Changes

Summary

  1. [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use (commit: 6d5d030) (details)
  2. [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in (commit: 4b36797) (details)
  3. [SPARK-33412][SQL] OverwriteByExpression should resolve its delete (commit: 8760032) (details)
  4. [SPARK-32512][SQL] add alter table add/drop partition command for (commit: 1eb236b) (details)
Commit 6d5d03095798a2ca2014ada340424512d60810ce by wenchen
[SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use
UnresolvedTableOrView to resolve the identifier
### What changes were proposed in this pull request?
This PR proposes to migrate `SHOW CREATE TABLE` to use
`UnresolvedTableOrView` to resolve the table identifier. This allows
consistent resolution rules (temp view first, etc.) to be applied for
both v1/v2 commands. More info about the consistent resolution rule
proposal can be found in
[JIRA](https://issues.apache.org/jira/browse/SPARK-29900) or [proposal
doc](https://docs.google.com/document/d/1hvLjGA8y_W_hhilpngXVub1Ebv8RsMap986nENCFnrg/edit?usp=sharing).
Note that `SHOW CREATE TABLE` works only with a v1 table and a permanent
view, and not supported for v2 tables.
### Why are the changes needed?
The changes allow consistent resolution behavior when resolving the
table identifier. For example, the following is the current behavior:
```scala sql("CREATE TEMPORARY VIEW t AS SELECT 1") sql("CREATE DATABASE
db") sql("CREATE TABLE t (key INT, value STRING) USING hive") sql("USE
db") sql("SHOW CREATE TABLE t AS SERDE") // Succeeds
``` With this change, `SHOW CREATE TABLE ... AS SERDE` above fails with
the following:
``` org.apache.spark.sql.AnalysisException: t is a temp view not table
or permanent view.; line 1 pos 0
at
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTempViews$$anonfun$apply$7.$anonfun$applyOrElse$43(Analyzer.scala:883)
at scala.Option.map(Option.scala:230)
```
, which is expected since temporary view is resolved first and `SHOW
CREATE TABLE ... AS SERDE` doesn't support a temporary view.
Note that there is no behavior change for `SHOW CREATE TABLE` without
`AS SERDE` since it was already resolving to a temporary view first. See
below for more detail.
### Does this PR introduce _any_ user-facing change?
After this PR, `SHOW CREATE TABLE t AS SERDE` is resolved to a temp view
`t` instead of table `db.t` in the above scenario.
Note that there is no behavior change for `SHOW CREATE TABLE` without
`AS SERDE`, but the exception message changes from `SHOW CREATE TABLE is
not supported on a temporary view` to `t is a temp view not table or
permanent view`.
### How was this patch tested?
Updated existing tests.
Closes #30321 from imback82/show_create_table.
Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
(commit: 6d5d030)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/ShowCreateTableSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala (diff)
Commit 4b367976a877adb981f65d546e1522fdf30d0731 by yamamuro
[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in
TPCDSQueryBenchmark
### What changes were proposed in this pull request?
This PR intends to fix the behaviour of query filters in
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for
selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But,
the current master has a weird behaviour about the option. For example,
if we pass `--query-filter q6` so as to run the TPCDS q6 only,
`TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the
`filterQueries` method does not respect the name suffix. So, there is no
way now to run the TPCDS q6 only.
### Why are the changes needed?
Bugfix.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Manually checked.
Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by:
Takeshi Yamamuro <yamamuro@apache.org>
(commit: 4b36797)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala (diff)
Commit 8760032f4f7e1ef36fee6afc45923d3826ef14fc by gurwls223
[SPARK-33412][SQL] OverwriteByExpression should resolve its delete
condition based on the table relation not the input query
### What changes were proposed in this pull request?
Make a special case in `ResolveReferences`, which resolves
`OverwriteByExpression`'s condition expression based on the table
relation instead of the input query.
### Why are the changes needed?
The condition expression is passed to the table implementation at the
end, so we should resolve it using table schema. Previously it works
because we have a hack in `ResolveReferences` to delay the resolution if
`outputResolved == false`. However, this hack doesn't work for tables
accepting any schema like https://github.com/delta-io/delta/pull/521 .
We may wrongly resolve the delete condition using input query's outout
columns which don't match the table column names.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
existing tests and updated test in v2 write.
Closes #30318 from cloud-fan/v2-write.
Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by:
HyukjinKwon <gurwls223@apache.org>
(commit: 8760032)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala (diff)
Commit 1eb236b9360a000afc30424341698fe26ee96d0f by wenchen
[SPARK-32512][SQL] add alter table add/drop partition command for
datasourcev2
### What changes were proposed in this pull request? This patch is
trying to add `AlterTableAddPartitionExec` and
`AlterTableDropPartitionExec` with the new table partition API, defined
in #28617.
### Does this PR introduce _any_ user-facing change? Yes. User can use
`alter table add partition` or `alter table drop partition` to
create/drop partition in V2Table.
### How was this patch tested? Run suites and fix old tests.
Closes #29339 from stczwd/SPARK-32512-new.
Lead-authored-by: stczwd <qcsd2011@163.com> Co-authored-by: Jacky Lee
<qcsd2011@163.com> Co-authored-by: Jackey Lee <qcsd2011@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 1eb236b)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala (diff)
The file was addedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AlterTableAddPartitionExec.scala
The file was addedsql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala (diff)
The file was addedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolvePartitionSpec.scala
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala (diff)
The file was addedsql/core/src/test/scala/org/apache/spark/sql/connector/DatasourceV2SQLBase.scala
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala (diff)
The file was addedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AlterTableDropPartitionExec.scala
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala (diff)
The file was addedsql/catalyst/src/test/scala/org/apache/spark/sql/connector/InMemoryPartitionTableCatalog.scala