Changes

Summary

  1. [SPARK-34466][SQL][DOCS] Improve docs for `ALTER TABLE .. RENAME TO` (commit: 4a9a1d4) (details)
  2. [SPARK-34314][SQL] Fix partitions schema inference (commit: b26e7b5) (details)
  3. [SPARK-34421][SQL] Resolve temporary functions and views in views with (commit: 27abb6a) (details)
  4. [SPARK-28123][SQL] String Functions: support btrim (commit: 06df121) (details)
  5. [SPARK-34283][SQL] Combines all adjacent 'Union' operators into a single (commit: 96bcb4b) (details)
  6. [SPARK-34469][K8S] Ignore RegisterExecutor when SparkContext is stopped (commit: 484a83e) (details)
  7. [SPARK-24818][CORE] Support delay scheduling for barrier execution (commit: 4dc16f2) (details)
  8. [SPARK-34471][SS][DOCS] Document Streaming Table APIs in Structured (commit: 489d32a) (details)
  9. [SPARK-7768][CORE][SQL] Open UserDefinedType as a Developer API (commit: f78466d) (details)
  10. [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType (commit: 82b33a3) (details)
  11. [SPARK-34481][SQL] Refactor dataframe reader/writer optionsWithPath (commit: 7de49a8) (details)
  12. [SPARK-20977][CORE] Use a non-final field for the state of (commit: fadd0f5) (details)
  13. [SPARK-34373][SQL] HiveThriftServer2 startWithContext may hang with a (commit: 1fac706) (details)
  14. [SPARK-34360][SQL] Support truncation of v2 tables (commit: 04c3125) (details)
  15. [SPARK-34384][CORE] Add missing docs for ResourceProfile APIs (commit: 546d2eb) (details)
  16. [SPARK-34486][K8S] Upgrade kubernetes-client to 4.13.2 (commit: 020e84e) (details)
  17. [SPARK-34487][K8S][TESTS] Use the runtime Hadoop version in K8s IT (commit: 9942548) (details)
  18. [SPARK-34129][SQL] Add table name to LogicalRelation.simpleString (commit: 94f9617) (details)
  19. [SPARK-34029][SQL][TESTS] Add OrcEncryptionSuite and FakeKeyProvider (commit: 03f4cf5) (details)
  20. [SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views (commit: 6ea4b5f) (details)
  21. [SPARK-34468][SQL] Rename v2 table in place if new name has single part (commit: a22d20a) (details)
  22. [SPARK-34167][SQL] Reading parquet with IntDecimal written as a (commit: 38fbe56) (details)
  23. [SPARK-34495][TESTS] Add `DedicatedJVMTest` test tag (commit: 2fb5f21) (details)
  24. [SPARK-34450][SQL][TESTS] Unify v1 and v2 ALTER TABLE .. RENAME tests (commit: 23a5996) (details)
  25. [SPARK-34432][SQL][TESTS] Add JavaSimpleWritableDataSource (commit: 9767041) (details)
  26. [MINOR][DOCS] Add table_identifier in sql-migration-guide for SHOW (commit: a6a82c8) (details)
  27. [SPARK-34473][SQL] Avoid NPE in DataFrameReader.schema(StructType) (commit: 02c784c) (details)
  28. [SPARK-34496][BUILD] Upgrade ZSTD-JNI to 1.4.8-5 for better API (commit: 0bccf16) (details)
  29. [MINOR][SQL] Fix the comment for CalendarIntervalType about (commit: 7df4fed) (details)
Commit 4a9a1d42e7bc3120fc114393e524deb1b16f7419 by wenchen
[SPARK-34466][SQL][DOCS] Improve docs for `ALTER TABLE .. RENAME TO`

### What changes were proposed in this pull request?
Explicitly highlight that the table rename command cannot move a table between databases.

### Why are the changes needed?
To inform users about actual behavior of the table rename command.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
```sql
spark-sql> CREATE DATABASE db1;
spark-sql> CREATE DATABASE db2;
spark-sql> CREATE TABLE db1.tbl1 (c0 INT);
spark-sql> ALTER TABLE db1.tbl1 RENAME TO db2.tbl1;
Error in query: RENAME TABLE source and destination databases do not match: 'db1' != 'db2';
spark-sql> ALTER TABLE db1.tbl1 RENAME TO db1.tbl2;
spark-sql> SHOW TABLES IN db1 LIKE '*';
db1 tbl2 false
```

Closes #31586 from MaxGekk/doc-rename-table.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 4a9a1d4)
The file was modifieddocs/sql-ref-syntax-ddl-alter-table.md (diff)
Commit b26e7b510bbaee63c4095ab47e75ff2a70e377d7 by wenchen
[SPARK-34314][SQL] Fix partitions schema inference

### What changes were proposed in this pull request?
Infer the partitions schema by:
1. interring the common type over all partition part values, and
2. casting those values to the common type

Before the changes:
1. Spark creates a literal with most appropriate type for concrete partition value i.e. `part0=-0` -> `Literal(0, IntegerType)`, `part0=abc` -> `Literal(UTF8String.fromString("abc"), StringType)`.
2. Finds the common type for all literals of a partition column. For the example above, it is `StringType`.
3. Casts those literal to the desired type:
  - `Cast(Literal(0, IntegerType), StringType)` -> `UTF8String.fromString("0")`
  - `Cast(Literal(UTF8String.fromString("abc", StringType), StringType)` -> `UTF8String.fromString("abc")`

In the example, we get a partition part value "0" which is different from the original one "-0". Spark shouldn't modify partition part values of the string type because it can influence on query results.

Closes #31423

### Why are the changes needed?
The changes fix the bug demonstrated by the example:
1. There are partitioned parquet files (file format doesn't matter):
```
/private/var/folders/p3/dfs6mf655d7fnjrsjvldh0tc0000gn/T/spark-e09eae99-7ecf-4ab2-b99b-f63f8dea658d
├── _SUCCESS
├── part=-0
│   └── part-00001-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet
└── part=AA
    └── part-00000-02144398-2896-4d21-9628-a8743d098cb4.c000.snappy.parquet
```
placed to two partitions "AA" and **"-0"**.

2. When reading them w/o specified schema:
```
val df = spark.read.parquet(path)
df.printSchema()
root
|-- id: integer (nullable = true)
|-- part: string (nullable = true)
```
the inferred type of the partition column `part` is the **string** type.
3. The expected values in the column `part` are "AA" and "-0" but we get:
```
df.show(false)
+---+----+
|id |part|
+---+----+
|0  |AA  |
|1  |0   |
+---+----+
```
So, Spark returns **"0"** instead of **"-0"**.

### Does this PR introduce _any_ user-facing change?
This PR can change query results.

### How was this patch tested?
By running new test and existing test suites:
```
$ build/sbt "test:testOnly *FileIndexSuite"
$ build/sbt "test:testOnly *ParquetV1PartitionDiscoverySuite"
$ build/sbt "test:testOnly *ParquetV2PartitionDiscoverySuite"
```

Closes #31549 from MaxGekk/fix-partition-file-index-2.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: b26e7b5)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala (diff)
Commit 27abb6ab5674b8663440dc738a0ba79c185fb063 by wenchen
[SPARK-34421][SQL] Resolve temporary functions and views in views with CTEs

### What changes were proposed in this pull request?
This PR:
- Fixes a bug that prevents analysis of:
  ```
  CREATE TEMPORARY VIEW temp_view AS WITH cte AS (SELECT temp_func(0)) SELECT * FROM cte;
  SELECT * FROM temp_view
  ```
  by throwing:
  ```
  Undefined function: 'temp_func'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.
  ```
- and doesn't report analysis error when it should:
  ```
  CREATE TEMPORARY VIEW temp_view AS SELECT 0;
  CREATE VIEW view_on_temp_view AS WITH cte AS (SELECT * FROM temp_view) SELECT * FROM cte
  ```
  by properly collecting temporary objects from VIEW definitions with CTEs.

- Minor refactor to make the affected code more readable.

### Why are the changes needed?
To fix a bug introduced with https://github.com/apache/spark/pull/30567

### Does this PR introduce _any_ user-facing change?
Yes, the query works again.

### How was this patch tested?
Added new UT + existing ones.

Closes #31550 from peter-toth/SPARK-34421-temp-functions-in-views-with-cte.

Authored-by: Peter Toth <peter.toth@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 27abb6a)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala (diff)
Commit 06df1210d4117e3b7cb5b1779e4fae672383be34 by wenchen
[SPARK-28123][SQL] String Functions: support btrim

### What changes were proposed in this pull request?
Spark support `trim`/`ltrim`/`rtrim` now. The function `btrim` is an alternate form of `TRIM(BOTH <chars> FROM <expr>)`.
`btrim` removes the longest string consisting only of specified characters from the start and end of a string.

The mainstream database support this feature show below:

**Postgresql**
https://www.postgresql.org/docs/11/functions-binarystring.html

**Vertica**
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/String/BTRIM.htm?tocpath=SQL%20Reference%20Manual%7CSQL%20Functions%7CString%20Functions%7C_____5

**Redshift**
https://docs.aws.amazon.com/redshift/latest/dg/r_BTRIM.html

**Druid**
https://druid.apache.org/docs/latest/querying/sql.html#string-functions

**Greenplum**
http://docs.greenplum.org/6-8/ref_guide/function-summary.html

### Why are the changes needed?
btrim is very useful.

### Does this PR introduce _any_ user-facing change?
Yes. btrim is a new function

### How was this patch tested?
Jenkins test.

Closes #31390 from beliefer/SPARK-28123-support-btrim.

Authored-by: gengjiaan <gengjiaan@360.cn>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 06df121)
The file was modifiedsql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/postgreSQL/strings.sql (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/postgreSQL/strings.sql.out (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-functions/sql-expression-schema.md (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/string-functions.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/string-functions.sql (diff)
Commit 96bcb4bbe4340fd42e79ad618239bd1c3bbf9fd7 by wenchen
[SPARK-34283][SQL] Combines all adjacent 'Union' operators into a single 'Union' when using 'Dataset.union.distinct.union.distinct'

### What changes were proposed in this pull request?

Handled 'Deduplicate(Keys, Union)' operation in rule 'CombineUnions' to combine adjacent 'Union' operators  into a single 'Union' if necessary when using 'Dataset.union.distinct.union.distinct'.
Currently only handle distinct-like 'Deduplicate', where the keys == output, for example:
```
val df1 = Seq((1, 2, 3)).toDF("a", "b", "c")
val df2 = Seq((6, 2, 5)).toDF("a", "b", "c")
val df3 = Seq((2, 4, 3)).toDF("c", "a", "b")
val df4 = Seq((1, 4, 5)).toDF("b", "a", "c")
val unionDF1 = df1.unionByName(df2).dropDuplicates(Seq("b", "a", "c"))
      .unionByName(df3).dropDuplicates().unionByName(df4)
      .dropDuplicates("a")
```
In this case, **all Union operators will be combined**.
but,
```
val df1 = Seq((1, 2, 3)).toDF("a", "b", "c")
val df2 = Seq((6, 2, 5)).toDF("a", "b", "c")
val df3 = Seq((2, 4, 3)).toDF("c", "a", "b")
val df4 = Seq((1, 4, 5)).toDF("b", "a", "c")
val unionDF = df1.unionByName(df2).dropDuplicates(Seq("a"))
      .unionByName(df3).dropDuplicates("c").unionByName(df4)
      .dropDuplicates("b")
```
In this case, **all unions will not be combined, because the Deduplicate.keys doesn't equal to Union.output**.

### Why are the changes needed?

When using 'Dataset.union.distinct.union.distinct', the operator is  'Deduplicate(Keys, Union)', but AstBuilder transform sql-style 'Union' to operator 'Distinct(Union)', the rule 'CombineUnions' in Optimizer only handle 'Distinct(Union)' operator but not Deduplicate(Keys, Union).
Please see the detailed  description in [SPARK-34283](https://issues.apache.org/jira/browse/SPARK-34283).

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit tests.

Closes #31404 from zzcclp/SPARK-34283.

Authored-by: Zhichao Zhang <441586683@qq.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 96bcb4b)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SetOperationSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DataFrameSetOperationsSuite.scala (diff)
Commit 484a83e73e8b9420ffd42b4eb95dcdf4cf66ebf0 by dhyun
[SPARK-34469][K8S] Ignore RegisterExecutor when SparkContext is stopped

### What changes were proposed in this pull request?

This PR aims to make `KubernetesClusterSchedulerBackend` ignore `RegisterExecutor` message when `SparkContext` is stopped already.

### Why are the changes needed?

If `SparkDriver` is terminated, the executors will be removed by K8s automatically.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the newly added test case.

Closes #31587 from dongjoon-hyun/SPARK-34469.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 484a83e)
The file was modifiedresource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackendSuite.scala (diff)
The file was modifiedresource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala (diff)
Commit 4dc16f2d59827e3ff38f76da1821bc2682c9f48b by mridulatgmail.com
[SPARK-24818][CORE] Support delay scheduling for barrier execution

### What changes were proposed in this pull request?

This PR tries to support the (non-legacy) delay scheduling for the barrier execution.

The idea is, adding a pending launch tasks list(`barrierPendingLaunchTasks`) in the barrier `TaskSetManager`. And we don't really add those pending launch tasks to the running list and post task start event to the listeners and so on until all tasks in the barrier `TaskSetManager` has been added to `barrierPendingLaunchTasks` after a single round `resourceOffers()`. If there're only partial tasks that are able to launch after a single `rousourceOffers()` round, we'll revert all the assigned resources to those tasks which were added in `barrierPendingLaunchTasks` and clear `barrierPendingLaunchTasks` and wait for the next `resourceOffers()` round. The barrier `TaskSetManager` should be launched finally since we've ensured enough slots before the scheduling.

### Why are the changes needed?

Currently, with delay scheduling enabled for the barrier execution, the application can abort immediately when there're only partial tasks can be launched. This is really bad, especially when the application already completed many stages before the barrier stage. For example, the application may do some ETL jobs before the barrier job(for ML).

After this PR, this scenario  should no longer happen.

### Does this PR introduce _any_ user-facing change?

Yes, users will no longer face the `Fail resource offers for barrier stage...` error.

### How was this patch tested?

Added/updated unit tests.

Closes #30650 from Ngone51/barrier-delay-scheduling.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
(commit: 4dc16f2)
The file was modifiedcore/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala (diff)
The file was modifiedcore/src/test/scala/org/apache/spark/scheduler/BarrierTaskContextSuite.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala (diff)
The file was modifiedcore/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala (diff)
Commit 489d32aa9bb9ef9446ac8df19deb0693f305b092 by kabhwan.opensource
[SPARK-34471][SS][DOCS] Document Streaming Table APIs in Structured Streaming Programming Guide

### What changes were proposed in this pull request?

This change is to document the newly added streaming table APIs in Structured Streaming Programming Guide.

### Why are the changes needed?

This will help our users when they try to use the new APIs.

### Does this PR introduce _any_ user-facing change?
Yes. Users will see the changes in the programming guide.

### How was this patch tested?
Built the HTML page and verified.

Attached is a screenshot of the section added:
![Table APIs Section - Scala](https://user-images.githubusercontent.com/44179472/108581923-1ff86700-736b-11eb-8fcd-efa04ac936de.png)

Closes #31590 from bozhang2820/table-api-doc.

Lead-authored-by: Bo Zhang <bo.zhang@databricks.com>
Co-authored-by: Bo Zhang <bozhang2820@gmail.com>
Signed-off-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
(commit: 489d32a)
The file was modifieddocs/structured-streaming-programming-guide.md (diff)
Commit f78466dca6f0ddb1c979842f5a22e1a1e3b535bf by srowen
[SPARK-7768][CORE][SQL] Open UserDefinedType as a Developer API

### What changes were proposed in this pull request?

UserDefinedType and UDTRegistration become public Developer APIs, not package-private to Spark.

### Why are the changes needed?

This proposes to simply open up the UserDefinedType class as a developer API. It was public in 1.x, but closed in 2.x for some possible redesign that does not seem to have happened.

Other libraries have managed to define UDTs anyway by inserting shims into the Spark namespace, and this evidently has worked OK. But package isolation in Java 9+ breaks this.

The logic here is mostly: this is de facto a stable API, so can at least be open to developers with the usual caveats about developer APIs.

Open questions:

- Is there in fact some important redesign that's needed before opening it? The comment to this effect is from 2016
- Is this all that needs to be opened up? Like PythonUserDefinedType?
- Should any of this be kept package-private?

This was first proposed in https://github.com/apache/spark/pull/16478 though it was a larger change, but, the other API issues it was fixing seem to have been addressed already (e.g. no need to return internal Spark types). It was never really reviewed.

My hunch is that there isn't much downside, and some upside, to just opening this as-is now.

### Does this PR introduce _any_ user-facing change?

UserDefinedType becomes visible to developers to subclass.

### How was this patch tested?

Existing tests; there is no change to the existing logic.

Closes #31461 from srowen/SPARK-7768.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(commit: f78466d)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/types/UserDefinedType.scala (diff)
The file was modifiedmllib/src/main/scala/org/apache/spark/ml/linalg/VectorUDT.scala (diff)
The file was modifiedmllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/types/UDTRegistration.scala (diff)
Commit 82b33a304160e4f950de613c3d17f88fa3e75e5e by sarutak
[SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

### What changes were proposed in this pull request?

This PR fix an issue that `java.sql.RowId` is mapped to `LongType` and prefer `StringType`.

In the current implementation, JDBC RowID type is mapped to `LongType` except for `OracleDialect`, but there is no guarantee to be able to convert RowID to long.
`java.sql.RowId` declares `toString` and the specification of `java.sql.RowId` says

> _all methods on the RowId interface must be fully implemented if the JDBC driver supports the data type_
(https://docs.oracle.com/javase/8/docs/api/java/sql/RowId.html)

So, we should prefer StringType to LongType.

### Why are the changes needed?

This seems to be a potential bug.

### Does this PR introduce _any_ user-facing change?

Yes. RowID is mapped to StringType rather than LongType.

### How was this patch tested?

New test and  the existing test case `SPARK-32992: map Oracle's ROWID type to StringType` in `OracleIntegrationSuite` passes.

Closes #31491 from sarutak/rowid-type.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
(commit: 82b33a3)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala (diff)
The file was modifieddocs/sql-migration-guide.md (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/jdbc/OracleDialect.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala (diff)
Commit 7de49a8fc0c47fb4d2ce44e3ebe2978e002d9699 by dhyun
[SPARK-34481][SQL] Refactor dataframe reader/writer optionsWithPath logic

### What changes were proposed in this pull request?

Extract optionsWithPath logic into their own function.

### Why are the changes needed?

Reduce the code duplication and improve modularity.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Just some refactoring. Existing tests.

Closes #31599 from yuchenhuo/SPARK-34481.

Authored-by: Yuchen Huo <yuchen.huo@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 7de49a8)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala (diff)
Commit fadd0f5d9bff79cbd785631aa2962b9eda644ab8 by srowen
[SPARK-20977][CORE] Use a non-final field for the state of CollectionAccumulator

This PR is a fix for the JLS 17.5.3 violation identified in
zsxwing's [19/Feb/19 11:47 comment](https://issues.apache.org/jira/browse/SPARK-20977?focusedCommentId=16772277&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16772277) on the JIRA.

### What changes were proposed in this pull request?
- Use a var field to hold the state of the collection accumulator

### Why are the changes needed?
AccumulatorV2 auto-registration of accumulator during readObject doesn't work with final fields that are post-processed outside readObject. As it stands incompletely initialized objects are published to heartbeat thread. This leads to sporadic exceptions knocking out executors which increases the cost of the jobs. We observe such failures on a regular basis https://github.com/NVIDIA/spark-rapids/issues/1522.

### Does this PR introduce _any_ user-facing change?
None

### How was this patch tested?
- this is a concurrency bug that is almost impossible to reproduce as a quick unit test.
- By trial and error I crafted a command https://github.com/NVIDIA/spark-rapids/pull/1688 that reproduces the issue on my dev box several times per hour, with the first occurrence often within a few minutes. After the patch, these Exceptions have not shown up after running overnight for 10+ hours
- existing unit tests in *`AccumulatorV2Suite` and *`LiveEntitySuite`

Closes #31540 from gerashegalov/SPARK-20977.

Authored-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
(commit: fadd0f5)
The file was modifiedcore/src/main/scala/org/apache/spark/util/AccumulatorV2.scala (diff)
Commit 1fac706db560001411672c5ade42f6608f82989e by gurwls223
[SPARK-34373][SQL] HiveThriftServer2 startWithContext may hang with a race issue

### What changes were proposed in this pull request?

fix a race issue by interrupting the thread

### Why are the changes needed?

```
21:43:26.809 WARN org.apache.thrift.server.TThreadPoolServer: Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: No underlying server socket.
at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:126)
at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
at org.apache.thrift.transport.TServerTransport.acceException in thread "Thread-15" java.io.IOException: Stream closed
at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170)
at java.io.BufferedInputStream.read(BufferedInputStream.java:336)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at scala.sys.process.BasicIO$.loop$1(BasicIO.scala:238)
at scala.sys.process.BasicIO$.transferFullyImpl(BasicIO.scala:246)
at scala.sys.process.BasicIO$.transferFully(BasicIO.scala:227)
at scala.sys.process.BasicIO$.$anonfun$toStdOut$1(BasicIO.scala:221)
```
when the TServer try to `serve` after `stop`, it hangs with the log above forever
### Does this PR introduce _any_ user-facing change?

no
### How was this patch tested?

passing ci

Closes #31479 from yaooqinn/SPARK-34373.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 1fac706)
The file was modifiedsql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java (diff)
The file was modifiedsql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java (diff)
The file was modifiedsql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java (diff)
Commit 04c3125dcfb2a40b13eef443e5b543795aa31c34 by gurwls223
[SPARK-34360][SQL] Support truncation of v2 tables

### What changes were proposed in this pull request?
1. Add new interface `TruncatableTable` which represents tables that allow atomic truncation.
2. Implement new method in `InMemoryTable` and in `InMemoryPartitionTable`.

### Why are the changes needed?
To support `TRUNCATE TABLE` for v2 tables.

### Does this PR introduce _any_ user-facing change?
Should not.

### How was this patch tested?
Added new tests to `TableCatalogSuite` that check truncation of non-partitioned and partitioned tables:
```
$ build/sbt "test:testOnly *TableCatalogSuite"
```

Closes #31475 from MaxGekk/dsv2-truncate-table.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 04c3125)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/connector/InMemoryTable.scala (diff)
The file was modifiedsql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsDelete.java (diff)
The file was addedsql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TruncatableTable.java
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/TableCatalogSuite.scala (diff)
Commit 546d2eb5d46813a14c7bd30113fb6bb038cdd2fc by gurwls223
[SPARK-34384][CORE] Add missing docs for ResourceProfile APIs

### What changes were proposed in this pull request?

This PR adds missing docs for ResourceProfile related APIs. Besides, it includes a few minor changes on API:

* ResourceProfileBuilder.build -> ResourceProfileBuilder.builder()
* Provides java specific API `allSupportedExecutorResourcesJList`
* private `ResourceAllocator` since it was mistakenly exposed previously

### Why are the changes needed?

Add missing API docs

### Does this PR introduce _any_ user-facing change?

No, as Apache Spark 3.1 hasn't officially released.

### How was this patch tested?

Updated unit tests due to the signature change of `build()`.

Closes #31496 from Ngone51/resource-profile-api-cleanup.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 546d2eb)
The file was modifiedcore/src/main/scala/org/apache/spark/resource/TaskResourceRequests.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/resource/ResourceAllocator.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala (diff)
The file was modifiedresource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocatorSuite.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/resource/ExecutorResourceRequests.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/resource/ResourceProfile.scala (diff)
The file was modifiedresource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStepSuite.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/resource/ExecutorResourceRequest.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/resource/TaskResourceRequest.scala (diff)
Commit 020e84e92f5fe81a144f909ac0d1879ab5ec4dd5 by gurwls223
[SPARK-34486][K8S] Upgrade kubernetes-client to 4.13.2

### What changes were proposed in this pull request?

This PR aims to upgrade `kubernetes-client` library from 4.12.0 to 4.13.2 for Apache Spark 3.2.0.

### Why are the changes needed?

This will bring [K8s 1.19.1](https://github.com/fabric8io/kubernetes-client/pull/2541) models officially and the latest bug fixes.

- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.13.0
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.13.1
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.13.2

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Pass the K8s IT and UT.

```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- Run SparkR on simple dataframe.R example
Run completed in 19 minutes, 25 seconds.
Total number of tests run: 27
Suites: completed 2, aborted 0
Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #31602 from dongjoon-hyun/SPARK-34486.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 020e84e)
The file was modifieddev/deps/spark-deps-hadoop-2.7-hive-2.3 (diff)
The file was modifiedresource-managers/kubernetes/core/pom.xml (diff)
The file was modifiedresource-managers/kubernetes/integration-tests/pom.xml (diff)
The file was modifieddev/deps/spark-deps-hadoop-3.2-hive-2.3 (diff)
Commit 9942548c37ee6b08b6e29332c1e42407f4026fd3 by dhyun
[SPARK-34487][K8S][TESTS] Use the runtime Hadoop version in K8s IT

### What changes were proposed in this pull request?

This PR aims to use the runtime Hadoop version in K8s integration test.

### Why are the changes needed?

SPARK-33212 upgrades Hadoop dependency from 3.2.0 to 3.2.2 and we will upgrade to 3.3.x+.
We had better use the runtime Hadoop version instead of having a static string.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the K8s IT.

This is tested locally like the following.
```
KubernetesSuite:
...
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
...
```

Closes #31604 from dongjoon-hyun/SPARK-34487.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 9942548)
The file was modifiedresource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/DepsTestsSuite.scala (diff)
Commit 94f9617cb486cc56acb880a6968def9cfbb8afac by srowen
[SPARK-34129][SQL] Add table name to LogicalRelation.simpleString

### What changes were proposed in this pull request?

This pr add table name to `LogicalRelation.simpleString`.

### Why are the changes needed?

Make optimized logical plan more readable.

Before this pr:
```
== Optimized Logical Plan ==
Project [i_item_sk#7 AS ss_item_sk#162], Statistics(sizeInBytes=8.07E+27 B)
+- Join Inner, (((i_brand_id#14 = brand_id#159) AND (i_class_id#16 = class_id#160)) AND (i_category_id#18 = category_id#161)), Statistics(sizeInBytes=2.42E+28 B)
   :- Project [i_item_sk#7, i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=8.5 MiB, rowCount=3.69E+5)
   :  +- Filter ((isnotnull(i_brand_id#14) AND isnotnull(i_class_id#16)) AND isnotnull(i_category_id#18)), Statistics(sizeInBytes=150.0 MiB, rowCount=3.69E+5)
   :     +- Relation[i_item_sk#7,i_item_id#8,i_rec_start_date#9,i_rec_end_date#10,i_item_desc#11,i_current_price#12,i_wholesale_cost#13,i_brand_id#14,i_brand#15,i_class_id#16,i_class#17,i_category_id#18,i_category#19,i_manufact_id#20,i_manufact#21,i_size#22,i_formulation#23,i_color#24,i_units#25,i_container#26,i_manager_id#27,i_product_name#28] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
   +- Aggregate [brand_id#159, class_id#160, category_id#161], [brand_id#159, class_id#160, category_id#161], Statistics(sizeInBytes=2.73E+21 B)
      +- Aggregate [brand_id#159, class_id#160, category_id#161], [brand_id#159, class_id#160, category_id#161], Statistics(sizeInBytes=2.73E+21 B)
         +- Join LeftSemi, (((brand_id#159 <=> i_brand_id#14) AND (class_id#160 <=> i_class_id#16)) AND (category_id#161 <=> i_category_id#18)), Statistics(sizeInBytes=2.73E+21 B)
            :- Join LeftSemi, (((brand_id#159 <=> i_brand_id#14) AND (class_id#160 <=> i_class_id#16)) AND (category_id#161 <=> i_category_id#18)), Statistics(sizeInBytes=2.73E+21 B)
            :  :- Project [i_brand_id#14 AS brand_id#159, i_class_id#16 AS class_id#160, i_category_id#18 AS category_id#161], Statistics(sizeInBytes=2.73E+21 B)
            :  :  +- Join Inner, (ss_sold_date_sk#51 = d_date_sk#52), Statistics(sizeInBytes=3.83E+21 B)
            :  :     :- Project [ss_sold_date_sk#51, i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=387.3 PiB)
            :  :     :  +- Join Inner, (ss_item_sk#30 = i_item_sk#7), Statistics(sizeInBytes=516.5 PiB)
            :  :     :     :- Project [ss_item_sk#30, ss_sold_date_sk#51], Statistics(sizeInBytes=61.1 GiB)
            :  :     :     :  +- Filter ((isnotnull(ss_item_sk#30) AND isnotnull(ss_sold_date_sk#51)) AND dynamicpruning#168 [ss_sold_date_sk#51]), Statistics(sizeInBytes=580.6 GiB)
            :  :     :     :     :  +- Project [d_date_sk#52], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :  :     :     :     :     +- Filter ((((d_year#58 >= 1999) AND (d_year#58 <= 2001)) AND isnotnull(d_year#58)) AND isnotnull(d_date_sk#52)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :  :     :     :     :        +- Relation[d_date_sk#52,d_date_id#53,d_date#54,d_month_seq#55,d_week_seq#56,d_quarter_seq#57,d_year#58,d_dow#59,d_moy#60,d_dom#61,d_qoy#62,d_fy_year#63,d_fy_quarter_seq#64,d_fy_week_seq#65,d_day_name#66,d_quarter_name#67,d_holiday#68,d_weekend#69,d_following_holiday#70,d_first_dom#71,d_last_dom#72,d_same_day_ly#73,d_same_day_lq#74,d_current_day#75,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            :  :     :     :     +- Relation[ss_sold_time_sk#29,ss_item_sk#30,ss_customer_sk#31,ss_cdemo_sk#32,ss_hdemo_sk#33,ss_addr_sk#34,ss_store_sk#35,ss_promo_sk#36,ss_ticket_number#37L,ss_quantity#38,ss_wholesale_cost#39,ss_list_price#40,ss_sales_price#41,ss_ext_discount_amt#42,ss_ext_sales_price#43,ss_ext_wholesale_cost#44,ss_ext_list_price#45,ss_ext_tax#46,ss_coupon_amt#47,ss_net_paid#48,ss_net_paid_inc_tax#49,ss_net_profit#50,ss_sold_date_sk#51] parquet, Statistics(sizeInBytes=580.6 GiB)
            :  :     :     +- Project [i_item_sk#7, i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=8.5 MiB, rowCount=3.69E+5)
            :  :     :        +- Filter (((isnotnull(i_brand_id#14) AND isnotnull(i_class_id#16)) AND isnotnull(i_category_id#18)) AND isnotnull(i_item_sk#7)), Statistics(sizeInBytes=150.0 MiB, rowCount=3.69E+5)
            :  :     :           +- Relation[i_item_sk#7,i_item_id#8,i_rec_start_date#9,i_rec_end_date#10,i_item_desc#11,i_current_price#12,i_wholesale_cost#13,i_brand_id#14,i_brand#15,i_class_id#16,i_class#17,i_category_id#18,i_category#19,i_manufact_id#20,i_manufact#21,i_size#22,i_formulation#23,i_color#24,i_units#25,i_container#26,i_manager_id#27,i_product_name#28] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
            :  :     +- Project [d_date_sk#52], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :  :        +- Filter ((((d_year#58 >= 1999) AND (d_year#58 <= 2001)) AND isnotnull(d_year#58)) AND isnotnull(d_date_sk#52)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :  :           +- Relation[d_date_sk#52,d_date_id#53,d_date#54,d_month_seq#55,d_week_seq#56,d_quarter_seq#57,d_year#58,d_dow#59,d_moy#60,d_dom#61,d_qoy#62,d_fy_year#63,d_fy_quarter_seq#64,d_fy_week_seq#65,d_day_name#66,d_quarter_name#67,d_holiday#68,d_weekend#69,d_following_holiday#70,d_first_dom#71,d_last_dom#72,d_same_day_ly#73,d_same_day_lq#74,d_current_day#75,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            :  +- Aggregate [i_brand_id#14, i_class_id#16, i_category_id#18], [i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=1414.2 EiB)
            :     +- Project [i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=1414.2 EiB)
            :        +- Join Inner, (cs_sold_date_sk#113 = d_date_sk#52), Statistics(sizeInBytes=1979.9 EiB)
            :           :- Project [cs_sold_date_sk#113, i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=231.1 PiB)
            :           :  +- Join Inner, (cs_item_sk#94 = i_item_sk#7), Statistics(sizeInBytes=308.2 PiB)
            :           :     :- Project [cs_item_sk#94, cs_sold_date_sk#113], Statistics(sizeInBytes=36.2 GiB)
            :           :     :  +- Filter ((isnotnull(cs_item_sk#94) AND isnotnull(cs_sold_date_sk#113)) AND dynamicpruning#169 [cs_sold_date_sk#113]), Statistics(sizeInBytes=470.5 GiB)
            :           :     :     :  +- Project [d_date_sk#52], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :           :     :     :     +- Filter ((((d_year#58 >= 1999) AND (d_year#58 <= 2001)) AND isnotnull(d_year#58)) AND isnotnull(d_date_sk#52)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :           :     :     :        +- Relation[d_date_sk#52,d_date_id#53,d_date#54,d_month_seq#55,d_week_seq#56,d_quarter_seq#57,d_year#58,d_dow#59,d_moy#60,d_dom#61,d_qoy#62,d_fy_year#63,d_fy_quarter_seq#64,d_fy_week_seq#65,d_day_name#66,d_quarter_name#67,d_holiday#68,d_weekend#69,d_following_holiday#70,d_first_dom#71,d_last_dom#72,d_same_day_ly#73,d_same_day_lq#74,d_current_day#75,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            :           :     :     +- Relation[cs_sold_time_sk#80,cs_ship_date_sk#81,cs_bill_customer_sk#82,cs_bill_cdemo_sk#83,cs_bill_hdemo_sk#84,cs_bill_addr_sk#85,cs_ship_customer_sk#86,cs_ship_cdemo_sk#87,cs_ship_hdemo_sk#88,cs_ship_addr_sk#89,cs_call_center_sk#90,cs_catalog_page_sk#91,cs_ship_mode_sk#92,cs_warehouse_sk#93,cs_item_sk#94,cs_promo_sk#95,cs_order_number#96L,cs_quantity#97,cs_wholesale_cost#98,cs_list_price#99,cs_sales_price#100,cs_ext_discount_amt#101,cs_ext_sales_price#102,cs_ext_wholesale_cost#103,... 10 more fields] parquet, Statistics(sizeInBytes=470.5 GiB)
            :           :     +- Project [i_item_sk#7, i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=8.5 MiB, rowCount=3.72E+5)
            :           :        +- Filter isnotnull(i_item_sk#7), Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
            :           :           +- Relation[i_item_sk#7,i_item_id#8,i_rec_start_date#9,i_rec_end_date#10,i_item_desc#11,i_current_price#12,i_wholesale_cost#13,i_brand_id#14,i_brand#15,i_class_id#16,i_class#17,i_category_id#18,i_category#19,i_manufact_id#20,i_manufact#21,i_size#22,i_formulation#23,i_color#24,i_units#25,i_container#26,i_manager_id#27,i_product_name#28] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
            :           +- Project [d_date_sk#52], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :              +- Filter ((((d_year#58 >= 1999) AND (d_year#58 <= 2001)) AND isnotnull(d_year#58)) AND isnotnull(d_date_sk#52)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :                 +- Relation[d_date_sk#52,d_date_id#53,d_date#54,d_month_seq#55,d_week_seq#56,d_quarter_seq#57,d_year#58,d_dow#59,d_moy#60,d_dom#61,d_qoy#62,d_fy_year#63,d_fy_quarter_seq#64,d_fy_week_seq#65,d_day_name#66,d_quarter_name#67,d_holiday#68,d_weekend#69,d_following_holiday#70,d_first_dom#71,d_last_dom#72,d_same_day_ly#73,d_same_day_lq#74,d_current_day#75,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            +- Aggregate [i_brand_id#14, i_class_id#16, i_category_id#18], [i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=650.5 EiB)
               +- Project [i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=650.5 EiB)
                  +- Join Inner, (ws_sold_date_sk#147 = d_date_sk#52), Statistics(sizeInBytes=910.6 EiB)
                     :- Project [ws_sold_date_sk#147, i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=106.3 PiB)
                     :  +- Join Inner, (ws_item_sk#116 = i_item_sk#7), Statistics(sizeInBytes=141.7 PiB)
                     :     :- Project [ws_item_sk#116, ws_sold_date_sk#147], Statistics(sizeInBytes=16.6 GiB)
                     :     :  +- Filter ((isnotnull(ws_item_sk#116) AND isnotnull(ws_sold_date_sk#147)) AND dynamicpruning#170 [ws_sold_date_sk#147]), Statistics(sizeInBytes=216.4 GiB)
                     :     :     :  +- Project [d_date_sk#52], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
                     :     :     :     +- Filter ((((d_year#58 >= 1999) AND (d_year#58 <= 2001)) AND isnotnull(d_year#58)) AND isnotnull(d_date_sk#52)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
                     :     :     :        +- Relation[d_date_sk#52,d_date_id#53,d_date#54,d_month_seq#55,d_week_seq#56,d_quarter_seq#57,d_year#58,d_dow#59,d_moy#60,d_dom#61,d_qoy#62,d_fy_year#63,d_fy_quarter_seq#64,d_fy_week_seq#65,d_day_name#66,d_quarter_name#67,d_holiday#68,d_weekend#69,d_following_holiday#70,d_first_dom#71,d_last_dom#72,d_same_day_ly#73,d_same_day_lq#74,d_current_day#75,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
                     :     :     +- Relation[ws_sold_time_sk#114,ws_ship_date_sk#115,ws_item_sk#116,ws_bill_customer_sk#117,ws_bill_cdemo_sk#118,ws_bill_hdemo_sk#119,ws_bill_addr_sk#120,ws_ship_customer_sk#121,ws_ship_cdemo_sk#122,ws_ship_hdemo_sk#123,ws_ship_addr_sk#124,ws_web_page_sk#125,ws_web_site_sk#126,ws_ship_mode_sk#127,ws_warehouse_sk#128,ws_promo_sk#129,ws_order_number#130L,ws_quantity#131,ws_wholesale_cost#132,ws_list_price#133,ws_sales_price#134,ws_ext_discount_amt#135,ws_ext_sales_price#136,ws_ext_wholesale_cost#137,... 10 more fields] parquet, Statistics(sizeInBytes=216.4 GiB)
                     :     +- Project [i_item_sk#7, i_brand_id#14, i_class_id#16, i_category_id#18], Statistics(sizeInBytes=8.5 MiB, rowCount=3.72E+5)
                     :        +- Filter isnotnull(i_item_sk#7), Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
                     :           +- Relation[i_item_sk#7,i_item_id#8,i_rec_start_date#9,i_rec_end_date#10,i_item_desc#11,i_current_price#12,i_wholesale_cost#13,i_brand_id#14,i_brand#15,i_class_id#16,i_class#17,i_category_id#18,i_category#19,i_manufact_id#20,i_manufact#21,i_size#22,i_formulation#23,i_color#24,i_units#25,i_container#26,i_manager_id#27,i_product_name#28] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
                     +- Project [d_date_sk#52], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
                        +- Filter ((((d_year#58 >= 1999) AND (d_year#58 <= 2001)) AND isnotnull(d_year#58)) AND isnotnull(d_date_sk#52)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
                           +- Relation[d_date_sk#52,d_date_id#53,d_date#54,d_month_seq#55,d_week_seq#56,d_quarter_seq#57,d_year#58,d_dow#59,d_moy#60,d_dom#61,d_qoy#62,d_fy_year#63,d_fy_quarter_seq#64,d_fy_week_seq#65,d_day_name#66,d_quarter_name#67,d_holiday#68,d_weekend#69,d_following_holiday#70,d_first_dom#71,d_last_dom#72,d_same_day_ly#73,d_same_day_lq#74,d_current_day#75,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
```

After this pr:
```
== Optimized Logical Plan ==
Project [i_item_sk#9 AS ss_item_sk#3], Statistics(sizeInBytes=8.07E+27 B)
+- Join Inner, (((i_brand_id#16 = brand_id#0) AND (i_class_id#18 = class_id#1)) AND (i_category_id#20 = category_id#2)), Statistics(sizeInBytes=2.42E+28 B)
   :- Project [i_item_sk#9, i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=8.5 MiB, rowCount=3.69E+5)
   :  +- Filter ((isnotnull(i_brand_id#16) AND isnotnull(i_class_id#18)) AND isnotnull(i_category_id#20)), Statistics(sizeInBytes=150.0 MiB, rowCount=3.69E+5)
   :     +- Relation tpcds5t.item[i_item_sk#9,i_item_id#10,i_rec_start_date#11,i_rec_end_date#12,i_item_desc#13,i_current_price#14,i_wholesale_cost#15,i_brand_id#16,i_brand#17,i_class_id#18,i_class#19,i_category_id#20,i_category#21,i_manufact_id#22,i_manufact#23,i_size#24,i_formulation#25,i_color#26,i_units#27,i_container#28,i_manager_id#29,i_product_name#30] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
   +- Aggregate [brand_id#0, class_id#1, category_id#2], [brand_id#0, class_id#1, category_id#2], Statistics(sizeInBytes=2.73E+21 B)
      +- Aggregate [brand_id#0, class_id#1, category_id#2], [brand_id#0, class_id#1, category_id#2], Statistics(sizeInBytes=2.73E+21 B)
         +- Join LeftSemi, (((brand_id#0 <=> i_brand_id#16) AND (class_id#1 <=> i_class_id#18)) AND (category_id#2 <=> i_category_id#20)), Statistics(sizeInBytes=2.73E+21 B)
            :- Join LeftSemi, (((brand_id#0 <=> i_brand_id#16) AND (class_id#1 <=> i_class_id#18)) AND (category_id#2 <=> i_category_id#20)), Statistics(sizeInBytes=2.73E+21 B)
            :  :- Project [i_brand_id#16 AS brand_id#0, i_class_id#18 AS class_id#1, i_category_id#20 AS category_id#2], Statistics(sizeInBytes=2.73E+21 B)
            :  :  +- Join Inner, (ss_sold_date_sk#53 = d_date_sk#54), Statistics(sizeInBytes=3.83E+21 B)
            :  :     :- Project [ss_sold_date_sk#53, i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=387.3 PiB)
            :  :     :  +- Join Inner, (ss_item_sk#32 = i_item_sk#9), Statistics(sizeInBytes=516.5 PiB)
            :  :     :     :- Project [ss_item_sk#32, ss_sold_date_sk#53], Statistics(sizeInBytes=61.1 GiB)
            :  :     :     :  +- Filter ((isnotnull(ss_item_sk#32) AND isnotnull(ss_sold_date_sk#53)) AND dynamicpruning#150 [ss_sold_date_sk#53]), Statistics(sizeInBytes=580.6 GiB)
            :  :     :     :     :  +- Project [d_date_sk#54], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :  :     :     :     :     +- Filter ((((d_year#60 >= 1999) AND (d_year#60 <= 2001)) AND isnotnull(d_year#60)) AND isnotnull(d_date_sk#54)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :  :     :     :     :        +- Relation tpcds5t.date_dim[d_date_sk#54,d_date_id#55,d_date#56,d_month_seq#57,d_week_seq#58,d_quarter_seq#59,d_year#60,d_dow#61,d_moy#62,d_dom#63,d_qoy#64,d_fy_year#65,d_fy_quarter_seq#66,d_fy_week_seq#67,d_day_name#68,d_quarter_name#69,d_holiday#70,d_weekend#71,d_following_holiday#72,d_first_dom#73,d_last_dom#74,d_same_day_ly#75,d_same_day_lq#76,d_current_day#77,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            :  :     :     :     +- Relation tpcds5t.store_sales[ss_sold_time_sk#31,ss_item_sk#32,ss_customer_sk#33,ss_cdemo_sk#34,ss_hdemo_sk#35,ss_addr_sk#36,ss_store_sk#37,ss_promo_sk#38,ss_ticket_number#39L,ss_quantity#40,ss_wholesale_cost#41,ss_list_price#42,ss_sales_price#43,ss_ext_discount_amt#44,ss_ext_sales_price#45,ss_ext_wholesale_cost#46,ss_ext_list_price#47,ss_ext_tax#48,ss_coupon_amt#49,ss_net_paid#50,ss_net_paid_inc_tax#51,ss_net_profit#52,ss_sold_date_sk#53] parquet, Statistics(sizeInBytes=580.6 GiB)
            :  :     :     +- Project [i_item_sk#9, i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=8.5 MiB, rowCount=3.69E+5)
            :  :     :        +- Filter (((isnotnull(i_brand_id#16) AND isnotnull(i_class_id#18)) AND isnotnull(i_category_id#20)) AND isnotnull(i_item_sk#9)), Statistics(sizeInBytes=150.0 MiB, rowCount=3.69E+5)
            :  :     :           +- Relation tpcds5t.item[i_item_sk#9,i_item_id#10,i_rec_start_date#11,i_rec_end_date#12,i_item_desc#13,i_current_price#14,i_wholesale_cost#15,i_brand_id#16,i_brand#17,i_class_id#18,i_class#19,i_category_id#20,i_category#21,i_manufact_id#22,i_manufact#23,i_size#24,i_formulation#25,i_color#26,i_units#27,i_container#28,i_manager_id#29,i_product_name#30] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
            :  :     +- Project [d_date_sk#54], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :  :        +- Filter ((((d_year#60 >= 1999) AND (d_year#60 <= 2001)) AND isnotnull(d_year#60)) AND isnotnull(d_date_sk#54)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :  :           +- Relation tpcds5t.date_dim[d_date_sk#54,d_date_id#55,d_date#56,d_month_seq#57,d_week_seq#58,d_quarter_seq#59,d_year#60,d_dow#61,d_moy#62,d_dom#63,d_qoy#64,d_fy_year#65,d_fy_quarter_seq#66,d_fy_week_seq#67,d_day_name#68,d_quarter_name#69,d_holiday#70,d_weekend#71,d_following_holiday#72,d_first_dom#73,d_last_dom#74,d_same_day_ly#75,d_same_day_lq#76,d_current_day#77,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            :  +- Aggregate [i_brand_id#16, i_class_id#18, i_category_id#20], [i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=1414.2 EiB)
            :     +- Project [i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=1414.2 EiB)
            :        +- Join Inner, (cs_sold_date_sk#115 = d_date_sk#54), Statistics(sizeInBytes=1979.9 EiB)
            :           :- Project [cs_sold_date_sk#115, i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=231.1 PiB)
            :           :  +- Join Inner, (cs_item_sk#96 = i_item_sk#9), Statistics(sizeInBytes=308.2 PiB)
            :           :     :- Project [cs_item_sk#96, cs_sold_date_sk#115], Statistics(sizeInBytes=36.2 GiB)
            :           :     :  +- Filter ((isnotnull(cs_item_sk#96) AND isnotnull(cs_sold_date_sk#115)) AND dynamicpruning#151 [cs_sold_date_sk#115]), Statistics(sizeInBytes=470.5 GiB)
            :           :     :     :  +- Project [d_date_sk#54], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :           :     :     :     +- Filter ((((d_year#60 >= 1999) AND (d_year#60 <= 2001)) AND isnotnull(d_year#60)) AND isnotnull(d_date_sk#54)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :           :     :     :        +- Relation tpcds5t.date_dim[d_date_sk#54,d_date_id#55,d_date#56,d_month_seq#57,d_week_seq#58,d_quarter_seq#59,d_year#60,d_dow#61,d_moy#62,d_dom#63,d_qoy#64,d_fy_year#65,d_fy_quarter_seq#66,d_fy_week_seq#67,d_day_name#68,d_quarter_name#69,d_holiday#70,d_weekend#71,d_following_holiday#72,d_first_dom#73,d_last_dom#74,d_same_day_ly#75,d_same_day_lq#76,d_current_day#77,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            :           :     :     +- Relation tpcds5t.catalog_sales[cs_sold_time_sk#82,cs_ship_date_sk#83,cs_bill_customer_sk#84,cs_bill_cdemo_sk#85,cs_bill_hdemo_sk#86,cs_bill_addr_sk#87,cs_ship_customer_sk#88,cs_ship_cdemo_sk#89,cs_ship_hdemo_sk#90,cs_ship_addr_sk#91,cs_call_center_sk#92,cs_catalog_page_sk#93,cs_ship_mode_sk#94,cs_warehouse_sk#95,cs_item_sk#96,cs_promo_sk#97,cs_order_number#98L,cs_quantity#99,cs_wholesale_cost#100,cs_list_price#101,cs_sales_price#102,cs_ext_discount_amt#103,cs_ext_sales_price#104,cs_ext_wholesale_cost#105,... 10 more fields] parquet, Statistics(sizeInBytes=470.5 GiB)
            :           :     +- Project [i_item_sk#9, i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=8.5 MiB, rowCount=3.72E+5)
            :           :        +- Filter isnotnull(i_item_sk#9), Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
            :           :           +- Relation tpcds5t.item[i_item_sk#9,i_item_id#10,i_rec_start_date#11,i_rec_end_date#12,i_item_desc#13,i_current_price#14,i_wholesale_cost#15,i_brand_id#16,i_brand#17,i_class_id#18,i_class#19,i_category_id#20,i_category#21,i_manufact_id#22,i_manufact#23,i_size#24,i_formulation#25,i_color#26,i_units#27,i_container#28,i_manager_id#29,i_product_name#30] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
            :           +- Project [d_date_sk#54], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
            :              +- Filter ((((d_year#60 >= 1999) AND (d_year#60 <= 2001)) AND isnotnull(d_year#60)) AND isnotnull(d_date_sk#54)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
            :                 +- Relation tpcds5t.date_dim[d_date_sk#54,d_date_id#55,d_date#56,d_month_seq#57,d_week_seq#58,d_quarter_seq#59,d_year#60,d_dow#61,d_moy#62,d_dom#63,d_qoy#64,d_fy_year#65,d_fy_quarter_seq#66,d_fy_week_seq#67,d_day_name#68,d_quarter_name#69,d_holiday#70,d_weekend#71,d_following_holiday#72,d_first_dom#73,d_last_dom#74,d_same_day_ly#75,d_same_day_lq#76,d_current_day#77,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
            +- Aggregate [i_brand_id#16, i_class_id#18, i_category_id#20], [i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=650.5 EiB)
               +- Project [i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=650.5 EiB)
                  +- Join Inner, (ws_sold_date_sk#149 = d_date_sk#54), Statistics(sizeInBytes=910.6 EiB)
                     :- Project [ws_sold_date_sk#149, i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=106.3 PiB)
                     :  +- Join Inner, (ws_item_sk#118 = i_item_sk#9), Statistics(sizeInBytes=141.7 PiB)
                     :     :- Project [ws_item_sk#118, ws_sold_date_sk#149], Statistics(sizeInBytes=16.6 GiB)
                     :     :  +- Filter ((isnotnull(ws_item_sk#118) AND isnotnull(ws_sold_date_sk#149)) AND dynamicpruning#152 [ws_sold_date_sk#149]), Statistics(sizeInBytes=216.4 GiB)
                     :     :     :  +- Project [d_date_sk#54], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
                     :     :     :     +- Filter ((((d_year#60 >= 1999) AND (d_year#60 <= 2001)) AND isnotnull(d_year#60)) AND isnotnull(d_date_sk#54)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
                     :     :     :        +- Relation tpcds5t.date_dim[d_date_sk#54,d_date_id#55,d_date#56,d_month_seq#57,d_week_seq#58,d_quarter_seq#59,d_year#60,d_dow#61,d_moy#62,d_dom#63,d_qoy#64,d_fy_year#65,d_fy_quarter_seq#66,d_fy_week_seq#67,d_day_name#68,d_quarter_name#69,d_holiday#70,d_weekend#71,d_following_holiday#72,d_first_dom#73,d_last_dom#74,d_same_day_ly#75,d_same_day_lq#76,d_current_day#77,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
                     :     :     +- Relation tpcds5t.web_sales[ws_sold_time_sk#116,ws_ship_date_sk#117,ws_item_sk#118,ws_bill_customer_sk#119,ws_bill_cdemo_sk#120,ws_bill_hdemo_sk#121,ws_bill_addr_sk#122,ws_ship_customer_sk#123,ws_ship_cdemo_sk#124,ws_ship_hdemo_sk#125,ws_ship_addr_sk#126,ws_web_page_sk#127,ws_web_site_sk#128,ws_ship_mode_sk#129,ws_warehouse_sk#130,ws_promo_sk#131,ws_order_number#132L,ws_quantity#133,ws_wholesale_cost#134,ws_list_price#135,ws_sales_price#136,ws_ext_discount_amt#137,ws_ext_sales_price#138,ws_ext_wholesale_cost#139,... 10 more fields] parquet, Statistics(sizeInBytes=216.4 GiB)
                     :     +- Project [i_item_sk#9, i_brand_id#16, i_class_id#18, i_category_id#20], Statistics(sizeInBytes=8.5 MiB, rowCount=3.72E+5)
                     :        +- Filter isnotnull(i_item_sk#9), Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
                     :           +- Relation tpcds5t.item[i_item_sk#9,i_item_id#10,i_rec_start_date#11,i_rec_end_date#12,i_item_desc#13,i_current_price#14,i_wholesale_cost#15,i_brand_id#16,i_brand#17,i_class_id#18,i_class#19,i_category_id#20,i_category#21,i_manufact_id#22,i_manufact#23,i_size#24,i_formulation#25,i_color#26,i_units#27,i_container#28,i_manager_id#29,i_product_name#30] parquet, Statistics(sizeInBytes=151.1 MiB, rowCount=3.72E+5)
                     +- Project [d_date_sk#54], Statistics(sizeInBytes=8.6 KiB, rowCount=731)
                        +- Filter ((((d_year#60 >= 1999) AND (d_year#60 <= 2001)) AND isnotnull(d_year#60)) AND isnotnull(d_date_sk#54)), Statistics(sizeInBytes=175.6 KiB, rowCount=731)
                           +- Relation tpcds5t.date_dim[d_date_sk#54,d_date_id#55,d_date#56,d_month_seq#57,d_week_seq#58,d_quarter_seq#59,d_year#60,d_dow#61,d_moy#62,d_dom#63,d_qoy#64,d_fy_year#65,d_fy_quarter_seq#66,d_fy_week_seq#67,d_day_name#68,d_quarter_name#69,d_holiday#70,d_weekend#71,d_following_holiday#72,d_first_dom#73,d_last_dom#74,d_same_day_ly#75,d_same_day_lq#76,d_current_day#77,... 4 more fields] parquet, Statistics(sizeInBytes=17.1 MiB, rowCount=7.30E+4)
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit test.

Closes #31196 from wangyum/SPARK-34129.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(commit: 94f9617)
The file was modifiedsql/core/src/test/resources/sql-tests/results/explain.sql.out (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/explain-aqe.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/explain-cbo.sql.out (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/QueryExecutionSuite.scala (diff)
Commit 03f4cf584526eac3436c8b9035b7c92bb96ed0f1 by dhyun
[SPARK-34029][SQL][TESTS] Add OrcEncryptionSuite and FakeKeyProvider

### What changes were proposed in this pull request?

This is a retry of #31065 . Last time, the newly add test cases passed in Jenkins and individually, but it's reverted because they fail when `GitHub Action` runs with  `SERIAL_SBT_TESTS=1`.

In this PR, `SecurityTest` tag is used to isolate `KeyProvider`.

This PR aims to add a basis for columnar encryption test framework by add `OrcEncryptionSuite` and `FakeKeyProvider`.

Please note that we will improve more in both Apache Spark and Apache ORC in Apache Spark 3.2.0 timeframe.

### Why are the changes needed?

Apache ORC 1.6 supports columnar encryption.

### Does this PR introduce _any_ user-facing change?

No. This is for a test case.

### How was this patch tested?

Pass the newly added test suite.

Closes #31603 from dongjoon-hyun/SPARK-34486-RETRY.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 03f4cf5)
The file was addedsql/core/src/test/resources/META-INF/services/org.apache.hadoop.crypto.key.KeyProviderFactory
The file was modifiedproject/SparkBuild.scala (diff)
The file was addedcommon/tags/src/test/java/org/apache/spark/tags/SecurityTest.java
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcEncryptionSuite.scala
The file was addedsql/core/src/test/java/test/org/apache/spark/sql/execution/datasources/orc/FakeKeyProvider.java
The file was modified.github/workflows/build_and_test.yml (diff)
Commit 6ea4b5fda7fd32f78e204e3de466fdc07e47ee89 by wenchen
[SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views

### What changes were proposed in this pull request?
Update public docs of SQL commands about altering cached tables/views. For instance:
<img width="869" alt="Screenshot 2021-02-08 at 15 11 48" src="https://user-images.githubusercontent.com/1580697/107217940-fd3b8980-6a1f-11eb-98b9-9b2e3fe7f4ef.png">

### Why are the changes needed?
To inform users about commands behavior in altering cached tables or views.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the command below and manually checking the docs:
```
$ SKIP_API=1 SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll serve --watch
```

Closes #31524 from MaxGekk/doc-cmd-caching.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 6ea4b5f)
The file was modifieddocs/sql-ref-syntax-ddl-alter-view.md (diff)
The file was modifieddocs/sql-ref-syntax-ddl-alter-table.md (diff)
The file was modifieddocs/sql-ref-syntax-dml-load.md (diff)
The file was modifieddocs/sql-ref-syntax-ddl-repair-table.md (diff)
The file was modifieddocs/sql-ref-syntax-ddl-truncate-table.md (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala (diff)
The file was modifieddocs/sql-ref-syntax-ddl-drop-table.md (diff)
Commit a22d20a6ca6e763b3c6011d6019ab92a7f54ea87 by wenchen
[SPARK-34468][SQL] Rename v2 table in place if new name has single part

### What changes were proposed in this pull request?
If new table name consists of single part (no namespaces), the v2 `ALTER TABLE .. RENAME TO` command renames the table while keeping it in the same namespace. For example:
```sql
ALTER TABLE catalog_name.ns1.ns2.ns3.ns4.ns5.tbl RENAME TO new_table
```
the command should rename the source table to `catalog_name.ns1.ns2.ns3.ns4.ns5.new_table`. Before the changes, the command moves the table to the "root" name space i.e. `catalog_name.new_table`.

### Why are the changes needed?
To have the same behavior as v1 implementation of `ALTER TABLE .. RENAME TO`, and other DBMSs.

### Does this PR introduce _any_ user-facing change?
Yes

### How was this patch tested?
By running new test:
```
$ build/sbt "sql/test:testOnly *DataSourceV2SQLSuite"
```

Closes #31594 from MaxGekk/rename-table-single-part.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: a22d20a)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/RenameTableExec.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala (diff)
Commit 38fbe560fd08168e90c575f7707368ddf758c3a9 by wenchen
[SPARK-34167][SQL] Reading parquet with IntDecimal written as a LongDecimal blows up

### What changes were proposed in this pull request?
If an IntDecimal type was written as a LongDecimal in a parquet file. Spark should read it as a long from `VectorizedValuesReader` but write it to the `WritableColumnVector` as an int by down-casting it and calling the appropriate method. `readLongs` has been modified to take in a boolean flag that tells it if the number would fit in a 32-bit Decimal and subsequently downsized.

### Why are the changes needed?
If a Parquet file writes an IntDecimal as LongDecimal, which is supported by the parquet spec, Spark will not be able to read it and will throw an exception.  The reason this happens is because method `readLong` tries to write the long to a `WritableColumnVector` which has been initialized to accept only Ints which leads to a `NullPointerException`.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
manually tested and added unit-test

Closes #31284 from razajafri/decimal_fix.

Authored-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 38fbe56)
The file was modifiedsql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java (diff)
The file was addedsql/core/src/test/resources/test-data/decimal32-written-as-64-bit-dict.snappy.parquet
The file was addedsql/core/src/test/resources/test-data/decimal32-written-as-64-bit.snappy.parquet
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala (diff)
The file was modifiedsql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetDictionary.java (diff)
The file was modifiedsql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java (diff)
Commit 2fb5f21b1ec63a54a8d9b9b0fce8268aa81c88a7 by gurwls223
[SPARK-34495][TESTS] Add `DedicatedJVMTest` test tag

### What changes were proposed in this pull request?

This PR aims to add a test tag, `DedicatedJVMTest`, and replace `SecurityTest` with this.

### Why are the changes needed?

To have a reusable general test tag.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #31607 from dongjoon-hyun/SPARK-34495.

Lead-authored-by: Dongjoon Hyun <dhyun@apple.com>
Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 2fb5f21)
The file was removedcommon/tags/src/test/java/org/apache/spark/tags/SecurityTest.java
The file was addedcommon/tags/src/test/java/org/apache/spark/tags/DedicatedJVMTest.java
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcEncryptionSuite.scala (diff)
The file was modified.github/workflows/build_and_test.yml (diff)
Commit 23a5996a4619262030d6f50fe035f4586d1a3f24 by wenchen
[SPARK-34450][SQL][TESTS] Unify v1 and v2 ALTER TABLE .. RENAME tests

### What changes were proposed in this pull request?
1. Move parser tests from `DDLParserSuite` to `AlterTableRenameParserSuite`.
2. Port DS v1 tests from `DDLSuite` and other test suites to `v1.AlterTableRenameBase` and to `v1.AlterTableRenameSuite`.
3. Add a test for DSv2 `ALTER TABLE .. RENAME` to `v2.AlterTableRenameSuite`.

### Why are the changes needed?
To improve test coverage.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running new test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableRenameSuite"
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableRenameParserSuite"
```

Closes #31575 from MaxGekk/unify-rename-table-tests.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 23a5996)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala (diff)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala (diff)
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableRenameSuiteBase.scala
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/DropTableSuiteBase.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/AlterTableRecoverPartitionsSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandTestUtils.scala (diff)
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/AlterTableRenameSuite.scala
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableRenameSuite.scala
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/TruncateTableSuiteBase.scala (diff)
The file was addedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/AlterTableRenameSuite.scala
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableRenameParserSuite.scala
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala (diff)
Commit 9767041153eec4c79c197662030f399f88b64d3e by wenchen
[SPARK-34432][SQL][TESTS] Add JavaSimpleWritableDataSource

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/19269

In #19269 , there is only a scala implementation of simple writable data source in `DataSourceV2Suite`.

This PR adds a java implementation of it.

### Why are the changes needed?

To improve test coverage.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

existing testsuites

Closes #31560 from kevincmchen/SPARK-34432.

Lead-authored-by: kevincmchen <kevincmchen@tencent.com>
Co-authored-by: Kevin Pis <68981916+kevincmchen@users.noreply.github.com>
Co-authored-by: Kevin Pis <kc4163568@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 9767041)
The file was modifiedsql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleBatchTable.java (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2Suite.scala (diff)
The file was addedsql/core/src/test/java/test/org/apache/spark/sql/connector/JavaSimpleWritableDataSource.java
Commit a6a82c8e69eedb4ab83c99fb71772553a0ea4e84 by yumwang
[MINOR][DOCS] Add table_identifier in sql-migration-guide for SHOW CREATE TABLE

### What changes were proposed in this pull request?
Add `table_identifier` in sql-migration-guide for SHOW CREATE TABLE.

### Why are the changes needed?
To make document more readable.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing test suites.

Closes #31608 from Karl-WangSK/sqldoc.

Lead-authored-by: Karl-WangSK <shikai.wang@linkflowtech.com>
Co-authored-by: ShiKai Wang <wskqing@gmail.com>
Signed-off-by: Yuming Wang <yumwang@ebay.com>
(commit: a6a82c8)
The file was modifieddocs/sql-migration-guide.md (diff)
Commit 02c784ca686fc675b63ce37f03215bc6c2fec869 by wenchen
[SPARK-34473][SQL] Avoid NPE in DataFrameReader.schema(StructType)

### What changes were proposed in this pull request?

This fixes a regression in `DataFrameReader.schema(StructType)`, to avoid NPE if the given `StructType` is null. Note that, passing null to Spark public APIs leads to undefined behavior. There is no document mentioning the null behavior, and it's just an accident that `DataFrameReader.schema(StructType)` worked before. So I think this is not a 3.1 blocker.

### Why are the changes needed?

It fixes a 3.1 regression

### Does this PR introduce _any_ user-facing change?

yea, now `df.read.schema(null: StructType)` is a noop as before, while in the current branch-3.1 it throws NPE.

### How was this patch tested?

It's undefined behavior and is very obvious, so I didn't add a test. We should add tests when we clearly define and fix the null behavior for all public APIs.

Closes #31593 from cloud-fan/minor.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 02c784c)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala (diff)
Commit 0bccf1664f7169ec71027355dcb58a2f229393b1 by gurwls223
[SPARK-34496][BUILD] Upgrade ZSTD-JNI to 1.4.8-5 for better API compatibility

### What changes were proposed in this pull request?

This PR aims to upgrade ZSTD-JNI to 1.4.8-5 for better API compatibility.

### Why are the changes needed?

Previously, we upgrade for ZSTD-JNI performance improvement.
And, `Apache Spark`/`Apache Parquet`/`Apache Avro` master branches are using JZSTD-JNI 1.4.8-x.

This PR aims to upgrade a minor version for a better API compatibility.
- https://github.com/luben/zstd-jni/commit/def1860c6f454ca50bf360bb0da133545daeed31
- https://github.com/luben/zstd-jni/commit/188c803044249409ee234f7e987cfdd712a567fd

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs

Closes #31609 from dongjoon-hyun/SPARK-34496.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 0bccf16)
The file was modifieddev/deps/spark-deps-hadoop-3.2-hive-2.3 (diff)
The file was modifieddev/deps/spark-deps-hadoop-2.7-hive-2.3 (diff)
The file was modifiedpom.xml (diff)
Commit 7df4fed420c1164e5e025ebce525850c025d9400 by wenchen
[MINOR][SQL] Fix the comment for CalendarIntervalType about comparability

### What changes were proposed in this pull request?
In the PR, I propose to revert https://github.com/apache/spark/pull/26659 partially regarding to comparability of interval values. The comment became incorrect after https://github.com/apache/spark/pull/27262.

### Why are the changes needed?
The comment is incorrect, and it might confuse Spark's devs/users.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By checking scala coding style `./dev/scalastyle`.

Closes #31610 from MaxGekk/doc-interval-not-comparable.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 7df4fed)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala (diff)