Changes

Summary

  1. [SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR (commit: c88edda) (details)
  2. [SPARK-33669] Wrong error message from YARN application state monitor (commit: 48f93af) (details)
  3. [SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR (commit: a713a7e) (details)
  4. [SPARK-33719][DOC] Add make_date/make_timestamp/make_interval into the (commit: 9959d49) (details)
  5. [SPARK-33071][SPARK-33536][SQL][FOLLOW-UP] Rename deniedMetadataKeys to (commit: b5399d4) (details)
  6. [SPARK-33722][SQL] Handle DELETE in ReplaceNullWithFalseInPredicate (commit: fa9ce1d) (details)
  7. [SPARK-33725][BUILD] Upgrade snappy-java to 1.1.8.2 (commit: 667f64f) (details)
  8. [SPARK-33727][K8S] Fall back from gnupg.net to openpgp.org (commit: 991b797) (details)
  9. [SPARK-33724][K8S] Add decom script as a configuration param (commit: 1c7f5f1) (details)
  10. [SPARK-33558][SQL][TESTS] Unify v1 and v2 ALTER TABLE .. ADD PARTITION (commit: af37c7f) (details)
  11. [SPARK-33714][SQL] Migrate ALTER VIEW ... SET/UNSET TBLPROPERTIES (commit: b112e2b) (details)
  12. [SPARK-33732][K8S][TESTS] Kubernetes integration tests doesn't work with (commit: 795db05) (details)
  13. [SPARK-32670][SQL][FOLLOWUP] Group exception messages in Catalyst (commit: cef28c2) (details)
  14. [SPARK-33692][SQL] View should use captured catalog and namespace to (commit: 1554977) (details)
Commit c88eddac3bf860d04bba91fc913f8b2069a94153 by wenchen
[SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types

### What changes were proposed in this pull request?

Add migration guide for CHAR VARCHAR types

### Why are the changes needed?

for migration

### Does this PR introduce _any_ user-facing change?

doc change

### How was this patch tested?

passing ci

Closes #30654 from yaooqinn/SPARK-33641-F.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: c88edda)
The file was modifieddocs/sql-migration-guide.md (diff)
Commit 48f93af9f3d40de5bf087eb1a06c1b9954b2ad76 by mridulatgmail.com
[SPARK-33669] Wrong error message from YARN application state monitor when sc.stop in yarn client mode

### What changes were proposed in this pull request?
This change make InterruptedIOException to be treated as InterruptedException when closing YarnClientSchedulerBackend, which doesn't log error with "YARN application has exited unexpectedly xxx"

### Why are the changes needed?
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries to interrupt Yarn application monitor thread. In MonitorThread.run() it catches InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when the hadoop rpc call is calling. In this case, MonitorThread will not know it is interrupted, a Yarn App failed is returned with "Failed to contact YARN for application xxxxx;  YARN application has exited unexpectedly with state xxxxx" is logged with error level. which confuse user a lot.

### Does this PR introduce _any_ user-facing change?
Yes

### How was this patch tested?
very simple patch, seems no need?

Closes #30617 from sqlwindspeaker/yarn-client-interrupt-monitor.

Authored-by: suqilong <suqilong@qiyi.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
(commit: 48f93af)
The file was modifiedresource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala (diff)
The file was modifiedresource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala (diff)
Commit a713a7eee3e7f76df6210a6e215ffc0d67ec71f2 by gurwls223
[SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR

### What changes were proposed in this pull request?
Currently, when a client requests FETCH_PRIOR to Thriftserver, Thriftserver reiterates from the start position. Because Thriftserver caches a query result with an array when THRIFTSERVER_INCREMENTAL_COLLECT feature is off, FETCH_PRIOR can be implemented without reiterating the result. A trait FeatureIterator is added in order to separate the implementation for iterator and an array. Also, FeatureIterator supports moves cursor with absolute position, which will be useful for the implementation of FETCH_RELATIVE, FETCH_ABSOLUTE.

### Why are the changes needed?
For better performance of Thriftserver.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
FetchIteratorSuite

Closes #30600 from Dooyoung-Hwang/refactor_with_fetch_iterator.

Authored-by: Dooyoung Hwang <dooyoung.hwang@sk.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: a713a7e)
The file was addedsql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/FetchIterator.scala
The file was addedsql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/FetchIteratorSuite.scala
The file was modifiedsql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala (diff)
Commit 9959d49942d334b03a05c43299f3949a48e5fa17 by gurwls223
[SPARK-33719][DOC] Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance

### What changes were proposed in this pull request?

Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance

### Why are the changes needed?

Users can know that these functions throw runtime exceptions under ANSI mode if the result is not valid.
### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Build doc and check it in browser:
![image](https://user-images.githubusercontent.com/1097932/101608930-34a79e80-39bb-11eb-9294-9d9b8c3f6faa.png)

Closes #30683 from gengliangwang/improveDoc.

Authored-by: Gengliang Wang <gengliang.wang@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 9959d49)
The file was modifieddocs/sql-ref-ansi-compliance.md (diff)
Commit b5399d4ef1c4e3df9d01a07e76bede41d7255d1c by gurwls223
[SPARK-33071][SPARK-33536][SQL][FOLLOW-UP] Rename deniedMetadataKeys to nonInheritableMetadataKeys in Alias

### What changes were proposed in this pull request?

This PR is a followup of https://github.com/apache/spark/pull/30488. This PR proposes to rename `Alias.deniedMetadataKeys` to `Alias.nonInheritableMetadataKeys` to make it less confusing.

### Why are the changes needed?

To make it easier to maintain and read.

### Does this PR introduce _any_ user-facing change?

No. This is rather a code cleanup.

### How was this patch tested?

Ran the unittests written in the previous PR manually. Jenkins and GitHub Actions in this PR should also test them.

Closes #30682 from HyukjinKwon/SPARK-33071-SPARK-33536.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: b5399d4)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/Column.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AliasHelper.scala (diff)
Commit fa9ce1d4e893e3a32bc05e4d95241d32710deb54 by dongjoon
[SPARK-33722][SQL] Handle DELETE in ReplaceNullWithFalseInPredicate

### What changes were proposed in this pull request?

This PR adds `DeleteFromTable` to supported plans in `ReplaceNullWithFalseInPredicate`.

### Why are the changes needed?

This change allows Spark to optimize delete conditions like we optimize filters.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This PR extends the existing test cases to also cover `DeleteFromTable`.

Closes #30688 from aokolnychyi/spark-33722.

Authored-by: Anton Okolnychyi <aokolnychyi@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(commit: fa9ce1d)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicateSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala (diff)
Commit 667f64f447a75141b091c361acebdc363bfe9288 by dongjoon
[SPARK-33725][BUILD] Upgrade snappy-java to 1.1.8.2

### What changes were proposed in this pull request?

This upgrades snappy-java to 1.1.8.2.

### Why are the changes needed?

Minor version upgrade that includes:

- [Fixed](https://github.com/xerial/snappy-java/pull/265) an initialization issue when using a recent Mac OS X version
- Support Apple Silicon (M1, Mac-aarch64)
- Fixed the pure-java Snappy fallback logic when no native library for your platform is found.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit test.

Closes #30690 from viirya/upgrade-snappy.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(commit: 667f64f)
The file was modifiedpom.xml (diff)
The file was modifieddev/deps/spark-deps-hadoop-2.7-hive-2.3 (diff)
The file was modifieddev/deps/spark-deps-hadoop-3.2-hive-2.3 (diff)
Commit 991b7977b5006e1e0d02b7d67a3e0fc50f5a9f66 by gurwls223
[SPARK-33727][K8S] Fall back from gnupg.net to openpgp.org

### What changes were proposed in this pull request?

While building R docker image if we can't fetch the key from gnupg.net fall back to openpgp.org

### Why are the changes needed?

gnupg.net key servers are flaky and sometimes fail to resolve or return keys.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Tried to add key on my desktop, it failed, then tried to add key with openpgp.org and it succeed.

Closes #30696 from holdenk/SPARK-33727-gnupg-server-is-flaky.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 991b797)
The file was modifiedresource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile (diff)
Commit 1c7f5f1ac7ecf0390410d2da6f3b1a615a5a71cc by dongjoon
[SPARK-33724][K8S] Add decom script as a configuration param

### What changes were proposed in this pull request?

Makes the location of the decommission script used in Kubernetes for graceful shutdown configurable.

### Why are the changes needed?

Some environments don't use the Spark image builder and instead mount the decompressed Spark distro. In those envs configuring the location of the decommissioning script is required.

### Does this PR introduce _any_ user-facing change?

New configuration parameter.

### How was this patch tested?

Existing decommissioning integration test.

Closes #30694 from holdenk/SPARK-33724-allow-decommissioning-script-location-to-be-configured.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(commit: 1c7f5f1)
The file was modifiedresource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala (diff)
The file was modifiedresource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala (diff)
Commit af37c7f4115a2edf46a304f90db0aec4d3edde16 by wenchen
[SPARK-33558][SQL][TESTS] Unify v1 and v2 ALTER TABLE .. ADD PARTITION tests

### What changes were proposed in this pull request?
1. Move the `ALTER TABLE .. ADD PARTITION` parsing tests to `AlterTableAddPartitionParserSuite`
2. Place v1 tests for `ALTER TABLE .. ADD PARTITION` from `DDLSuite` and v2 tests from `AlterTablePartitionV2SQLSuite` to the common trait `AlterTableAddPartitionSuiteBase`, so, the tests will run for V1, Hive V1 and V2 DS.

### Why are the changes needed?
- The unification will allow to run common `ALTER TABLE .. ADD PARTITION` tests for both DSv1 and Hive DSv1, DSv2
- We can detect missing features and differences between DSv1 and DSv2 implementations.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running new test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"
```

Closes #30685 from MaxGekk/unify-alter-table-add-partition-tests.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: af37c7f)
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala (diff)
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableAddPartitionSuiteBase.scala
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableAddPartitionParserSuite.scala
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala (diff)
The file was addedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/AlterTableAddPartitionSuite.scala
The file was addedsql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/AlterTableAddPartitionSuite.scala
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolvePartitionSpec.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala (diff)
Commit b112e2bfa619d028004cbc7fb8ec1363689729a7 by wenchen
[SPARK-33714][SQL] Migrate ALTER VIEW ... SET/UNSET TBLPROPERTIES commands to use UnresolvedView to resolve the identifier

### What changes were proposed in this pull request?

This PR adds `allowTemp` flag to `UnresolvedView` so that `Analyzer` can check whether to resolve temp views or not.

This PR also migrates `ALTER VIEW ... SET/UNSET TBLPROPERTIES` to use `UnresolvedView` to resolve the table/view identifier. This allows consistent resolution rules (temp view first, etc.) to be applied for both v1/v2 commands. More info about the consistent resolution rule proposal can be found in [JIRA](https://issues.apache.org/jira/browse/SPARK-29900) or [proposal doc](https://docs.google.com/document/d/1hvLjGA8y_W_hhilpngXVub1Ebv8RsMap986nENCFnrg/edit?usp=sharing).

### Why are the changes needed?

To use `UnresolvedView` for view resolution.

One benefit is that the exception message is better for `ALTER VIEW ... SET/UNSET TBLPROPERTIES`. Before, if a temp view is passed, you will just get `NoSuchTableException` with `Table or view 'tmpView' not found in database 'default'`. But with this PR, you will get more description exception message: `tmpView is a temp view. ALTER VIEW ... SET TBLPROPERTIES expects a permanent view`.

### Does this PR introduce _any_ user-facing change?

The exception message changes as describe above.

### How was this patch tested?

Updated existing tests.

Closes #30676 from imback82/alter_view_set_unset_properties.

Authored-by: Terry Kim <yuminkim@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: b112e2b)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala (diff)
Commit 795db05bf6911aa2a66eea57460409a238957b40 by dongjoon
[SPARK-33732][K8S][TESTS] Kubernetes integration tests doesn't work with Minikube 1.9+

### What changes were proposed in this pull request?

This PR changes `Minikube.scala` for Kubernetes integration tests to work with Minikube 1.9+.
`Minikube.scala` assumes that `apiserver.key` and `apiserver.crt` are in `~/.minikube/`.
But as of Minikube 1.9, they are in `~/.minikube/profiles/<profile>`.

### Why are the changes needed?

Currently, Kubernetes integration tests doesn't work with Minikube 1.9+.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

I confirmed the following test passes.
```
$ build/sbt -Pkubernetes -Pkubernetes-integration-tests package 'kubernetes-integration-tests/testOnly -- -z "SparkPi with no"'
```

Closes #30700 from sarutak/minikube-1.9.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(commit: 795db05)
The file was modifiedresource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/backend/minikube/Minikube.scala (diff)
Commit cef28c2c51d06506afd8a5f5ac725a1a0fd53b6d by wenchen
[SPARK-32670][SQL][FOLLOWUP] Group exception messages in Catalyst Analyzer in one file

### What changes were proposed in this pull request?
This PR follows up https://github.com/apache/spark/pull/29497.
Because https://github.com/apache/spark/pull/29497 just give us an example to group all `AnalysisExcpetion` in Analyzer into QueryCompilationErrors.
This PR group other `AnalysisExcpetion` into QueryCompilationErrors.

### Why are the changes needed?
It will largely help with standardization of error messages and its maintenance.

### Does this PR introduce _any_ user-facing change?
No. Error messages remain unchanged.

### How was this patch tested?
No new tests - pass all original tests to make sure it doesn't break any existing behavior.

Closes #30564 from beliefer/SPARK-32670-followup.

Lead-authored-by: gengjiaan <gengjiaan@360.cn>
Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: beliefer <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: cef28c2)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/QueryCompilationErrors.scala (diff)
Commit 1554977670ffa452242b1433f0bff44c88c35722 by wenchen
[SPARK-33692][SQL] View should use captured catalog and namespace to lookup function

### What changes were proposed in this pull request?
Using the view captured catalog and namespace to lookup function, so the view
referred functions won't be overridden by newly created function with the same name,
but different database or function type (i.e. temporary function)

### Why are the changes needed?
bug fix, without this PR, changing database or create a temporary function with
the same name may cause failure when querying a view.

### Does this PR introduce _any_ user-facing change?
Yes, bug fix.

### How was this patch tested?
newly added and existing test cases.

Closes #30662 from linhongliu-db/SPARK-33692.

Lead-authored-by: Linhong Liu <linhong.liu@databricks.com>
Co-authored-by: Linhong Liu <67896261+linhongliu-db@users.noreply.github.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 1554977)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewTestSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)