Changes

Summary

  1. [SPARK-34604][PYTHON][TESTS] Use eventually in (commit: 8005900) (details)
  2. [SPARK-33084][SQL][TEST][FOLLOWUP] Add ResetSystemProperties trait to (commit: 9d2d620) (details)
  3. [SPARK-34610][PYTHON][TEST] Fix Python UDF used in (commit: 331d459) (details)
  4. [SPARK-34584][SQL] Static partition should also follow (commit: 8f1eec4) (details)
  5. [SPARK-34577][SQL] Fix drop/add columns to a dataset of `DESCRIBE (commit: db62710) (details)
  6. [SPARK-34599][SQL] Fix the issue that INSERT INTO OVERWRITE doesn't (commit: 53e4dba) (details)
  7. [SPARK-34614][SQL] ANSI mode: Casting String to Boolean should throw (commit: 2b1c170) (details)
  8. [SPARK-34567][SQL] CreateTableAsSelect should update metrics too (commit: 401e270) (details)
  9. [SPARK-34482][SS] Correct the active SparkSession for (commit: e7e0161) (details)
  10. [SPARK-34605][SQL] Support `java.time.Duration` as an external type of (commit: 17601e0) (details)
  11. [SPARK-32924][WEBUI] Make duration column in master UI sorted in the (commit: 9ac5ee2) (details)
  12. [SPARK-34607][SQL] Add `Utils.isMemberClass` to fix a malformed class (commit: dbce74d) (details)
  13. [SPARK-34376][SQL] Support regexp as a SQL function (commit: 814d81c) (details)
  14. [SPARK-34613][SQL] Fix view does not capture disable hint config (commit: 43aacd5) (details)
  15. [SPARK-34608][SQL] Remove unused output of AddJarCommand (commit: 979b9bc) (details)
  16. [SPARK-34609][SQL] Unify resolveExpressionBottomUp and (commit: dc78f33) (details)
  17. [MINOR][SQL][DOCS] Fix some spelling issues in SQL migration guide (commit: 75db6e7) (details)
  18. [SPARK-34635][UI] Add trailing slashes in URLs to reduce unnecessary (commit: 358697b) (details)
  19. [SPARK-22748][SQL] Analyze __grouping__id as a literal function (commit: ca326c4) (details)
  20. [SPARK-34621][SQL] Unify output of ShowCreateTableAsSerdeCommand  and (commit: 654f19d) (details)
  21. [SPARK-34592][WEBUI] Mark indeterminate RDD in Web UI (commit: c91a756) (details)
  22. [SPARK-34624][CORE] Exclude non-jar dependencies of ivy/maven packages (commit: 1fd7368) (details)
  23. [SPARK-34643][R][DOCS] Use CRAN URL in canonical form (commit: f72b906) (details)
  24. [SPARK-34595][SQL] DPP support RLIKE (commit: 1a97224) (details)
Commit 800590035c483434cf59edf63bc771836ae27714 by gurwls223
[SPARK-34604][PYTHON][TESTS] Use eventually in TaskContextTestsWithWorkerReuse.test_task_context_correct_with_python_worker_reuse

### What changes were proposed in this pull request?

`TaskContextTestsWithWorkerReuse.test_task_context_correct_with_python_worker_reuse` can be flaky and fails sometimes:

```
======================================================================
ERROR [1.798s]: test_task_context_correct_with_python_worker_reuse (pyspark.tests.test_taskcontext.TaskContextTestsWithWorkerReuse)
...
test_task_context_correct_with_python_worker_reuse
    self.assertTrue(pid in worker_pids)
AssertionError: False is not true

----------------------------------------------------------------------
```

I suspect that the Python worker was killed for whatever reason and new attempt created a new Python worker.

This PR fixes the flakiness simply by retrying the test case.

### Why are the changes needed?

To make the tests more robust.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested it by controlling the conditions manually in the test codes.

Closes #31723 from HyukjinKwon/SPARK-34604.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 8005900)
The file was modifiedpython/pyspark/tests/test_taskcontext.py (diff)
Commit 9d2d620e9878a52d79530104a04c409c9ec43de6 by gurwls223
[SPARK-33084][SQL][TEST][FOLLOWUP] Add ResetSystemProperties trait to SQLQuerySuite to avoid clearing ivy.home

### What changes were proposed in this pull request?
Add the `ResetSystemProperties` trait to `SQLQuerySuite` so that system property changes made by any of the tests will not affect other suites/tests. Specifically, the system property changes made by `SPARK-33084: Add jar support Ivy URI in SQL -- jar contains udf class` are targeted here (which sets and then clears `ivy.home`).

### Why are the changes needed?
PR #29966 added a new test case that adjusts the `ivy.home` system property to force Ivy to resolve an artifact from a custom location. At the end of the test, the value is cleared. Clearing the value meant that, if a custom value of `ivy.home` was configured externally, it would not apply for tests run after this test case.

### Does this PR introduce _any_ user-facing change?
No, this is only in tests.

### How was this patch tested?
Existing unit tests continue to pass, whether or not `spark.jars.ivySettings` is configured (which adjusts the behavior of Ivy w.r.t. handling of `ivy.home` and `ivy.default.ivy.user.dir` properties).

Closes #31694 from xkrogen/xkrogen-SPARK-33084-ivyhome-sysprop-followon.

Authored-by: Erik Krogen <xkrogen@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 9d2d620)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala (diff)
Commit 331d459ee7024fe500ea69ad59fc2bc7ebf20859 by gurwls223
[SPARK-34610][PYTHON][TEST] Fix Python UDF used in GroupedAggPandasUDFTests

### What changes were proposed in this pull request?

Fixes a Python UDF `plus_one` used in `GroupedAggPandasUDFTests` to always return float (double) values.

### Why are the changes needed?

The Python UDF `plus_one` used in `GroupedAggPandasUDFTests` is always returning `v + 1` regardless of its type. The return type of the UDF is 'double', so if the input is int, the result will be `null`.

```py
>>> df = spark.range(10).toDF('id') \
...             .withColumn("vs", array([lit(i * 1.0) + col('id') for i in range(20, 30)])) \
...             .withColumn("v", explode(col('vs'))) \
...             .drop('vs') \
...             .withColumn('w', lit(1.0))
>>> udf('double')
... def plus_one(v):
...   assert isinstance(v, (int, float))
...   return v + 1
...
>>> pandas_udf('double', PandasUDFType.GROUPED_AGG)
... def sum_udf(v):
...   return v.sum()
...
>>> df.groupby(plus_one(df.id)).agg(sum_udf(df.v)).show()
+------------+----------+
|plus_one(id)|sum_udf(v)|
+------------+----------+
|        null|    2900.0|
+------------+----------+
```

This is meaningless and should be:

```py
>>> udf('double')
... def plus_one(v):
...   assert isinstance(v, (int, float))
...   return float(v + 1)
...
>>> df.groupby(plus_one(df.id)).agg(sum_udf(df.v)).sort('plus_one(id)').show()
+------------+----------+
|plus_one(id)|sum_udf(v)|
+------------+----------+
|         1.0|     245.0|
|         2.0|     255.0|
|         3.0|     265.0|
|         4.0|     275.0|
|         5.0|     285.0|
|         6.0|     295.0|
|         7.0|     305.0|
|         8.0|     315.0|
|         9.0|     325.0|
|        10.0|     335.0|
+------------+----------+
```

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Fixed the test.

Closes #31730 from ueshin/issues/SPARK-34610/test_pandas_udf_grouped_agg.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 331d459)
The file was modifiedpython/pyspark/sql/tests/test_pandas_udf_grouped_agg.py (diff)
The file was modifiedpython/pyspark/sql/tests/test_pandas_udf_window.py (diff)
Commit 8f1eec4d138da604b890111d0a6daaef86d44ef2 by gurwls223
[SPARK-34584][SQL] Static partition should also follow StoreAssignmentPolicy when insert into v2 tables

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/27597 and simply apply the fix in the v2 table insertion code path.

### Why are the changes needed?

bug fix

### Does this PR introduce _any_ user-facing change?

yes, now v2 table insertion with static partitions also follow StoreAssignmentPolicy.

### How was this patch tested?

moved the test from https://github.com/apache/spark/pull/27597 to the general test suite `SQLInsertTestSuite`, which covers DS v2, file source, and hive tables.

Closes #31726 from cloud-fan/insert.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 8f1eec4)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/SQLInsertTestSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala (diff)
Commit db627107b7afad7a6eef538b85c3e998ca833fee by wenchen
[SPARK-34577][SQL] Fix drop/add columns to a dataset of `DESCRIBE NAMESPACE`

### What changes were proposed in this pull request?
In the PR, I propose to generate "stable" output attributes per the logical node of the DESCRIBE NAMESPACE command.

### Why are the changes needed?
This fixes the issue demonstrated by the example:

```
sql(s"CREATE NAMESPACE ns")
val description = sql(s"DESCRIBE NAMESPACE ns")
description.drop("name")
```

```
[info]   org.apache.spark.sql.AnalysisException: Resolved attribute(s) name#74 missing from name#25,value#26 in operator !Project [name#74]. Attribute(s) with the same name appear in the operation: name. Please check if the right attribute(s) are used.;
[info] !Project [name#74]
[info] +- LocalRelation [name#25, value#26]
```

### Does this PR introduce _any_ user-facing change?
After this change user `drop()/add()` works well.

### How was this patch tested?
Added UT

Closes #31705 from AngersZhuuuu/SPARK-34577.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: db62710)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala (diff)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala (diff)
Commit 53e4dba7c489ac5c0ad61f0121c4e247de5b485c by wenchen
[SPARK-34599][SQL] Fix the issue that INSERT INTO OVERWRITE doesn't support partition columns containing dot for DSv2

### What changes were proposed in this pull request?

`ResolveInsertInto.staticDeleteExpression` should use `UnresolvedAttribute.quoted` to create the delete expression so that we will treat the entire `attr.name` as a column name.

### Why are the changes needed?

When users use `dot` in a partition column name, queries like ```INSERT OVERWRITE $t1 PARTITION (`a.b` = 'a') (`c.d`) VALUES('b')``` is not working.

### Does this PR introduce _any_ user-facing change?

Without this test, the above query will throw
```
[info]   org.apache.spark.sql.AnalysisException: cannot resolve '`a.b`' given input columns: [a.b, c.d];
[info] 'OverwriteByExpression RelationV2[a.b#17, c.d#18] default.tbl, ('a.b <=> cast(a as string)), false
[info] +- Project [a.b#19, ansi_cast(col1#16 as string) AS c.d#20]
[info]    +- Project [cast(a as string) AS a.b#19, col1#16]
[info]       +- LocalRelation [col1#16]
```

With the fix, the query will run correctly.

### How was this patch tested?

The new added test.

Closes #31713 from zsxwing/SPARK-34599.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 53e4dba)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/connector/InsertIntoTests.scala (diff)
Commit 2b1c1700167e9083617d4ea4c2b6982dbf4dd063 by gengliang.wang
[SPARK-34614][SQL] ANSI mode: Casting String to Boolean should throw exception on parse error

### What changes were proposed in this pull request?

In ANSI mode, casting String to Boolean should throw an exception on parse error, instead of returning null

### Why are the changes needed?

For better ANSI compliance

### Does this PR introduce _any_ user-facing change?

Yes, in ANSI mode there will be an exception on parse failure of casting String value to Boolean type.

### How was this patch tested?

Unit tests.

Closes #31734 from gengliangwang/ansiCastToBoolean.

Authored-by: Gengliang Wang <gengliang.wang@databricks.com>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
(commit: 2b1c170)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala (diff)
The file was modifieddocs/sql-ref-ansi-compliance.md (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/postgreSQL/boolean.sql.out (diff)
Commit 401e270c179021dd7bfd136b143ccc6d01c04755 by wenchen
[SPARK-34567][SQL] CreateTableAsSelect should update metrics too

### What changes were proposed in this pull request?
For command `CreateTableAsSelect` we use `InsertIntoHiveTable`, `InsertIntoHadoopFsRelationCommand` to insert data.
We will update metrics of  `InsertIntoHiveTable`, `InsertIntoHadoopFsRelationCommand`  in `FileFormatWriter.write()`, but we only show CreateTableAsSelectCommand in WebUI SQL Tab.
We need to update `CreateTableAsSelectCommand`'s metrics too.

Before this PR:
![image](https://user-images.githubusercontent.com/46485123/109411226-81f44480-79db-11eb-99cb-b9686b15bf61.png)

After this PR:
![image](https://user-images.githubusercontent.com/46485123/109411232-8ae51600-79db-11eb-9111-3bea0bc2d475.png)

![image](https://user-images.githubusercontent.com/46485123/109905192-62aa2f80-7cd9-11eb-91f9-04b16c9238ae.png)

### Why are the changes needed?
Complete SQL Metrics

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
<!--
MT

Closes #31679 from AngersZhuuuu/SPARK-34567.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 401e270)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala (diff)
The file was modifiedsql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala (diff)
The file was modifiedsql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLMetricsSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala (diff)
Commit e7e016192f882cfb430d706c2099e58e1bcc014c by wenchen
[SPARK-34482][SS] Correct the active SparkSession for StreamExecution.logicalPlan

### What changes were proposed in this pull request?

Set the active SparkSession to `sparkSessionForStream` and diable AQE & CBO before initializing the `StreamExecution.logicalPlan`.

### Why are the changes needed?

The active session should be `sparkSessionForStream`. Otherwise, settings like

https://github.com/apache/spark/blob/6b34745cb9b294c91cd126c2ea44c039ee83cb84/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala#L332-L335

wouldn't take effect if callers access them from the active SQLConf, e.g., the rule of `InsertAdaptiveSparkPlan`. Besides, unlike `InsertAdaptiveSparkPlan` (which skips streaming plan), `CostBasedJoinReorder` seems to have the chance to take effect theoretically.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tested manually. Before the fix, `InsertAdaptiveSparkPlan` would try to apply AQE on the plan(wouldn't take effect though). After this fix, the rule returns directly.

Closes #31600 from Ngone51/active-session-for-stream.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: e7e0161)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala (diff)
Commit 17601e014c6ccb48958d35ffb04bedeac8cfc66a by wenchen
[SPARK-34605][SQL] Support `java.time.Duration` as an external type of the day-time interval type

### What changes were proposed in this pull request?
In the PR, I propose to extend Spark SQL API to accept [`java.time.Duration`](https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html) as an external type of recently added new Catalyst type - `DayTimeIntervalType` (see #31614). The Java class `java.time.Duration` has similar semantic to ANSI SQL day-time interval type, and it is the most suitable to be an external type for `DayTimeIntervalType`. In more details:
1. Added `DurationConverter` which converts `java.time.Duration` instances to/from internal representation of the Catalyst type `DayTimeIntervalType` (to `Long` type). The `DurationConverter` object uses new methods of `IntervalUtils`:
    - `durationToMicros()` converts the input duration to the total length in microseconds. If this duration is too large to fit `Long`, the method throws the exception `ArithmeticException`. **Note:** _the input duration has nanosecond precision, the method casts the nanos part to microseconds by dividing by 1000._
    - `microsToDuration()` obtains a `java.time.Duration` representing a number of microseconds.
2. Support new type `DayTimeIntervalType` in `RowEncoder` via the methods `createDeserializerForDuration()` and `createSerializerForJavaDuration()`.
3. Extended the Literal API to construct literals from `java.time.Duration` instances.

### Why are the changes needed?
1. To allow users parallelization of `java.time.Duration` collections, and construct day-time interval columns. Also to collect such columns back to the driver side.
2. This will allow to write tests in other sub-tasks of SPARK-27790.

### Does this PR introduce _any_ user-facing change?
The PR extends existing functionality. So, users can parallelize instances of the `java.time.Duration` class and collect them back:

```Scala
scala> val ds = Seq(java.time.Duration.ofDays(10)).toDS
ds: org.apache.spark.sql.Dataset[java.time.Duration] = [value: daytimeinterval]

scala> ds.collect
res0: Array[java.time.Duration] = Array(PT240H)
```

### How was this patch tested?
- Added a few tests to `CatalystTypeConvertersSuite` to check conversion from/to `java.time.Duration`.
- Checking row encoding by new tests in `RowEncoderSuite`.
- Making literals of `DayTimeIntervalType` are tested in `LiteralExpressionSuite`
- Check collecting by `DatasetSuite` and `JavaDatasetSuite`.

Closes #31729 from MaxGekk/java-time-duration.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 17601e0)
The file was modifiedsql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/SpecializedGettersReader.java (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/RowEncoderSuite.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/SQLImplicits.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SerializerBuildHelper.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedUnsafeProjection.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SpecificInternalRow.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala (diff)
The file was modifiedsql/core/src/test/java/test/org/apache/spark/sql/JavaDatasetSuite.java (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/IntervalUtilsSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala (diff)
Commit 9ac5ee2e17ca491eabf2e6e7d33ce7cfb5a002a7 by dhyun
[SPARK-32924][WEBUI] Make duration column in master UI sorted in the correct order

### What changes were proposed in this pull request?

Make the "duration" column in standalone mode master UI sorted by numeric duration, hence the column can be sorted by the correct order.

Before changes:
![image](https://user-images.githubusercontent.com/26694233/110025426-f5a49300-7cf4-11eb-86f0-2febade86be9.png)

After changes:
![image](https://user-images.githubusercontent.com/26694233/110025604-33092080-7cf5-11eb-8b34-215688faf56d.png)

### Why are the changes needed?

Fix a UI bug to make the sorting consistent across different pages.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Ran several apps with different durations and verified the duration column on the master page can be sorted correctly.

Closes #31743 from baohe-zhang/SPARK-32924.

Authored-by: Baohe Zhang <baohe.zhang@verizonmedia.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 9ac5ee2)
The file was modifiedcore/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala (diff)
Commit dbce74d39d192f702972df06382591b3e8e43a77 by yamamuro
[SPARK-34607][SQL] Add `Utils.isMemberClass` to fix a malformed class name error on jdk8u

### What changes were proposed in this pull request?

This PR intends to fix a bug of `objects.NewInstance` if a user runs Spark on jdk8u and a given `cls` in `NewInstance` is a deeply-nested inner class, e.g.,.
```
  object OuterLevelWithVeryVeryVeryLongClassName1 {
    object OuterLevelWithVeryVeryVeryLongClassName2 {
      object OuterLevelWithVeryVeryVeryLongClassName3 {
        object OuterLevelWithVeryVeryVeryLongClassName4 {
          object OuterLevelWithVeryVeryVeryLongClassName5 {
            object OuterLevelWithVeryVeryVeryLongClassName6 {
              object OuterLevelWithVeryVeryVeryLongClassName7 {
                object OuterLevelWithVeryVeryVeryLongClassName8 {
                  object OuterLevelWithVeryVeryVeryLongClassName9 {
                    object OuterLevelWithVeryVeryVeryLongClassName10 {
                      object OuterLevelWithVeryVeryVeryLongClassName11 {
                        object OuterLevelWithVeryVeryVeryLongClassName12 {
                          object OuterLevelWithVeryVeryVeryLongClassName13 {
                            object OuterLevelWithVeryVeryVeryLongClassName14 {
                              object OuterLevelWithVeryVeryVeryLongClassName15 {
                                object OuterLevelWithVeryVeryVeryLongClassName16 {
                                  object OuterLevelWithVeryVeryVeryLongClassName17 {
                                    object OuterLevelWithVeryVeryVeryLongClassName18 {
                                      object OuterLevelWithVeryVeryVeryLongClassName19 {
                                        object OuterLevelWithVeryVeryVeryLongClassName20 {
                                          case class MalformedNameExample2(x: Int)
                                        }}}}}}}}}}}}}}}}}}}}
```

The root cause that Kris (rednaxelafx) investigated is as follows (Kudos to Kris);

The reason why the test case above is so convoluted is in the way Scala generates the class name for nested classes. In general, Scala generates a class name for a nested class by inserting the dollar-sign ( `$` ) in between each level of class nesting. The problem is that this format can concatenate into a very long string that goes beyond certain limits, so Scala will change the class name format beyond certain length threshold.

For the example above, we can see that the first two levels of class nesting have class names that look like this:
```
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassName1$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassName1$OuterLevelWithVeryVeryVeryLongClassName2$
```
If we leave out the fact that Scala uses a dollar-sign ( `$` ) suffix for the class name of the companion object, `OuterLevelWithVeryVeryVeryLongClassName1`'s full name is a prefix (substring) of `OuterLevelWithVeryVeryVeryLongClassName2`.

But if we keep going deeper into the levels of nesting, you'll find names that look like:
```
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$2a1321b953c615695d7442b2adb1$$$$ryVeryLongClassName8$OuterLevelWithVeryVeryVeryLongClassName9$OuterLevelWithVeryVeryVeryLongClassName10$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$2a1321b953c615695d7442b2adb1$$$$ryVeryLongClassName8$OuterLevelWithVeryVeryVeryLongClassName9$OuterLevelWithVeryVeryVeryLongClassName10$OuterLevelWithVeryVeryVeryLongClassName11$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$85f068777e7ecf112afcbe997d461b$$$$VeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$85f068777e7ecf112afcbe997d461b$$$$VeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$OuterLevelWithVeryVeryVeryLongClassName13$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$85f068777e7ecf112afcbe997d461b$$$$VeryLongClassName11$OuterLevelWithVeryVeryVeryLongClassName12$OuterLevelWithVeryVeryVeryLongClassName13$OuterLevelWithVeryVeryVeryLongClassName14$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$5f7ad51804cb1be53938ea804699fa$$$$VeryLongClassName14$OuterLevelWithVeryVeryVeryLongClassName15$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$5f7ad51804cb1be53938ea804699fa$$$$VeryLongClassName14$OuterLevelWithVeryVeryVeryLongClassName15$OuterLevelWithVeryVeryVeryLongClassName16$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$5f7ad51804cb1be53938ea804699fa$$$$VeryLongClassName14$OuterLevelWithVeryVeryVeryLongClassName15$OuterLevelWithVeryVeryVeryLongClassName16$OuterLevelWithVeryVeryVeryLongClassName17$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$69b54f16b1965a31e88968df1a58d8$$$$VeryLongClassName17$OuterLevelWithVeryVeryVeryLongClassName18$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$69b54f16b1965a31e88968df1a58d8$$$$VeryLongClassName17$OuterLevelWithVeryVeryVeryLongClassName18$OuterLevelWithVeryVeryVeryLongClassName19$
org.apache.spark.sql.catalyst.encoders.ExpressionEncoderSuite$OuterLevelWithVeryVeryVeryLongClassNam$$$$69b54f16b1965a31e88968df1a58d8$$$$VeryLongClassName17$OuterLevelWithVeryVeryVeryLongClassName18$OuterLevelWithVeryVeryVeryLongClassName19$OuterLevelWithVeryVeryVeryLongClassName20$
```
with a hash code in the middle and various levels of nesting omitted.

The `java.lang.Class.isMemberClass` method is implemented in JDK8u as:
http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/tip/src/share/classes/java/lang/Class.java#l1425
```
    /**
     * Returns {code true} if and only if the underlying class
     * is a member class.
     *
     * return {code true} if and only if this class is a member class.
     * since 1.5
     */
    public boolean isMemberClass() {
        return getSimpleBinaryName() != null && !isLocalOrAnonymousClass();
    }

    /**
     * Returns the "simple binary name" of the underlying class, i.e.,
     * the binary name without the leading enclosing class name.
     * Returns {code null} if the underlying class is a top level
     * class.
     */
    private String getSimpleBinaryName() {
        Class<?> enclosingClass = getEnclosingClass();
        if (enclosingClass == null) // top level class
            return null;
        // Otherwise, strip the enclosing class' name
        try {
            return getName().substring(enclosingClass.getName().length());
        } catch (IndexOutOfBoundsException ex) {
            throw new InternalError("Malformed class name", ex);
        }
    }
```
and the problematic code is `getName().substring(enclosingClass.getName().length())` -- if a class's enclosing class's full name is *longer* than the nested class's full name, this logic would end up going out of bounds.

The bug has been fixed in JDK9 by https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8057919 , but still exists in the latest JDK8u release. So from the Spark side we'd need to do something to avoid hitting this problem.

### Why are the changes needed?

Bugfix on jdk8u.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added tests.

Closes #31733 from maropu/SPARK-34607.

Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
(commit: dbce74d)
The file was modifiedcore/src/main/scala/org/apache/spark/util/Utils.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala (diff)
Commit 814d81c1e511f87ee5556724bae64c6b54689ca9 by gurwls223
[SPARK-34376][SQL] Support regexp as a SQL function

### What changes were proposed in this pull request?

We have equality in `SqlBase.g4` for `RLIKE: 'RLIKE' | 'REGEXP';`
We seemed to miss adding` REGEXP` as a SQL function just like` RLIKE`

### Why are the changes needed?

symmetry and beauty
This is also a builtin  function in Hive, we can reduce the migration pain for those users

### Does this PR introduce _any_ user-facing change?

yes new regexp function as an alias as rlike

### How was this patch tested?

new tests

Closes #31488 from yaooqinn/SPARK-34376.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(commit: 814d81c)
The file was modifiedsql/core/src/test/resources/sql-tests/results/regexp-functions.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/regexp-functions.sql (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-functions/sql-expression-schema.md (diff)
Commit 43aacd5069294e1215e86cd43bd0810bda998be2 by wenchen
[SPARK-34613][SQL] Fix view does not capture disable hint config

### What changes were proposed in this pull request?

Add allow list to capture sql config for view.

### Why are the changes needed?

Spark use origin text sql to store view then capture and store sql config into view metadata.

Capture config will skip some config with some prefix, e.g. `spark.sql.optimizer.` but unfortunately `spark.sql.optimizer.disableHints` is start with `spark.sql.optimizer.`.

We need a allow list to help capture the config.

### Does this PR introduce _any_ user-facing change?

Yes bug fix.

### How was this patch tested?

Add test.

Closes #31732 from ulysses-you/SPARK-34613.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 43aacd5)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewTestSuite.scala (diff)
Commit 979b9bcf5d36bcbf1063d9f999c32fd70fa25f89 by dhyun
[SPARK-34608][SQL] Remove unused output of AddJarCommand

### What changes were proposed in this pull request?
Remove unused output of AddJarCommand, keep consistence and clean

### Why are the changes needed?
Keep consistence and clean

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes #31725 from AngersZhuuuu/SPARK-34608.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 979b9bc)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/resources.scala (diff)
Commit dc78f337cbca94872cdfc7725542b993003bd622 by wenchen
[SPARK-34609][SQL] Unify resolveExpressionBottomUp and resolveExpressionTopDown

### What changes were proposed in this pull request?

It's a bit confusing to see `resolveExpressionBottomUp` and `resolveExpressionTopDown`, which provide similar functionalities but with different tree traverse order. It turns out that the real difference between these 2 methods is: which attributes should the columns be resolved to? `resolveExpressionTopDown` resolves columns using output attributes of the plan children, `resolveExpressionBottomUp` resolves columns using output attributes of the plan itself.

This PR unifies `resolveExpressionBottomUp` and `resolveExpressionTopDown` and put the common logic in a new method, and let `resolveExpressionBottomUp` and `resolveExpressionTopDown` just call the new method. This PR also renames `resolveExpressionBottomUp` and `resolveExpressionTopDown` to make the difference clear.

### Why are the changes needed?

code cleanup

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

Closes #31728 from cloud-fan/resolve.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: dc78f33)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
Commit 75db6e7d9e4241b00422e432ede5e6c120a4f1a8 by dhyun
[MINOR][SQL][DOCS] Fix some spelling issues in SQL migration guide

### What changes were proposed in this pull request?

1 add a sapce between words
2 unify the initials' case

### Why are the changes needed?

correct spelling issues for better user experience

### Does this PR introduce _any_ user-facing change?

yes.

### How was this patch tested?

manually

Closes #31748 from hopefulnick/doc_rectify.

Authored-by: nickhliu <nickhliu@tencent.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 75db6e7)
The file was modifieddocs/sql-migration-guide.md (diff)
Commit 358697b386c4e91a8753faad02a496011138a28c by dhyun
[SPARK-34635][UI] Add trailing slashes in URLs to reduce unnecessary redirects

### What changes were proposed in this pull request?

Add trailing slashes in URLs of Spark UI pages.

### Why are the changes needed?

When a user accesses a URL without a trailing slash, Spark UI always responds with a 302 redirect to a URL with a trailing slash.
![image](https://user-images.githubusercontent.com/1097932/110072744-1be92380-7d33-11eb-98d4-50df12f59ae3.png)

Adding trailing slash to URLs in UI pages can reduce such unnecessary redirects

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Manual test. It's a very simple change.

Closes #31753 from gengliangwang/reduceRedirect.

Authored-by: Gengliang Wang <gengliang.wang@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 358697b)
The file was modifiedcore/src/main/scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/deploy/worker/ui/WorkerPage.scala (diff)
Commit ca326c4bb397f24937c22487922018f250ab4e7d by wenchen
[SPARK-22748][SQL] Analyze __grouping__id as a literal function

### What changes were proposed in this pull request?

This PR intends to refactor the logic to resolve `__grouping_id` in the `Analyzer`; it moves the logic from `ResolveFunctions` to `ResolveReferences` (`resolveLiteralFunction`).

The original author of this PR is sqlwindspeaker (#30781).

Closes #30781.

### Why are the changes needed?

Code refactoring.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added tests in `AnalysisSuite`.

Closes #31751 from maropu/SPARK-22748.

Authored-by: suqilong <suqilong@qiyi.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: ca326c4)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala (diff)
Commit 654f19dfd7f25cce811670b9f87c65f423eda58f by wenchen
[SPARK-34621][SQL] Unify output of ShowCreateTableAsSerdeCommand  and ShowCreateTableCommand

### What changes were proposed in this pull request?
Unify output of ShowCreateTableAsSerdeCommand  and ShowCreateTableCommand

### Why are the changes needed?
Unify output of ShowCreateTableAsSerdeCommand  and ShowCreateTableCommand

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

Closes #31737 from AngersZhuuuu/SPARK-34621.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 654f19d)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala (diff)
Commit c91a75620465c5606b837a595d6cc55c7219026f by viirya
[SPARK-34592][WEBUI] Mark indeterminate RDD in Web UI

### What changes were proposed in this pull request?

This patch proposes to mark indeterminate RDD in Web UI.

### Why are the changes needed?

It is somehow hard to track which part is indeterminate in a graph of RDDs. In some cases we may need to track indeterminate RDDs. For example, indeterminate map stage fails and Spark is unable to fallback some parent stages. The developers are usually unable to easily identify indeterminate part from the complicated RDD computation. If Web UI can show up indeterminate RDD like cached RDD, it could be useful to track it.

### Does this PR introduce _any_ user-facing change?

Yes, there is a UI change for users.

### How was this patch tested?

Manual check with Web UI locally. Updated existing unit tests.

<img width="544" alt="Screen Shot 2021-03-02 at 12 38 02 AM" src="https://user-images.githubusercontent.com/68855/109621580-020bce80-7af0-11eb-834f-46b0f89d47c0.png">
<img width="390" alt="Screen Shot 2021-03-05 at 9 27 14 AM" src="https://user-images.githubusercontent.com/68855/110151181-04db1d80-7d95-11eb-8b3a-7235f7fe9eac.png">

Closes #31707 from viirya/SPARK-34592.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
(commit: c91a756)
The file was modifiedcore/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala (diff)
The file was modifiedcore/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.css (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/storage/RDDInfo.scala (diff)
The file was modifiedcore/src/test/scala/org/apache/spark/ui/scope/RDDOperationGraphSuite.scala (diff)
The file was modifiedcore/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/util/JsonProtocol.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/ui/UIUtils.scala (diff)
The file was modifiedcore/src/main/scala/org/apache/spark/rdd/RDD.scala (diff)
Commit 1fd73686baebfcd438f0b228f5a5ffd7c7a9f689 by dhyun
[SPARK-34624][CORE] Exclude non-jar dependencies of ivy/maven packages

### What changes were proposed in this pull request?
Exclude non-jar dependencies of the ivy/maven packages we want to resolve as our current dependency resolution code assumes artifacts to be jars. https://github.com/apache/spark/blob/17601e014c6ccb48958d35ffb04bedeac8cfc66a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L1215 and https://github.com/apache/spark/blob/17601e014c6ccb48958d35ffb04bedeac8cfc66a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L318

### Why are the changes needed?
Some maven artifacts define non-jar dependencies. One such example is `hive-exec`'s dependency on the `pom` of `apache-curator` https://repo1.maven.org/maven2/org/apache/hive/hive-exec/2.3.8/hive-exec-2.3.8.pom

Today trying to depend on such an artifact using `--packages` will print an error but continue without including the non-jar dependency. Doing the same using `spark.sql("ADD JAR ivy://org.apache.hive:hive-exec:2.3.8?exclude=org.pentaho:pentaho-aggdesigner-algorithm")` will cause a failure. Detailed stacktraces can be found in SPARK-34624.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added unit test. Retried the same example in `spark-shell` which produced the stacktrace in the JIRA.

Closes #31741 from shardulm94/add-jar-filter-poms.

Authored-by: Shardul Mahadik <smahadik@linkedin.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: 1fd7368)
The file was modifiedcore/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala (diff)
The file was modifiedcore/src/test/scala/org/apache/spark/deploy/SparkSubmitUtilsSuite.scala (diff)
Commit f72b9068adadaa0478a5306c4778346b04ac8a47 by dhyun
[SPARK-34643][R][DOCS] Use CRAN URL in canonical form

### What changes were proposed in this pull request?

This PR fixes the URL links to use CRAN URL in canonical form.
CRAN package submission was failed as below:

```
   Found the following (possibly) invalid URLs:
     URL: https://cran.r-project.org/web/packages/e1071/index.html
       From: man/spark.naiveBayes.Rd
       Status: 200
       Message: OK
       CRAN URL not in canonical form
     URL: https://cran.r-project.org/web/packages/mixtools/index.html
       From: man/spark.gaussianMixture.Rd
       Status: 200
       Message: OK
       CRAN URL not in canonical form
     URL: https://cran.r-project.org/web/packages/survival/index.html
       From: man/spark.survreg.Rd
       Status: 200
       Message: OK
       CRAN URL not in canonical form
     URL: https://cran.r-project.org/web/packages/topicmodels/index.html
       From: man/spark.lda.Rd
       Status: 200
       Message: OK
       CRAN URL not in canonical form
     The canonical URL of the CRAN page for a package is
       https://CRAN.R-project.org/package=pkgname
```

### Why are the changes needed?

To fix CRAN package submission

### Does this PR introduce _any_ user-facing change?

It exposes a canoncal form of URLs to end users.

### How was this patch tested?

I manually clicked each links.

Closes #31759 from HyukjinKwon/minor-doc-fixes.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(commit: f72b906)
The file was modifiedR/pkg/R/mllib_regression.R (diff)
The file was modifiedR/pkg/R/mllib_clustering.R (diff)
The file was modifiedR/pkg/R/mllib_classification.R (diff)
The file was modifiedR/DOCUMENTATION.md (diff)
Commit 1a9722420eda672f42c1d1e81baa6d6c40126df7 by yamamuro
[SPARK-34595][SQL] DPP support RLIKE

### What changes were proposed in this pull request?
This pr make DPP support RLIKE expression:

```sql
SELECT date_id, product_id FROM fact_sk f
JOIN dim_store s
ON f.store_id = s.store_id WHERE s.country RLIKE  '[DE|US]'
```
### Why are the changes needed?
Improve query performance.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Unit test.

Closes #31722 from chaojun-zhang/SPARK-34595.

Authored-by: helloman <zcj23085@gmail.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
(commit: 1a97224)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/DynamicPartitionPruningSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala (diff)