Changes

Summary

  1. [SPARK-32516][SQL][FOLLOWUP] Remove unnecessary check if path string is (commit: ab2fa88) (details)
  2. [SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer (commit: 06a9945) (details)
  3. [SPARK-32819][SQL] ignoreNullability parameter should be effective (commit: add267c) (details)
  4. [SPARK-32826][SQL] Set the right column size for the null type in (commit: 9ab8a2c) (details)
  5. [SPARK-32312][SQL][PYTHON][TEST-JAVA11] Upgrade Apache Arrow to version (commit: e0538bd) (details)
  6. Revert "[SPARK-32772][SQL][FOLLOWUP] Remove legacy silent support mode (commit: 4a09613) (details)
  7. [SPARK-32831][SS] Refactor SupportsStreamingUpdate to represent actual (commit: db89b0e) (details)
  8. [SPARK-32832][SS] Use CaseInsensitiveMap for DataStreamReader/Writer (commit: 2f85f95) (details)
  9. [SPARK-32841][BUILD] Use Apache Hadoop 3.2.0 for PyPI and CRAN (commit: dbc4137) (details)
  10. [SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement (commit: 8f61005) (details)
  11. [SPARK-32828][SQL] Cast from a derived user-defined type to a base type (commit: 7eb76d6) (details)
  12. [SPARK-32840][SQL] Invalid interval value can happen to be just adhesive (commit: 5669b21) (details)
  13. [SPARK-32777][SQL] Aggregation support aggregate function with multiple (commit: a22871f) (details)
  14. [SPARK-32822][SQL] Change the number of partitions to zero when a range (commit: 5f468cc) (details)
  15. [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in (commit: 328d81a) (details)
Commit ab2fa881edb0a75a94b7f5965a49687658d74060 by wenchen
[SPARK-32516][SQL][FOLLOWUP] Remove unnecessary check if path string is
empty for DataFrameWriter.save(), DataStreamReader.load() and
DataStreamWriter.start()
### What changes were proposed in this pull request?
This PR is a follow up to
https://github.com/apache/spark/pull/29543#discussion_r485409606, which
correctly points out that the check for the empty string is not
necessary.
### Why are the changes needed?
The unnecessary check actually could cause more confusion.
For example,
```scala scala> Seq(1).toDF.write.option("path",
"/tmp/path1").parquet("") java.lang.IllegalArgumentException: Can not
create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:168)
``` even when `path` option is available. This PR addresses to fix this
confusion.
### Does this PR introduce _any_ user-facing change?
Yes, now the above example prints the consistent exception message
whether the path parameter value is empty or not.
```scala scala> Seq(1).toDF.write.option("path",
"/tmp/path1").parquet("") org.apache.spark.sql.AnalysisException: There
is a 'path' option set and save() is called with a path parameter.
Either remove the path option, or call save() without the parameter. To
ignore this check, set 'spark.sql.legacy.pathOptionBehavior.enabled' to
'true'.;
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:290)
at
org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:856)
... 47 elided
```
### How was this patch tested?
Added unit tests.
Closes #29697 from imback82/SPARK-32516-followup.
Authored-by: Terry Kim <yuminkim@gmail.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
(commit: ab2fa88)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala (diff)
Commit 06a994517fc3080b14f01183eeb17e56ab52eaa8 by dongjoon
[SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check writer
options correctly
### What changes were proposed in this pull request?
This PR aims to fix the test coverage at `DataStreamReaderWriterSuite`.
### Why are the changes needed?
Currently, the test case checks `DataStreamReader` options instead of
`DataStreamWriter` options.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the revised test case.
Closes #29701 from dongjoon-hyun/SPARK-32836.
Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon
Hyun <dongjoon@apache.org>
(commit: 06a9945)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala (diff)
Commit add267c4de51274f8f57dbdcce8c497e09899df3 by wenchen
[SPARK-32819][SQL] ignoreNullability parameter should be effective
recursively
### What changes were proposed in this pull request?
This patch proposes to check `ignoreNullability` parameter recursively
in `equalsStructurally` method.
### Why are the changes needed?
`equalsStructurally` is used to check type equality. We can optionally
ask to ignore nullability check. But the parameter `ignoreNullability`
is not passed recursively down to nested types. So it produces weird
error like:
``` data type mismatch: argument 3 requires array<array<string>> type,
however ... is of array<array<string>> type.
```
when running the query `select aggregate(split('abcdefgh',''),
array(array('')), (acc, x) -> array(array( x ) ) )`.
### Does this PR introduce _any_ user-facing change?
Yes, fixed a bug when running user query.
### How was this patch tested?
Unit tests.
Closes #29698 from viirya/SPARK-32819.
Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Wenchen
Fan <wenchen@databricks.com>
(commit: add267c)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/ansi/higher-order-functions.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/higher-order-functions.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/higher-order-functions.sql (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala (diff)
Commit 9ab8a2c36db324748295d23c62339e9cb68ba77e by wenchen
[SPARK-32826][SQL] Set the right column size for the null type in
SparkGetColumnsOperation
### What changes were proposed in this pull request?
In Spark 3.0.0, the SparkGetColumnsOperation can not recognize NULL
columns but now we can because the side effect of
https://issues.apache.org/jira/browse/SPARK-32696 /
https://github.com/apache/spark/commit/f14f3742e0c98dd306abf02e93d2f10d89bc423f,
but the test coverage for this change was not added.
In Spark, the column size for null fields should be 1, in this PR, we
set the right column size for the null type.
### Why are the changes needed?
test coverage and fix the client-side information about the null type
through jdbc
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
added ut both for this pr and SPARK-32696
Closes #29687 from yaooqinn/SPARK-32826.
Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
(commit: 9ab8a2c)
The file was modifiedsql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala (diff)
The file was modifiedsql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/SparkMetadataOperationSuite.scala (diff)
Commit e0538bd38cd43feaa064e30df940edf8fb2088de by gurwls223
[SPARK-32312][SQL][PYTHON][TEST-JAVA11] Upgrade Apache Arrow to version
1.0.1
### What changes were proposed in this pull request?
Upgrade Apache Arrow to version 1.0.1 for the Java dependency and
increase minimum version of PyArrow to 1.0.0.
This release marks a transition to binary stability of the columnar
format (which was already informally backward-compatible going back to
December 2017) and a transition to Semantic Versioning for the Arrow
software libraries. Also note that the Java arrow-memory artifact has
been split to separate dependence on netty-buffer and allow users to
select an allocator. Spark will continue to use `arrow-memory-netty` to
maintain performance benefits.
Version 1.0.0 - 1.0.0 include the following selected fixes/improvements
relevant to Spark users:
ARROW-9300 - [Java] Separate Netty Memory to its own module ARROW-9272 -
[C++][Python] Reduce complexity in python to arrow conversion ARROW-9016
- [Java] Remove direct references to Netty/Unsafe Allocators ARROW-8664
- [Java] Add skip null check to all Vector types ARROW-8485 -
[Integration][Java] Implement extension types integration ARROW-8434 -
[C++] Ipc RecordBatchFileReader deserializes the Schema multiple times
ARROW-8314 - [Python] Provide a method to select a subset of columns of
a Table ARROW-8230 - [Java] Move Netty memory manager into a separate
module ARROW-8229 - [Java] Move ArrowBuf into the Arrow package
ARROW-7955 - [Java] Support large buffer for file/stream IPC ARROW-7831
- [Java] unnecessary buffer allocation when calling splitAndTransferTo
on variable width vectors ARROW-6111 - [Java] Support LargeVarChar and
LargeBinary types and add integration test with C++ ARROW-6110 - [Java]
Support LargeList Type and add integration test with C++ ARROW-5760 -
[C++] Optimize Take implementation ARROW-300 - [Format] Add body buffer
compression option to IPC message protocol using LZ4 or ZSTD ARROW-9098
- RecordBatch::ToStructArray cannot handle record batches with 0 column
ARROW-9066 - [Python] Raise correct error in isnull() ARROW-9223 -
[Python] Fix to_pandas() export for timestamps within structs ARROW-9195
- [Java] Wrong usage of Unsafe.get from bytearray in ByteFunctionsHelper
class ARROW-7610 - [Java] Finish support for 64 bit int allocations
ARROW-8115 - [Python] Conversion when mixing NaT and datetime objects
not working ARROW-8392 - [Java] Fix overflow related corner cases for
vector value comparison ARROW-8537 - [C++] Performance regression from
ARROW-8523 ARROW-8803 - [Java] Row count should be set before loading
buffers in VectorLoader ARROW-8911 - [C++] Slicing a ChunkedArray with
zero chunks segfaults
View release notes here: https://arrow.apache.org/release/1.0.1.html
https://arrow.apache.org/release/1.0.0.html
### Why are the changes needed?
Upgrade brings fixes, improvements and stability guarantees.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing tests with pyarrow 1.0.0 and 1.0.1
Closes #29686 from BryanCutler/arrow-upgrade-100-SPARK-32312.
Authored-by: Bryan Cutler <cutlerb@gmail.com> Signed-off-by: HyukjinKwon
<gurwls223@apache.org>
(commit: e0538bd)
The file was modifiedpython/setup.py (diff)
The file was modifieddev/deps/spark-deps-hadoop-3.2-hive-2.3 (diff)
The file was modifiedpython/pyspark/sql/pandas/utils.py (diff)
The file was modifiedpython/docs/source/user_guide/arrow_pandas.rst (diff)
The file was modifieddev/deps/spark-deps-hadoop-2.7-hive-2.3 (diff)
The file was modifiedsql/catalyst/pom.xml (diff)
The file was modifieddev/deps/spark-deps-hadoop-2.7-hive-1.2 (diff)
The file was modifiedpom.xml (diff)
Commit 4a096131eee99ff8cb32022072e085dda87f766f by gurwls223
Revert "[SPARK-32772][SQL][FOLLOWUP] Remove legacy silent support mode
for spark-sql CLI"
This reverts commit f1f7ae420ee5cfc5755141d5ff25604bb78be465.
(commit: 4a09613)
The file was modifiedsql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala (diff)
Commit db89b0e1b8bb98db6672f2b89e42e8a14e06e745 by kabhwan.opensource
[SPARK-32831][SS] Refactor SupportsStreamingUpdate to represent actual
meaning of the behavior
### What changes were proposed in this pull request?
This PR renames `SupportsStreamingUpdate` to
`SupportsStreamingUpdateAsAppend` as the new interface name represents
the actual behavior clearer. This PR also removes the `update()` method
(so the interface is more likely a marker), as the implementations of
`SupportsStreamingUpdateAsAppend` should support append mode by default,
hence no need to trigger some flag on it.
### Why are the changes needed?
SupportsStreamingUpdate was intended to revive the functionality of
Streaming update output mode for internal data sources, but despite the
name, that interface isn't really used to do actual update on sink; all
sinks are implementing this interface to do append, so strictly saying,
it's just to support update as append. Renaming the interface would make
it clear.
### Does this PR introduce _any_ user-facing change?
No, as the class is only for internal data sources.
### How was this patch tested?
Jenkins test will follow.
Closes #29693 from HeartSaVioR/SPARK-32831.
Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Signed-off-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
(commit: db89b0e)
The file was modifiedexternal/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala (diff)
The file was addedsql/catalyst/src/main/scala/org/apache/spark/sql/internal/connector/SupportsStreamingUpdateAsAppend.scala
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/noop/NoopDataSource.scala (diff)
The file was removedsql/catalyst/src/main/scala/org/apache/spark/sql/internal/connector/SupportsStreamingUpdate.scala
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/streaming/console.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ForeachWriterTable.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/memory.scala (diff)
Commit 2f85f9516cfc33a376871cf27f9fb4ac30ecbed8 by dongjoon
[SPARK-32832][SS] Use CaseInsensitiveMap for DataStreamReader/Writer
options
### What changes were proposed in this pull request?
This PR aims to fix indeterministic behavior on DataStreamReader/Writer
options like the following.
```scala scala> spark.readStream.format("parquet").option("paTh",
"1").option("PATH", "2").option("Path", "3").option("patH",
"4").option("path", "5").load() org.apache.spark.sql.AnalysisException:
Path does not exist: 1;
```
### Why are the changes needed?
This will make the behavior deterministic.
### Does this PR introduce _any_ user-facing change?
Yes, but the previous behavior is indeterministic.
### How was this patch tested?
Pass the newly test cases.
Closes #29702 from dongjoon-hyun/SPARK-32832.
Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon
Hyun <dongjoon@apache.org>
(commit: 2f85f95)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala (diff)
Commit dbc41376de636ca2aec354021647a67df319d914 by dongjoon
[SPARK-32841][BUILD] Use Apache Hadoop 3.2.0 for PyPI and CRAN
### What changes were proposed in this pull request?
PyPI and CRAN did not change because of the concern about selecting
Hadoop and Hive versions.
For PyPI, now there is a PR open at
https://github.com/apache/spark/pull/29703 For CRAN, we can already
select Hadoop and Hive versions via `SparkR::install.spark`.
### Why are the changes needed?
To keep the default profiles consistent in distributions
### Does this PR introduce _any_ user-facing change?
Yes, the default distributions will use Hadoop 3.2.
### How was this patch tested?
Jenkins tests.
Closes #29704 from HyukjinKwon/SPARK-32058.
Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon
Hyun <dongjoon@apache.org>
(commit: dbc4137)
The file was modifieddev/create-release/release-build.sh (diff)
Commit 8f610057232c4c2919bdd8bdfba7a4608a360b9f by wenchen
[SPARK-32456][SS][FOLLOWUP] Update doc to note about using SQL statement
with streaming Dataset
### What changes were proposed in this pull request?
This patch proposes to update the doc (both SS guide doc and Dataset
dropDuplicates method doc) to leave a note to check on using SQL
statements with streaming Dataset.
Once end users create a temp view based on streaming Dataset, they won't
bother with thinking about "streaming" and do whatever they do with
batch query. In many cases it works, but not just smoothly for the case
when streaming aggregation is involved. They still need to concern about
maintaining state store.
### Why are the changes needed?
Although SPARK-32456 fixed the weird error message, as a side effect
some operations are enabled on streaming workload via SQL statement,
which is error-prone if end users don't indicate what they're doing.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Only doc change.
Closes #29461 from HeartSaVioR/SPARK-32456-FOLLOWUP-DOC.
Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: 8f61005)
The file was modifieddocs/structured-streaming-programming-guide.md (diff)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/Dataset.scala (diff)
Commit 7eb76d698836a251065753117e22285dd1a8aa8f by yamamuro
[SPARK-32828][SQL] Cast from a derived user-defined type to a base type
### What changes were proposed in this pull request?
This PR intends to fix an existing bug below in `UserDefinedTypeSuite`;
```
[info] - SPARK-19311: UDFs disregard UDT type hierarchy (931
milliseconds) 16:22:35.936 WARN
org.apache.spark.sql.catalyst.expressions.SafeProjection: Expr codegen
error and falling back to interpreter mode
org.apache.spark.SparkException: Cannot cast
org.apache.spark.sql.ExampleSubTypeUDT46b1771f to
org.apache.spark.sql.ExampleBaseTypeUDT31e8d979.
at
org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeCastFunction(Cast.scala:891)
at
org.apache.spark.sql.catalyst.expressions.CastBase.doGenCode(Cast.scala:852)
at
org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:147)
   ...
```
### Why are the changes needed?
bugfix
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Added unit tests.
Closes #29691 from maropu/FixUdtBug.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by:
Takeshi Yamamuro <yamamuro@apache.org>
(commit: 7eb76d6)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/types/TestUDT.scala (diff)
The file was modifiedsql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala (diff)
Commit 5669b212ec38a55ce5db62835801a65ff1021c7f by wenchen
[SPARK-32840][SQL] Invalid interval value can happen to be just adhesive
with the unit
### What changes were proposed in this pull request? In this PR, we add
a checker for STRING form interval value ahead for parsing multiple
units intervals and fail directly if the interval value contains
alphabets to prevent correctness issues like `interval '1 day 2' day`=`3
days`.
### Why are the changes needed?
fix correctness issue
### Does this PR introduce _any_ user-facing change?
yes, in spark 3.0.0 `interval '1 day 2' day`=`3 days` but now we fail
with ParseException
### How was this patch tested?
add a test.
Closes #29708 from yaooqinn/SPARK-32840.
Authored-by: Kent Yao <yaooqinn@hotmail.com> Signed-off-by: Wenchen Fan
<wenchen@databricks.com>
(commit: 5669b21)
The file was modifiedsql/core/src/test/resources/sql-tests/results/interval.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/interval.sql (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala (diff)
Commit a22871f50a78d5d6784b6e978014bfa1b70a98ce by wenchen
[SPARK-32777][SQL] Aggregation support aggregate function with multiple
foldable expressions
### What changes were proposed in this pull request? Spark SQL exists a
bug show below:
``` spark.sql(
" SELECT COUNT(DISTINCT 2), COUNT(DISTINCT 2, 3)")
.show()
+-----------------+--------------------+
|count(DISTINCT 2)|count(DISTINCT 2, 3)|
+-----------------+--------------------+
|                1|                   1|
+-----------------+--------------------+
spark.sql(
" SELECT COUNT(DISTINCT 2), COUNT(DISTINCT 3, 2)")
.show()
+-----------------+--------------------+
|count(DISTINCT 2)|count(DISTINCT 3, 2)|
+-----------------+--------------------+
|                1|                   0|
+-----------------+--------------------+
``` The first query is correct, but the second query is not. The root
reason is the second query rewrited by `RewriteDistinctAggregates` who
expand the output but lost the 2.
### Why are the changes needed? Fix a bug.
`SELECT COUNT(DISTINCT 2), COUNT(DISTINCT 3, 2)` should return `1, 1`
### Does this PR introduce _any_ user-facing change? Yes
### How was this patch tested? New UT
Closes #29626 from
beliefer/support-multiple-foldable-distinct-expressions.
Lead-authored-by: gengjiaan <gengjiaan@360.cn> Co-authored-by: beliefer
<beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(commit: a22871f)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/group-by-filter.sql (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/count.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/results/group-by-filter.sql.out (diff)
The file was modifiedsql/core/src/test/resources/sql-tests/inputs/count.sql (diff)
The file was modifiedsql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala (diff)
Commit 5f468cc21ef621151c200edfeea0411342c6d8bb by yamamuro
[SPARK-32822][SQL] Change the number of partitions to zero when a range
is empty with WholeStageCodegen disabled or falled back
### What changes were proposed in this pull request?
This PR changes the behavior of RangeExec with WholeStageCodegen
disabled or falled back to change the number of partitions to zero when
a range is empty.
In the current master, if WholeStageCodegen effects, the number of
partitions of an empty range will be changed to zero.
``` spark.range(1, 1, 1, 1000).rdd.getNumPartitions res0: Int = 0
``` But it doesn't if WholeStageCodegen is disabled or falled back.
``` spark.conf.set("spark.sql.codegen.wholeStage", false) spark.range(1,
1, 1, 1000).rdd.getNumPartitions res2: Int = 1000
```
### Why are the changes needed?
To archive better performance even though WholeStageCodegen disabled or
falled back.
### Does this PR introduce _any_ user-facing change?
Yes. the number of partitions gotten with `getNumPartitions` for an
empty range will be changed when WholeStageCodegen is disabled.
### How was this patch tested?
New test.
Closes #29681 from sarutak/zero-size-range.
Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by:
Takeshi Yamamuro <yamamuro@apache.org>
(commit: 5f468cc)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala (diff)
The file was modifiedsql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala (diff)
Commit 328d81a2d1131742bcfba5117896c093db39e721 by yamamuro
[SPARK-32677][SQL][DOCS][MINOR] Improve code comment in
CreateFunctionCommand
### What changes were proposed in this pull request?
We made a mistake in https://github.com/apache/spark/pull/29502, as
there is no code comment to explain why we can't load the UDF class when
creating functions. This PR improves the code comment.
### Why are the changes needed?
To avoid making the same mistake.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
N/A
Closes #29713 from cloud-fan/comment.
Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Takeshi
Yamamuro <yamamuro@apache.org>
(commit: 328d81a)
The file was modifiedsql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala (diff)