[SPARK-31561][SQL] Add QUALIFY Clause by viirya · Pull Request #55401 · apache/spark

viirya · 2026-04-17T23:26:18Z

What changes were proposed in this pull request?

Add QUALIFY clause to Spark SQL using UnresolvedQualify LogicalPlan node and a self-contained ResolveQualify rule in Analyzer, with structured error conditions.

PR #55019 models QUALIFY as a marker expression (QualifyExpression) wrapped inside a Filter. The design forces QUALIFY handling logic to be scattered across four Analyzer rules. This PR models QUALIFY as a LogicalPlan node (UnresolvedQualify), resolved by a single self-contained ResolveQualify rule. ResolveQualify completes all work in one pass once the child plan is resolved.

Why are the changes needed?

QUALIFY is supported by several popular SQL engines including Snowflake, Databricks SQL etc, and users expect it when porting SQL that filters on window-function results. Without it, equivalent Spark queries need an extra subquery or CTE just to filter on a window alias.

This change closes that gap and makes Spark SQL more compatible with existing SQL workloads while preserving clear analyzer rules around window and aggregate semantics.

Does this PR introduce any user-facing change?

Yes. Spark SQL can now parse and analyze queries that use QUALIFY, for example:

SELECT a, ROW_NUMBER() OVER (ORDER BY b) AS rn
FROM t
QUALIFY rn = 1

This PR also introduces user-visible analysis errors for invalid QUALIFY usage, such as using aggregate functions directly in the QUALIFY predicate.

How was this patch tested?

Unit tests and e2e tests

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code

Add QUALIFY clause to Spark SQL using UnresolvedQualify LogicalPlan node and a self-contained ResolveQualify rule in Analyzer, with structured error conditions. Co-authored-by: Claude Code Co-authored-by: Chao Sun <chao@openai.com>

…dation, and strict error handling to ResolveQualify - Add resolveConditionSubqueries to handle correlated subqueries in QUALIFY conditions, using the same fake-Project pattern as HAVING resolution. - Validate that resolved attributes in the Aggregate case are present in grouping expressions or aggregate output, rejecting invalid references. - Change the catch-all case in resolveQualifyCondition to throw SparkException.internalError instead of silently returning. Co-authored-by: Claude Code Co-authored-by: Chao Sun <chao@openai.com>

…on and aggregate validation - Add test for correlated subquery in QUALIFY condition (EXISTS). - Add test that non-grouping column references with GROUP BY are rejected. Co-authored-by: Claude Code Co-authored-by: Chao Sun <chao@openai.com>

…r test for QUALIFY clause - Generate qualify.sql.out and analyzer-results/qualify.sql.out via SPARK_GENERATE_GOLDEN_FILES=1. - Fix SparkSqlParserSuite QUALIFY test to assert node types instead of full plan tree comparison. - Fix scalastyle issues: non-ASCII em-dash, import line length, unused import. Co-authored-by: Claude Code Co-authored-by: Chao Sun <chao@openai.com>

viirya force-pushed the qualify-clause branch from a069430 to 12c132e Compare April 18, 2026 17:20

viirya and others added 4 commits April 18, 2026 13:40

[SPARK-31561][SQL] Add QUALIFY clause support

898a04d

Add QUALIFY clause to Spark SQL using UnresolvedQualify LogicalPlan node and a self-contained ResolveQualify rule in Analyzer, with structured error conditions. Co-authored-by: Claude Code Co-authored-by: Chao Sun <chao@openai.com>

viirya force-pushed the qualify-clause branch from 12c132e to 0951b60 Compare April 18, 2026 20:40

viirya requested a review from sunchao April 18, 2026 23:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-31561][SQL] Add QUALIFY Clause#55401

[SPARK-31561][SQL] Add QUALIFY Clause#55401
viirya wants to merge 4 commits intoapache:masterfrom
viirya:qualify-clause

viirya commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

viirya commented Apr 17, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant