Skip to content

Conversation

@adriangb
Copy link
Contributor

Adds more interesting test cases extracted from #19538

The reasoning is that it's (1) better to split the additional tests out from that PR to make that PR easier to review and (2) it's a good thing to document current behavior with complex edge cases like these, so even if that PR doesn't get merged there is still value in having these as regression tests.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jan 26, 2026
@adriangb adriangb requested a review from alamb January 26, 2026 21:10
Comment on lines 1174 to 1193
###
# Test 11.4: Core augmentation with column reordering
# Projects: s['value'], s['label'], id (note 'id' is LAST in the SELECT list)
# Predicate: id > 1
# When 'id' is added for filtering, columns are reordered to match input schema order.
# Result: 'id' moves from SELECT position 3 to physical plan index 2.
# Physical plan shows: FilterExec: id@2 > 1 (note the index after reordering).
###

query TT
EXPLAIN SELECT s['value'], s['label'], id FROM simple_struct WHERE id > 1;
----
logical_plan
01)Projection: get_field(simple_struct.s, Utf8("value")), get_field(simple_struct.s, Utf8("label")), simple_struct.id
02)--Filter: simple_struct.id > Int64(1)
03)----TableScan: simple_struct projection=[id, s], partial_filters=[simple_struct.id > Int64(1)]
physical_plan
01)ProjectionExec: expr=[get_field(s@1, value) as simple_struct.s[value], get_field(s@1, label) as simple_struct.s[label], id@0 as id]
02)--FilterExec: id@0 > 1
03)----DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/projection_pushdown/simple.parquet]]}, projection=[id, s], file_type=parquet, predicate=id@0 > 1, pruning_predicate=id_null_count@1 != row_count@2 AND id_max@0 > 1, required_guarantees=[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help explain this test to me? Comments state

Physical plan shows: FilterExec: id@2 > 1 (note the index after reordering).

But the plan shows

02)--FilterExec: id@0 > 1

Is this expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should just remove the comment. These were pulled out of another PR and updated, the index will change once the ProjectionExec is pushed through the FilterExec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended in 37e75b1

Remove comments that imply projections will be pushed down, keeping only
generic descriptions of what each test covers. This reflects that these
tests are now regression tests rather than optimization development tests.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thanks @adriangb and @Jefffrey


###
# Test 4.4: Projection with duplicate column through Sort
# The projection expands the logical output (3→4 columns) but reduces physical columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the comment "reduces physical columns" as there are three physical column and all three are scanned 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that after the projection there are 4 columns but before the projection there are 3 columns. But I agree the comment is confusing, I'll reword it


# Verify correctness
query IIII
SELECT col_a, col_b, col_c, col_b as col_b_dup FROM three_cols ORDER BY col_a;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this data was inserted in order of col_a it might be better to check a different sort order (e.g. sort by col_a desc)

@adriangb adriangb added this pull request to the merge queue Jan 27, 2026
Merged via the queue into apache:main with commit 1a48d58 Jan 27, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants