compiler parity for typealias#7645
Conversation
📝 WalkthroughWalkthroughThe pull request refactors symbol table and code generation for PEP 695 type aliases and PEP 649 annotations. It introduces Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
76ef95e to
8a0f9ba
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] test: cpython/Lib/test/test_compile.py (TODO: 18) dependencies: dependent tests: (no tests depend on compile) Legend:
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
crates/codegen/src/compile.rs (3)
4585-4592:⚠️ Potential issue | 🟡 MinorUse the same
formatvarname as the module/class annotation symbol table.
symboltable.rsregisters module/class__annotate__scopes with"format", but Line 4589 appends".format". Sinceenter_scopealready copies symbol-table varnames, this creates an extra unused local and breaksco_varnamesparity.Proposed fix
self.current_code_info() .metadata .varnames - .insert(".format".to_owned()); + .insert("format".to_owned());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 4585 - 4592, The code is inserting ".format" into the current code object's metadata.varnames, creating an extra local that breaks co_varnames parity with symboltable.rs which registers "format" for __annotate__ scopes; change the insertion in the block that calls self.current_code_info().metadata.varnames.insert(...) to insert "format" (no leading dot) so it matches the symbol-table/enter_scope behavior and keeps co_varnames aligned with emit_format_validation() and the __annotate__ scopes managed in symboltable.rs.
1892-1911:⚠️ Potential issue | 🟠 MajorSkip synthetic parents repeatedly when building
__qualname__.This only skips one parent. For a nested scope inside an annotation scope inside a type-params scope, the grandparent can still be
TypeParams, so the generated qualname can include<generic parameters of ...>instead of the real owner.Proposed fix
- let parent_scope = self - .symbol_table_stack - .get(parent_idx) - .map(|table| table.typ); - - // CPython skips both generic-parameter scopes and annotation scopes - // when building qualnames for the contained function/class code object. - if matches!( - parent_scope, - Some(CompilerScope::TypeParams | CompilerScope::Annotation) - ) || parent.metadata.name.starts_with("<generic parameters of ") - { - if stack_size == 2 { - // If we're immediately within the module, qualname is just the name - return current_obj_name; - } - // Use grandparent - parent_idx = stack_size - 3; - parent = &self.code_stack[parent_idx]; - } + // CPython skips generic-parameter and annotation scopes when building + // qualnames for contained function/class code objects. These synthetic + // scopes can be stacked, so keep walking until a real parent is found. + loop { + let parent_scope = self + .symbol_table_stack + .get(parent_idx) + .map(|table| table.typ); + let skip_parent = matches!( + parent_scope, + Some(CompilerScope::TypeParams | CompilerScope::Annotation) + ) || parent.metadata.name.starts_with("<generic parameters of "); + if !skip_parent { + break; + } + if parent_idx == 0 { + return current_obj_name; + } + parent_idx -= 1; + parent = &self.code_stack[parent_idx]; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 1892 - 1911, The current logic only skips one synthetic parent when building __qualname__ (using parent_scope, symbol_table_stack and checks for CompilerScope::TypeParams | CompilerScope::Annotation and parent.metadata.name starting with "<generic parameters of "), which can leave further synthetic ancestors in place; change this to repeatedly skip synthetic parents by looping (while) to move parent_idx up until the parent is not a TypeParams/Annotation and its metadata.name is not the generic-parameters sentinel, updating parent_scope and parent from symbol_table_stack and code_stack each iteration, and preserve the existing early-return when the adjusted stack_size indicates the module-level parent (using stack_size to decide when to return current_obj_name).
3011-3016:⚠️ Potential issue | 🔴 CriticalFix type-parameter evaluator parameter spelling to match type-alias evaluator and symbol table registration.
Change
"format"to".format"at line 3012. The type-parameter bound/default evaluator scope should use the same parameter name as the type-alias evaluator (compile_typealias_value_closureline 3068) and the symbol table registration (symboltable.rs lines 1803, 2323), which all use".format". Using different spellings will create inconsistentco_varnamesindices.Current code
self.current_code_info() .metadata .varnames .insert("format".to_owned());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 3011 - 3016, The varname inserted into the current code info is misspelled — replace the string "format" with ".format" in the call to self.current_code_info().metadata.varnames.insert(...) so the type-parameter evaluator parameter matches the type-alias evaluator (see compile_typealias_value_closure) and the symbol table registration entries; this ensures co_varnames indices remain consistent with emit_format_validation() and the symbol table registrations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 2329-2335: The branch for SymbolScope::GlobalExplicit incorrectly
falls back to checking can_see_class_scope and returning NameOp::DictOrGlobals;
update the match arm handling SymbolScope::GlobalExplicit so it unconditionally
returns NameOp::Global (remove the can_see_class_scope check and the
NameOp::DictOrGlobals path) so explicit global declarations always use
NameOp::Global instead of class-dict lookup.
In `@crates/codegen/src/symboltable.rs`:
- Around line 1769-1778: Update the grammatically incorrect error string in the
SymbolTableError returned from the name.as_name_expr() check: replace "type
alias expect name" with a clear message such as "type alias expects name" or
"type alias must be a Name node" in the code inside the conditional that checks
let Some(name_expr) = name.as_name_expr(); ensure the replacement occurs where
SymbolTableError { error: ... } is constructed so the new message is
logged/returned; optionally add or adjust a unit test to exercise this error
path to validate the corrected message.
---
Outside diff comments:
In `@crates/codegen/src/compile.rs`:
- Around line 4585-4592: The code is inserting ".format" into the current code
object's metadata.varnames, creating an extra local that breaks co_varnames
parity with symboltable.rs which registers "format" for __annotate__ scopes;
change the insertion in the block that calls
self.current_code_info().metadata.varnames.insert(...) to insert "format" (no
leading dot) so it matches the symbol-table/enter_scope behavior and keeps
co_varnames aligned with emit_format_validation() and the __annotate__ scopes
managed in symboltable.rs.
- Around line 1892-1911: The current logic only skips one synthetic parent when
building __qualname__ (using parent_scope, symbol_table_stack and checks for
CompilerScope::TypeParams | CompilerScope::Annotation and parent.metadata.name
starting with "<generic parameters of "), which can leave further synthetic
ancestors in place; change this to repeatedly skip synthetic parents by looping
(while) to move parent_idx up until the parent is not a TypeParams/Annotation
and its metadata.name is not the generic-parameters sentinel, updating
parent_scope and parent from symbol_table_stack and code_stack each iteration,
and preserve the existing early-return when the adjusted stack_size indicates
the module-level parent (using stack_size to decide when to return
current_obj_name).
- Around line 3011-3016: The varname inserted into the current code info is
misspelled — replace the string "format" with ".format" in the call to
self.current_code_info().metadata.varnames.insert(...) so the type-parameter
evaluator parameter matches the type-alias evaluator (see
compile_typealias_value_closure) and the symbol table registration entries; this
ensures co_varnames indices remain consistent with emit_format_validation() and
the symbol table registrations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 35eaf435-1de2-4960-af00-7f73988741ce
⛔ Files ignored due to path filters (1)
Lib/test/test_compile.pyis excluded by!Lib/**
📒 Files selected for processing (3)
crates/codegen/src/compile.rscrates/codegen/src/symboltable.rscrates/vm/src/stdlib/_symtable.rs
| SymbolScope::GlobalExplicit => { | ||
| if can_see_class_scope { | ||
| NameOp::DictOrGlobals | ||
| } else { | ||
| NameOp::Global | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Read-only CPython parity probe; requires a CPython with PEP 649 support.
python3.14 - <<'PY'
import dis
src = """
X = 'global'
class C:
locals()['X'] = 'class'
global X
y: X
"""
code = compile(src, "<probe>", "exec")
dis.dis(code)
for const in code.co_consts:
if hasattr(const, "co_name"):
print("\\nCODE", const.co_name, const.co_varnames, const.co_freevars)
dis.dis(const)
for nested in const.co_consts:
if hasattr(nested, "co_name"):
print("\\nNESTED", nested.co_name, nested.co_varnames, nested.co_freevars)
dis.dis(nested)
PYRepository: RustPython/RustPython
Length of output: 115
🏁 Script executed:
cat -n crates/codegen/src/compile.rs | sed -n '2300,2350p'Repository: RustPython/RustPython
Length of output: 2279
🏁 Script executed:
rg "enum NameOp" -A 10 crates/codegen/src/Repository: RustPython/RustPython
Length of output: 642
🏁 Script executed:
rg "DictOrGlobals\|NameOp::Global" -B 2 -A 2 crates/codegen/src/ | head -50Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Test with available Python to understand global behavior in class scope
python3 <<'PYTHON'
import dis
# Test case: global declaration in class scope with annotation
src = """
X = 'global'
class C:
locals()['X'] = 'class_local'
global X
y: X
"""
code = compile(src, "<test>", "exec")
print("=== Module level ===")
dis.dis(code)
# Extract and show class code
for const in code.co_consts:
if hasattr(const, "co_name") and const.co_name == "C":
print("\n=== Class C ===")
dis.dis(const)
# Show annotation code if exists
for nested in const.co_consts:
if hasattr(nested, "co_name"):
print(f"\n=== Nested: {nested.co_name} ===")
dis.dis(nested)
PYTHONRepository: RustPython/RustPython
Length of output: 2511
Remove classdict fallback from explicit global declarations.
GlobalExplicit must unconditionally use NameOp::Global, not DictOrGlobals. CPython's global declaration bypasses class scope lookup entirely, even within annotation contexts under PEP 649. The current code at lines 2329–2335 violates this by checking __classdict__ first when can_see_class_scope is set, which breaks code that relies on explicit global names overriding class-local bindings.
Change to:
SymbolScope::GlobalExplicit => NameOp::Global🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/codegen/src/compile.rs` around lines 2329 - 2335, The branch for
SymbolScope::GlobalExplicit incorrectly falls back to checking
can_see_class_scope and returning NameOp::DictOrGlobals; update the match arm
handling SymbolScope::GlobalExplicit so it unconditionally returns
NameOp::Global (remove the can_see_class_scope check and the
NameOp::DictOrGlobals path) so explicit global declarations always use
NameOp::Global instead of class-dict lookup.
| let Some(name_expr) = name.as_name_expr() else { | ||
| return Err(SymbolTableError { | ||
| error: "type alias expect name".to_owned(), | ||
| location: Some( | ||
| self.source_file | ||
| .to_source_code() | ||
| .source_location(name.range().start(), PositionEncoding::Utf8), | ||
| ), | ||
| }); | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm the type of StmtTypeAlias.name in ruff_python_ast and whether non-Name is reachable.
rg -nP 'struct\s+StmtTypeAlias\b' -C 10
rg -nP '\bTypeAlias\s*\{' -C 5 --type=rustRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for StmtTypeAlias definition with broader patterns
fd -i typealias -t f --type rust | head -20
rg -l 'StmtTypeAlias' --type=rustRepository: RustPython/RustPython
Length of output: 391
🏁 Script executed:
#!/bin/bash
# Look in the symboltable.rs file itself for context around the error
cat -n crates/codegen/src/symboltable.rs | sed -n '1760,1790p'Repository: RustPython/RustPython
Length of output: 1649
🏁 Script executed:
#!/bin/bash
# Search for the as_name_expr method to understand what it does
rg -B 3 -A 3 'fn as_name_expr' --type=rustRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for StmtTypeAlias struct definition
rg 'struct StmtTypeAlias' -A 10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for as_name_expr method implementation in the Expr type
rg 'as_name_expr' -B 2 -A 5Repository: RustPython/RustPython
Length of output: 1268
🏁 Script executed:
#!/bin/bash
# Look for the definition of Expr type and Name variant
rg 'pub enum Expr' -A 50 | head -100Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for StmtTypeAlias in the AST files
rg 'StmtTypeAlias' crates/vm/src/stdlib/_ast/ -A 8Repository: RustPython/RustPython
Length of output: 3722
🏁 Script executed:
#!/bin/bash
# Look for the Expr enum definition
rg -l 'pub enum Expr' crates/Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for as_name_expr implementation more broadly
rg 'as_name_expr' --type-list | head -5
find crates/ -name '*.rs' -exec grep -l 'as_name_expr' {} \;Repository: RustPython/RustPython
Length of output: 535
🏁 Script executed:
#!/bin/bash
# Look for the Expr enum or struct in parser crates
find crates/ -name '*.rs' | xargs grep -l 'enum Expr' | head -5Repository: RustPython/RustPython
Length of output: 98
🏁 Script executed:
#!/bin/bash
# Search for the actual Rust ast module definition of StmtTypeAlias
rg 'pub struct StmtTypeAlias' -A 5Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search more broadly for TypeAlias struct
rg 'struct.*TypeAlias' -B 2 -A 8 | head -60Repository: RustPython/RustPython
Length of output: 2337
🏁 Script executed:
#!/bin/bash
# Look for impl blocks with as_name_expr
rg 'impl.*Expr' -A 30 | grep -A 30 'as_name_expr'Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Find files that define StmtTypeAlias (the Rust struct, not Python wrapper)
find crates/ -name '*.rs' -exec grep -l 'ast::StmtTypeAlias' {} \;Repository: RustPython/RustPython
Length of output: 133
🏁 Script executed:
#!/bin/bash
# Look in the ast module definition
find crates/ -path '*/ast/*.rs' -o -path '*/ast.rs' | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search for where StmtTypeAlias struct is defined with name field
rg 'StmtTypeAlias.*{' -B 2 -A 15 | head -80Repository: RustPython/RustPython
Length of output: 160
🏁 Script executed:
#!/bin/bash
# Check crates directory structure
ls -la crates/ | grep -E 'parser|ast'Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Look at imports in symboltable.rs to find where ast types come from
head -50 crates/codegen/src/symboltable.rs | grep -E 'use|import'Repository: RustPython/RustPython
Length of output: 341
🏁 Script executed:
#!/bin/bash
# Look for where the ast module is imported
rg 'use.*ast' crates/codegen/src/symboltable.rs | head -10Repository: RustPython/RustPython
Length of output: 161
🏁 Script executed:
#!/bin/bash
# Find all ast modules
find crates/ -name 'lib.rs' -o -name 'mod.rs' | xargs grep -l 'StmtTypeAlias' 2>/dev/nullRepository: RustPython/RustPython
Length of output: 49
🏁 Script executed:
#!/bin/bash
# Check if there's a ruff dependency
rg 'ruff' crates/codegen/Cargo.tomlRepository: RustPython/RustPython
Length of output: 183
🏁 Script executed:
#!/bin/bash
# Check if ruff_python_ast is available in the workspace or external
grep -r 'ruff_python_ast' crates/codegen/Cargo.toml
cat crates/codegen/Cargo.toml | grep -A 5 -B 5 'ruff'Repository: RustPython/RustPython
Length of output: 679
🏁 Script executed:
#!/bin/bash
# Look at Cargo.lock or workspace definition for ruff version
find . -name 'Cargo.lock' -o -name 'workspace.toml' | head -3Repository: RustPython/RustPython
Length of output: 77
🌐 Web query:
ruff_python_ast StmtTypeAlias name field type
💡 Result:
In the ruff_python_ast crate, StmtTypeAlias (corresponding to Python's PEP 695 type statement) is a struct with the following fields: pub struct StmtTypeAlias { range: TextRange, node_index: AtomicNodeIndex, name: Expr, // The name of the type alias (typically an ExprName) type_params: Option<Box>, value: Expr, // The type expression assigned to the alias } The name field is an Expr (usually ExprName for simple identifiers), not a direct string or Identifier type. This matches the ast.toml definition used to generate the Rust AST models.
Citations:
- 1: https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_ast/ast.toml
- 2: https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_ast/src/generated.rs
- 3: https://github.com/astral-sh/ruff/blob/56eb6b62/crates/ruff_python_ast/src/helpers.rs
🏁 Script executed:
#!/bin/bash
# Now search for as_name_expr implementation in ruff-related code or in RustPython
rg 'as_name_expr' -B 3 -A 10Repository: RustPython/RustPython
Length of output: 2178
🌐 Web query:
ruff as_name_expr Expr method implementation
💡 Result:
In Ruff's ruff_python_ast crate, Expr is an enum representing Python expressions. The as_name_expr method is a helper function (likely defined via a trait or extension) that pattern-matches an Expr to check if it is an Expr::Name variant and returns Option<&ExprName> if true, allowing safe downcasting to access the name identifier (id: &Name). This pattern is used extensively in Ruff's linter rules for semantic analysis, such as: - Checking if call arguments or attribute values are simple names (e.g., in PERF rules for list comprehensions: value.as_name_expr). - Resolving identifiers in type checking and binding analysis (e.g., func.as_name_expr in rule checks). - Validating targets in named expressions (walrus operator :=, Expr::Named). Example usage from source: let Some(list_name) = value.as_name_expr else { return; }; If value is Expr::Name(ExprName { id, .. }), it extracts the ExprName; otherwise, None. The method follows Rust's pattern-matching idiom for AST node visitors, similar to as_name, as_call, etc. No official documentation found, but implementation is in crates/ruff_python_ast, used across ruff_linter rules.
Citations:
- 1: https://github.com/astral-sh/ruff/blob/main/crates%2Fruff_linter%2Fsrc%2Frules%2Fperflint%2Frules%2Fmanual_list_comprehension.rs
- 2: https://github.com/astral-sh/ruff/blob/main/crates%2Fruff_linter%2Fsrc%2Frules%2Fpyupgrade%2Frules%2Fconvert_typed_dict_functional_to_class.rs
- 3: https://github.com/astral-sh/ruff/blob/main/crates%2Fruff_linter%2Fsrc%2Frules%2Fpylint%2Frules%2Fnamed_expr_without_context.rs
- 4: https://github.com/astral-sh/ruff/blob/56eb6b62/crates/ruff_python_ast/src/helpers.rs
- 5: https://github.com/astral-sh/ruff/blob/56eb6b62/crates/ruff_linter/src/checkers/ast/analyze/expression.rs
- 6: https://github.com/astral-sh/ruff/blob/56eb6b62/crates/ruff_python_ast/src/nodes.rs
🏁 Script executed:
#!/bin/bash
# Search for test cases related to TypeAlias error handling
rg 'type alias' -i crates/ | grep -i 'test\|error' | head -20Repository: RustPython/RustPython
Length of output: 292
🏁 Script executed:
#!/bin/bash
# Look for how other syntax errors are phrased in the codebase
rg 'error.*expect' -i crates/codegen/src/ | head -10Repository: RustPython/RustPython
Length of output: 278
🏁 Script executed:
#!/bin/bash
# Check if there are tests for invalid TypeAlias syntax
find crates/ -name '*.py' -o -name '*.rs' | xargs grep -l 'type.*=' | grep testRepository: RustPython/RustPython
Length of output: 125
🌐 Web query:
Python PEP 695 type alias syntax error message name must be identifier
💡 Result:
The error message "name must be identifier" in Python PEP 695 type alias syntax occurs when the name after the type keyword is not a valid Python identifier. According to PEP 695, the syntax is type NAME = expression, where NAME must follow Python's identifier rules: it must start with a letter (a-z, A-Z) or underscore (_), followed by letters, digits (0-9), or underscores. Invalid characters (e.g., numbers at start, special chars like -, $, @) or keywords trigger a SyntaxError with this message during parsing. Examples: - Valid: type MyAlias = list[int] - Invalid: type 1alias = int (starts with digit) → SyntaxError: name must be identifier - Invalid: type my-alias = str (contains '-') → SyntaxError: invalid character in identifier or similar, but parser expects NAME token. This is enforced by the grammar rule type_alias: "type" n=NAME t=[type_params] '=' b=expression in PEP 695. Python's lexer defines identifiers strictly (docs.python.org/3/reference/lexical_analysis.html#identifiers), and the PEG parser raises precise SyntaxErrors for violations in type statements (Python 3.12+). To fix: Use a valid identifier name (e.g., capitalize like MyType, avoid reserved words/special chars).
Citations:
- 1: https://peps.python.org/pep-0695/
- 2: https://typing.python.org/en/latest/spec/aliases.html
- 3: https://docs.python.org/3/reference/lexical_analysis.html
- 4: https://jellezijlstra.github.io/pep695.html
- 5: misleading SyntaxError (f-string and type alias related) python/cpython#109114
Fix grammatically incorrect error message.
"type alias expect name" is grammatically incorrect. It should be "type alias expects name" or restructured as "type alias must be a Name node".
The error path is reachable when the parser produces non-Name expressions during error recovery or for malformed input. Adding a unit test for this path would be beneficial.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/codegen/src/symboltable.rs` around lines 1769 - 1778, Update the
grammatically incorrect error string in the SymbolTableError returned from the
name.as_name_expr() check: replace "type alias expect name" with a clear message
such as "type alias expects name" or "type alias must be a Name node" in the
code inside the conditional that checks let Some(name_expr) =
name.as_name_expr(); ensure the replacement occurs where SymbolTableError {
error: ... } is constructed so the new message is logged/returned; optionally
add or adjust a unit test to exercise this error path to validate the corrected
message.
Summary by CodeRabbit
Bug Fixes
Chores