Bytecode parity CFG normalization and jump cleanup#7721
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR refactors bytecode code generation with improved constant handling, enhanced try/finally control-flow shaping through NOP/jump management, better global name resolution in class scopes, incremental static attribute collection, and extensive structural test updates. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 6/8 reviews remaining, refill in 13 minutes and 18 seconds.Comment |
c9dec96 to
ded263c
Compare
Reorder try/except/else/finally to emit else+finally before except handlers matching CPython layout. Add set_no_location for cleanup blocks. Extend CFG reorder pass to handle true-path jump-back for generators, break/continue, and assert in loops. Add stop-iteration error handler awareness to block protection.
remove_nops and remove_redundant_nops_in_blocks repeated has_jump_predecessor / has_plain_jump_predecessor / target lookups per block, scanning all blocks each time. With ~200,000 if blocks this became O(B^2 * I) and timed out test_compile.test_stack_overflow. Fold the three flags into one O(B * I) pass via compute_target_predecessor_flags.
ded263c to
23c689d
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: (module 'class test_dis' not found) Legend:
|
| "jitted", | ||
| "jitting", | ||
| "kwonly", | ||
| "lolcatz", |
There was a problem hiding this comment.
adding vocabulary only for string in test data is bad idea
There was a problem hiding this comment.
Maybe leave it as a single ignore statement in one file?
There was a problem hiding this comment.
The test was duplication of test_dis.py. removed the test
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 10175-10239: try_fold_slice_constant_part is too narrow (it only
matches literal AST variants) so slice bounds like unary-folded values and
special names (e.g. __debug__) aren't recognized; update
try_fold_slice_constant_part to first call the existing try_fold_constant_expr
(or delegate to the same folding logic) for the given expr and, if that returns
Some(ConstantData), return it; keep the existing literal matching as a fallback
only if try_fold_constant_expr isn't available/applicable, and preserve error
mapping (map_err(|e| self.error(e))?) and the CompileResult<Option<...>>
contract so try_fold_constant_slice can treat slice bounds the same as other
constant folding paths.
- Around line 574-607: statement_ends_with_scope_exit currently omits
ast::Stmt::Try and so misses nested-try cases; update
statement_ends_with_scope_exit to handle ast::Stmt::Try (match the Try arm) and
return true when the try has a non-empty finalbody or when it has handlers and
its try body ends with a scope-exit (mirror the logic used in
preserves_finally_entry_nop: check finalbody.is_empty() and/or
(!handlers.is_empty() && Self::statements_end_with_scope_exit(body)) so nested
try inside if/else is treated as ending with a scope-exit).
In `@crates/codegen/src/ir.rs`:
- Around line 4085-4088: The code currently bails out when
first_unprotected_suffix(block) returns Some(_) but it should treat that
position as the start of the deopt segment instead of continuing; change the
branch that checks same_block_tail_start so that when Some(start_idx) is
returned you set the segment start to that index (i.e., begin splitting/handling
the deopt segment at start_idx) and proceed with the existing deopt/splitting
logic, rather than continuing; ensure this preserves behavior elsewhere (blocks
still split only at jumps) and covers the same-block protected→unprotected tail
case (refer to first_unprotected_suffix, same_block_tail_start, and the
surrounding block-splitting/deopt logic).
- Around line 4410-4417: The ReturnIter matcher currently only checks for
Instruction::Call after the current cursor, so keyword-call variants are
ignored; update the branch that inspects real_instrs.get(cursor + 1) to also
accept Instruction::CallKw (in addition to Instruction::Call) when deciding to
produce DeoptKind::ReturnIter (with tail_start_idx: cursor + 2), i.e. match both
Call and CallKw against the real instruction (same logic that uses
Instruction::GetIter afterwards) so CallKw triggers the same deopt path;
reference symbols: Instruction::Call, Instruction::CallKw,
DeoptKind::ReturnIter, Instruction::GetIter, real_instrs, cursor.
- Around line 2563-2572: starts_after_for_cleanup currently only checks
immediate fallthroughs via block.next and misses cases where an END_FOR/POP_ITER
cleanup falls through through one or more empty blocks; update the computation
of fallthrough_predecessors (and thus starts_after_for_cleanup) to walk the
chain of next pointers from each block (using BlockIdx and block.next) until you
find the first non-empty successor (or NULL) and record that predecessor index,
then compute starts_after_for_cleanup by calling ends_with_for_cleanup on that
resolved successor index instead of the immediate block.next so cleanup
fallthroughs through empty blocks are correctly detected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 368e1a8e-a411-4fc1-979f-9354540ded8e
⛔ Files ignored due to path filters (1)
Lib/test/test_dis.pyis excluded by!Lib/**
📒 Files selected for processing (4)
.cspell.jsoncrates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/codegen/src/symboltable.rs
| fn statement_ends_with_scope_exit(stmt: &ast::Stmt) -> bool { | ||
| match stmt { | ||
| ast::Stmt::Return(_) | ast::Stmt::Raise(_) => true, | ||
| ast::Stmt::If(ast::StmtIf { | ||
| body, | ||
| elif_else_clauses, | ||
| .. | ||
| }) => { | ||
| let has_else = elif_else_clauses | ||
| .last() | ||
| .is_some_and(|clause| clause.test.is_none()); | ||
| has_else | ||
| && Self::statements_end_with_scope_exit(body) | ||
| && elif_else_clauses | ||
| .iter() | ||
| .all(|clause| Self::statements_end_with_scope_exit(&clause.body)) | ||
| } | ||
| _ => false, | ||
| } | ||
| } | ||
|
|
||
| fn preserves_finally_entry_nop(body: &[ast::Stmt]) -> bool { | ||
| body.last().is_some_and(|stmt| match stmt { | ||
| ast::Stmt::Try(ast::StmtTry { | ||
| body, | ||
| handlers, | ||
| finalbody, | ||
| .. | ||
| }) => { | ||
| !finalbody.is_empty() | ||
| || (!handlers.is_empty() && Self::statements_end_with_scope_exit(body)) | ||
| } | ||
| _ => false, | ||
| }) |
There was a problem hiding this comment.
Handle nested try statements in the scope-exit heuristic.
statement_ends_with_scope_exit() never returns true for ast::Stmt::Try, so preserves_finally_entry_nop() still drops the outer finally-entry anchor for shapes like if cond: try: return ... finally: ... else: try: return ... finally: .... That is the same nested-finally case this helper is meant to preserve.
Possible fix
fn statement_ends_with_scope_exit(stmt: &ast::Stmt) -> bool {
match stmt {
ast::Stmt::Return(_) | ast::Stmt::Raise(_) => true,
+ ast::Stmt::Try(ast::StmtTry {
+ body,
+ handlers,
+ orelse,
+ finalbody,
+ ..
+ }) => {
+ let handlers_exit = !handlers.is_empty()
+ && handlers.iter().all(|handler| {
+ let ast::ExceptHandler::ExceptHandler(
+ ast::ExceptHandlerExceptHandler { body, .. },
+ ) = handler;
+ Self::statements_end_with_scope_exit(body)
+ });
+ Self::statements_end_with_scope_exit(body)
+ && (orelse.is_empty() || Self::statements_end_with_scope_exit(orelse))
+ && (!finalbody.is_empty() || handlers_exit)
+ }
ast::Stmt::If(ast::StmtIf {
body,
elif_else_clauses,
..
}) => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/codegen/src/compile.rs` around lines 574 - 607,
statement_ends_with_scope_exit currently omits ast::Stmt::Try and so misses
nested-try cases; update statement_ends_with_scope_exit to handle ast::Stmt::Try
(match the Try arm) and return true when the try has a non-empty finalbody or
when it has handlers and its try body ends with a scope-exit (mirror the logic
used in preserves_finally_entry_nop: check finalbody.is_empty() and/or
(!handlers.is_empty() && Self::statements_end_with_scope_exit(body)) so nested
try inside if/else is treated as ending with a scope-exit).
| let same_block_tail_start = first_unprotected_suffix(block); | ||
| if same_block_tail_start.is_some() { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Don’t skip the same-block protected tail case.
first_unprotected_suffix() returning Some(_) is the exact signal that this block already contains the protected→unprotected transition you want to deopt, but the current code bails out instead of starting the segment there. Because blocks are only split at jumps, not at POP_BLOCK, this leaves reachable same-block store-tail cases untreated.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/codegen/src/ir.rs` around lines 4085 - 4088, The code currently bails
out when first_unprotected_suffix(block) returns Some(_) but it should treat
that position as the start of the deopt segment instead of continuing; change
the branch that checks same_block_tail_start so that when Some(start_idx) is
returned you set the segment start to that index (i.e., begin splitting/handling
the deopt segment at start_idx) and proceed with the existing deopt/splitting
logic, rather than continuing; ensure this preserves behavior elsewhere (blocks
still split only at jumps) and covers the same-block protected→unprotected tail
case (refer to first_unprotected_suffix, same_block_tail_start, and the
surrounding block-splitting/deopt logic).
- try_fold_constant_slice now delegates to try_fold_constant_expr so slice bounds accept the same constants other folding paths do (unary-folded values, __debug__, etc.). - remove_nops resolves fallthrough predecessors through empty blocks via next_nonempty_block before checking ends_with_for_cleanup. - should_deopt_borrowed_attr_chain ReturnIter matcher now accepts Instruction::CallKw alongside Instruction::Call, matching the Call/CallKw treatment in the surrounding deopt trigger check.
There was a problem hiding this comment.
♻️ Duplicate comments (3)
crates/codegen/src/ir.rs (2)
2563-2568:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDon’t let empty blocks overwrite the cleanup predecessor.
This still breaks on
A -> empty -> B: Line 2567 gets overwritten by the empty block, sostarts_after_for_cleanup[B]becomes false and the block-start no-location NOP can survive incorrectly. Record only the last non-empty fallthrough predecessor here.Suggested fix
let mut fallthrough_predecessors = vec![None; self.blocks.len()]; for (pred_idx, block) in self.blocks.iter().enumerate() { + if block.instructions.is_empty() { + continue; + } let next = next_nonempty_block(&self.blocks, block.next); if next != BlockIdx::NULL { fallthrough_predecessors[next.idx()] = Some(pred_idx); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/ir.rs` around lines 2563 - 2568, The fallthrough_predecessors assignment currently allows empty blocks to overwrite a non-empty predecessor (breaking cases like A -> empty -> B); in the loop over self.blocks (using fallthrough_predecessors, next_nonempty_block, and pred_idx) only set fallthrough_predecessors[next.idx()] = Some(pred_idx) when the current block is non-empty (i.e., skip assigning for empty blocks) so that the last non-empty fallthrough predecessor is recorded and doesn't get clobbered.
4086-4088:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse the same-block unprotected suffix instead of skipping it.
first_unprotected_suffix()returningSome(start)is the exact case this pass needs to deopt, but Line 4087 bails out. That still leaves reachable same-block protected→unprotected store tails untreated.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/ir.rs` around lines 4086 - 4088, The code currently skips blocks when first_unprotected_suffix(block) returns Some(start), but this is the exact case that should trigger deoptimization; instead of continuing, use the returned start index to handle the same-block protected→unprotected store tail (i.e., call the deopt/emission logic with same_block_tail_start or the start value from first_unprotected_suffix). Update the branch that checks same_block_tail_start to invoke the same deopt path used for cross-block tails (reusing the deopt emission code path or helper function) so same-block unprotected suffixes are processed rather than ignored.crates/codegen/src/compile.rs (1)
574-607:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBroaden the finally-entry preservation heuristic.
Line 595 still only preserves the anchor when the trailing statement is a literal
Stmt::Try, and Line 574 still treats nestedStmt::Tryas non-exiting. That means an outertry/finallywhose body ends in anif/elsewhose branches each end withtry ... finally: return/raisewill still drop the finally-entry NOP and diverge from CPython’s shape.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 574 - 607, The heuristic needs broadening: update statement_ends_with_scope_exit and preserves_finally_entry_nop so nested Try nodes are treated as scope-exiting when appropriate and the trailing anchor-preservation checks look through If/Try recursively. Specifically, enhance statement_ends_with_scope_exit (currently matching Return/Raise/If) to also recognize ast::Stmt::Try as exiting when its finalbody is non-empty or when it has handlers and its try body itself ends with a scope exit (reusing Self::statements_end_with_scope_exit). Then change preserves_finally_entry_nop to not only match a literal trailing Stmt::Try but to call statement_ends_with_scope_exit on the last statement (or otherwise inspect the last stmt via the same recursive logic) so an outer try/finally whose final visible control flow is from nested try/finally-return/raise inside an if/else keeps the finally-entry NOP; use the existing helpers (statement_ends_with_scope_exit, statements_end_with_scope_exit) and symbols (preserves_finally_entry_nop, statement_ends_with_scope_exit) to implement this.
🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)
10181-10207: ⚡ Quick winDeduplicate the slice-bound folding logic.
These three
matchblocks only differ by the input expression. Pulling the fold-or-Nonelogic into one local helper will keep this path from drifting again.♻️ Suggested refactor
fn try_fold_constant_slice( &mut self, lower: Option<&ast::Expr>, upper: Option<&ast::Expr>, step: Option<&ast::Expr>, ) -> CompileResult<Option<ConstantData>> { - let start = match lower { - Some(expr) => { - let Some(constant) = self.try_fold_constant_expr(expr)? else { - return Ok(None); - }; - constant - } - None => ConstantData::None, - }; - let stop = match upper { - Some(expr) => { - let Some(constant) = self.try_fold_constant_expr(expr)? else { - return Ok(None); - }; - constant - } - None => ConstantData::None, - }; - let step = match step { - Some(expr) => { - let Some(constant) = self.try_fold_constant_expr(expr)? else { - return Ok(None); - }; - constant - } - None => ConstantData::None, - }; + let mut fold_part = |expr: Option<&ast::Expr>| -> CompileResult<Option<ConstantData>> { + match expr { + Some(expr) => self.try_fold_constant_expr(expr), + None => Ok(Some(ConstantData::None)), + } + }; + + let Some(start) = fold_part(lower)? else { + return Ok(None); + }; + let Some(stop) = fold_part(upper)? else { + return Ok(None); + }; + let Some(step) = fold_part(step)? else { + return Ok(None); + }; Ok(Some(ConstantData::Slice { elements: Box::new([start, stop, step]), })) }As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 10181 - 10207, The three identical match blocks for start/stop/step duplicate the fold-or-None logic; extract a small local helper (e.g., a closure fold_bound = |opt_expr| -> Result<ConstantData, _> or similar) that takes an Option<Expr>, calls self.try_fold_constant_expr(opt_expr) when Some, returns the folded ConstantData or propagates the outer Ok(None) semantics when try_fold_constant_expr yields None, and returns ConstantData::None when input is None; then call this helper to set start, stop and step instead of repeating the match. Ensure the helper uses the same error propagation (the ? on try_fold_constant_expr) and preserves the current behavior for lower/upper/step and the created start/stop/step bindings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/codegen/src/compile.rs`:
- Around line 574-607: The heuristic needs broadening: update
statement_ends_with_scope_exit and preserves_finally_entry_nop so nested Try
nodes are treated as scope-exiting when appropriate and the trailing
anchor-preservation checks look through If/Try recursively. Specifically,
enhance statement_ends_with_scope_exit (currently matching Return/Raise/If) to
also recognize ast::Stmt::Try as exiting when its finalbody is non-empty or when
it has handlers and its try body itself ends with a scope exit (reusing
Self::statements_end_with_scope_exit). Then change preserves_finally_entry_nop
to not only match a literal trailing Stmt::Try but to call
statement_ends_with_scope_exit on the last statement (or otherwise inspect the
last stmt via the same recursive logic) so an outer try/finally whose final
visible control flow is from nested try/finally-return/raise inside an if/else
keeps the finally-entry NOP; use the existing helpers
(statement_ends_with_scope_exit, statements_end_with_scope_exit) and symbols
(preserves_finally_entry_nop, statement_ends_with_scope_exit) to implement this.
In `@crates/codegen/src/ir.rs`:
- Around line 2563-2568: The fallthrough_predecessors assignment currently
allows empty blocks to overwrite a non-empty predecessor (breaking cases like A
-> empty -> B); in the loop over self.blocks (using fallthrough_predecessors,
next_nonempty_block, and pred_idx) only set fallthrough_predecessors[next.idx()]
= Some(pred_idx) when the current block is non-empty (i.e., skip assigning for
empty blocks) so that the last non-empty fallthrough predecessor is recorded and
doesn't get clobbered.
- Around line 4086-4088: The code currently skips blocks when
first_unprotected_suffix(block) returns Some(start), but this is the exact case
that should trigger deoptimization; instead of continuing, use the returned
start index to handle the same-block protected→unprotected store tail (i.e.,
call the deopt/emission logic with same_block_tail_start or the start value from
first_unprotected_suffix). Update the branch that checks same_block_tail_start
to invoke the same deopt path used for cross-block tails (reusing the deopt
emission code path or helper function) so same-block unprotected suffixes are
processed rather than ignored.
---
Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 10181-10207: The three identical match blocks for start/stop/step
duplicate the fold-or-None logic; extract a small local helper (e.g., a closure
fold_bound = |opt_expr| -> Result<ConstantData, _> or similar) that takes an
Option<Expr>, calls self.try_fold_constant_expr(opt_expr) when Some, returns the
folded ConstantData or propagates the outer Ok(None) semantics when
try_fold_constant_expr yields None, and returns ConstantData::None when input is
None; then call this helper to set start, stop and step instead of repeating the
match. Ensure the helper uses the same error propagation (the ? on
try_fold_constant_expr) and preserves the current behavior for lower/upper/step
and the created start/stop/step bindings.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 3a985768-6733-40e0-83a9-169f9b5a540b
📒 Files selected for processing (2)
crates/codegen/src/compile.rscrates/codegen/src/ir.rs
2c31fa9 to
c267535
Compare
Summary by CodeRabbit
Bug Fixes
Refactor
Tests