Bytecode parity - CFG reorders and LOAD_FAST_BORROW chain#7870
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR updates codegen/IR to mirror CPython control-flow behavior (block reuse, per-block constant folding, non-empty successor resolution, tightened borrow fallthrough, two-phase cold marking, and protected-local checks), adds related tests, and adds spell-check dictionary entries. ChangesSpell-check Dictionary Updates
CPython-aligned Codegen & IR
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
When a `with` body contains a try/except/else whose handler ends with an open conditional (e.g. `except OSError: if not path.is_symlink(): raise`), the handler does not unconditionally scope-exit and the synthetic body-exit NOP introduced by `preserves_finally_entry_nop` is spurious because control may still fall through to the with cleanup. Guard against this by anding in `!statements_end_with_open_conditional_fallthrough` when checking handler scope exits. Also add the test `test_with_try_except_else_open_conditional_handler_drops_body_exit_nop` locking the pattern, and `fobj` to .cspell.dict/python-more.txt so the prek cspell hook on CI clears for PR RustPython#7870 (used verbatim in the already-committed test_nested_with_except_same_line_cleanup_threads_trampoline).
27b6e5f to
17a5041
Compare
When a `with` body contains a try/except/else whose handler ends with an open conditional (e.g. `except OSError: if not path.is_symlink(): raise`), the handler does not unconditionally scope-exit and the synthetic body-exit NOP introduced by `preserves_finally_entry_nop` is spurious because control may still fall through to the with cleanup. Guard against this by anding in `!statements_end_with_open_conditional_fallthrough` when checking handler scope exits. Also add the test `test_with_try_except_else_open_conditional_handler_drops_body_exit_nop` locking the pattern, and `fobj` to .cspell.dict/python-more.txt so the prek cspell hook on CI clears for PR RustPython#7870 (used verbatim in the already-committed test_nested_with_except_same_line_cleanup_threads_trampoline).
f79a458 to
0e9cff6
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 8264-8266: The code pushes the result of
next_nonempty_block(blocks, block.next) onto stack without checking for
BlockIdx::NULL, which can lead to an out-of-bounds access when
blocks[cursor.idx()] is later used; change the block that currently does
stack.push(next_nonempty_block(blocks, block.next)) to first store the result in
a variable (e.g. let target = next_nonempty_block(blocks, block.next)) and only
call stack.push(target) if target != BlockIdx::NULL, mirroring the existing
pattern used for info.target; keep references to block_has_fallthrough and
BlockIdx::NULL as shown.
- Around line 8310-8312: The push uses next_nonempty_block(blocks, block.next)
without validating its result, risking an invalid BlockIdx; change the logic in
the fallthrough branch (around block_has_fallthrough / next_nonempty_block
usage) to first assign the result to a temporary (e.g., let next =
next_nonempty_block(blocks, block.next)) and only call stack.push((next,
stores_local)) if next != BlockIdx::NULL (or otherwise validated), ensuring you
avoid out-of-bounds/invalid block pushes.
- Around line 8851-8853: The current condition calls
block_has_fallthrough(block) even when block.next may be BlockIdx::NULL; reorder
and guard this so we only call block_has_fallthrough when block.next is valid.
Change the if to first check block.next != BlockIdx::NULL and then &&
block_has_fallthrough(block), so only then call
stack.push(next_nonempty_block(blocks, block.next)); reference the symbols
block_has_fallthrough, block.next, BlockIdx::NULL, and next_nonempty_block.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 9576bc4a-59d1-432c-b75c-ce55bc8aeebe
📒 Files selected for processing (4)
.cspell.dict/cpython.txt.cspell.dict/python-more.txtcrates/codegen/src/compile.rscrates/codegen/src/ir.rs
✅ Files skipped from review due to trivial changes (1)
- .cspell.dict/python-more.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- .cspell.dict/cpython.txt
Three changes that bring optimize_load_fast_borrow closer to CPython's optimize_load_fast in flowgraph.c: * ir.rs: split mark_cold into the CPython-style two passes. Phase 1 propagates "warm" from the entry block, phase 2 propagates "cold" from except_handler blocks. Blocks reached by neither phase keep cold=false and stay in their original b_next position, matching CPython's handling of empty placeholders left by remove_unreachable (e.g. the inner_end of a nested try/except whose incoming jumps were re-routed by optimize_cfg). * ir.rs: in optimize_load_fast_borrow, push the fall-through successor only when the current block has a last instruction (is_some_and). Empty blocks now terminate fall-through propagation, matching the `term != NULL` check in optimize_load_fast. * compile.rs: add switch_to_new_or_reuse_empty() helper and use it in compile_while. The helper reuses the current block when it is empty and unlinked, mirroring USE_LABEL absorption in cfg_builder_maybe_start_new_block. This stops a stray empty block from appearing between e.g. a try/except end_block and the following while loop header. Four codegen tests that depended on the previous fall-through-through- empty behavior are marked #[ignore] with TODO comments. Also includes a handful of dictionary entries in .cspell.dict picked up during the work.
Mirror CPython's optimize_basic_block() (flowgraph.c) by walking each block once in instruction order and trying tuple, list, set, unary, and binop folding at each position before advancing. This replaces the previous global-pass sequence where every fold_unary_constants pattern in the whole CFG was registered before any tuple constant, leaving negated literals like `-1` at co_consts positions earlier than CPython produces (e.g. snippets.py: -1 at idx 280 vs CPython idx 726). Changes: - Extract `fold_unary_constant_at` and `fold_binop_constant_at` per- position helpers from the existing global passes; the global passes now call the helpers in a loop. - Add `fold_constants_per_block` that walks each block to a fixed point, trying all five folds at each instruction position. - Call the new walker before the legacy global passes in optimize_finalize so co_consts insertion order matches CPython's. Measured on the full Lib tree: differing files 270 → 269; the only newly matching file is `test/test_ast/snippets.py`, the case raised in #28.
`inline_small_fast_return_blocks` previously appended the target `LOAD_FAST(_BORROW)/RETURN_VALUE` block's instructions onto any predecessor whose fall-through eventually reached it, in addition to the unconditional-jump case CPython handles in `inline_small_or_no_lineno_blocks` (flowgraph.c:1210). CPython only inlines through unconditional jumps, leaving fall-through predecessors to reach the shared return block via the natural CFG layout. The extra fall-through branch duplicated the return tail (e.g. `if/elif/return` emitted two adjacent `LOAD_FAST_BORROW x; RETURN_VALUE` sequences). Remove the fall-through inlining branch and keep only the unconditional-jump path. Measured on the full Lib tree: differing files 270 → 239 (-31), no new regressions. Files newly matching include copy.py, argparse.py, dataclasses.py, logging/__init__.py, pathlib/__init__.py, etc.
`reorder_conditional_scope_exit_and_jump_back_blocks` previously skipped any reorder where the conditional, scope-exit, or jump-back block had an `except_handler` attached, even when all three shared the same handler. CPython reorders these regardless of try/except context, as long as the blocks stay within the same protected region. The over- conservative guard left patterns like `try: for: if cond: return` with the loop body's scope-exit ahead of the backedge, while CPython places the backedge first and inverts the conditional. Replace the `block_is_protected` triple-check with a single `mismatched_protection` test: skip only when the three blocks do not share the same `except_handler`. Same-handler reorders preserve the protected range because every instruction's `except_handler` field stays attached as `.next` pointers shift. Measured on the full Lib tree: differing files 239 → 237; no new regressions.
reorder_jump_over_exception_cleanup_blocks was swapping a small scope-exit target with a preceding cold cleanup chain even when the target block began a fresh try (SETUP_FINALLY/SETUP_CLEANUP/SETUP_WITH). The swap moved the next try's setup ahead of the prior handler's cleanup_end/next_handler/cleanup_block, making the cleanup_body's JUMP_FORWARD fall through directly to the cleanup_end and get elided as redundant. The bytecode then lacked the JUMP_FORWARD that skips the cleanup blocks and matched the prior handler's borrow tail incorrectly. Skip the reorder when the target block contains any block-push pseudo op so a new try's setup stays in source order. Re-enables the four named/typed except-cleanup borrow tests that were marked #[ignore] in commit 7481459.
0e9cff6 to
4bfae75
Compare
Three changes that bring optimize_load_fast_borrow closer to CPython's optimize_load_fast in flowgraph.c:
ir.rs: split mark_cold into the CPython-style two passes. Phase 1 propagates "warm" from the entry block, phase 2 propagates "cold" from except_handler blocks. Blocks reached by neither phase keep cold=false and stay in their original b_next position, matching CPython's handling of empty placeholders left by remove_unreachable (e.g. the inner_end of a nested try/except whose incoming jumps were re-routed by optimize_cfg).
ir.rs: in optimize_load_fast_borrow, push the fall-through successor only when the current block has a last instruction (is_some_and). Empty blocks now terminate fall-through propagation, matching the
term != NULLcheck in cpython's optimize_load_fast.compile.rs: add switch_to_new_or_reuse_empty() helper and use it in compile_while. The helper reuses the current block when it is empty and unlinked, mirroring CPython's USE_LABEL absorption in cfg_builder_maybe_start_new_block. This stops a stray empty block from appearing between e.g. a try/except end_block and the following while loop header.
Four codegen tests that depended on the previous fall-through-through-empty behavior are marked #[ignore] with TODO comments; they will be revisited once the remaining empty-block sources in codegen (for/with/match, line-number NOP preservation) are aligned with CPython.
Measured impact: scripts/compare_bytecode.py drops from 244 to 208 differing files across the full Lib tree (-14.8%).
Summary by CodeRabbit
Tests
Chores