Skip to content

Bytecode parity - boolop, comprehension, CFG passes#7631

Merged
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:bytecode-parity-int
Apr 20, 2026
Merged

Bytecode parity - boolop, comprehension, CFG passes#7631
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:bytecode-parity-int

Conversation

@youknowone
Copy link
Copy Markdown
Member

@youknowone youknowone commented Apr 19, 2026

Summary by CodeRabbit

  • Improvements

    • Refined comprehension inlining and symbol handling to better match CPython semantics.
    • Improved detection and tracking of hidden/local variables; frame local snapshots and exposed frame.f_locals now reflect hidden-local semantics more accurately.
    • Slightly more precise source-location generation for compiled code.
  • Development Tools

    • More robust bytecode dump/comparison workflow with retries, timeouts, and a new option to write dump JSON to a file.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 19, 2026

📝 Walkthrough

Walkthrough

Refactors comprehension inlining and symbol merging, adds a hidden-locals overlay and overlay-aware locals APIs to frames, and rewrites bytecode dump tooling to write JSON to files with timeout-aware subprocess retry and per-target reruns.

Changes

Cohort / File(s) Summary
Symbol table / comprehension inlining
crates/codegen/src/symboltable.rs
Removed await-based gating for inlining; inline_comprehension() returns an IndexSet<String> of removed class-implicit names, treats __class__, __classdict__, __conditional_annotations__ specially, merges symbols into parent with revised scope assignment, marks inlined cells with COMP_CELL, and restricts LOCAL→CELL promotion to function-like scopes.
Frame locals & hidden-overlay
crates/vm/src/frame.rs, crates/vm/src/builtins/frame.rs
Added f_locals_hidden_overlay: PyMutex<Option<PyDictRef>> with GC/clear handling; added has_active_hidden_locals(); split locals sync into sync_visible_locals_to_mapping and f_locals_mapping; locals() now returns overlay-aware snapshots when hidden locals active; frame.f_locals accessor switched to use f_locals_mapping.
Bytecode dump & compare tooling
scripts/compare_bytecode.py, scripts/dis_dump.py
dis_dump.py gains --output to target a file. compare_bytecode.py now creates job dicts with separate temp output files, threads a DUMP_TIMEOUT, reads JSON from output files via _load_dump_output, adds _run_sync_dump and _rerun_missing_targets to serially rerun missing/failed targets, and merges recovered results.
Location conversion refactor
crates/compiler-core/src/marshal.rs
Moved creation of mk closure out of branch and explicitly constructs SourceLocation with mk(line) and zero character offset for None-kind entries instead of default_loc().

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Frame
  participant VM
  participant Overlay
  Caller->>Frame: write_local(name, value)
  Frame->>Frame: has_active_hidden_locals()?
  alt hidden locals active
    Frame->>Overlay: ensure/init overlay mapping (f_locals_hidden_overlay)
    Frame->>Overlay: write name→value
    Frame->>VM: mark locals_dirty = true
  else no hidden locals
    Frame->>VM: sync_visible_locals_to_mapping(name, value)
    VM->>VM: update backing mapping
  end
  Caller->>Frame: locals() request
  Frame->>Frame: has_active_hidden_locals()?
  alt hidden locals active
    Frame->>Overlay: return fresh snapshot from overlay (merged)
  else
    Frame->>VM: return backing mapping snapshot
  end
Loading
sequenceDiagram
  participant Controller
  participant Worker as dis_dump subprocess
  participant FS as OutputFile
  Controller->>Worker: start dump job (--output OutputFile)
  Worker->>FS: write JSON results (or empty/partial)
  Controller->>Worker: wait for completion (with DUMP_TIMEOUT)
  alt timeout or unreadable JSON
    Controller->>Controller: _run_sync_dump / _rerun_missing_targets (serial)
    Controller->>Worker: start per-target subprocess invocations (serial)
    Worker->>FS: write per-target JSON (or error)
  end
  Controller->>FS: _load_dump_output(output_file) -> parse / merge results
  Controller->>Controller: merge recovered results into final report
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰
I hopped through scopes with nimble paws and cheer,
hid locals in a nook so callers need not fear.
Dumps now write to files and try again when sad,
class names neatly spliced where once they made me mad—
a rabbit hums, then bounds away in gear.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'Bytecode parity - boolop, comprehension, CFG passes' is vague and overly generic. While it mentions some components, it doesn't clearly convey the primary change—improving bytecode parity through symbol table and frame handling changes. Provide a more specific title that highlights the main change, such as 'Fix comprehension inlining and hidden locals handling for bytecode parity' or similar to better reflect the substantial refactoring work.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the bytecode-parity-int branch 2 times, most recently from 2e4339e to d8ac86b Compare April 19, 2026 13:57
- Flatten nested same-op BoolOp and add IfExpr to jump_if
- Simplify is_name_imported to module-level only
- Enable inlined comprehensions in module/class scope
- Add TweakInlinedComprehensionScopes with fast_hidden tracking
- Track fb_range in FBlockInfo for with-statement line info
- Add constant subscript folding and unary Not folding
- Add emit_return_const_no_location for implicit returns
- Use JumpNoInterrupt for ternary/except jumps
- Add STACK_USE_GUIDELINE threshold for collection building
- Reorder CFG pipeline: inline small blocks earlier, resolve
  line numbers before cold block extraction
- Add redirect_empty_unconditional_jump_targets,
  reorder_conditional_chain_and_jump_back_blocks,
  materialize_empty_conditional_exit_targets,
  duplicate_shared_jump_back_targets passes
- Add borrow deoptimization for multi-handler, named-except,
  protected conditional tail, and protected import joins
- Run folding/optimization passes twice around peephole
@youknowone youknowone force-pushed the bytecode-parity-int branch from 8debf0b to 7367e26 Compare April 20, 2026 11:09
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 20, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

(module 'code test_compile test_inspect test_json test_peepholer' not found)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

- Augmented assignment with two-part slices uses BINARY_SLICE/STORE_SLICE
- Inline async comprehensions (remove await/async guards)
- Inlined comprehension cleanup jump uses JumpNoInterrupt
- Class firstlineno uses first decorator line when decorated
- Recurse into nested functions for static attribute collection
- Fold unary positive complex constants (+0.0j)
- Add deoptimize_borrow_for_folded_nonliteral_exprs pass
- Add inline_single_predecessor_artificial_expr_exit_blocks pass
- Skip shared artificial expr-stmt exit blocks in small-block inlining
- Mark folded boolop tail as folded_from_nonliteral_expr
@youknowone youknowone force-pushed the bytecode-parity-int branch from 7367e26 to f4674b6 Compare April 20, 2026 12:02
@youknowone youknowone marked this pull request as ready for review April 20, 2026 12:50
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/vm/src/frame.rs (1)

918-1022: ⚠️ Potential issue | 🟠 Major

Preserve f_locals write-back semantics when returning an overlay.

Line 923 creates a detached dict and Lines 1018-1020 return it without storing it on the frame. frame.f_locals callers can mutate that dict, but locals_to_fast() later reads from self.locals.mapping(vm), so those mutations are silently lost while hidden locals are active. Consider either routing only locals() snapshot behavior through this overlay path, or storing/merging the overlay so f_locals changes can be synced back before resume.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/frame.rs` around lines 918 - 1022, The overlay dict created in
overlay_locals is returned as ArgMapping::from_dict_exact without being stored
on the frame, so mutations to frame.f_locals are lost when locals_to_fast()
later reads self.locals.mapping(vm); update the code in the locals-to-ArgMapping
path (the block using overlay_locals and returning ArgMapping::from_dict_exact)
to either persist the overlay on the frame (e.g., assign it to a frame field
like self.overlay_locals or self.f_locals_overlay) so future locals_to_fast()
can sync it back, or merge any mutations from the overlay into the frame’s
underlying locals mapping before returning (so locals_to_fast() will see them);
ensure related symbols touched are overlay_locals, ArgMapping::from_dict_exact,
frame.f_locals (or self.locals), and locals_to_fast().
scripts/compare_bytecode.py (1)

75-113: ⚠️ Potential issue | 🟡 Minor

Close temp files in exception handler before unlinking to avoid Windows PermissionError.

If an exception occurs between the NamedTemporaryFile creations and their close() calls (e.g., during files_file.write() on line 21), both files remain open. The exception handler then calls os.unlink() on the open files, which raises PermissionError on Windows and masks the original exception. Close both handles in the except block before unlinking, or use context managers to ensure files are closed.

While the script is currently Linux-only in CI workflows, this is worth fixing for general robustness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 75 - 113, The exception handler
currently unlinks files_file and output_file while they may still be open, which
causes PermissionError on Windows; update the except block in the function that
creates output_file/files_file so it first checks for and closes both handles
(files_file.close() and output_file.close()) if they exist/open (guard via
locals() or try/except), then call os.unlink() for each path, and finally
re-raise the original exception; alternatively wrap the temp file logic in
context managers but ensure the same variables (files_file, output_file) are
closed before unlinking to avoid masking the original error.
🧹 Nitpick comments (4)
scripts/dis_dump.py (1)

395-400: Minor: consider contextlib.ExitStack or a small helper for cleaner stdout/file fallback.

The current try/finally is correct, but the two-branch if args.output on lines 395 and 399 can be consolidated using contextlib.nullcontext(sys.stdout) so that the with semantics handle closing uniformly.

♻️ Proposed refactor
-    output = open(args.output, "w", encoding="utf-8") if args.output else sys.stdout
-    try:
-        json.dump(results, output, ensure_ascii=False, separators=(",", ":"))
-    finally:
-        if args.output:
-            output.close()
+    import contextlib
+    output_cm = (
+        open(args.output, "w", encoding="utf-8")
+        if args.output
+        else contextlib.nullcontext(sys.stdout)
+    )
+    with output_cm as output:
+        json.dump(results, output, ensure_ascii=False, separators=(",", ":"))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/dis_dump.py` around lines 395 - 400, Replace the try/finally
file-handle pattern around json.dump with a context manager that unifies stdout
and file handling: use contextlib.nullcontext(sys.stdout) (or
contextlib.ExitStack) to open args.output when present and yield sys.stdout
otherwise, then perform json.dump(results, output, ...) inside a single with
block; reference the variables/functions output, args.output, and json.dump to
locate and modify the existing block.
scripts/compare_bytecode.py (1)

194-199: stdout referenced via locals() after a timeout path that never sets it.

On the timeout branch (lines 179–186) data = None is assigned but stdout is not, hence the "stdout" in locals() guard on line 197. This works, but the flow is fragile — any refactor that adds another exception path will need to remember this invariant. A plain stdout = b"" at the top of the function (mirroring the pattern in _run_sync_dump) would eliminate the locals() check and make the intent explicit.

♻️ Suggested tidy
 def _finish_one(job):
     """Wait for a single dis_dump.py process and return parsed JSON."""
     proc = job["proc"]
     expected = {relpath for relpath, _ in job["targets"]}
+    stdout = b""
     try:
         stdout = proc.communicate(timeout=600)[0]
     except subprocess.TimeoutExpired:
         ...
         data = None
     else:
         data = _load_dump_output(job["output_file"])
     finally:
         ...
 
-    stray = stdout.decode(errors="replace").strip() if "stdout" in locals() else ""
+    stray = stdout.decode(errors="replace").strip()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 194 - 199, The function currently
uses "stdout" only in some branches and guards it with '"stdout" in locals()'
which is fragile; initialize stdout = b"" at the top of the function (same
pattern as _run_sync_dump) so stdout is always defined, remove the locals()
check and simply decode stdout (e.g. stray =
stdout.decode(errors="replace").strip()) and check if stray to emit the warning;
update references in the timeout/error handling in this function accordingly
(keep the proc.returncode check and stray warning logic) so no code paths leave
stdout undefined.
crates/codegen/src/symboltable.rs (2)

2227-2237: Minor: comment narrower than the actual gate.

The comment states the exclusion is for "annotation scopes nested in classes that can see class scope", but the check is parent_can_see_class_scope, which is also set on TypeParams scopes inside a class (see enter_type_param_block, line 1079) and on scan_type_param_bound_or_default scopes. So generic-class type-parameter scopes and TypeVar bound/default scopes nested in a class will also skip inlining. If that is intentional (it matches CPython), consider broadening the comment to avoid future confusion.

📝 Suggested wording
         // PEP 709: Mark non-generator comprehensions for inlining.
-        // CPython's symtable marks all non-generator comprehensions for
-        // inlining, except annotation scopes nested in classes that can see
-        // class scope.
+        // CPython's symtable marks all non-generator comprehensions for
+        // inlining, except when the enclosing scope can see class scope
+        // (annotation, type-parameter, and TypeVar bound/default scopes
+        // nested in a class).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/symboltable.rs` around lines 2227 - 2237, The comment is
too narrow: the gate checking parent_can_see_class_scope (used with
is_generator) also applies to TypeParams and type-var bound/default scopes (see
enter_type_param_block and scan_type_param_bound_or_default), so update the
comment above the inlining logic to state that any nested annotation-like scopes
that can see class scope (including generic-class type-parameter scopes and
TypeVar bound/default scopes) are excluded from comp_inlined rather than only
“annotation scopes nested in classes that can see class scope”; leave the logic
using parent_can_see_class_scope, is_generator, and setting comp_inlined
unchanged.

570-607: Optional: fuse the two post-analysis passes over symbol_table.symbols.

The pass at 572-576 must run before analyze_symbol (line 579) so that the Unknown + COMP_CELL branch at 689-701 sees the flag. That's fine. However, the third pass at 600-607 (LOCAL→CELL promotion) can be folded into the analyze_symbol loop at 579-586 since its decision depends only on the just-resolved symbol.scope, inlined_cells, and promote_inlined_cells_to_cell (which is constant per call). This also lets the free-var collection at 583-585 see the promoted Cell scope in one pass, which may matter if a future change relies on Cell symbols appearing in newfree consistently.

♻️ Sketch
-        for symbol in symbol_table.symbols.values_mut() {
-            if inlined_cells.contains(&symbol.name) {
-                symbol.flags.insert(SymbolFlags::COMP_CELL);
-            }
-        }
-
-        // Analyze symbols in current scope
-        for symbol in symbol_table.symbols.values_mut() {
-            self.analyze_symbol(symbol, symbol_table.typ, sub_tables, class_entry)?;
-
-            // Collect free variables from this scope
-            if symbol.scope == SymbolScope::Free || symbol.flags.contains(SymbolFlags::FREE_CLASS) {
-                newfree.insert(symbol.name.clone());
-            }
-        }
-
-        // PEP 709 / CPython symtable.c:
-        // - only promote LOCAL -> CELL in function-like scopes, where
-        //   analyze_cells() runs. Module and class scopes keep their normal
-        //   scope and rely on DEF_COMP_CELL for comprehension-only cells.
-        let promote_inlined_cells_to_cell = matches!(
+        for symbol in symbol_table.symbols.values_mut() {
+            if inlined_cells.contains(&symbol.name) {
+                symbol.flags.insert(SymbolFlags::COMP_CELL);
+            }
+        }
+
+        // PEP 709 / CPython symtable.c: only promote LOCAL -> CELL in
+        // function-like scopes where analyze_cells() runs.
+        let promote_inlined_cells_to_cell = matches!(
             symbol_table.typ,
             CompilerScope::Function
                 | CompilerScope::AsyncFunction
                 | CompilerScope::Lambda
                 | CompilerScope::Comprehension
                 | CompilerScope::Annotation
         );
-        for symbol in symbol_table.symbols.values_mut() {
-            if inlined_cells.contains(&symbol.name)
-                && promote_inlined_cells_to_cell
-                && symbol.scope == SymbolScope::Local
-            {
-                symbol.scope = SymbolScope::Cell;
-            }
-        }
+        for symbol in symbol_table.symbols.values_mut() {
+            self.analyze_symbol(symbol, symbol_table.typ, sub_tables, class_entry)?;
+
+            if promote_inlined_cells_to_cell
+                && symbol.scope == SymbolScope::Local
+                && inlined_cells.contains(&symbol.name)
+            {
+                symbol.scope = SymbolScope::Cell;
+            }
+
+            if symbol.scope == SymbolScope::Free || symbol.flags.contains(SymbolFlags::FREE_CLASS) {
+                newfree.insert(symbol.name.clone());
+            }
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/symboltable.rs` around lines 570 - 607, The third pass
that promotes LOCAL→CELL based on inlined_cells and
promote_inlined_cells_to_cell should be folded into the loop that calls
self.analyze_symbol so you only iterate symbol_table.symbols once; after calling
self.analyze_symbol(symbol, ...), compute promote_inlined_cells_to_cell (as
already defined) and if inlined_cells.contains(&symbol.name) &&
promote_inlined_cells_to_cell && symbol.scope == SymbolScope::Local then set
symbol.scope = SymbolScope::Cell before you collect free vars into newfree,
ensuring the COMP_CELL bit-setting pass (which must run first) remains unchanged
and newfree sees the promoted scope in the same pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/compare_bytecode.py`:
- Around line 164-170: _rerun_missing_targets currently treats a falsy return
from _run_sync_dump the same (dropping both {} and None), losing persistent
read/parse failures from _load_dump_output; change _rerun_missing_targets to
explicitly check for None vs empty dict after calling _run_sync_dump (use
identity or explicit is None check), and when data is None insert a synthetic
error entry for that target (e.g. a dict value indicating a parse/read failure)
into the recovered mapping so downstream accounting in main() (cp_err/rp_err and
cp_data/rp_data) will count/report the error; keep successful {} and non-empty
dict returns unchanged.
- Around line 133-161: The _run_sync_dump function currently ignores its timeout
parameter and uses a fragile locals() trick for timed_out; update
proc.communicate(...) to use the function parameter
(proc.communicate(timeout=timeout)) and initialize timed_out = False before the
try/except so you can set timed_out = True in the except handler and drop the
locals() lookup in the finally. Also mirror this change in the similar helper
_finish_one so the timeout value is respected consistently across both helpers.

---

Outside diff comments:
In `@crates/vm/src/frame.rs`:
- Around line 918-1022: The overlay dict created in overlay_locals is returned
as ArgMapping::from_dict_exact without being stored on the frame, so mutations
to frame.f_locals are lost when locals_to_fast() later reads
self.locals.mapping(vm); update the code in the locals-to-ArgMapping path (the
block using overlay_locals and returning ArgMapping::from_dict_exact) to either
persist the overlay on the frame (e.g., assign it to a frame field like
self.overlay_locals or self.f_locals_overlay) so future locals_to_fast() can
sync it back, or merge any mutations from the overlay into the frame’s
underlying locals mapping before returning (so locals_to_fast() will see them);
ensure related symbols touched are overlay_locals, ArgMapping::from_dict_exact,
frame.f_locals (or self.locals), and locals_to_fast().

In `@scripts/compare_bytecode.py`:
- Around line 75-113: The exception handler currently unlinks files_file and
output_file while they may still be open, which causes PermissionError on
Windows; update the except block in the function that creates
output_file/files_file so it first checks for and closes both handles
(files_file.close() and output_file.close()) if they exist/open (guard via
locals() or try/except), then call os.unlink() for each path, and finally
re-raise the original exception; alternatively wrap the temp file logic in
context managers but ensure the same variables (files_file, output_file) are
closed before unlinking to avoid masking the original error.

---

Nitpick comments:
In `@crates/codegen/src/symboltable.rs`:
- Around line 2227-2237: The comment is too narrow: the gate checking
parent_can_see_class_scope (used with is_generator) also applies to TypeParams
and type-var bound/default scopes (see enter_type_param_block and
scan_type_param_bound_or_default), so update the comment above the inlining
logic to state that any nested annotation-like scopes that can see class scope
(including generic-class type-parameter scopes and TypeVar bound/default scopes)
are excluded from comp_inlined rather than only “annotation scopes nested in
classes that can see class scope”; leave the logic using
parent_can_see_class_scope, is_generator, and setting comp_inlined unchanged.
- Around line 570-607: The third pass that promotes LOCAL→CELL based on
inlined_cells and promote_inlined_cells_to_cell should be folded into the loop
that calls self.analyze_symbol so you only iterate symbol_table.symbols once;
after calling self.analyze_symbol(symbol, ...), compute
promote_inlined_cells_to_cell (as already defined) and if
inlined_cells.contains(&symbol.name) && promote_inlined_cells_to_cell &&
symbol.scope == SymbolScope::Local then set symbol.scope = SymbolScope::Cell
before you collect free vars into newfree, ensuring the COMP_CELL bit-setting
pass (which must run first) remains unchanged and newfree sees the promoted
scope in the same pass.

In `@scripts/compare_bytecode.py`:
- Around line 194-199: The function currently uses "stdout" only in some
branches and guards it with '"stdout" in locals()' which is fragile; initialize
stdout = b"" at the top of the function (same pattern as _run_sync_dump) so
stdout is always defined, remove the locals() check and simply decode stdout
(e.g. stray = stdout.decode(errors="replace").strip()) and check if stray to
emit the warning; update references in the timeout/error handling in this
function accordingly (keep the proc.returncode check and stray warning logic) so
no code paths leave stdout undefined.

In `@scripts/dis_dump.py`:
- Around line 395-400: Replace the try/finally file-handle pattern around
json.dump with a context manager that unifies stdout and file handling: use
contextlib.nullcontext(sys.stdout) (or contextlib.ExitStack) to open args.output
when present and yield sys.stdout otherwise, then perform json.dump(results,
output, ...) inside a single with block; reference the variables/functions
output, args.output, and json.dump to locate and modify the existing block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 5b61ad95-98d4-40ae-b77d-a2a1ae6a06ce

📥 Commits

Reviewing files that changed from the base of the PR and between b18b71b and f4674b6.

⛔ Files ignored due to path filters (6)
  • Lib/test/test_code.py is excluded by !Lib/**
  • Lib/test/test_compile.py is excluded by !Lib/**
  • Lib/test/test_inspect/test_inspect.py is excluded by !Lib/**
  • Lib/test/test_peepholer.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_bool_op.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (7)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/codegen/src/symboltable.rs
  • crates/compiler-core/src/marshal.rs
  • crates/vm/src/frame.rs
  • scripts/compare_bytecode.py
  • scripts/dis_dump.py

Comment thread scripts/compare_bytecode.py Outdated
Comment thread scripts/compare_bytecode.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
scripts/compare_bytecode.py (1)

172-196: ⚠️ Potential issue | 🟡 Minor

Preserve failed reruns in the result map.

failed/empty targets are only logged, then omitted from recovered. If both interpreters lose the same target, the final report can under-count it instead of reporting an error.

Suggested fix
 def _rerun_missing_targets(interpreter, targets, base_dir):
     recovered = {}
     failed = []
     empty = []
     for target in targets:
+        relpath = target[0]
         data = _run_sync_dump(interpreter, [target], base_dir)
         if data is None:
-            failed.append(target[0])
+            failed.append(relpath)
+            recovered[relpath] = {
+                "status": "error",
+                "error": "dump helper failed while rerunning target",
+            }
         elif data:
             recovered.update(data)
         else:
-            empty.append(target[0])
+            empty.append(relpath)
+            recovered[relpath] = {
+                "status": "error",
+                "error": "dump helper produced no data while rerunning target",
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 172 - 196, The
_rerun_missing_targets function currently logs failed/empty targets but omits
them from the returned recovered map, causing under-reporting; after computing
failed and empty (use the same keying as recovered, i.e. target[0]), add entries
for each failed and empty target into recovered (e.g., recovered[failed_name] =
None) before returning so the caller sees these missing items instead of
silently dropping them; update _rerun_missing_targets to insert those sentinel
entries after the warning prints and then return recovered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/builtins/frame.rs`:
- Around line 704-706: The call to self.f_locals_hidden_overlay.lock().take()
returns an Option<PyDictRef> annotated #[must_use], so explicitly discard it to
avoid warnings; update the code to bind the result (e.g., let _ =
self.f_locals_hidden_overlay.lock().take()) just like the pattern used for
self.temporary_refs.lock().clear(), referencing the f_locals_hidden_overlay
field and its take() call to make the intent clear and silence the must_use
lint.

In `@crates/vm/src/frame.rs`:
- Around line 1027-1063: The snapshot dicts for hidden locals are being created
empty and only fastlocals are synced, which drops visible names from the backing
mapping; update f_locals_mapping and locals so when you create a new overlay
dict you first seed it with the backing locals mapping (self.locals) before
overlaying fastlocals. Concretely: in f_locals_mapping when overlay is None,
construct the new dict, copy entries from self.locals.mapping(vm) (or clone its
dict) into overlay_dict before storing it in f_locals_hidden_overlay, and in
locals() when building the fresh snapshot use vm.ctx.new_dict() then copy
self.locals.mapping(vm) entries into that dict before calling
sync_visible_locals_to_mapping; keep using
ArgMapping::from_dict_exact(overlay_dict) and then overlay the fastlocals as you
do now.
- Around line 874-883: The code takes a mutable slice via
(*self.iframe.get()).localsplus.fastlocals_mut() and then calls
has_active_hidden_locals() which accesses the same frame through UnsafeCell,
creating overlapping &mut/& accesses; move the computation of overlay_locals
(the has_active_hidden_locals() check, f_locals_hidden_overlay.lock().clone(),
ArgMapping::from_dict_exact, and locals_map selection via ArgMapping::mapping or
self.locals.mapping(vm)) to occur before taking fastlocals_mut() so you only
borrow immutably/inspect the frame first and then obtain fastlocals_mut()
afterwards, ensuring no simultaneous mutable and shared access to
iframe/localsplus.

---

Duplicate comments:
In `@scripts/compare_bytecode.py`:
- Around line 172-196: The _rerun_missing_targets function currently logs
failed/empty targets but omits them from the returned recovered map, causing
under-reporting; after computing failed and empty (use the same keying as
recovered, i.e. target[0]), add entries for each failed and empty target into
recovered (e.g., recovered[failed_name] = None) before returning so the caller
sees these missing items instead of silently dropping them; update
_rerun_missing_targets to insert those sentinel entries after the warning prints
and then return recovered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: c68aac22-f06b-45a0-81f4-59003ee1c1fe

📥 Commits

Reviewing files that changed from the base of the PR and between f4674b6 and aac855f.

📒 Files selected for processing (4)
  • crates/codegen/src/symboltable.rs
  • crates/vm/src/builtins/frame.rs
  • crates/vm/src/frame.rs
  • scripts/compare_bytecode.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/codegen/src/symboltable.rs

Comment on lines 704 to +706
// Clear temporary refs
self.temporary_refs.lock().clear();
self.f_locals_hidden_overlay.lock().take();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify the ignored Option::take return does not introduce Rust/clippy warnings.
# Expected: no unused_must_use/clippy warnings from crates/vm/src/builtins/frame.rs.

cargo clippy --package rustpython-vm -- -D warnings

Repository: RustPython/RustPython

Length of output: 151


🏁 Script executed:

# First, find the file and check its size
wc -l crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 100


🏁 Script executed:

# Read the code around lines 704-706
sed -n '695,715p' crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 586


🏁 Script executed:

# Search for any clippy allow attributes in the file or around the function
grep -n "allow\|suppress" crates/vm/src/builtins/frame.rs | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Try a more lightweight clippy check on just this file
cargo clippy --package rustpython-vm -- crates/vm/src/builtins/frame.rs 2>&1 | head -100

Repository: RustPython/RustPython

Length of output: 151


🏁 Script executed:

# Search for f_locals_hidden_overlay definition and usage
rg -n "f_locals_hidden_overlay" crates/vm/src/builtins/frame.rs -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 228


🏁 Script executed:

# Check the struct definition for this field
rg -n "struct.*Frame" crates/vm/src/builtins/frame.rs | head -5

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search for the struct definition of Frame containing f_locals_hidden_overlay
rg -n "f_locals_hidden_overlay\s*:" crates/vm/src/builtins/frame.rs -B 5 -A 1

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Also search for the full struct definition
rg -n "pub struct.*Frame" crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Get the first 150 lines of the file to see struct/imports context
head -150 crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 4550


🏁 Script executed:

# Search for the actual Frame struct definition - it might be in a different file
find crates -name "*.rs" -type f | xargs grep -l "f_locals_hidden_overlay" | head -5

Repository: RustPython/RustPython

Length of output: 119


🏁 Script executed:

# Also search for PyFrame or Frame struct that contains this field
rg -n "struct.*Frame\|f_locals_hidden_overlay" crates/vm/src/builtins/frame.rs | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there are other Rust files defining Frame with this field
fd -e rs crates/vm/src | xargs grep -l "f_locals_hidden_overlay"

Repository: RustPython/RustPython

Length of output: 424


🏁 Script executed:

# Search in the other frame.rs file for the struct definition
rg -n "struct.*Frame|f_locals_hidden_overlay" crates/vm/src/frame.rs -B 3 -A 3 | head -50

Repository: RustPython/RustPython

Length of output: 2131


🏁 Script executed:

# Get the struct definition from frame.rs
rg -n "pub struct Frame" crates/vm/src/frame.rs -A 30

Repository: RustPython/RustPython

Length of output: 2281


🏁 Script executed:

# Look at the context around line 704-706 more carefully - check what function this is in
sed -n '680,715p' crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 1131


🏁 Script executed:

# Check for any #[allow] or clippy suppress attributes in that function
sed -n '680,715p' crates/vm/src/builtins/frame.rs | grep -i "allow\|suppress"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Let's check if the current code produces warnings by running cargo check instead (faster than clippy)
cargo check --package rustpython-vm 2>&1 | grep -i "unused_must_use\|f_locals_hidden_overlay" | head -20

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Let's verify the Rust standard library behavior - Option::take is marked #[must_use]
# Check if there are any recent uses of Option::take() in the codebase to see patterns
rg "\.take\(\)" crates/vm/src/builtins/frame.rs -B 1 -A 1

Repository: RustPython/RustPython

Length of output: 161


🏁 Script executed:

# Check the Rust version and whether must_use is enforced in this project
cat Cargo.toml | head -30

Repository: RustPython/RustPython

Length of output: 1257


🏁 Script executed:

# Check the rustpython-vm Cargo.toml for any lints configuration
cat crates/vm/Cargo.toml | grep -A 20 "lints\|profile"

Repository: RustPython/RustPython

Length of output: 89


🏁 Script executed:

# Check if there are workspace-level lint settings
cat Cargo.toml | grep -A 20 "\[lints\]"

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for any build.rs or other configuration that might set lint levels
ls -la crates/vm/ | grep -E "build\.rs|clippy"

Repository: RustPython/RustPython

Length of output: 123


🏁 Script executed:

# Find the workspace lints configuration
rg "\[lints\]" Cargo.toml -A 50

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Also check if there's any clippy configuration in rust-toolchain or .clippy.toml
cat rust-toolchain.toml 2>/dev/null || echo "No rust-toolchain.toml"

Repository: RustPython/RustPython

Length of output: 95


🏁 Script executed:

# Check for clippy.toml
cat .clippy.toml 2>/dev/null || cat clippy.toml 2>/dev/null || echo "No clippy config found"

Repository: RustPython/RustPython

Length of output: 87


🏁 Script executed:

# Verify that Option::take is indeed marked #[must_use] in Rust stdlib
# Let's also check if there are other .take() calls in the file to see patterns
rg "\.take()" crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 116


🏁 Script executed:

# Check if there are any allow(unused_must_use) or similar attributes around line 706
sed -n '700,715p' crates/vm/src/builtins/frame.rs | cat -A

Repository: RustPython/RustPython

Length of output: 437


🏁 Script executed:

# Check what the return type of PyMutex::lock() is and verify the must_use attribute
rg "impl.*PyMutex|type Target" crates/vm/src -B 2 -A 5 | head -40

Repository: RustPython/RustPython

Length of output: 1871


🏁 Script executed:

# Check if there are other patterns in the codebase of handling must_use values
rg "let _ = .*\.take\(\)" crates/vm/src/ | head -10

Repository: RustPython/RustPython

Length of output: 120


🏁 Script executed:

# Also check for explicit drop patterns
rg "drop\(" crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Verify the exact semantic - does the value need to be kept or is it safe to discard?
# Look at what the clear method is supposed to do
sed -n '670,710p' crates/vm/src/builtins/frame.rs

Repository: RustPython/RustPython

Length of output: 1500


🏁 Script executed:

# Find the workspace lints definition - it must be in a parent Cargo.toml or workspace
find . -maxdepth 1 -name "Cargo.toml" -exec grep -l "workspace" {} \;

Repository: RustPython/RustPython

Length of output: 77


🏁 Script executed:

# Check the root level workspace configuration
head -100 Cargo.toml

Repository: RustPython/RustPython

Length of output: 2906


🏁 Script executed:

# Get the full workspace definition including lints
tail -200 Cargo.toml | head -150

Repository: RustPython/RustPython

Length of output: 5299


🏁 Script executed:

# Or search specifically for the lints section in the root Cargo.toml
rg "^\[lints\]" Cargo.toml -A 50

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check if there's a workspace.lints section in the full Cargo.toml
grep -n "^\[workspace.lints\]" Cargo.toml

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Verify the Rust standard library behavior - Option::take is marked #[must_use]
# Let's check the rust docs or verify the behavior
python3 << 'EOF'
# The Rust standard library marks Option::take() with #[must_use]
# This means ignoring its return value will trigger a compiler warning
# when compiled with warnings-as-errors or in strict lint modes
print("Rust Option<T>::take() is marked #[must_use]")
print("Source: https://doc.rust-lang.org/std/option/enum.Option.html#method.take")
print()
print("Ignoring a #[must_use] return value will trigger:")
print("  - warning: unused return value of `Option::<T>::take`")
print("  - error when lints are enforced with -D warnings")
EOF

Repository: RustPython/RustPython

Length of output: 341


🏁 Script executed:

# Confirm the pattern used elsewhere in the codebase
grep -B 2 -A 2 "let _ = .*\.take()" crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 205


Use idiomatic pattern to explicitly discard the Option<PyDictRef> from take().

Line 706 ignores the return value of take(), which is marked #[must_use] and will trigger compiler warnings when lints are enforced. Bind the result explicitly using let _ = to make the intent clear, matching the pattern used elsewhere in the codebase (e.g., crates/vm/src/frame.rs).

🛠️ Proposed fix
         // Clear temporary refs
         self.temporary_refs.lock().clear();
-        self.f_locals_hidden_overlay.lock().take();
+        let _ = self.f_locals_hidden_overlay.lock().take();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Clear temporary refs
self.temporary_refs.lock().clear();
self.f_locals_hidden_overlay.lock().take();
// Clear temporary refs
self.temporary_refs.lock().clear();
let _ = self.f_locals_hidden_overlay.lock().take();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/builtins/frame.rs` around lines 704 - 706, The call to
self.f_locals_hidden_overlay.lock().take() returns an Option<PyDictRef>
annotated #[must_use], so explicitly discard it to avoid warnings; update the
code to bind the result (e.g., let _ =
self.f_locals_hidden_overlay.lock().take()) just like the pattern used for
self.temporary_refs.lock().clear(), referencing the f_locals_hidden_overlay
field and its take() call to make the intent clear and silence the must_use
lint.

Comment thread crates/vm/src/frame.rs Outdated
Comment thread crates/vm/src/frame.rs
@youknowone youknowone force-pushed the bytecode-parity-int branch from aac855f to a04edb1 Compare April 20, 2026 14:24
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/compare_bytecode.py`:
- Around line 172-192: In _rerun_missing_targets, don't treat any non-empty
returned mapping as success; after calling _run_sync_dump and getting data,
verify that data contains the requested relpath (e.g., if relpath in data or
data.get(relpath) is not None) before calling recovered.update(data); if the
relpath key is missing then treat it like an empty/unexpected result (append
relpath to empty and set recovered[relpath] to the error message about no data),
otherwise merge the returned mapping into recovered and consider it recovered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 37593bfe-688f-4991-9ed2-b6d49fd82976

📥 Commits

Reviewing files that changed from the base of the PR and between aac855f and a04edb1.

📒 Files selected for processing (4)
  • crates/codegen/src/symboltable.rs
  • crates/vm/src/builtins/frame.rs
  • crates/vm/src/frame.rs
  • scripts/compare_bytecode.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/vm/src/builtins/frame.rs
  • crates/vm/src/frame.rs

Comment on lines +172 to +192
def _rerun_missing_targets(interpreter, targets, base_dir):
recovered = {}
failed = []
empty = []
for target in targets:
relpath = target[0]
data = _run_sync_dump(interpreter, [target], base_dir)
if data is None:
failed.append(relpath)
recovered[relpath] = {
"status": "error",
"error": "dump helper failed while rerunning target",
}
elif data:
recovered.update(data)
else:
empty.append(relpath)
recovered[relpath] = {
"status": "error",
"error": "dump helper produced no data while rerunning target",
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Verify the rerun returned the requested target before marking it recovered.

Line 185 treats any non-empty rerun output as success. If the helper returns a mapping with only unexpected keys, the original relpath remains missing while the rerun is counted as recovered.

Proposed fix
-        elif data:
-            recovered.update(data)
+        elif relpath in data:
+            recovered[relpath] = data[relpath]
+        elif data:
+            failed.append(relpath)
+            recovered[relpath] = {
+                "status": "error",
+                "error": "dump helper did not return target while rerunning",
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/compare_bytecode.py` around lines 172 - 192, In
_rerun_missing_targets, don't treat any non-empty returned mapping as success;
after calling _run_sync_dump and getting data, verify that data contains the
requested relpath (e.g., if relpath in data or data.get(relpath) is not None)
before calling recovered.update(data); if the relpath key is missing then treat
it like an empty/unexpected result (append relpath to empty and set
recovered[relpath] to the error message about no data), otherwise merge the
returned mapping into recovered and consider it recovered.

@youknowone youknowone merged commit 9140ef5 into RustPython:main Apr 20, 2026
21 checks passed
@youknowone youknowone deleted the bytecode-parity-int branch April 20, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant