Skip to content

Add c-api error handling#7784

Closed
bschoenmaeckers wants to merge 12 commits into
RustPython:mainfrom
bschoenmaeckers:capi-errors
Closed

Add c-api error handling#7784
bschoenmaeckers wants to merge 12 commits into
RustPython:mainfrom
bschoenmaeckers:capi-errors

Conversation

@bschoenmaeckers
Copy link
Copy Markdown
Contributor

@bschoenmaeckers bschoenmaeckers commented May 5, 2026

This is the next step in adding c-api support. It also contains utilities to handle errors for implementing other c-api functions.

Summary by CodeRabbit

  • New Features

    • Added optional C API support enabling RustPython to function as a Python interpreter through C FFI bindings, compatible with existing C extensions and libraries expecting standard Python C interfaces.
  • Refactor

    • Refactored thread-local VM tracking mechanism to use a stack-based approach.
    • Tightened conditional compilation requirements for fork-related functionality to require host environment support.
  • Chores

    • Updated spell-check dictionary entries.
    • Removed unused threading dependency.

@bschoenmaeckers bschoenmaeckers changed the title ADD c-api error handling Add c-api error handling May 5, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

📝 Walkthrough

Walkthrough

This PR introduces a Rust-to-C FFI binding layer (rustpython-capi crate) exposing core RustPython interpreter lifecycle, exception, object, and state management operations. It refactors VM thread management from scoped TLS to stack-based attachment, adds conditional fork-safety gates behind a host_env feature, and wires the C-API through Cargo features and build configuration.

Changes

C-API Layer Integration

Layer / File(s) Summary
Feature & Build Configuration
Cargo.toml, build.rs, .cspell.dict/*
New optional capi feature added to top-level crate with rustpython-capi workspace dependency; linker export-dynamic flags gated behind CARGO_FEATURE_CAPI on Linux/macOS; spell-check dictionaries updated with pystate and PYGILSTATE.
VM Thread Management Foundation
crates/vm/src/vm/thread.rs
Replaced scoped_thread_local! VM tracking with stack-based VM_STACK (RefCell) and added per-thread GILSTATE_VM storage; introduced CurrentVmAttachState enum and public attach_current_thread() / release_current_thread() APIs for external thread attachment; adjusted Unix stop-the-world transitions to use new stack mechanism via set_current_vm().
Exception API Visibility
crates/vm/src/vm/mod.rs
Exposed current_exception() and set_exception() from pub(crate) to pub; added take_raised_exception() to extract top exception with thread state updates.
C-API Crate Infrastructure
crates/capi/Cargo.toml, crates/capi/.cargo/config.toml, crates/capi/pyo3-rustpython.config, crates/capi/src/lib.rs
New rustpython-capi crate configured as both cdylib and rlib; pyo3 dev-dependency with abi3 feature; environment config for PYO3_CONFIG_FILE and PYO3_NO_PYTHON; module surface declares object, pyerrors, pylifecycle, pystate, refcount modules; added get_main_interpreter() and set_main_interpreter() for process-wide interpreter access.
FFI Result Abstraction
crates/capi/src/util.rs
Introduced FfiResult<Output> trait with error sentinel (ERR_VALUE) and into_output() conversion; implementations for unit, primitives, raw pointers, and PyResult<T> enable consistent C-ABI boundary error handling.
C-API Implementations
crates/capi/src/pylifecycle.rs, crates/capi/src/pystate.rs, crates/capi/src/object.rs, crates/capi/src/pyerrors.rs, crates/capi/src/refcount.rs
Implemented Python C-API entry points: Py_Initialize/Py_Finalize manage interpreter lifecycle via MAIN_INTERP global; PyGILState_Ensure/Release manage thread attachment; Py_TYPE, Py_IS_TYPE, PyType_GetFlags provide object/type introspection; exception zoo statics and functions (PyErr_Occurred, PyErr_SetRaisedException, PyErr_NewException, etc.) handle exception lifecycle; _Py_IncRef/_Py_DecRef manage reference counts.
Integration & Feature Gating
src/lib.rs
Conditional execution path under feature = "capi" initializes main interpreter via rustpython_capi, sets it before running, and finalizes after completion; non-capi builds use original interp.run() path.
Fork-Safety Conditioning
crates/vm/src/builtins/type.rs, crates/vm/src/codecs.rs, crates/vm/src/intern.rs, crates/vm/src/object/core.rs, crates/vm/src/stdlib/_imp.rs, crates/vm/src/stdlib/_io.rs, crates/vm/src/stdlib/_thread.rs
Fork-related reinitialization helpers now require feature = "host_env" in addition to feature = "threading"; affects reinit_after_fork methods, weakref lock reset, import lock management, and std-stream reinitialization to prevent compilation when C-API host-environment integration is not intended.
Dependency Cleanup
crates/vm/Cargo.toml
Removed unused scoped-tls dependency; thread tracking now uses stack-based thread_local instead.

Sequence Diagram

sequenceDiagram
    participant Client as External C Code
    participant API as RustPython C-API
    participant Interp as Interpreter<br/>(MAIN_INTERP)
    participant VM as VirtualMachine
    participant Exc as Exception Stack

    Client->>API: Py_Initialize()
    API->>Interp: create & store MAIN_INTERP
    Interp->>VM: Interpreter::with_init()
    Interp->>VM: ensure_thread_has_vm_attached()
    
    Client->>API: PyGILState_Ensure()
    API->>VM: attach_current_thread()
    VM-->>API: return attached state

    Client->>API: PyErr_SetString(PyExc_TypeError, msg)
    API->>Exc: set exception in VM
    Exc-->>API: exception stored

    Client->>API: PyErr_Occurred()
    Exc-->>API: return exception ptr

    Client->>API: PyGILState_Release(state)
    API->>VM: release_current_thread()
    VM-->>API: detached

    Client->>API: Py_FinalizeEx()
    API->>Interp: finalize main interpreter
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

  • RustPython/RustPython#7648: Both PRs add the same C-API infrastructure (crates/capi crate, pylifecycle/pystate/refcount modules, Cargo.toml wiring, build.rs linker flags) with identical file structure and symbols.
  • RustPython/RustPython#7166: Related at the VM/exception level — both PRs modify exception stack storage and thread-local exception state handling, with this PR exposing public exception APIs that depend on refactored exception management.

Suggested Reviewers

  • ShaharNaveh
  • youknowone

Poem

🐰 A C-bridge now spans the gulf so deep,
Where RustPython's VM treasures sleep,
Initialize, attach threads with glee,
Reference counts dance merrily,
Safe FFI boundaries firmly weep! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Add c-api error handling' partially relates to the changeset but is overly broad and vague. The PR introduces extensive C-API infrastructure including object types, lifecycle management, thread state, reference counting, and more—not just error handling. While error handling is present (pyerrors.rs), it represents only one aspect of a much larger C-API implementation effort. Consider a more specific title such as 'Implement Python C-API bindings and infrastructure' or 'Add comprehensive C-API layer with object, lifecycle, and thread management' to better reflect the full scope of changes.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/vm/src/object/core.rs (1)

477-503: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Gate narrowing on reset_weakref_locks_after_fork breaks its call site in posix.rs::py_os_after_fork_child

reset_weakref_locks_after_fork (line 500) is now compiled only under unix + threading + host_env, but its call site in crates/vm/src/stdlib/posix.rs::py_os_after_fork_child (line ~791) is gated only on #[cfg(feature = "threading")] (and posix.rs is Unix-only, making it effectively unix + threading). Builds with unix + threading but without host_env will fail to compile there.

The call in posix.rs:

#[cfg(feature = "threading")]
crate::object::reset_weakref_locks_after_fork();

needs to become:

#[cfg(all(feature = "threading", feature = "host_env"))]
crate::object::reset_weakref_locks_after_fork();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/object/core.rs` around lines 477 - 503, The call site gating
for reset_weakref_locks_after_fork in posix.rs::py_os_after_fork_child must
match the function's cfg; change the cfg on the call to require both feature =
"threading" and feature = "host_env" (i.e. use #[cfg(all(feature = "threading",
feature = "host_env"))]) so the call to
crate::object::reset_weakref_locks_after_fork() is only compiled when the
function is present.
crates/vm/src/vm/mod.rs (1)

2063-2098: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Back C-API exception functions with a separate error-indicator state, not the handled-exception stack.

Currently, PyErr_Occurred() and PyErr_GetRaisedException() read from the same stack that backs sys.exc_info() and generator gi_exc_state (the exc_info slot managed by PUSH_EXC_INFO/POP_EXCEPT). According to CPython semantics, these C-API functions should operate on the per-thread "error indicator" state—representing exceptions still propagating between C calls—which is distinct from and independent of the handled-exception state. Using the exc_info stack causes:

  1. C-API code can observe exceptions caught by Python try/except, which should be invisible at the C level.
  2. Calling PyErr_GetRaisedException() + PyErr_SetRaisedException() can corrupt the active exception state of the current frame or generator.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/vm/mod.rs` around lines 2063 - 2098, The C-API error-indicator
must be backed by a separate per-Vm state, not the exc_info stack used by
sys.exc_info()/PUSH_EXC_INFO; change the methods so they don't read/write
exceptions.stack: leave current_exception/set_exception (and their semantics for
the exc_info stack) as-is, and introduce/replace the C-API facing API with a
separate storage (e.g., add a new field like raised_exception:
RefCell<Option<PyBaseExceptionRef>>) and implement get_raised_exception /
set_raised_exception / take_raised_exception to operate on that field instead of
exceptions.stack; ensure thread::update_thread_exception continues to reflect
the thread-visible error indicator using the new getter (not topmost_exception),
and update any callers that relied on the old take_raised_exception to use the
new C-API methods so C-level errors do not corrupt the handled-exception stack.
🧹 Nitpick comments (5)
crates/capi/src/pystate.rs (1)

11-13: 💤 Low value

Caveat for future callers of with_vm.

with_current_vm holds a RefCell borrow on VM_STACK for the duration of the callback (per crates/vm/src/vm/thread.rs:88-99). If f ever transitively calls release_current_thread or anything that tries to mutate VM_STACK, it will hit a borrow_mut panic. None of today's call sites do that, but a brief safety comment on with_vm would prevent future footguns.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/pystate.rs` around lines 11 - 13, with_vm currently calls
with_current_vm which holds a RefCell borrow on VM_STACK for the duration of the
callback; if the provided closure f (or anything it transitively calls) tries to
mutate VM_STACK (for example by calling release_current_thread) it will cause a
borrow_mut panic — add a concise safety comment on with_vm explaining that the
closure must not call any function that mutates or re-enters VM_STACK (e.g.,
release_current_thread or other VM stack-mutating APIs) and suggest using a
different API or refactoring to avoid holding the borrow across such calls;
reference the functions with_vm and with_current_vm and the VM_STACK cell in the
comment to help future callers locate the caveat.
crates/capi/src/util.rs (1)

79-85: 💤 Low value

usize → isize cast can produce a spurious error sentinel.

self as isize wraps when self > isize::MAX, which would silently turn a very large length into a negative value indistinguishable from ERR_VALUE = -1. For sizes returned by C-API functions (e.g., Py_ssize_t-shaped APIs) this is unlikely in practice, but consider isize::try_from(self).unwrap_or(isize::MAX) or asserting the upper bound, or at minimum a comment documenting the assumption.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/util.rs` around lines 79 - 85, The impl FfiResult<isize> for
usize currently casts usize -> isize in into_output which can wrap for values >
isize::MAX and collide with ERR_VALUE; change into_output in the impl for usize
to convert safely (e.g., use isize::try_from(self).unwrap_or(isize::MAX) or
otherwise saturate/validate and return isize::MAX on overflow) and document the
chosen behavior, making sure ERR_VALUE (-1) remains distinct; update the impl
block (symbols: FfiResult<isize> for usize, const ERR_VALUE, fn
into_output(self, _vm: &VirtualMachine)) accordingly.
crates/capi/src/object.rs (1)

70-83: ⚡ Quick win

Replace panic! with CPython-conventional error.

A panic! in an extern "C" function aborts the process (since Rust 1.81 panics across the FFI boundary unwind into an abort). CPython's Py_GetConstant/Py_GetConstantBorrowed conventionally raise SystemError and return NULL on an invalid id, which is friendlier and matches what C-extension authors expect.

♻️ Suggested change
-pub extern "C" fn Py_GetConstantBorrowed(constant_id: c_uint) -> *mut PyObject {
-    with_vm(|vm| {
-        let ctx = &vm.ctx;
-        match constant_id {
-            0 => ctx.none.as_object(),
-            1 => ctx.false_value.as_object(),
-            2 => ctx.true_value.as_object(),
-            3 => ctx.ellipsis.as_object(),
-            4 => ctx.not_implemented.as_object(),
-            _ => panic!("Invalid constant_id passed to Py_GetConstantBorrowed"),
-        }
-        .as_raw()
-    })
-}
+pub extern "C" fn Py_GetConstantBorrowed(constant_id: c_uint) -> *mut PyObject {
+    with_vm(|vm| -> PyResult<PyObjectRef> {
+        let ctx = &vm.ctx;
+        let obj = match constant_id {
+            0 => ctx.none.as_object(),
+            1 => ctx.false_value.as_object(),
+            2 => ctx.true_value.as_object(),
+            3 => ctx.ellipsis.as_object(),
+            4 => ctx.not_implemented.as_object(),
+            _ => return Err(vm.new_system_error(format!(
+                "Py_GetConstantBorrowed: invalid constant_id {constant_id}"
+            ))),
+        };
+        Ok(obj.to_owned())
+    })
+}

(Or, if the as_raw/borrowed-pointer semantics need to be preserved exactly, return *mut PyObject directly via FfiResult and set the exception inline, returning null_mut() on the bad-id path.)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/object.rs` around lines 70 - 83, In Py_GetConstantBorrowed
replace the panic! on the _ (invalid id) branch with CPython-style behavior:
create/raise a SystemError on the VM/context (e.g. via
vm.ctx.new_system_error(...) or the crate's VM exception-raising helper) and
return null_mut() from the extern "C" function; alternatively, change the
function to return an FfiResult that sets the exception and yields a null
pointer on error. Ensure you still return the borrowed object's as_raw() for
valid ids and do not unwind across the FFI boundary.
crates/capi/src/lib.rs (1)

27-33: 💤 Low value

Context::genesis() is reused; verify it matches the interpreter's context.

init_exception_statics(&Context::genesis().exceptions) initializes exception statics from the genesis context, while the stored interpreter may have been built with its own Context. Today genesis() is effectively a global singleton in RustPython so this works, but if Context ever becomes per-interpreter, the statics here will diverge from the interpreter actually serving C-API calls. Consider deriving the exceptions from the interpreter being installed (e.g., interpreter.enter(|vm| init_exception_statics(&vm.ctx.exceptions)) before storing) so the wiring stays correct under future refactors.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/lib.rs` around lines 27 - 33, The code currently calls
init_exception_statics(&Context::genesis().exceptions) which may diverge if
Context becomes per-interpreter; change set_main_interpreter to derive
exceptions from the Interpreter being installed by entering the interpreter
(e.g., use Interpreter::enter or interpreter.enter(|vm| ...)) and call
init_exception_statics(&vm.ctx.exceptions) while still in that closure, then
store the interpreter; ensure the safety comment is updated to reflect that
exception statics are initialized from the interpreter's Context and keep the
assert!(interp.is_none()) and assignment to *interp = Some(interpreter) as
before.
build.rs (1)

5-12: 💤 Low value

Windows + capi has no symbol-export configuration.

When target == "windows" && capi_enabled, the binary won't export the C-API symbols (no --export-all-symbols, no .def file, no __declspec(dllexport) on the Rust side). Linux/macOS get -Wl,--export-dynamic/-Wl,-export_dynamic, but Windows binaries need explicit export markup to be loadable as a Python host by C extensions. If Windows support for capi is intended, consider emitting an equivalent flag or .def file; if it's deliberately deferred, a comment here documenting that would help.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@build.rs` around lines 5 - 12, The Windows branch currently lacks any export
configuration when target.as_str() == "windows" and capi_enabled is true; update
the build.rs match to handle that case by either (a) generating a
module-definition (.def) file listing the C-API symbols and emitting a
cargo:rustc-link-arg-bin=rustpython=/DEF:<path> (or rustc-link-arg for the MSVC
linker) or (b) detecting MinGW and emitting the appropriate
-Wl,--export-all-symbols flag; implement one of these in the "windows" arm and
add a brief comment that documents which method you chose and why (reference the
target.as_str() match, the capi_enabled flag, and the println! calls that emit
cargo:rustc-link-arg-bin).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/capi/.cargo/config.toml`:
- Around line 1-3: The PYO3 env entries (PYO3_CONFIG_FILE and PYO3_NO_PYTHON)
defined in the local crate .cargo/config.toml won't be picked up for
workspace-root builds; move those two entries into the workspace root
.cargo/config.toml, keep PYO3_CONFIG_FILE pointed at pyo3-rustpython.config (do
not change it to ../pyo3-rustpython.config since relative=true already resolves
to the crate dir), and remove the duplicate/local config so workspace builds
(e.g., builds with the capi feature) inherit the PYO3 settings.

In `@crates/capi/src/object.rs`:
- Around line 32-67: PyType_GetFlags currently initializes flags from
ty.slots.flags.bits() (u64) and then ORs c_ulong constants
(PY_TPFLAGS_LONG_SUBCLASS, etc.), which breaks on Windows because c_ulong is
u32; fix by casting the initial value to c_ulong—i.e., change the initialization
of flags in PyType_GetFlags so it uses ty.slots.flags.bits() as c_ulong before
any |= with the PY_TPFLAGS_* constants so all operations are the same c_ulong
type.

In `@crates/capi/src/pyerrors.rs`:
- Around line 99-105: PyErr_Occurred currently returns the exception instance
(exc.as_object().as_raw()) but must return the exception type (the first arg to
PyErr_Set*), so change the mapper in PyErr_Occurred to return the exception's
type raw pointer instead — i.e., when vm.current_exception() yields exc, return
the raw pointer for exc's class/type (use the VM/exception API such as
exc.class().as_object().as_raw() or the equivalent on your exception type) and
keep the borrowed semantics (no INCREF), preserving the unwrap_or_default
fallback.
- Around line 118-123: In PyErr_SetRaisedException, validate the incoming
pointer before taking ownership: check that exc is not null (use NonNull::new
instead of NonNull::new_unchecked and return an appropriate Err if null), then
construct the PyObjectRef via PyObjectRef::from_raw only after the non-null
check, and replace downcast_unchecked() with a checked downcast (e.g.,
downcast() or the crate's checked variant) so you verify the object is an
exception instance before returning Err(exception).
- Around line 192-236: The extern "C" functions (notably PyErr_NewException,
PyErr_NewExceptionWithDoc, PyErr_PrintEx, PyErr_DisplayException,
PyErr_WriteUnraisable) currently call expect!/panic!/assert!, which will abort
the process on bad input; instead catch invalid inputs, set a Python SystemError
on the VM and return NULL (or return early) per the C-API contract. Replace uses
of expect()/panic!/assert! in PyErr_NewException (e.g.,
CStr::from_ptr(...).to_str().expect(...), rsplit_once().expect(...), panic! for
bad base types, and assert! on dict) with guarded checks that build an error
message, call vm.ctx.exceptions.system_error (or
vm.set_exception(vm.ctx.exceptions.system_error, message)) to set the exception
on the VM, and then return std::ptr::null_mut(); apply the same pattern to
PyErr_NewExceptionWithDoc (delegate should propagate NULL on error) and to
PyErr_PrintEx / PyErr_DisplayException / PyErr_WriteUnraisable so they no longer
panic but set an appropriate SystemError and return gracefully.

In `@crates/capi/src/pylifecycle.rs`:
- Around line 42-53: Py_FinalizeEx() currently returns success without resetting
interpreter state; update Py_FinalizeEx to properly finalize and clear
MAIN_INTERP (drop/free the interpreter handle and set MAIN_INTERP to null/None)
and update whatever initialized flag Py_IsInitialized() checks so it returns 0
after finalize; ensure Py_Finalize() still calls Py_FinalizeEx(), and mirror the
same teardown in Py_FinalizeEx so a subsequent Py_InitializeEx() can create a
fresh interpreter. Reference: Py_FinalizeEx, Py_Finalize, MAIN_INTERP,
Py_IsInitialized, Py_InitializeEx.

In `@crates/capi/src/pystate.rs`:
- Around line 46-49: The PyEval_SaveThread stub must not silently return NULL;
update the body of the extern "C" function PyEval_SaveThread to fail loudly
instead of ptr::null_mut(): add a TODO comment about implementing proper GIL
release, and replace the current return with either an explicit abort (print a
clear message then process::abort()) or call unimplemented!() so callers (and C
extensions expecting a valid PyThreadState pointer) will fail loudly during
development; also ensure the complementary PyEval_RestoreThread handling is
noted in the TODO so the full pair gets implemented against
release_current_thread/attach_current_thread later.

In `@crates/vm/src/codecs.rs`:
- Around line 161-164: The cfg gates are inconsistent:
CodecsRegistry::reinit_after_fork and StringPool::reinit_after_fork are only
defined with #[cfg(all(unix, feature = "threading", feature = "host_env"))] but
posix::reinit_locks_after_fork (posix.rs::reinit_locks_after_fork) is gated with
#[cfg(all(unix, feature = "threading"))], causing missing method errors when
host_env is disabled; update the cfg on posix::reinit_locks_after_fork to
#[cfg(all(unix, feature = "threading", feature = "host_env"))] so its calls to
CodecsRegistry::reinit_after_fork and StringPool::reinit_after_fork compile
consistently.

In `@crates/vm/src/intern.rs`:
- Around line 39-42: The function reinit_locks_after_fork is currently gated
with #[cfg(all(unix, feature = "threading"))] but calls
StringPool::reinit_after_fork and CodecRegistry::reinit_after_fork which are
defined only when #[cfg(all(unix, feature = "threading", feature = "host_env"))]
is set; change the attribute on reinit_locks_after_fork to #[cfg(all(unix,
feature = "threading", feature = "host_env"))] so the compilation gate matches
the methods it calls (locate the reinit_locks_after_fork function and update its
cfg to include feature = "host_env").

In `@crates/vm/src/stdlib/_io.rs`:
- Around line 5-6: The exported function cfg is too restrictive: change the
module-level export so reinit_std_streams_after_fork and reinit_io_locks are
available under #[cfg(all(unix, feature = "threading"))] and move the
#[cfg(feature = "host_env")] guard inside each function body; implement the
host_env-disabled path as a no-op (early return) so calls from posix.rs (which
are under #[cfg(feature = "threading")]) compile when host_env is not enabled,
while keeping the real implementation when host_env is present.

In `@crates/vm/src/stdlib/_thread.rs`:
- Around line 2-3: The call site py_os_after_fork_child in posix.rs calls
after_fork_child which is only exported under #[cfg(all(unix, feature =
"threading", feature = "host_env"))]; update the gating around
py_os_after_fork_child (or the specific call) to require the same cfg (e.g.
#[cfg(all(unix, feature = "threading", feature = "host_env"))]) so the call is
only compiled when after_fork_child is available, ensuring the cfg for
py_os_after_fork_child matches the after_fork_child export.

In `@crates/vm/src/vm/thread.rs`:
- Around line 175-190: The release_current_thread function currently drops
GILSTATE_VM before popping the raw pointer from VM_STACK, which can cause
destructors to call with_current_vm and dereference freed memory; fix by
reversing the order in release_current_thread: first remove/pop the VM pointer
from VM_STACK (the vms.borrow_mut().pop().expect(...) call) and only after that
take/drop the GILSTATE_VM (GILSTATE_VM.with(|gilstate_vm|
gilstate_vm.borrow_mut().take())); keep the existing detach_thread call and
other logic unchanged.

---

Outside diff comments:
In `@crates/vm/src/object/core.rs`:
- Around line 477-503: The call site gating for reset_weakref_locks_after_fork
in posix.rs::py_os_after_fork_child must match the function's cfg; change the
cfg on the call to require both feature = "threading" and feature = "host_env"
(i.e. use #[cfg(all(feature = "threading", feature = "host_env"))]) so the call
to crate::object::reset_weakref_locks_after_fork() is only compiled when the
function is present.

In `@crates/vm/src/vm/mod.rs`:
- Around line 2063-2098: The C-API error-indicator must be backed by a separate
per-Vm state, not the exc_info stack used by sys.exc_info()/PUSH_EXC_INFO;
change the methods so they don't read/write exceptions.stack: leave
current_exception/set_exception (and their semantics for the exc_info stack)
as-is, and introduce/replace the C-API facing API with a separate storage (e.g.,
add a new field like raised_exception: RefCell<Option<PyBaseExceptionRef>>) and
implement get_raised_exception / set_raised_exception / take_raised_exception to
operate on that field instead of exceptions.stack; ensure
thread::update_thread_exception continues to reflect the thread-visible error
indicator using the new getter (not topmost_exception), and update any callers
that relied on the old take_raised_exception to use the new C-API methods so
C-level errors do not corrupt the handled-exception stack.

---

Nitpick comments:
In `@build.rs`:
- Around line 5-12: The Windows branch currently lacks any export configuration
when target.as_str() == "windows" and capi_enabled is true; update the build.rs
match to handle that case by either (a) generating a module-definition (.def)
file listing the C-API symbols and emitting a
cargo:rustc-link-arg-bin=rustpython=/DEF:<path> (or rustc-link-arg for the MSVC
linker) or (b) detecting MinGW and emitting the appropriate
-Wl,--export-all-symbols flag; implement one of these in the "windows" arm and
add a brief comment that documents which method you chose and why (reference the
target.as_str() match, the capi_enabled flag, and the println! calls that emit
cargo:rustc-link-arg-bin).

In `@crates/capi/src/lib.rs`:
- Around line 27-33: The code currently calls
init_exception_statics(&Context::genesis().exceptions) which may diverge if
Context becomes per-interpreter; change set_main_interpreter to derive
exceptions from the Interpreter being installed by entering the interpreter
(e.g., use Interpreter::enter or interpreter.enter(|vm| ...)) and call
init_exception_statics(&vm.ctx.exceptions) while still in that closure, then
store the interpreter; ensure the safety comment is updated to reflect that
exception statics are initialized from the interpreter's Context and keep the
assert!(interp.is_none()) and assignment to *interp = Some(interpreter) as
before.

In `@crates/capi/src/object.rs`:
- Around line 70-83: In Py_GetConstantBorrowed replace the panic! on the _
(invalid id) branch with CPython-style behavior: create/raise a SystemError on
the VM/context (e.g. via vm.ctx.new_system_error(...) or the crate's VM
exception-raising helper) and return null_mut() from the extern "C" function;
alternatively, change the function to return an FfiResult that sets the
exception and yields a null pointer on error. Ensure you still return the
borrowed object's as_raw() for valid ids and do not unwind across the FFI
boundary.

In `@crates/capi/src/pystate.rs`:
- Around line 11-13: with_vm currently calls with_current_vm which holds a
RefCell borrow on VM_STACK for the duration of the callback; if the provided
closure f (or anything it transitively calls) tries to mutate VM_STACK (for
example by calling release_current_thread) it will cause a borrow_mut panic —
add a concise safety comment on with_vm explaining that the closure must not
call any function that mutates or re-enters VM_STACK (e.g.,
release_current_thread or other VM stack-mutating APIs) and suggest using a
different API or refactoring to avoid holding the borrow across such calls;
reference the functions with_vm and with_current_vm and the VM_STACK cell in the
comment to help future callers locate the caveat.

In `@crates/capi/src/util.rs`:
- Around line 79-85: The impl FfiResult<isize> for usize currently casts usize
-> isize in into_output which can wrap for values > isize::MAX and collide with
ERR_VALUE; change into_output in the impl for usize to convert safely (e.g., use
isize::try_from(self).unwrap_or(isize::MAX) or otherwise saturate/validate and
return isize::MAX on overflow) and document the chosen behavior, making sure
ERR_VALUE (-1) remains distinct; update the impl block (symbols:
FfiResult<isize> for usize, const ERR_VALUE, fn into_output(self, _vm:
&VirtualMachine)) accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 7a69f801-df68-4089-9da5-e5695cac77a4

📥 Commits

Reviewing files that changed from the base of the PR and between 22d4f43 and 61f4387.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • .cspell.dict/cpython.txt
  • .cspell.dict/python-more.txt
  • Cargo.toml
  • build.rs
  • crates/capi/.cargo/config.toml
  • crates/capi/Cargo.toml
  • crates/capi/pyo3-rustpython.config
  • crates/capi/src/lib.rs
  • crates/capi/src/object.rs
  • crates/capi/src/pyerrors.rs
  • crates/capi/src/pylifecycle.rs
  • crates/capi/src/pystate.rs
  • crates/capi/src/refcount.rs
  • crates/capi/src/util.rs
  • crates/vm/Cargo.toml
  • crates/vm/src/builtins/type.rs
  • crates/vm/src/codecs.rs
  • crates/vm/src/intern.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/stdlib/_imp.rs
  • crates/vm/src/stdlib/_io.rs
  • crates/vm/src/stdlib/_thread.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
  • src/lib.rs
💤 Files with no reviewable changes (1)
  • crates/vm/Cargo.toml

Comment thread crates/capi/.cargo/config.toml
Comment thread crates/capi/src/object.rs
Comment on lines +32 to +67
#[unsafe(no_mangle)]
pub unsafe extern "C" fn PyType_GetFlags(ptr: *const PyTypeObject) -> c_ulong {
let ctx = Context::genesis();
let zoo = &ctx.types;
let exp_zoo = &ctx.exceptions;

let ty = unsafe { &*ptr };
let mut flags = ty.slots.flags.bits();

if ty.is_subtype(zoo.int_type) {
flags |= PY_TPFLAGS_LONG_SUBCLASS;
}
if ty.is_subtype(zoo.list_type) {
flags |= PY_TPFLAGS_LIST_SUBCLASS
}
if ty.is_subtype(zoo.tuple_type) {
flags |= PY_TPFLAGS_TUPLE_SUBCLASS;
}
if ty.is_subtype(zoo.bytes_type) {
flags |= PY_TPFLAGS_BYTES_SUBCLASS;
}
if ty.is_subtype(zoo.str_type) {
flags |= PY_TPFLAGS_UNICODE_SUBCLASS;
}
if ty.is_subtype(zoo.dict_type) {
flags |= PY_TPFLAGS_DICT_SUBCLASS;
}
if ty.is_subtype(exp_zoo.base_exception_type) {
flags |= PY_TPFLAGS_BASE_EXC_SUBCLASS;
}
if ty.is_subtype(zoo.type_type) {
flags |= PY_TPFLAGS_TYPE_SUBCLASS;
}

flags
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the underlying repr of slots.flags.
ast-grep --pattern 'pub struct PyTypeFlags: $_'
rg -nP --type=rust -C3 'bitflags!\s*\{[^}]*PyTypeFlags' 
rg -nP --type=rust -C2 'pub flags:\s*PyTypeFlags'

Repository: RustPython/RustPython

Length of output: 331


🏁 Script executed:

#!/bin/bash
# Find PyTypeFlags definition and its underlying type
rg -n "struct PyTypeFlags|bitflags.*PyTypeFlags" crates/vm/src/types/slot.rs -A 20 -B 2

Repository: RustPython/RustPython

Length of output: 994


🏁 Script executed:

#!/bin/bash
# Find PY_TPFLAGS_* constants definition
rg -n "PY_TPFLAGS_LONG_SUBCLASS|PY_TPFLAGS_LIST_SUBCLASS" --type=rust -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 932


🏁 Script executed:

#!/bin/bash
# Check what type bits() returns and what c_ulong is in context
rg -n "c_ulong|c_uint" crates/capi/src/object.rs -B 2 -A 2 | head -40

Repository: RustPython/RustPython

Length of output: 1078


Add explicit cast of bits() to c_ulong to fix Windows compilation.

ty.slots.flags is a PyTypeFlags bitflags with underlying type u64 (line 221, crates/vm/src/types/slot.rs), so bits() returns u64. The constants PY_TPFLAGS_LONG_SUBCLASS etc. are declared as c_ulong (lines 7–14, crates/capi/src/object.rs). On Windows, c_ulong is u32 while on 64-bit Unix it is u64. The current code does flags |= c_ulong_constant, which requires matching types in Rust and will fail to compile on Windows.

Cast the result to c_ulong:

let mut flags = ty.slots.flags.bits() as c_ulong;

High bits are not lost since the constants use shifts 1 << 24 through 1 << 31 (within 32 bits), and the function return type is already c_ulong.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/object.rs` around lines 32 - 67, PyType_GetFlags currently
initializes flags from ty.slots.flags.bits() (u64) and then ORs c_ulong
constants (PY_TPFLAGS_LONG_SUBCLASS, etc.), which breaks on Windows because
c_ulong is u32; fix by casting the initial value to c_ulong—i.e., change the
initialization of flags in PyType_GetFlags so it uses ty.slots.flags.bits() as
c_ulong before any |= with the PY_TPFLAGS_* constants so all operations are the
same c_ulong type.

Comment on lines +99 to +105
#[unsafe(no_mangle)]
pub extern "C" fn PyErr_Occurred() -> *mut PyObject {
with_vm(|vm| {
vm.current_exception()
.map(|exc| exc.as_object().as_raw())
.unwrap_or_default()
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In CPython's C API, what does PyErr_Occurred() return: the current exception instance or the current exception type?

💡 Result:

PyErr_Occurred returns the current exception type, not the exception instance. If the error indicator is set, it returns a borrowed reference to the exception type (the first argument to the last call to a PyErr_Set* function or PyErr_Restore). If no exception is set, it returns NULL.[1][2][3]

Citations:


PyErr_Occurred() must return the exception type, not the exception instance.

Per the CPython C API specification, PyErr_Occurred() returns a borrowed reference to the exception type (the first argument to the last call to a PyErr_Set* function). The current implementation returns the exception instance via exc.as_object().as_raw(), which breaks C callers that check against PyExc_* type constants or branch on exception type.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/pyerrors.rs` around lines 99 - 105, PyErr_Occurred currently
returns the exception instance (exc.as_object().as_raw()) but must return the
exception type (the first arg to PyErr_Set*), so change the mapper in
PyErr_Occurred to return the exception's type raw pointer instead — i.e., when
vm.current_exception() yields exc, return the raw pointer for exc's class/type
(use the VM/exception API such as exc.class().as_object().as_raw() or the
equivalent on your exception type) and keep the borrowed semantics (no INCREF),
preserving the unwrap_or_default fallback.

Comment on lines +118 to +123
pub unsafe extern "C" fn PyErr_SetRaisedException(exc: *mut PyObject) {
with_vm::<PyResult<Infallible>, _>(|_vm| {
let exception =
unsafe { PyObjectRef::from_raw(NonNull::new_unchecked(exc)).downcast_unchecked() };
Err(exception)
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In CPython 3.12+ C API, does PyErr_SetRaisedException(NULL) clear the current exception, and is the argument required to be an exception instance?

💡 Result:

No, PyErr_SetRaisedException(NULL) does not clear the current exception. The official documentation states: "Set exc as the exception currently being raised, clearing the existing exception if one is set. This call steals a reference to exc, which must be a valid exception."[1][2][3][4] Passing NULL would violate the requirement that exc must be a valid exception, as NULL is not a valid PyObject* exception instance, and attempting to steal a reference from NULL would likely crash or cause undefined behavior. To clear the current exception in CPython 3.12+, use PyErr_Clear() instead.[3][5] The argument is required to be an exception instance (a valid PyObject* that is a valid exception), not NULL.[1]

Citations:


Validate exc before from_raw()/downcast_unchecked().

Per CPython 3.12+ documentation, PyErr_SetRaisedException() requires exc to be a valid exception instance—NULL is invalid and will cause undefined behavior. The current code uses NonNull::new_unchecked(exc) (UB if NULL) and downcast_unchecked() (no validation that exc is an exception). Add a null check and convert downcast_unchecked() to a checked downcast before taking ownership.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/pyerrors.rs` around lines 118 - 123, In
PyErr_SetRaisedException, validate the incoming pointer before taking ownership:
check that exc is not null (use NonNull::new instead of NonNull::new_unchecked
and return an appropriate Err if null), then construct the PyObjectRef via
PyObjectRef::from_raw only after the non-null check, and replace
downcast_unchecked() with a checked downcast (e.g., downcast() or the crate's
checked variant) so you verify the object is an exception instance before
returning Err(exception).

Comment on lines +192 to +236
pub unsafe extern "C" fn PyErr_NewException(
name: *const c_char,
base: *mut PyObject,
dict: *mut PyObject,
) -> *mut PyObject {
with_vm(|vm| {
let (module, name) = unsafe {
CStr::from_ptr(name)
.to_str()
.expect("Exception name is not valid UTF-8")
.rsplit_once('.')
.expect("Exception name must be of the form 'module.ExceptionName'")
};

let bases = unsafe { base.as_ref() }.map(|bases| {
if let Some(ty) = bases.downcast_ref::<PyType>() {
vec![ty.to_owned()]
} else if let Some(tuple) = bases.downcast_ref::<PyTuple>() {
tuple
.iter()
.map(|item| item.to_owned().downcast())
.collect::<Result<Vec<_>, _>>()
.expect("PyErr_NewException base tuple must contain only types")
} else {
panic!("PyErr_NewException base must be a type or a tuple of types");
}
});

assert!(
dict.is_null(),
"PyErr_NewException with non-null dict is not supported yet"
);

vm.ctx.new_exception_type(module, name, bases)
})
}

#[unsafe(no_mangle)]
pub unsafe extern "C" fn PyErr_NewExceptionWithDoc(
name: *const c_char,
_doc: *const c_char,
base: *mut PyObject,
dict: *mut PyObject,
) -> *mut PyObject {
unsafe { PyErr_NewException(name, base, dict) }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

What happens when a Rust panic reaches an extern "C"boundary, and how does CPython'sPyErr_NewException report invalid arguments?

💡 Result:

  1. What happens when a Rust panic reaches an extern "C" boundary? When compiling with panic=unwind, Rust traditionally treated unwinding across ABI boundaries like extern "C" as undefined behavior, because extern "C" functions are optimized under the assumption they cannot unwind (effectively LLVM nounwind). Rust’s c-unwind work changed the behavior so that if a Rust function with the non-unwind ABI (extern "C") panics and that panic would otherwise “escape” across the boundary, the runtime aborts instead of attempting to unwind into the C ABI [1]. If you instead use the unwind-enabled ABI, extern "C-unwind" (and corresponding "*-unwind" ABIs), unwinding is permitted to cross between Rust and foreign frames that support unwinding [2]. Practical upshot: - extern "C" Rust function + uncaught panic: the panic is trapped at the ABI boundary and the process aborts (with the c-unwind change) rather than unwinding through C frames [1]. - extern "C-unwind" Rust function: unwinding is allowed across the boundary (where supported) [2]. Caveat: if you “mix” panic/unwind modes between Rust crates or between Rust and external code, there can be soundness issues/UB scenarios around unwinding expectations, depending on how the final binary is built [3]. 2) How CPython’s PyErr_NewException reports invalid arguments PyErr_NewException(const char *name, PyObject *base, PyObject *dict) is documented as creating a new exception class whose C name must be of the form module.classname [4]. If the name does not contain a dot (i.e., it’s not module.classname), CPython raises a SystemError with the message "PyErr_NewException: name must be module.class" and returns NULL [5]. In CPython’s implementation, it also defaults base to PyExc_Exception when base is NULL, and if dict is NULL it creates a new empty dict; failures in these helper steps lead to NULL being returned (after DECREF cleanup) [5]. So the key “invalid arguments” behavior to know: - Invalid name format (no module.classname dot): sets SystemError("PyErr_NewException: name must be module.class") and returns NULL [5][4]. - base==NULL and dict==NULL are not errors; they are treated as defaults (base becomes Exception; dict is created) [5][4].

Citations:


🏁 Script executed:

# Check the function signature and confirm it's extern "C"
head -200 crates/capi/src/pyerrors.rs | tail -50

Repository: RustPython/RustPython

Length of output: 1389


🏁 Script executed:

# Search for other potential panicking patterns in the same file
rg -n 'expect\(|panic!\(|assert!\(' crates/capi/src/pyerrors.rs | head -20

Repository: RustPython/RustPython

Length of output: 879


🏁 Script executed:

# Check if lines 261-263 are in a test context
sed -n '250,270p' crates/capi/src/pyerrors.rs

Repository: RustPython/RustPython

Length of output: 477


Fix panicking C-API functions that should return NULL and set error state instead.

This pattern affects multiple functions in the file (PyErr_NewException, PyErr_PrintEx, PyErr_DisplayException, PyErr_WriteUnraisable). Using expect(), panic!(), and assert!() in extern "C" functions causes the process to abort when invalid arguments are provided, rather than following the C-API contract: return NULL and set a Python exception. Per CPython's PyErr_NewException, invalid input like malformed exception names should set a SystemError and return NULL, not panic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/capi/src/pyerrors.rs` around lines 192 - 236, The extern "C" functions
(notably PyErr_NewException, PyErr_NewExceptionWithDoc, PyErr_PrintEx,
PyErr_DisplayException, PyErr_WriteUnraisable) currently call
expect!/panic!/assert!, which will abort the process on bad input; instead catch
invalid inputs, set a Python SystemError on the VM and return NULL (or return
early) per the C-API contract. Replace uses of expect()/panic!/assert! in
PyErr_NewException (e.g., CStr::from_ptr(...).to_str().expect(...),
rsplit_once().expect(...), panic! for bad base types, and assert! on dict) with
guarded checks that build an error message, call vm.ctx.exceptions.system_error
(or vm.set_exception(vm.ctx.exceptions.system_error, message)) to set the
exception on the VM, and then return std::ptr::null_mut(); apply the same
pattern to PyErr_NewExceptionWithDoc (delegate should propagate NULL on error)
and to PyErr_PrintEx / PyErr_DisplayException / PyErr_WriteUnraisable so they no
longer panic but set an appropriate SystemError and return gracefully.

Comment thread crates/vm/src/codecs.rs
Comment on lines +161 to 164
#[cfg(all(unix, feature = "threading", feature = "host_env"))]
pub(crate) unsafe fn reinit_after_fork(&self) {
unsafe { crate::common::lock::reinit_rwlock_after_fork(&self.inner) };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Same #[cfg] gate mismatch as StringPool::reinit_after_fork — breaks posix.rs::reinit_locks_after_fork

CodecsRegistry::reinit_after_fork is called from crates/vm/src/stdlib/posix.rs::reinit_locks_after_fork (line 843), which is gated only on #[cfg(all(unix, feature = "threading"))]. After this gate narrowing, the method won't exist when host_env is disabled, causing a compilation error under unix + threading (without host_env).

This is the same reinit_locks_after_fork function affected by the intern.rs change; both its string_pool and codec_registry calls are now broken under that feature combination. The fix is the same: widen reinit_locks_after_fork's gate to #[cfg(all(unix, feature = "threading", feature = "host_env"))] in posix.rs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/codecs.rs` around lines 161 - 164, The cfg gates are
inconsistent: CodecsRegistry::reinit_after_fork and
StringPool::reinit_after_fork are only defined with #[cfg(all(unix, feature =
"threading", feature = "host_env"))] but posix::reinit_locks_after_fork
(posix.rs::reinit_locks_after_fork) is gated with #[cfg(all(unix, feature =
"threading"))], causing missing method errors when host_env is disabled; update
the cfg on posix::reinit_locks_after_fork to #[cfg(all(unix, feature =
"threading", feature = "host_env"))] so its calls to
CodecsRegistry::reinit_after_fork and StringPool::reinit_after_fork compile
consistently.

Comment thread crates/vm/src/intern.rs
Comment on lines +39 to 42
#[cfg(all(unix, feature = "threading", feature = "host_env"))]
pub(crate) unsafe fn reinit_after_fork(&self) {
unsafe { crate::common::lock::reinit_rwlock_after_fork(&self.inner) };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the cfg guard on reinit_locks_after_fork and its callers in posix.rs
rg -n "reinit_locks_after_fork|reinit_after_fork" --type rust -B2

Repository: RustPython/RustPython

Length of output: 6242


Compilation mismatch: reinit_locks_after_fork calls methods gated on host_env

StringPool::reinit_after_fork and CodecRegistry::reinit_after_fork both require #[cfg(all(unix, feature = "threading", feature = "host_env"))], but reinit_locks_after_fork (which calls both at lines 842–845) is only gated on #[cfg(all(unix, feature = "threading"))]. Building on Unix with threading enabled but host_env disabled will compile the containing function but fail when it reaches those method calls, since the methods won't be defined.

Widen the gate on reinit_locks_after_fork to match:

Suggested fix
-#[cfg(all(unix, feature = "threading"))]
+#[cfg(all(unix, feature = "threading", feature = "host_env"))]
 fn reinit_locks_after_fork(vm: &VirtualMachine) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/intern.rs` around lines 39 - 42, The function
reinit_locks_after_fork is currently gated with #[cfg(all(unix, feature =
"threading"))] but calls StringPool::reinit_after_fork and
CodecRegistry::reinit_after_fork which are defined only when #[cfg(all(unix,
feature = "threading", feature = "host_env"))] is set; change the attribute on
reinit_locks_after_fork to #[cfg(all(unix, feature = "threading", feature =
"host_env"))] so the compilation gate matches the methods it calls (locate the
reinit_locks_after_fork function and update its cfg to include feature =
"host_env").

Comment on lines +5 to 6
#[cfg(all(unix, feature = "threading", feature = "host_env"))]
pub(crate) use _io::reinit_std_streams_after_fork;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify cfg mismatch between declaration/export and call sites.
set -euo pipefail

rg -n -C3 'reinit_std_streams_after_fork|cfg\(all\(unix, feature = "threading", feature = "host_env"\)\)|cfg\(feature = "threading"\)' \
  crates/vm/src/stdlib/_io.rs \
  crates/vm/src/stdlib/posix.rs

echo
echo "Expected: call sites and symbol availability should be gated compatibly."
echo "If call sites are only cfg(feature=\"threading\") while symbol requires host_env, build matrix is inconsistent."

Repository: RustPython/RustPython

Length of output: 5974


Critical cfg mismatch breaks unix+threading builds without host_env feature

reinit_std_streams_after_fork is exported and defined with #[cfg(all(unix, feature = "threading", feature = "host_env"))], but crates/vm/src/stdlib/posix.rs:784 calls it under only #[cfg(feature = "threading")]. This breaks compilation for unix + threading + !host_env, where the call site exists but the symbol is not compiled.

Move the host_env check inside the function body so the function is available under #[cfg(all(unix, feature = "threading"))], with a no-op implementation when host_env is disabled.

Suggested fix
-#[cfg(all(unix, feature = "threading", feature = "host_env"))]
+#[cfg(all(unix, feature = "threading"))]
 pub(crate) unsafe fn reinit_std_streams_after_fork(vm: &VirtualMachine) {
+    #[cfg(not(feature = "host_env"))]
+    {
+        let _ = vm;
+        return;
+    }
+
+    #[cfg(feature = "host_env")]
+    {
         for name in ["stdin", "stdout", "stderr"] {
             let Ok(stream) = vm.sys_module.get_attr(name, vm) else {
                 continue;
             };
             reinit_io_locks(&stream);
         }
+    }
 }

Also applies to the reinit_io_locks helper function (line 5016).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/stdlib/_io.rs` around lines 5 - 6, The exported function cfg is
too restrictive: change the module-level export so reinit_std_streams_after_fork
and reinit_io_locks are available under #[cfg(all(unix, feature = "threading"))]
and move the #[cfg(feature = "host_env")] guard inside each function body;
implement the host_env-disabled path as a no-op (early return) so calls from
posix.rs (which are under #[cfg(feature = "threading")]) compile when host_env
is not enabled, while keeping the real implementation when host_env is present.

Comment on lines +2 to 3
#[cfg(all(unix, feature = "threading", feature = "host_env"))]
pub(crate) use _thread::after_fork_child;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

after_fork_child's narrowed gate breaks its call site in posix.rs::py_os_after_fork_child

after_fork_child is now exported and defined under #[cfg(all(unix, feature = "threading", feature = "host_env"))] (lines 2–3 and 1060–1061). Its call site in crates/vm/src/stdlib/posix.rs::py_os_after_fork_child is gated only on #[cfg(feature = "threading")] (effectively unix + threading in a Unix-only module), missing the host_env requirement. Building with unix + threading but without host_env will fail to compile.

Fix the call site in posix.rs:

-#[cfg(feature = "threading")]
+#[cfg(all(feature = "threading", feature = "host_env"))]
 crate::stdlib::_thread::after_fork_child(vm);

Also applies to: 1060-1061

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/stdlib/_thread.rs` around lines 2 - 3, The call site
py_os_after_fork_child in posix.rs calls after_fork_child which is only exported
under #[cfg(all(unix, feature = "threading", feature = "host_env"))]; update the
gating around py_os_after_fork_child (or the specific call) to require the same
cfg (e.g. #[cfg(all(unix, feature = "threading", feature = "host_env"))]) so the
call is only compiled when after_fork_child is available, ensuring the cfg for
py_os_after_fork_child matches the after_fork_child export.

Comment on lines +175 to +190
pub fn release_current_thread(state: CurrentVmAttachState) {
if state == CurrentVmAttachState::AlreadyAttached {
return;
}

let gilstate_vm = GILSTATE_VM.with(|gilstate_vm| gilstate_vm.borrow_mut().take());
drop(gilstate_vm);

VM_STACK.with(|vms| {
vms.borrow_mut()
.pop()
.expect("release_current_thread() called without an attached VM");
});

#[cfg(unix)]
detach_thread();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

In Rust, if a raw pointer is copied from a Boxand theBox is dropped while that raw pointer is still stored elsewhere, is it valid to dereference that pointer during or after the drop of the boxed value?

💡 Result:

No. If you copy a raw pointer derived from a Box and then drop the Box while that pointer is still stored elsewhere, dereferencing the pointer during or after the drop is invalid (it triggers undefined behavior). [1][2] Reasoning (grounded in Rust’s rules for pointers and “dangling”): 1) Raw pointers have no automatic lifetime/liveness guarantees; copying/dropping the pointer itself doesn’t keep the referent alive. [3] 2) A pointer is “dangling” if it is null or if not all bytes it points to are part of the same live allocation; dereferencing a dangling pointer is one of the ways Rust characterizes incorrect behavior. [1][4] 3) After Box is dropped, the heap allocation backing it is deallocated; the raw pointer no longer has provenance/dereferenceability for non-zero-sized accesses, so reads through it require that the pointer be valid for reads to the relevant memory range, which it won’t be after free. [1][2][5] Practical takeaway: if you need a pointer that remains dereferenceable beyond the Box’s drop, you must ensure the allocation stays alive (e.g., by transferring ownership via Box::into_raw and later reconstructing via Box::from_raw), rather than dropping the Box while retaining a raw pointer to its contents. [6]

Citations:


🏁 Script executed:

cat -n crates/vm/src/vm/thread.rs | sed -n '175,190p'

Repository: RustPython/RustPython

Length of output: 624


🏁 Script executed:

# Search for destructor implementations (Drop trait) that might call with_current_vm
rg -A 10 'impl.*Drop.*for.*ThreadedVirtualMachine|impl.*Drop.*for.*VirtualMachine' crates/vm/src/

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search for any calls to with_current_vm() to understand what it does
rg -B 5 -A 10 'fn with_current_vm' crates/vm/src/vm/thread.rs

Repository: RustPython/RustPython

Length of output: 621


release_current_thread() drops the VM before removing its raw pointer from VM_STACK, violating Rust's memory safety.

The function drops GILSTATE_VM at line 181 while the corresponding raw pointer remains in VM_STACK (popped at line 185). If any destructor in the Box runs during this drop and calls with_current_vm(), it dereferences a pointer into freed memory—undefined behavior in Rust.

Swap the order: pop from VM_STACK first, then drop the boxed VM.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/vm/thread.rs` around lines 175 - 190, The
release_current_thread function currently drops GILSTATE_VM before popping the
raw pointer from VM_STACK, which can cause destructors to call with_current_vm
and dereference freed memory; fix by reversing the order in
release_current_thread: first remove/pop the VM pointer from VM_STACK (the
vms.borrow_mut().pop().expect(...) call) and only after that take/drop the
GILSTATE_VM (GILSTATE_VM.with(|gilstate_vm| gilstate_vm.borrow_mut().take()));
keep the existing detach_thread call and other logic unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants