Fix process abort on large float format precision#7633
Fix process abort on large float format precision#7633changjoon-park wants to merge 5 commits intoRustPython:mainfrom
Conversation
Formatting a float with large precision (>= ~65535) aborted the
interpreter instead of raising a Python exception. CPython handles
the same input by returning a clean string.
# Before
./rustpython -c "print(f'{1.5:.1000000}')"
thread 'main' panicked at crates/literal/src/float.rs:135:
Formatting argument out of range (exit 101, abort)
# After
./rustpython -c "print(f'{1.5:.1000000}')"
1.5
Root cause: Rust's `format!("{:.*}", n, x)` panics when `n`
exceeds the fmt runtime's internal precision limit. `format_fixed`
already caps `n` at u16::MAX, but `format_general` and
`format_exponent` (and the `%` branch in `crates/common/src/format.rs`)
passed user-supplied precision straight through to `format!`.
Fix:
* Introduce `FMT_MAX_PRECISION` + `clamp_fmt_precision()` in
crates/literal/src/float.rs. Cap is `u16::MAX - 1` because
`{:.*e}` hits a second panic (`ndigits > 0` in core flt2dec)
at exactly u16::MAX; the smaller value covers both paths.
* Apply the helper to `format_fixed` (replacing the existing
ad-hoc cap), `format_exponent` (entry), and `format_general`
(three separate format! calls with saturating arithmetic on
derived precision values).
* Apply the helper in the `FormatType::Percentage` branch in
crates/common/src/format.rs.
This is harmless for all normal inputs — f64 carries only ~17
significant digits, so precision beyond 65K is padding zeros at
best. Complex-number and old-style `%`-formatting paths transitively
benefit because they dispatch to the same library functions.
Verified:
* cargo run -- -m test test_float test_fstring test_format:
144 passed, 0 regressed.
* extra_tests/snippets/builtin_format.py: all assertions pass,
including 7 new regression cases covering e / E / g / G / f /
% at precision 1_000_000.
* Probed with 10 magnitude values (0, ±1.5, ±inf, nan, 1e-300,
1e300, f64::MAX, 5e-324) x 4 format types = 40 combinations,
plus precision 0/1/2 boundary, complex formatting, old-style
`%` formatting, and combined specs (fill/align/sign/grouping/
zero-pad). All return clean strings; no process abort.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdd precision-clamping constants and helpers; apply clamping across percent, fixed, exponent, and general float-formatting code paths to avoid runtime panics for extremely large precisions, and add regression tests validating behavior for huge precision values. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/literal/src/float.rs (1)
154-178:⚠️ Potential issue | 🟡 MinorAvoid double-clamping exponent base formatting in
format_general.At high precision values near the cap (65534), line 167–171 reformats
basewithclamp_fmt_precision(precision.saturating_add(1)), which re-applies the cap and can drop necessary precision. Sincer_expis already produced with clamped precision at line 156, reusebasedirectly instead of reformatting. Extractexponent_precisionat the start to avoid repeating the same computation and keep both the exponent string formatting and decimal point logic consistent.Proposed fix
magnitude if magnitude.is_finite() => { + let exponent_precision = clamp_fmt_precision(precision.saturating_sub(1)); let r_exp = format!( "{:.*e}", - clamp_fmt_precision(precision.saturating_sub(1)), + exponent_precision, magnitude, ); let mut parts = r_exp.splitn(2, 'e'); let base = parts.next().unwrap(); let exponent = parts.next().unwrap().parse::<i64>().unwrap(); @@ - let magnitude = format!( - "{:.*}", - clamp_fmt_precision(precision.saturating_add(1)), - base, - ); + let magnitude = base.to_owned(); let base = maybe_remove_trailing_redundant_chars(magnitude, alternate_form); - let point = decimal_point_or_empty(precision.saturating_sub(1), alternate_form); + let point = decimal_point_or_empty(exponent_precision, alternate_form); format!("{base}{point}{e}{exponent:+#03}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/literal/src/float.rs` around lines 154 - 178, In format_general, avoid reformatting the exponent base (the second formatting that uses clamp_fmt_precision(precision.saturating_add(1)) and assigns to magnitude/base) because r_exp was already produced with a clamped precision; instead compute a single exponent_precision = clamp_fmt_precision(precision.saturating_sub(1)) at the start, use it when creating r_exp, then reuse the parsed base from r_exp (variable base from parts.next()) rather than calling format! again; pass that base into maybe_remove_trailing_redundant_chars and use decimal_point_or_empty(precision.saturating_sub(1), alternate_form) so both exponent string formatting and decimal-point logic use the same computed precision and you no longer double-clamp via clamp_fmt_precision in this branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/literal/src/float.rs`:
- Around line 154-178: In format_general, avoid reformatting the exponent base
(the second formatting that uses
clamp_fmt_precision(precision.saturating_add(1)) and assigns to magnitude/base)
because r_exp was already produced with a clamped precision; instead compute a
single exponent_precision = clamp_fmt_precision(precision.saturating_sub(1)) at
the start, use it when creating r_exp, then reuse the parsed base from r_exp
(variable base from parts.next()) rather than calling format! again; pass that
base into maybe_remove_trailing_redundant_chars and use
decimal_point_or_empty(precision.saturating_sub(1), alternate_form) so both
exponent string formatting and decimal-point logic use the same computed
precision and you no longer double-clamp via clamp_fmt_precision in this branch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 6827d226-5ccc-49ce-9a04-483979d9a597
📒 Files selected for processing (3)
crates/common/src/format.rscrates/literal/src/float.rsextra_tests/snippets/builtin_format.py
Two refinements after CodeRabbit review:
1. Drop the redundant `format!("{:.*}", precision + 1, base)` in
`format_general`'s scientific branch. It was a no-op pre-fix
(magnitude is `.abs()`-ed at the caller, so `base` has no sign
and its length was exactly `precision + 1`), but after I added
the cap it turned into an active truncate — dropping 1 char of
precision at the cap boundary. Reuse `base` directly and extract
`exp_precision` for reuse by `decimal_point_or_empty`.
2. Split the cap into two helpers.
`FMT_MAX_PRECISION = u16::MAX` — for plain `{:.*}` (format_fixed,
%-branch, format_general's
non-scientific branch).
`FMT_MAX_EXP_PRECISION = u16::MAX - 1` — for `{:.*e}` (format_exponent,
format_general's scientific
entry).
The second value is one lower because `{:.*e}` trips an additional
`ndigits > 0` assertion in `core::num::flt2dec` at exactly
`u16::MAX`. The first commit used the tighter cap uniformly,
which silently regressed `format_fixed` by 1 char at
`precision == u16::MAX` (it previously capped at exactly that
value). Two helpers restore byte-identical CPython parity for
fixed / percent / general-non-scientific paths up through
`precision == u16::MAX`.
Verification:
* precision 5 .. 65534: 360 outputs byte-identical to CPython
across 8 magnitudes x 9 precisions x 5 types.
* precision == 65535: f / g / G / % now match CPython (0 diff).
e / E remain 1 char shorter — unavoidable
within the `u16::MAX - 1` exp cap.
* precision > 65535: output stops at cap; CPython emits full
padding — same design divergence as before.
* No panic regression: f-string default, e/E, g/G, %, f at
precision 1_000_000 all return cleanly.
* Test suite: test_float + test_fstring + test_format,
162 passed, 0 regressed.
| # crates/literal/src/float.rs and `crates/common/src/format.rs` (the `%` | ||
| # branch), which panic past Rust's fmt precision limit and killed the | ||
| # process instead of raising a Python exception. | ||
| _big = 1_000_000 |
There was a problem hiding this comment.
Because this patch set the boundary as pub const FMT_MAX_PRECISION: usize = u16::MAX as usize;, the test must cover 65535 and 65536 to ensure its behavior follows CPython
Per review comment on `extra_tests/snippets/builtin_format.py:209`:
the patch declares `FMT_MAX_PRECISION = u16::MAX`, so the tests must
cover 65535 and 65536 and demonstrate CPython parity at the boundary.
The previous version only avoided panic — at the cap it silently
truncated 1 char short of CPython for e / E, and thousands of chars
short for f / % at precision beyond the cap. This commit restores
byte-identical CPython output at every precision up to the format-
spec parser's own `i32::MAX` ceiling.
Fix: pad the Rust-format result with '0's up to the user-requested
precision.
Why this is correct, not a workaround: IEEE 754 double has at most
~767 significant decimal digits; past that, every digit is
deterministically '0' in both CPython and the native Rust output.
Our cap (65534 for exp, 65535 for plain) sits far above 767, so
appending zeros reconstructs precisely what CPython would have
produced. Verified on hard inputs: `1e-100`, `5e-324` (subnormal
boundary), `f64::MAX`, mixed magnitudes — the last 100 chars of
Rust-format output at precision 65534 are all '0' for every case.
Changes:
* `format_fixed`: after format!(), extend with (precision - capped)
'0' chars before appending the optional decimal point.
* `format_exponent`: same, applied to the parsed mantissa before
reassembling with the exponent marker.
* `FormatType::Percentage` branch: same. Also fixed a bug the
boundary audit surfaced: the finite-input overflow guard used
`return Ok("inf%")`, which bypasses the outer sign handler.
Changed to a match-arm value so `format_sign_and_align` still
runs and produces "-inf%" for `-f64::MAX`, matching CPython.
Verification:
* 7 magnitudes × 5 precisions × 6 format types = 210 comparisons
against CPython at precisions {65534, 65535, 65536, 100000,
200000}. All 210 byte-identical.
* Gap audit (complex formatting, old-style % formatting, negative
magnitudes, -0.0, combined specs with fill / sign / alternate /
grouping) at boundary precisions. All but 20 byte-identical.
The 20 remaining diffs all stem from a pre-existing
complex-imaginary-part repr bug (`1e100j` expands to 100 '0's
in RustPython vs CPython's `1e+100j`) which reproduces on
upstream main without any part of this patch and is out of
scope here.
* `cargo run -- -m test test_float test_fstring test_format`:
162 passed, 0 regressed.
* `extra_tests/snippets/builtin_format.py` now pins exact
expected strings at 65534 / 65535 / 65536 / 1_000_000 for
every format type, plus the `f64::MAX × 100 → 'inf%'`
overflow case.
* `cargo fmt --check`: pass.
| # f-format pads with trailing zeros up to the requested precision. | ||
| assert "{:.65534f}".format(1.5) == "1." + "5" + "0" * 65533 | ||
| assert "{:.65535f}".format(1.5) == "1." + "5" + "0" * 65534 | ||
| assert "{:.65536f}".format(1.5) == "1." + "5" + "0" * 65535 | ||
| # e-format emits a fixed mantissa width + 'e+00'. | ||
| assert "{:.65534e}".format(1.5) == "1." + "5" + "0" * 65533 + "e+00" | ||
| assert "{:.65535e}".format(1.5) == "1." + "5" + "0" * 65534 + "e+00" | ||
| assert "{:.65536e}".format(1.5) == "1." + "5" + "0" * 65535 + "e+00" | ||
| # %-format multiplies by 100 then applies f-format. | ||
| assert "{:.65534%}".format(1.5) == "150." + "0" * 65534 + "%" | ||
| assert "{:.65535%}".format(1.5) == "150." + "0" * 65535 + "%" | ||
| assert "{:.65536%}".format(1.5) == "150." + "0" * 65536 + "%" | ||
| # g-format strips trailing zeros, so the short form is the natural | ||
| # representation regardless of precision. | ||
| for p in (65534, 65535, 65536, 1_000_000): | ||
| assert ("{:." + str(p) + "g}").format(1.5) == "1.5" | ||
|
|
||
| # Percent overflow: finite input whose *100 is +inf produces "inf%" | ||
| # rather than crashing. CPython does the same. | ||
| assert "{:.100000%}".format(1.7976931348623157e308) == "inf%" | ||
|
|
||
| # Shallow cases unchanged. | ||
| assert f"{1.5:.5}" == "1.5" | ||
| assert "{:.3f}".format(1.5) == "1.500" | ||
| assert "{:.2%}".format(0.25) == "25.00%" | ||
| assert "{:.4e}".format(1234.5) == "1.2345e+03" | ||
| assert "{:.3g}".format(1234.5) == "1.23e+03" | ||
| assert f"{float('nan'):.10f}" == "nan" | ||
| assert f"{float('inf'):.10f}" == "inf" |
There was a problem hiding this comment.
which test is about unhappy cases, like exceeding the MAX_PRECISION?
Rename the boundary-test section so the three precision points per format type are labeled below / at / past the cap inline, making the "past MAX_PRECISION" unhappy-case coverage explicit. Add len-based assertions at precision 1_000_000 for f, e, and % to exercise the cap-then-pad path at a depth far beyond the boundary.
6b52bdb to
58c59d4
Compare
Summary
Formatting a float with a large precision aborts the interpreter instead of raising a Python exception. CPython returns a clean string on the same input.
Root cause
Rust's
format!("{:.*}", n, x)macro panics whennexceeds the fmt runtime's internal precision limit.format_fixedincrates/literal/src/float.rsalready capsnatu16::MAXbefore callingformat!, but its sibling functionsformat_generalandformat_exponent— and theFormatType::Percentagebranch incrates/common/src/format.rs— pass user-supplied precision straight through. A one-line user script (f"{x:.N}"with N ≳ 65535) triggers the abort.Affected format types:
f(fixed)e/E(exponential)g/G(general)%(percent)g)Fix
FMT_MAX_PRECISION+clamp_fmt_precision()helper at module level infloat.rs.u16::MAX - 1, notu16::MAX—{:.*e}hits a second assertion (ndigits > 0incore::num::flt2dec) at exactlyu16::MAX; the smaller value covers both{:.*}and{:.*e}uniformly.format_fixed(replacing existing ad-hoc cap — consistency only)format_exponent(new, at function entry)format_general(new, at each of three internalformat!calls, with saturating arithmetic on derived precision values)FormatType::Percentagebranch incommon/src/format.rs(new)Complex-number formatting and old-style
%-formatting dispatch to the same library functions, so they transitively benefit without separate changes.Why the cap is safe
f64 carries only ~17 significant decimal digits. Precision beyond ~17 produces padding zeros (for
f/e/%) or is silently trimmed (forg). Capping at ~65K is far beyond any user-meaningful precision and matches the existingformat_fixedbehavior already shipping inmain.Verification
Reliability audit (beyond the trigger path)
Probed after the fix:
0.0,±1.5,±inf,nan,1e-300,1e300,f64::MAX,5e-324. All return clean strings.'{:.0g}'.format(1.5) == '2', etc.)..200000e— all OK (transitively viaformat_exponent).%formatting:'%.200000e' % 1.5etc. — all OK (transitively viacformat.rs→format_fixed/format_exponent/format_general).i32::MAXwithValueError("Precision too big"), so the guarded interval[0, i32::MAX]is now panic-free end-to-end.Related
format!" approach:format_fixedalready did this; this PR extends the same pattern to its siblings.compile()#7630 (cyclic-ASTSIGSEGV→RecursionError), Fix stack overflow on deeply-nested JSON in json.loads() #7632 (deep-JSONSIGSEGV→RecursionError).Summary by CodeRabbit
Bug Fixes
Tests