Reject format-string field index above Py_ssize_t::MAX by changjoon-park · Pull Request #7708 · RustPython/RustPython

changjoon-park · 2026-04-27T20:56:06Z

Background

CPython rejects digit-only format-string field names that overflow Py_ssize_t at parse time with ValueError: Too many decimal digits in format string (Python/string_parser.c::get_integer). RustPython's FieldName::parse accepted any digit string usize::from_str could parse:

Values within usize but above isize::MAX → wrong exception at lookup (IndexError: tuple index out of range)
Values above usize::MAX → fell through to Keyword, then KeyError

Repro

import sys
fs = '{{{}:.6f}}'.format(sys.maxsize + 1)  # '{9223372036854775808:.6f}'

fs.format(2.34)
# CPython 3.14:    ValueError: Too many decimal digits in format string
# RustPython:      IndexError: tuple index out of range  (before this PR)

Fix

In FieldName::parse, after parse_usize returns the digit-only index, reject values above isize::MAX (matching CPython's Py_ssize_t::MAX). When parse_usize itself fails on a digits-only string (overflowing usize), raise the same error. A new FormatParseError::TooManyDecimalDigits maps to the byte-identical CPython wording.

The cap is isize::MAX as usize, which equals Py_ssize_t::MAX on every platform (i64 on 64-bit, i32 on 32-bit). Values up to isize::MAX parse successfully and only fail at lookup, matching CPython.

Nested [index] field-name parts ('{0[huge]}') have the same shape but no upstream regression test; left for a follow-up.

Tests unmasked

test_str.StrTest.test_format_huge_item_number

Verification

CPython 3.14.4 byte-identical for boundary cases:

{2147483647} (i32::MAX): IndexError (parse OK)
{2147483648} (i32::MAX+1): IndexError (parse OK — not band-aid)
{9223372036854775807} (isize::MAX): IndexError (parse OK)
{9223372036854775808} (isize::MAX+1): ValueError
{99999999999999999999...} (usize overflow): ValueError

Normal field names unchanged: {0}, {abc}, {0[1]}, {abc1} (digit in keyword), {007} (leading zeros), {0.attr}, dynamic '{0:{1}}'.format('x', 5).

No regressions across test_str, test_format, test_fstring.

Summary by CodeRabbit

Bug Fixes
- Enhanced validation for numeric field names in format strings to detect values exceeding parsing limits and provide more precise error messaging when invalid format specifications are encountered.

CPython rejects digit-only format-string field names that overflow Py_ssize_t at parse time with ValueError: Too many decimal digits in format string (Python/string_parser.c::get_integer). RustPython's FieldName::parse accepted any digit string usize::from_str could parse, producing IndexError or KeyError at lookup instead. Cap the parsed index at isize::MAX (Py_ssize_t::MAX on every platform) inside FieldName::parse. Also reject digits-only strings whose value overflows usize itself (caught when parse_usize returns None on an all-digit input). A new FormatParseError::TooManyDecimalDigits maps to the byte-identical CPython wording. Unmasks test_str.StrTest.test_format_huge_item_number.

coderabbitai · 2026-04-27T20:56:21Z

📝 Walkthrough

Walkthrough

Adds a new TooManyDecimalDigits error variant to FormatParseError enum. Updates FieldName::parse in the common format module to detect when digit-only field segments exceed integer limits and report them with this new error type. The VM layer's exception conversion is extended to map this error variant to a ValueError.

Changes

Cohort / File(s)	Summary
Format parsing error handling `crates/common/src/format.rs`	Introduces `TooManyDecimalDigits` enum variant to `FormatParseError`. Updates `FieldName::parse` logic to validate that parsed numeric values don't exceed `isize::MAX` and to handle `usize` parsing overflow for all-digit segments as `TooManyDecimalDigits` instead of falling back to `Keyword` type.
VM exception conversion `crates/vm/src/format.rs`	Extends `FormatParseError` to Python exception conversion to handle the new `TooManyDecimalDigits` variant, returning a `ValueError` with an appropriate error message.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested reviewers

ShaharNaveh
youknowone

Poem

🐰 Whiskers twitching with delight,
The digits dance beyond their height,
Too many zeros, now they say,
"You've overflowed in every way!"
A new error hops into place, ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly and specifically describes the main change: rejecting format-string field indices that exceed Py_ssize_t::MAX, which is the core objective of this changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-27T20:56:57Z

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] test: cpython/Lib/test/test_str.py (TODO: 14)
[x] test: cpython/Lib/test/test_fstring.py (TODO: 19)
[x] test: cpython/Lib/test/test_string_literals.py (TODO: 4)

dependencies:

dependent tests: (no tests depend on str)

Legend:

[+] path exists in CPython
[x] up-to-date, [ ] outdated

coderabbitai

🧹 Nitpick comments (1)

crates/common/src/format.rs (1)
1186-1204: Logic correctly mirrors CPython's get_integer and handles both overflow regimes.

The two-stage check is correct:

If parse_usize succeeds, values in (isize::MAX, usize::MAX] are now rejected (this regime is otherwise reachable on 64-bit).

If parse_usize fails on an all-ASCII-digit segment, it's a usize overflow and is also rejected.

Non-digit inputs (-5, +5 with current Rust semantics, leading whitespace, non-UTF-8 wtf8) still fall through to FieldType::Keyword exactly as before, so no regression.

Optional nit: consider adding unit tests next to test_parse_field_name (line 1729) for the three boundaries — isize::MAX (Index), isize::MAX + 1 (TooManyDecimalDigits), and a usize-overflowing digits-only string (TooManyDecimalDigits) — to lock the behavior in at this layer rather than relying solely on the unmasked test_format_huge_item_number.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/common/src/format.rs` around lines 1186 - 1204, The parsing branch
correctly enforces two overflow regimes but lacks explicit unit tests locking
this behavior; add tests near test_parse_field_name that assert
parse_usize/field-name parsing returns FieldType::Index for isize::MAX, and
returns FormatParseError::TooManyDecimalDigits for both isize::MAX + 1 and for a
digits-only string that would overflow usize (to cover the second overflow
branch that triggers the early error), referencing the existing parse logic
around parse_usize, FieldType::Index, FormatParseError::TooManyDecimalDigits and
FieldType::Keyword so future changes don’t regress this behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/common/src/format.rs`:
- Around line 1186-1204: The parsing branch correctly enforces two overflow
regimes but lacks explicit unit tests locking this behavior; add tests near
test_parse_field_name that assert parse_usize/field-name parsing returns
FieldType::Index for isize::MAX, and returns
FormatParseError::TooManyDecimalDigits for both isize::MAX + 1 and for a
digits-only string that would overflow usize (to cover the second overflow
branch that triggers the early error), referencing the existing parse logic
around parse_usize, FieldType::Index, FormatParseError::TooManyDecimalDigits and
FieldType::Keyword so future changes don’t regress this behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 591ff054-6b9c-455a-9866-62aa6b8606f6

📥 Commits

Reviewing files that changed from the base of the PR and between 1d42ee5 and 455519f.

⛔ Files ignored due to path filters (1)

Lib/test/test_str.py is excluded by !Lib/**

📒 Files selected for processing (2)

crates/common/src/format.rs
crates/vm/src/format.rs

ShaharNaveh

🔥

changjoon-park · 2026-04-27T21:41:19Z

🔥

🔥🔥🔥

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

ShaharNaveh approved these changes Apr 27, 2026

View reviewed changes

youknowone approved these changes Apr 28, 2026

View reviewed changes

youknowone merged commit 68aece5 into RustPython:main Apr 28, 2026
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reject format-string field index above Py_ssize_t::MAX#7708

Reject format-string field index above Py_ssize_t::MAX#7708
youknowone merged 1 commit intoRustPython:mainfrom
changjoon-park:fix-format-huge-field-index

changjoon-park commented Apr 27, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 27, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

ShaharNaveh left a comment

Uh oh!

changjoon-park commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

changjoon-park commented Apr 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Repro

Fix

Tests unmasked

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 27, 2026

📦 Library Dependencies

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ShaharNaveh left a comment

Choose a reason for hiding this comment

Uh oh!

changjoon-park commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

changjoon-park commented Apr 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading