Skip to content

gh-144191: Dataclasses single field ordering#144222

Open
whyvineet wants to merge 6 commits intopython:mainfrom
whyvineet:dataclasses-single-field-ordering
Open

gh-144191: Dataclasses single field ordering#144222
whyvineet wants to merge 6 commits intopython:mainfrom
whyvineet:dataclasses-single-field-ordering

Conversation

@whyvineet
Copy link
Copy Markdown

@whyvineet whyvineet commented Jan 25, 2026

Simplify single-field dataclass ordering comparisons
#144191

@whyvineet whyvineet requested a review from ericvsmith as a code owner January 25, 2026 15:33
@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented Jan 25, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

Copy link
Copy Markdown
Member

@johnslavik johnslavik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! You can wait for the green light from Eric before going forward or add tests for this even now. I'd probably change test_1_field_compare. And of course a news entry, too.

Comment thread Lib/dataclasses.py Outdated
@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented Jan 26, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@whyvineet
Copy link
Copy Markdown
Author

Hey @picnixz and @johnslavik, just a quick update: I ran a small local benchmark on a CPython 3.15 dev build to ensure this change doesn't cause any noticeable slowdown. In my tests, the match/case version was slightly faster in the single-field scenario and comparable in the multi-field case. Since this code executes only during class creation, I didn’t notice any adverse effects.

I’m happy to follow your advice on this. If you’d rather test it yourselves or revert to the if/else approach for clarity, just let me know, and I’ll update the PR accordingly.

@whyvineet
Copy link
Copy Markdown
Author

Just for reference, here are the numbers I observed locally (CPython 3.15 dev build):

  • Single-field case: match/case ~8–9% faster than if len(flds) == 1
  • Multiple-field case: no meaningful difference (well under 1%)

These were consistent across repeated runs.

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Jan 29, 2026

Please share the benchmarking script and the way you ran it. Class creation matters when considering import time as well.

@whyvineet
Copy link
Copy Markdown
Author

The script was executed using the freshly built interpreter. Each implementation (match/case vs if/else) is run back-to-back for a large number of iterations to reduce noise.

Script
import timeit
import sys
from dataclasses import dataclass

class MockField:
    def __init__(self, name):
        self.name = name

def _tuple_str(obj_name, flds):
    if not flds:
        return '()'
    return f'({",".join(f"{obj_name}.{f.name}" for f in flds)},)'

def match_impl(flds):
    match flds:
        case [single_fld]:
            self_expr = f"self.{single_fld.name}"
            other_expr = f"other.{single_fld.name}"
        case _:
            self_expr = _tuple_str("self", flds)
            other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def ifelse_impl(flds):
    if len(flds) == 1:
        self_expr = f"self.{flds[0].name}"
        other_expr = f"other.{flds[0].name}"
    else:
        self_expr = _tuple_str("self", flds)
        other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def run_benchmark():
    single_field = [MockField("value")]
    multi_field = [MockField("x"), MockField("y"), MockField("z")]
    iterations = 1_000_000

    # Just to make sure that I am using the local build
    print("Python:", sys.version)
    print("Executable:", sys.executable)
    print()

    print("Single-field case")
    print("match/case:", timeit.timeit(lambda: match_impl(single_field), number=iterations))
    print("if/else:", timeit.timeit(lambda: ifelse_impl(single_field), number=iterations))

    print("Multi-field case")
    print("match/case:", timeit.timeit(lambda: match_impl(multi_field), number=iterations))
    print("if/else:", timeit.timeit(lambda: ifelse_impl(multi_field), number=iterations))

    print("Real dataclass comparison (sanity check)")
    @dataclass(order=True)
    class A:
        x: int

    @dataclass(order=True)
    class B:
        x: int
        y: int
        z: int

    a1, a2 = A(1), A(2)
    b1, b2 = B(1, 2, 3), B(2, 3, 4)

    print("single-field:", timeit.timeit(lambda: a1 < a2, number=iterations))
    print("multi-field: ", timeit.timeit(lambda: b1 < b2, number=iterations))

if __name__ == "__main__":
    run_benchmark()

@whyvineet whyvineet requested a review from picnixz February 1, 2026 10:26
@picnixz
Copy link
Copy Markdown
Member

picnixz commented Feb 1, 2026

Can you use pyperf instead of custom benchmarks? I am also interested in the stdev.

@whyvineet
Copy link
Copy Markdown
Author

@picnixz, Please excuse any rough edges here... I’m still getting familiar with pyperf, so I’m happy to adjust if I’m misusing it in any way.

PS D:\dev\cpython> PCbuild\amd64\python.exe benchmark.py --rigorous
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclasses single-field match/case: Mean +- std dev: 403 ns +- 37 ns
.........................................
WARNING: the benchmark result may be unstable
* the standard deviation (49.3 ns) is 11% of the mean (454 ns)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclasses single-field if/else: Mean +- std dev: 454 ns +- 49 ns
Script
import pyperf

class MockField:
    def __init__(self, name):
        self.name = name

def _tuple_str(obj_name, flds):
    if not flds:
        return '()'
    return f'({",".join(f"{obj_name}.{f.name}" for f in flds)},)'

def match_impl(flds):
    match flds:
        case [single_fld]:
            self_expr = f"self.{single_fld.name}"
            other_expr = f"other.{single_fld.name}"
        case _:
            self_expr = _tuple_str("self", flds)
            other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def ifelse_impl(flds):
    if len(flds) == 1:
        self_expr = f"self.{flds[0].name}"
        other_expr = f"other.{flds[0].name}"
    else:
        self_expr = _tuple_str("self", flds)
        other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

runner = pyperf.Runner()
single_field = [MockField("value")]

runner.bench_func(
    "dataclasses single-field match/case",
    match_impl,
    single_field
)

runner.bench_func(
    "dataclasses single-field if/else",
    ifelse_impl,
    single_field
)

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Feb 1, 2026

I would like the results with the dataclass usage. And with more than just 3 fields. Is the interpreter built with PGO and LTO? Or is it a debug build?

@whyvineet
Copy link
Copy Markdown
Author

@picnixz, Here are the results with dataclass usage (I'm using a debug build)

match-cases
PS D:\dev\cpython> PCbuild\amd64\python.exe benchmark.py --rigorous
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (1 field): Mean +- std dev: 530 us +- 47 us
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (5 fields): Mean +- std dev: 876 us +- 71 us
.........................................
WARNING: the benchmark result may be unstable
* the standard deviation (178 us) is 14% of the mean (1.28 ms)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (10 fields): Mean +- std dev: 1.28 ms +- 0.18 ms
if-else
PS D:\dev\cpython> PCbuild\amd64\python.exe benchmark.py --rigorous
.........................................
WARNING: the benchmark result may be unstable
* the standard deviation (61.7 us) is 12% of the mean (521 us)
* the maximum (813 us) is 56% greater than the mean (521 us)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (1 field): Mean +- std dev: 521 us +- 62 us
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (5 fields): Mean +- std dev: 835 us +- 57 us
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (10 fields): Mean +- std dev: 1.14 ms +- 0.07 ms
Script
import pyperf
from dataclasses import dataclass

def make_dataclass(n):
    namespace = {f"x{i}": int for i in range(n)}
    return dataclass(order=True)(
        type(f"C{n}", (), {"__annotations__": namespace})
    )

runner = pyperf.Runner()

runner.bench_func("dataclass creation (1 field)", make_dataclass, 1)
runner.bench_func("dataclass creation (5 fields)", make_dataclass, 5)
runner.bench_func("dataclass creation (10 fields)", make_dataclass, 10)

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Feb 2, 2026

Results on DEBUG builds are not relevant. Please use a PGO/LTO build.

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Feb 2, 2026

In addition can you use pyperf compare as well? And if possible remove the time needed for creating the type. We are only interested in the time needed by the decorator.

@whyvineet
Copy link
Copy Markdown
Author

@picnixz, I ran pyperf compare_to ifelse_version.json matchcase_version.json (timing only the @dataclass(order=True) decorator) on PGO/LTO build and here are the results:

Benchmark hidden because not significant (3): dataclass decorator (1 field), dataclass decorator (5 fields), dataclass decorator (10 fields)

Geometric mean: 1.01x slower

@picnixz
Copy link
Copy Markdown
Member

picnixz commented Mar 1, 2026

Arf, actually I don't know if we can make that change. It's a breaking change because of dataclasses wrapping a single float for instance with math.nan. We always have this kind of issues. Should dataclasses be semantically equivalent to tuples, or should dataclasses be pointwise-checked? I remember we changed the pointwise checks to tuple checks.

So I think we need more discussion on the issue. Sorry. But for now, I'd suggest that we close this PR. It's not that it's bad, it's just that this can (once again) give surprises (in particular, I want to hear from more core devs).


If we make the change, it can only go into 3.15. People may have already noticed the problem that was introduced so fixing it again would make it worse for them (maybe, idk). At least, I would prefer to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants