Skip to content

Feature request: lightweight clear_data() for repeated collection without rerunning _init_for_start() #2139

@cyberthirst

Description

@cyberthirst

Is your feature request related to a problem? Please describe.

For repeated measurement cycles (start/stop/extract/clear in a loop), Coverage.erase() is the only reliable reset API on Coverage itself. But erase() sets _inited_for_start = False, so the next start() reruns _init_for_start(), rebuilding Core, Collector, InOrOut, creating a new CoverageData, and registering cleanup handlers again.

In our case (a fuzzer collecting branch/arc coverage thousands of times in no_disk mode), this repeated re-initialization is overhead: tracing configuration does not change between cycles, only collected data does.

Note: this request is not about keeping per-thread tracer instances alive across start()/stop(); Collector.start() recreates tracer instances each time by design.

Describe the solution you'd like

A public method like Coverage.clear_data() that:

  • Clears collected coverage data (SQLite rows / in-memory state)
  • Keeps existing Core/Collector/InOrOut wiring intact for the next start()
  • Does not set _inited_for_start = False
  • Avoids repeated cleanup-handler registration for every cycle

Usage would be:

cov = coverage.Coverage(branch=True, data_file=None, config_file=False)

for i in range(10000):
    cov.start()
    do_work()
    cov.stop()
    data = cov.get_data()
    # ... extract arcs/lines from data ...
    cov.clear_data()  # clear data, keep start-time initialization

Describe alternatives you've considered

  1. Calling Coverage.erase() each cycle — functionally works, but reruns _init_for_start() each iteration. In no_disk mode it also accumulates CoverageData/SQLite objects until process shutdown, because old data objects are retained for _atexit force-close.

Additional context

Our use case is a fuzzer that reuses one Coverage instance across thousands of iterations in no_disk mode during compilation. Each iteration compiles different code and we extract arcs, so we need a clean slate each cycle. Include/exclude rules, branch mode, and other tracing configuration are fixed; only coverage data should reset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions