Skip to content

Counter.most_common() Performance Improvement for Small n #144127

@udaypatel1

Description

@udaypatel1

Feature or enhancement

Overview

collections.Counter.most_common() currently has two execution paths:

if n is None:
    return sorted(self.items(), key=itemgetter(1), reverse=True)
else:
    return heapq.nlargest(n, self.items(), key=itemgetter(1))

While this is correct and efficient in general, there are common cases (especially n == 1 or very small n) where heapq.nlargest introduces avoidable overhead (heap allocation, tuple comparisons, Python-level callbacks).

This proposal is to investigate and potentially introduce small fast paths for common n values, without changing semantics or public API behavior.


Counter.most_common(1) is a very common usage pattern (e.g., “find the mode”), but it still uses the full capability of heapq.nlargest.

For example:

c = Counter(data)
c.most_common(1)

This could potentially be implemented more efficiently via a single-pass max selection:

max(self.items(), key=itemgetter(1))

or an equivalent loop-based approach, avoiding heap construction entirely.

Similarly, for very small n, a specialized strategy may outperform heapq.nlargest, especially for large counters.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

Labels

performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions