Counting Sort
BasisPrerequisites
Every sort you have seen so far — bubble, selection, insertion, merge — works by comparing pairs of elements. It can be proved that any comparison-based sort requires at least comparisons in the worst case. Counting sort sidesteps this bound entirely by not comparing elements at all.
The comparison lower bound (in brief)
A comparison sort can be modelled as a binary decision tree: each internal node is a comparison, each leaf is a possible sorted order. For elements there are possible orderings, so the tree must have at least leaves. A binary tree with leaves has height at least . Hence any comparison sort uses comparisons — always.
Counting sort avoids this bound by using the values themselves as indices, never comparing them.
When counting sort applies
Counting sort requires two things:
- Integer keys — or keys that can be mapped to integers.
- A known, small range — the number of distinct key values, , must be small relative to .
If , then counting sort runs in time. If is large (say, sorting 32-bit integers with ), the counts array itself would dwarf the input.
The three phases
Counting sort works in three passes over the data.
Phase 1: count
Allocate a counts array of length and initialise it to zero. Walk the input once, incrementing counts[x - min] for each element x.
Phase 2: prefix sum
Transform counts in-place so that counts[i] holds the number of elements with key . This tells you the last valid output position for each key.
Phase 3: place
Walk the input array in reverse and write each element into the output at position counts[x - min] - 1, then decrement counts[x - min]. Reversing ensures stability.
Zig implementation
const std = @import("std");
/// Sorts arr in-place. Every value in arr must be in [min_val, max_val].
fn countingSort(
arr: []i64,
min_val: i64,
max_val: i64,
allocator: std.mem.Allocator,
) !void {
const range: usize = @intCast(max_val - min_val + 1);
// Phase 1: count occurrences.
const counts = try allocator.alloc(usize, range);
defer allocator.free(counts);
@memset(counts, 0);
for (arr) |x| counts[@intCast(x - min_val)] += 1;
// Phase 2: prefix sum — counts[i] becomes the number of elements ≤ min_val+i.
for (1..range) |i| counts[i] += counts[i - 1];
// Phase 3: place elements into the output in reverse order (preserves stability).
const out = try allocator.alloc(i64, arr.len);
defer allocator.free(out);
var i = arr.len;
while (i > 0) {
i -= 1;
const key: usize = @intCast(arr[i] - min_val);
counts[key] -= 1;
out[counts[key]] = arr[i];
}
@memcpy(arr, out);
}
Running example
Sort [4, 2, 2, 8, 3, 3, 1] with min_val = 1, max_val = 8, so range = 8.
After Phase 1 (counts indexed 0–7, representing keys 1–8):
key: 1 2 3 4 5 6 7 8
count: 1 2 2 1 0 0 0 1
After Phase 2 (prefix sums):
key: 1 2 3 4 5 6 7 8
count: 1 3 5 6 6 6 6 7
counts[3] = 6 means there are 6 elements with key ≤ 4, so the last 4 must go into output position 5 (0-indexed).
Phase 3 (traverse input right-to-left):
arr[6] = 1:counts[0]becomes 0, place at output[0].arr[5] = 3:counts[2]becomes 4, place at output[4].arr[4] = 3:counts[2]becomes 3, place at output[3].- … and so on.
Final output: [1, 2, 2, 3, 3, 4, 8]. ✓
Why reverse traversal ensures stability
Two elements with the same key arrive in the output in reverse input order if you traverse forwards — later duplicates land first. Traversing in reverse processes the last duplicate first, which decrements counts[key] to the last available slot for that key, and the first duplicate ends up in the first slot. Equal elements therefore appear in the output in the same order they appeared in the input: stable.
Complexity
| Metric | Complexity |
|---|---|
| Time | |
| Extra space | |
| Stable | Yes |
| Comparison-based | No |
The terms come from initialising and scanning the counts array; the terms come from the three linear passes over the input.
When , the total is — linear. When , the algorithm degrades: the counts array is mostly zeros and you spend most of your time initialising it.
When not to use counting sort
- Large key range: sorting arbitrary 32-bit signed integers requires entries — a 32 GB counts array. Not practical.
- Non-integer keys: floating-point or string keys cannot be used as direct indices. Use a comparison-based sort instead.
- Unknown range: if
min_valandmax_valare not known in advance, you need an extra pass to find them (still , but adds complexity).
Summary
- Counting sort counts occurrences, computes prefix sums to find output positions, then places each element in its correct slot.
- It is not comparison-based and therefore beats the lower bound.
- Time and space are both , where is the key range.
- It is stable: reverse traversal in Phase 3 ensures equal elements keep their original order.
- It is practical when ; it is impractical when is large or keys are not integers.
- Radix sort uses counting sort as a stable subroutine to sort integer keys of any size digit by digit.