# benchmark: counting uniques

While working on a Rosalind exercise recently, I came upon an interesting solution for a counting-uniques problem.

A canonical performant solution for a counting-uniques problem is to make a single traversal through the collection of elements and populate a hash table that maps from each unique element to a count of the number of times it appears in the collection.

The hash table then provides constant-time lookup for each count, yielding an $\mathcal{O}(n)$ asymptotic run time.

So at first glance this solution, which simply calls `list.count`

for each
unique element in the given list, seemed like a performance-naive one:

```
def count_uniques(list, elts_to_be_counted):
return list(map(lambda x: list.count(x), elts_to_be_counted))
```

This solution has an $\mathcal{O}(m n)$ runtime, where $m$ is the number of
unique elements in the list (in this case, passed in via the
`elts_to_be_counted`

parameter) and $n$ the number of total elements. (Hence
quadratic for a list where each element is unique.)

The naivete was (perhaps) all mine, however. From the accompanying discussion:

Interestingly, this is much faster (10x in my tests) than tracking the counts manually with a dictionary, even though the dictionary solution only iterates through the string once, rather than four times.

and:

count is a low level C method in python so it is bound to be faster than any loop.

Skeptical, I decided to benchmark it myself.

## Python’s `list.count(x)`

### Implementation

The following benchmarks our two competing implementations in Python. In the
blue corner, we have `test_count`

, which counts uniques using the built-in
`list.count`

function. In the red corner, the tersely named
`test_count_with_ht`

, which uses a hash table to do the same.

We test each method over 100 trials, each on a 1,000-element list, where each list element is a randomly selected integer.

```
import benchmark as bm
import random as rand
from collections import Counter
from collections import defaultdict
from sets import Set
class BenchmarkCount(bm.Benchmark):
each = 100
def setUp(self):
self.list_length = 1000
self.list = [rand.randint(1, 100)] * self.list_length
def test_count(self):
uniques = Set(self.list)
return list(map(lambda x: self.list.count(x), uniques))
def test_count_with_ht(self):
counts = defaultdict(lambda: 0)
for elt in self.list:
counts[elt] += 1
return list(map(lambda x: counts[x], counts.keys()))
def test_count_with_counter(self):
counts = Counter(self.list)
return list(map(lambda x: counts[x], counts.keys()))
```

### Results

Sure enough, `test_count`

wins—although not quite with the reported 10x
performance advantage, once the cost of generating the list of unique elements
is internalized:

```
name | rank | runs | mean | sd | timesBaseline
-------------------|------|------|-----------|-----------|--------------
count | 1 | 100 | 6.971e-05 | 2.83e-05 | 1.0
count with ht | 2 | 100 | 0.0001138 | 4.082e-05 | 1.63237456821
count with counter | 3 | 100 | 0.0003266 | 8.942e-05 | 4.68466773829
```

Still, something feels off about

so it is bound to be faster than any loop

Faster than *any* loop?
Really? I’m not so sure.
interpreting charitably: where *any loop* means ‘any single-traversal implementation in Python’

The economy gained by peforming computations in C may dominate over some range of $n$, but the asymptotic analysis should obtain eventually.

And indeed it does. Let’s see what happens if we replace

```
self.list = [rand.randint(1, 100)] * self.list_length
```

with

```
self.list = list(range(self.list_length))
```

that is, if we test the worst possible case, in which each element in the list is unique.

### Results: $\mathcal{O}\left(n^2\right)$ case

```
name | rank | runs | mean | sd | timesBaseline
-------------------|------|------|-----------|-----------|--------------
count with ht | 1 | 100 | 0.0004774 | 0.0001103 | 1.0
count with counter | 2 | 100 | 0.000598 | 0.0001219 | 1.25244197195
count | 3 | 100 | 0.01856 | 0.00281 | 38.8764132476
```

Here, the hash table and `Counter`

implementations perform markedly better, with
an almost 40x performance improvement. As expected, there’s a cross-over point
at which, if the input is bad enough, the practical performance advantage reaped
by letting C do the counting (even if with an inefficient algorithm) ceases to
be a net performance gain.

## Ruby’s `Enumerable#count`

Out of curiosity, I wanted to see if we’d see a similar effect in Ruby. `Enumberable#count`

is implemented in C,
after all. Does it too exhibit this pattern?

### Implementation

```
require 'benchmark/ips'
def count(list, uniques)
uniques.map { |entry| list.count(entry) }
end
def count_with_ht(list, uniques)
counts = list.each_with_object(Hash.new(0)) { |entry, cts| cts[entry] += 1 }
uniques.map { |entry| counts[entry] }
end
LENGTH = 1_000
LIST = (1..LENGTH).map { rand(100) }.freeze
UNIQUES = LIST.uniq.freeze
UNIQ_LIST = (1..LENGTH).entries.freeze
Benchmark.ips do |x|
x.report("with count") { count(LIST, UNIQUES) }
x.report("with hash table") { count_with_ht(LIST, UNIQUES) }
x.report("with count (n^2)") { count(UNIQ_LIST, UNIQ_LIST) }
x.report("with hash table (n^2)") { count_with_ht(UNIQ_LIST, UNIQ_LIST) }
end
```

### Results

Not so much:

```
Calculating -------------------------------------
with count 295.937 (± 6.8%) i/s - 1.479k
with hash table 5.419k (± 7.0%) i/s - 27.508k
with count (n^2) 29.980 (± 6.7%) i/s - 150.000
with hash table (n^2) 2.457k (±23.2%) i/s - 10.800k
```

In Ruby, the hash table implementation is more performant regardless of whether
or not it’s working on a collection of all-unique elements. That’s even after
giving the `Enumerable#count`

implementation an advantage by parametrizing the
list of unique elements rather than generating it within the method.