# benchmark: counting uniques

16 Mar 2015 - New York

While working on a Rosalind exercise recently, I came upon an interesting solution for a counting-uniques problem.

A canonical performant solution for a counting-uniques problem is to make a single traversal through the collection of elements and populate a hash table that maps from each unique element to a count of the number of times it appears in the collection.

The hash table then provides constant-time lookup for each count, yielding an $\mathcal{O}(n)$ asymptotic run time.

So at first glance this solution, which simply calls list.count for each unique element in the given list, seemed like a performance-naive one:

def count_uniques(list, elts_to_be_counted):
return list(map(lambda x: list.count(x), elts_to_be_counted))


This solution has an $\mathcal{O}(m n)$ runtime, where $m$ is the number of unique elements in the list (in this case, passed in via the elts_to_be_counted parameter) and $n$ the number of total elements. (Hence quadratic for a list where each element is unique.)

The naivete was (perhaps) all mine, however. From the accompanying discussion:

Interestingly, this is much faster (10x in my tests) than tracking the counts manually with a dictionary, even though the dictionary solution only iterates through the string once, rather than four times.

and:

count is a low level C method in python so it is bound to be faster than any loop.

Skeptical, I decided to benchmark it myself.

### Python’s list.count(x)

#### Implementation

The following benchmarks our two competing implementations in Python. In the blue corner, we have test_count, which counts uniques using the built-in list.count function. In the red corner, the tersely named test_count_with_ht, which uses a hash table to do the same.

We test each method over 100 trials, each on a 1,000-element list, where each list element is a randomly selected integer.

import benchmark as bm
import random as rand

from collections import Counter
from collections import defaultdict
from sets import Set

class BenchmarkCount(bm.Benchmark):
each = 100

def setUp(self):
self.list_length = 1000
self.list = [rand.randint(1, 100)] * self.list_length

def test_count(self):
uniques = Set(self.list)
return list(map(lambda x: self.list.count(x), uniques))

def test_count_with_ht(self):
counts = defaultdict(lambda: 0)
for elt in self.list:
counts[elt] += 1
return list(map(lambda x: counts[x], counts.keys()))

def test_count_with_counter(self):
counts = Counter(self.list)
return list(map(lambda x: counts[x], counts.keys()))


#### Results

Sure enough, test_count wins—although not quite with the reported 10x performance advantage, once the cost of generating the list of unique elements is internalized:

              name | rank | runs |      mean |        sd | timesBaseline
-------------------|------|------|-----------|-----------|--------------
count |    1 |  100 | 6.971e-05 |  2.83e-05 |           1.0
count with ht |    2 |  100 | 0.0001138 | 4.082e-05 | 1.63237456821
count with counter |    3 |  100 | 0.0003266 | 8.942e-05 | 4.68466773829


so it is bound to be faster than any loop

[interpreting charitably: where loop == ‘single-traversal implementation in Python’].

Faster than any loop? Really? I’m not so sure.

The economy gained by peforming computations in C may dominate over some range of $n$, but the asymptotic analysis should obtain eventually.

And indeed it does. Let’s see what happens if we replace

self.list = [rand.randint(1, 100)] * self.list_length


with

self.list = list(range(self.list_length))


that is, if we test the worst possible case, in which each element in the list is unique.

#### Results: $\mathcal{O}\left(n2\right)$ case

               name | rank | runs |      mean |        sd | timesBaseline
-------------------|------|------|-----------|-----------|--------------
count with ht |    1 |  100 | 0.0004774 | 0.0001103 |           1.0
count with counter |    2 |  100 |  0.000598 | 0.0001219 | 1.25244197195
count |    3 |  100 |   0.01856 |   0.00281 | 38.8764132476


Here, the hash table and Counter implementations perform markedly better, with an almost 40x performance improvement. As expected, there’s a cross-over point at which, if the input is bad enough, the practical performance advantage reaped by letting C do the counting (even if with an inefficient algorithm) ceases to be a net performance gain.

### Ruby’s Enumerable#count

Out of curiosity, I wanted to see if we’d see a similar effect in Ruby. Enumberable#count is implemented in C, after all. Does it too exhibit this pattern?

#### Implementation

require 'benchmark/ips'

def count(list, uniques)
uniques.map { |entry| list.count(entry) }
end

def count_with_ht(list, uniques)
counts = list.each_with_object(Hash.new(0)) { |entry, cts| cts[entry] += 1 }
uniques.map { |entry| counts[entry] }
end

LENGTH    = 1_000
LIST      = (1..LENGTH).map { rand(100) }.freeze
UNIQUES   = LIST.uniq.freeze
UNIQ_LIST = (1..LENGTH).entries.freeze

Benchmark.ips do |x|
x.report("with count")      { count(LIST, UNIQUES) }
x.report("with hash table") { count_with_ht(LIST, UNIQUES) }

x.report("with count (n^2)")      { count(UNIQ_LIST, UNIQ_LIST) }
x.report("with hash table (n^2)") { count_with_ht(UNIQ_LIST, UNIQ_LIST) }
end


#### Results

Not so much:

Calculating -------------------------------------
with count    295.937  (± 6.8%) i/s -      1.479k
with hash table      5.419k (± 7.0%) i/s -     27.508k
with count (n^2)     29.980  (± 6.7%) i/s -    150.000
with hash table (n^2)     2.457k (±23.2%) i/s -     10.800k


In Ruby, the hash table implementation is more performant regardless of whether or not it’s working on a collection of all-unique elements. That’s even after giving the Enumerable#count implementation an advantage by parametrizing the list of unique elements rather than generating it within the method.