I’m working through the lessons in the free online version Allen B. Downey’s Think Stats.One thing I like to do whenever I write code is test it. So almost as soon as I began the exercises, I added a module called testwell.
Since performance is often a consideration with statistical computing, my testing module includes a basic timing function called timeit. The Python standard library includes a timeit module, but I don’t find it very friendly, as it creates a new process with its own separate environment. I just want a timing function that can time comparable pieces of code in the current environment. Here’s my timeit function:
def timeit(f, *args, **kw): """time a function over an n number of trails: f1 = lambda a: a * a def f2(a, b): f1(a) + f1(b) def f3(): f2(10, 20) USAGE: t1 = timeit(f1, a, n=1000) t2 = timeit(f2, a, b, n=1000) t3 = timeit(f3, n=1000) pprint([t1, t2, t3]) """ n = kw.get('n', 100) print 'timing %s over %s trials' % (f, n) t0 = time.time() for i in range(n): f(*args) total_time = time.time() - t0 per_trial = total_time / n return '%.2f (%s trials at %.6f per trial)' % (total_time, n, per_trial)
It takes a function as it’s first argument, a list of arguments, and an n keyword to specify how many times to run it.
The pattern I like best is to enclose the operation you wish to test in a function, then just pass that with the n keyword:
import pprint num_trials = 100000 f1 = lambda: Percentile(scores, 50) f2 = lambda: iPercentile(scores, 50) t1 = timeit(f1, n=num_trials) t2 = timeit(f2, n=num_trials) pprint.pprint([t1, t2]) # output timing <function at 0x8950bc4> over 100000 trials timing <function at 0x8950bfc> over 100000 trials ['0.86 (100000 trials at 0.000009 per trial)', '0.38 (100000 trials at 0.000004 per trial)']