Benchmarking R, Python, and Julia

27 January 2016

The folks at Simply Statistics recently emphasized the performance benefits of alternative R builds. The bottom line is, you should not use the default binaries provided by base R from CRAN but instead consider either the Microsoft R binary or build R from source yourself. This last option is made easier on Mac OS X by using Homebrew. Do one of these things, and your R programs will run much faster.

Out of curiosity, I replicated the profiling used by Simply Statistics on my computer for Microsoft R, Homebrew R, Python, and Julia. The specs are found either from About This Mac or by running sessionInfo() in R.

  • macbook air 8Gb memory
  • 2.2GHz processor, 4 cores
  • R version 3.2.3 (2015-12-10)
  • Platform: x86_64-apple-darwin14.5.0 (64-bit)
  • Running under: OS X 10.10.5 (Yosemite)

The code to be run in R is:

system.time({ x <- replicate(5e3, rnorm(5e3)); tcrossprod(x) }) 

The equivalent in python (running on the ipython interactive command line) is:

from numpy import random
%timeit n = int(5e3); x = random.randn(n,n); x.dot(x.T)

And in Julia,

tic(); n = round(Int,5e3); x=randn(n,n); x*x'; toc()

Here are the results:

Platform Elapsed Time (seconds)
Microsoft R
5.3
Homebrew R
5.3
Python (2.7.10)
3.4
Julia (0.4)
2.6

As noted by Yihui Xie, the source-built R from Homebrew performs equivalently to the Microsoft version of R. Julia is the fastest, and Python is intermediate between R and Julia.

Another (subjective) benchmark is how much time it took me to figure out the code for timing things in each language. R was the fastest because I could copy and paste from the blog post. Python took a little longer because I couldn’t remember the syntax of the %timeit macro. Finally, it took me a good 20 minutes to work out the Julia syntax. Of course this is biased due to my heavier use of R and Python compared to Julia, but I think it illustrates that in small applications, the free performance improvements of using a faster version of R might outweight the even greater performance improvements of switching to a newer language, because of the increased human time needed to learn the language. Of course, for those of us needing to run many computationally intensive simulations, languages like Julia have a major appeal!