EC2 AMI for scientific computing in Python and R
By Drew Conway, on April 11th, 2011
Like many people who crunch numbers frequently, I have increasingly been integrating Amazon's cloud computing services into my daily workflow. In particular, I have been using their elastic cloud computing (EC2) on a regular basis. The service is an excellent way to offload computationally intensive work from your laptop for literally pennies on the dollar.
One drawback that I have found, however, is there are not any obvious pre-configured images, called AMIs, designed for scientific computing in the languages I use most: Python and R. The best public AMI I could find was an Ubuntu 10 image provided by the good people at MIT's STARDEV Project , which includes several useful libraries pre-installed and optimized versions of core scientific Python libraries. This AMI is great, but was still missing several Python packages I use on a regular basis (NetworkX, scikits.learn, sympy, etc.), and had an old version of R with only base packages installed. This would simply not do.
Thus began the odyssey of modifying the StarCluster AMI to more fully support scientific computing in Python in R. I have now uploaded and made public the resulting image, which includes several hundred Python and R packages for scientific computing, statistics, machine learning, data mining and visualization.
Or, access it directly with the AMI ID:
AMI ID: ami-84bd41ed