How to Setup a Python Environment for Machine Learning
In this tutorial, you will learn how to set up a stable Python Machine Learning development environment. You’ll be able to get right down into the ML and never have to worry about installing packages ever again.
Setting up your Python environment for Machine Learning can be a tricky task. If you’ve never set up something like that before, you might spend hours fiddling with different commands trying to get the thing to work. But we just want to get right to the ML!
In this tutorial, you will learn how to set up a stable Python Machine Learning development environment. You’ll be able to get right down into the ML and never have to worry about installing packages ever again.
(1) Set up Python 3 and Pip
The first step is to install pip , a Python package manager:
sudo apt-get install python3-pip
Using pip, we’ll be able to install any Python package that’s indexed in the Python Package Index with a simple pip install your_package
. You’ll see soon how we use it to set up our virtual environment too.
Next, we’ll set Python 3 to be the default when running either the pip
or python
commands from command line. This makes using Python 3 easier and more convenient. If we didn’t do this, then if we wanted to use Python 3, we’d have to remember to type out pip3
and python3
every time!
To force Python 3 to be the default, we’re going to modify the ~/.bashrc
file. From the command line, execute the following command to view that file:
nano ~/.bashrc
Scroll on down to the # some more ls aliases section and add the following line:
alias python='python3'
Save the file and reload your changes:
source ~/.bashrc
Boom! Python 3 is now your default Python! You can run it with a simple python your_program
on the command line.
(2) Create a virtual environment
Now we’ll set up a virtual environment. In there, we’ll install all of the python packages that we need for Machine Learning.
We use virtual environments in order to separate our coding set ups. Imagine if at some point you wanted to do 2 different projects on your computer, which required different libraries of different versions. Having them all in the same working environment can be messy and you’ll likely run into the problem of conflicting library versions. Your ML code for project 1 needs version 1.0 of numpy
, but project 2 needs version 1.15. Yikes!
A virtual environment allows us to isolate our working areas to avoid those conflicts.
First, install the relevant packages:
sudo pip install virtualenv virtualenvwrapper
Once we have virtualenv and virtualenvwrapper installed, we’ll again need to edit our ~/.bashrc
file. Place these 3 lines right at the bottom and save it.
export WORKON_HOME=$HOME/.virtualenvs export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3 source /usr/local/bin/virtualenvwrapper.sh
Save the file and reload your changes:
source ~/.bashrc
Great! Now we can finally create our virtual environment like so:
mkvirtualenv ml
We’ve just created a virtual environment called ml
. To enter it, do this:
workon ml
Nice! Any library installations that you do while in the ml
virtualenv will be isolated in there and never conflict with any other environments! So whenever you wish to run code that depends on libraries installed in the ml
environment, enter into first with the workon
command and then run your code as normal.
If you need to exit the virtualenv, run this command:
deactivate
(3) Install Machine Learning libraries
Now we can install our ML libraries! We’ll go with the most commonly used ones:
- numpy: for any work with matrices, especially math operations
- scipy: scientific and technical computing
- pandas: data handling, manipulation, and analysis
- matplotlib: data visualisation
- scikit learn: machine learning
Here’s a simple trick to install all of those libraries in one quick shot! Create a requirements.txt
file and list all of the packages you wish to install like so:
numpy scipy pandas matplotlib scikit-learn
Once that’s done, just execute this command:
pip install -r requirements.txt
Voila! Pip will go ahead and install all of the packages listed in the file in one shot.
Congratulations, your environment is set up and you’re ready to do Machine Learning!
Like to learn?
Follow me on twitter where I post all about the latest and greatest AI, Technology, and Science!
Bio: George Seif is a Certified Nerd and AI / Machine Learning Engineer.
Original. Reposted with permission.
Related:
- Selecting the Best Machine Learning Algorithm for Your Regression Problem
- The 5 Clustering Algorithms Data Scientists Need to Know
- 5 Quick and Easy Data Visualizations in Python with Code