An Introduction to Scientific Python (and a Bit of the Maths Behind It) – Matplotlib
An introductory overview of Matplotlib, one of the foundational aspects of Scientific Computing in Python, along with some explanation of the maths involved.
By Jamal Moir, Oxford Brookes University.
One of the most popular uses for Python, especially in recent years is data processing, analysis and visualisation. This then leads topics such as the analysis of 'big data' which has many applications in pretty much every type of business you can imagine, and a personal interest of mine; Machine Learning.
Python has a vast array of powerful tools available to help with this processing, analysis and visualisation of data and is one of the main reasons that Python has gained such momentum in the scientific world.
In this series of posts, we will take a look at the main libraries used in scientific Python and learn how to use them to bend data to our will. We won't just be learning to churn out template code however, we will also learn a bit of the maths behind it so that we can understand what is going on a little better.
So let's kick things off with a incredibly useful little number that we will be using throughout this series of posts; Matplotlib.
What is Matplotlib?
Simply put it's a graphing library for Python. It has a humongous array of tools that you can use to create anything from simple scatter plots, to sin curves, to 3D graphs. It is used heavily in the scientific Python community for data visualisation. You can read more about the ideas behind Matplotlib on their website, but I especially recommend taking a look at their gallery to see the amazing things you can pull off with this library.
Plotting a Simple Graph
To get started we will plot a simple sin wave from 0 to 2 pi. You will notice that we are using Numpy here, don't worry too much about it for now if you don't know how to use it; we will be covering Numpy in the next post.
These are the imports we will be using. As I've mentioned in a previous post (and others) the 'from x import *' way of importing is not good. We don't want to be typing out matplotlib.pyplot and numpy all the time though, they are long, so we will use the above compromise.
The above code will produce a simple sin curve. The 'np.linspace(0, 2 * np.pi, 50)' bit of code produces an array of 50 evenly spaced numbers from 0 to 2 pi.
The plot command is the short and sweet line of code that actually creates the graph. Note that without the first x argument used here, instead of the x axis going fro 0 to 2 pi, it would instead use the array indices used in the x variable instead.
The final bit of code plt.show() displays the graph, without this nothing will appear.
You will get something like this:
Plotting Two Datasets on One Graph
A lot of the time you will want to plot more than one dataset on a graph. In Matplotlib this is simple.
The above code plots both the graphs for sin(x) and sin(2x). It is pretty much the same as the previous code for plotting one dataset, except this time inside the same plt.plot() call, we define another dataset separated by a comma.You will end up with a graph with two lines on like this:
Customising the Look of the Lines
When having multiple datasets on one graph it is useful to be able to change the look of the plotted lines to make differentiating between the datasets easier.
In the above code you can see two examples of different line stylings; 'r-o' and 'g--'. The letters 'r' and 'g' are the line colours and the following symbols are the line and marker styles. For example '-o' creates a solid line with dots on and '--' creates a dashed line. As with most of the aspects of Matplotlib, the best thing to do here is play.
Blue - 'b'
Green - 'g'
Red - 'r'
Cyan - 'c'
Magenta - 'm'
Yellow - 'y'
Black - 'k' ('b' is taken by blue so the last letter is used)
White - 'w'
Solid Line - '-'
Dashed - '--'
Dotted - '.'
Dash-dotted - '-:'
Often Used Markers:
Point - '.'
Pixel - ','
Circle - 'o'
Square - 's'
Triangle - '^'
For more markers click here.
You will end up with something like this:
Top Stories Past 30 Days