Plotting with Matplotlib¶
Though there are many options for plotting data in Python, we will be using Matplotlib. In particular, we will be using the pyplot module in Matplotlib, which provides MATLAB-like plotting. The reason for this is simple: Matplotlib is the most common module used for plotting in Python and many examples of plotting you may find online will be using Matplotlib.
Downloading the data and preparing¶
- For our first lesson plotting data using Matplotlib we will again be using the weather data file from Lesson 5.
- Save a copy of this file in your home directory or a directory for the materials for this week’s lesson.
- The data file contains observed daily mean, minimum, and maximum temperatures from June 2016 recorded from the Kumpula weather observation station in Helsinki. It is derived from a data file of daily temperature measurments downloaded from the US National Oceanographic and Atmospheric Administration’s National Centers for Environmental Information climate database.
- We expect something like 30 lines of data in the data file.
- If you have not already started Spyder you should do so now. You can find directions on how to open Spyder at the start of Lesson 1.
- Navigate in Spyder to the directory where you have stored the downloaded data file. You can do this most easily in Spyder by using the filesystem browser above the File/Variable explorer panel. Click on the file folder to select the directory where you would like to work, click Open, then click on the listed file path that is now displated beside the file folder and press Enter.
Spyder plots in separate windows¶
By default, Spyder plots will be shown in the IPython console, but this can be annoying when saving and interacting with the plots we make. We can change how plots are displayed in Spyder to have them show up in a separate window by changing the preferences.
- Start by opening the Spyder preferences.
- Mac users can go to python -> Preferences... in the menubar
- Linux/Windows users can go to Tools -> Preferences
- In the Preferences window, click on IPython console on the left side of the window, then on the Graphics tab.
- Under Graphics backend, select Automatic for the backend.
- Restart Spyder.
Plotting data with Matplotlib¶
To start, we will need to import both Pandas and pyplot.
In [1]: import pandas as pd In [2]: import matplotlib.pyplot as plt
Note again that we are renaming the modules when we import them. Perhaps now it is more clear why you might want to rename a module on import. Having to type
matplotlib.pyplot
every time you use one of its methods would be a pain.With our modules imported, we now can read in the data file in the same way we had for Lesson 5.
In [3]: dataFrame = pd.read_csv('Kumpula-June-2016-w-metadata.txt', skiprows=8)
As you may recall, we will now have a Pandas DataFrame with 4 columns.
In [4]: print(dataFrame.columns) Index(['YEARMODA', 'TEMP', 'MAX', 'MIN'], dtype='object')
OK, so let’s get to plotting! We can start by using the Matplotlib
plt.plot()
function.In [5]: x = dataFrame['YEARMODA'] In [6]: y = dataFrame['TEMP'] In [7]: plt.plot(x, y) Out[7]: [<matplotlib.lines.Line2D at 0x1078f07b8>] In [8]: plt.show()
If all goes well, you should see the plot above.
OK, so what happened here? Well, first we assigned the values we would like to plot, the year and temperature, to the variables
x
andy
. This isn’t necessary, per se, but does make it easier to see what is plotted. Next, it is perhaps pretty obvious thatplt.plot()
is a function in pyplot that produces a simple x-y plot. However, just like most variables in Python, creating the plot simply stores the information about the plot in memory. The plot is not displayed on the screen until you typeplt.show()
.We can make our plot look a bit nicer and provide more information by using a few additional pyplot options.
In [9]: plt.plot(x, y, 'ro--') Out[9]: [<matplotlib.lines.Line2D at 0x114d6da90>] In [10]: plt.title('Kumpula temperatures in June 2016')