Basic elements of Python¶
In this lesson we will revisit data types, learn how data can be stored in Python lists, and about the concept of objects in programming. If you have not already started Spyder you should do so now. You can find directions on how to open Spyder at the start of Lesson 1.
Sources¶
Like the previous lesson, this lesson is inspired by the Programming with Python lessons from the Software Carpentry organization.
Data types revisited¶
We saw a bit about variables and their values in the lesson last week, and we continue today with some variables related to FMI observation stations in Finland. For each station, a number of pieces of information are given, including the name of the station, an FMISID, its latitude, its longitude, and the station type. We can store this information and some additional information for a given station in Python as follows:
In [1]: stationName = 'Helsinki Kaivopuisto' In [2]: stationID = 132310 In [3]: stationLat = 60.15 In [4]: stationLong = 24.96 In [5]: stationType = 'Mareographs'
Here we have 5 values assigned to variables related to a single observation station. Each variable has a unique name and they can store different types of data.
We can explore the different types of data stored in variables using the
type()
function.In [6]: type(stationName) Out[6]: str In [7]: type(stationID) Out[7]: int In [8]: type(stationLat) Out[8]: float
As expected, we see that the
stationName
is a character string, thestationID
is an integer, and thestationLat
is a floating point number.Note
Remember, the data types are important because some are not compatible with one another.
In [9]: stationName + stationID --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-9-a9cbfafbc479> in <module>() ----> 1 stationName + stationID TypeError: must be str, not int
Here we get a
TypeError
because Python does not know how we are expected to combine a string of characters (stationName
) with an integer value (stationID
).It is not the case that things like the the
stationName
andstationID
cannot be combined at all, but in order to combine a character string with a number we need to perform a data type conversion. For example, we can could convert thestationID
into a character string using thestr()
function.In [10]: stationIDStr = str(stationID) In [11]: type(stationIDStr) Out[11]: str In [12]: print(stationIDStr) 132310
As you can see,
str()
converts a numerical value into a character string with the same numbers as before.Note
Similar to using
str()
to convert numbers to character strings,int()
can be used to convert strings or floating point numbers to integers andfloat()
can be used to convert strings or integers to floating point numbers.Attention
Poll pause - Questions 2.2, 2.3
Please visit the class polling page to participate (those present in lecture only).
Although most mathematical operations operate on numerical values, a common way to combine character strings is using the addition operator
+
.In [13]: stationNameAndID = stationName + ": " + str(stationID) In [14]: print(stationNameAndID) Helsinki Kaivopuisto: 132310
Note that here we are converting
stationID
to a character string using thestr()
function within the assignment to the variablestationNameAndID
. Alternatively, we could have simply addedstationName
andstationIDStr
.
Lists and indices¶
Above we have seen a bit of data related to one of several FMI observation stations in the Helsinki area. Rather than having individual variables for each of those stations, we can store many related values in a collection. The simplest type of collection in Python is a list.
Let’s first create a list of selected
stationName
values.In [15]: stationNames = ['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula'] In [16]: print(stationNames) ['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula'] In [17]: type(stationNames) Out[17]: list
Here we have a list of 4
stationName
values in a list calledstationNames
. As you can see, thetype()
function recognizes this as a list. Lists can be created using the square brackets ([
and]
), with commas separating the values in the list.To access an individual value in the list we need to use an index value. An index value is a number that refers to a given position in the list. Let’s check out the first value in our list as an example:
In [18]: print(stationNames[1]) Helsinki Kaisaniemi
Wait, what? This is the second value in the list we’ve created, what is wrong? As it turns out, Python (and many other programming languages) start values stored in collections with the index value 0. Thus, to get the value for the first item in the list, we must use index 0.
In [19]: print(stationNames[0]) Helsinki Harmaja
OK, that makes sense, but it may take some getting used to…
As it turns out, index values are extremely useful, very commonly used in many programming languages, yet often a point of confusion for new programmers. Thus, we need to have a trick for remembering what an index value is and how they are used. For this, we need to be introduced to Bill.
As you can see, Bill is a vending machine that contains 6 items. Like Python lists, the list of items available from Bill starts at 0 and increases in increments of 1.
The way Bill works is that you insert your money, then select the location of the item you wish to receive. In an analogy to Python, we could say Bill is simply a list of food items and the buttons you push to get them are the index values. For example, if you would like to buy a taco from Bill, you would push button
3
. The equivalent operation in Python would simply beIn [20]: print(Bill[3]) Taco
We can find the length of a list using the
len()
function.In [21]: len(stationNames) Out[21]: 4
Just as expected, there are 4 values in our list and
len(stationNames)
returns a value of 4.If we know the length of the list, we can now use it to find the value of the last item in the list, right?
In [22]: print(stationNames[4]) --------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-22-802825731e74> in <module>() ----> 1 print(stationNames[4]) IndexError: list index out of range
What, an
IndexError
? That’s right, since our list starts with index 0 and has 4 values, the index of the last item in the list islen(SampleIDs) - 1
. That isn’t ideal, but fortunately there’s a nice trick in Python to find the last item in a list.In [23]: print(stationNames) ['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula'] In [24]: print(stationNames[-1]) Helsinki Kumpula In [25]: print(stationNames[-4]) Helsinki Harmaja
Yes, in Python you can go backwards through lists by using negative index values. Index -1 gives the last value in the list and index
-len(SampleIDs)
would give the first. Of course, you still need to keep the index values within their ranges.In [26]: print(stationNames[-5]) --------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-26-b40ca6c3b597> in <module>() ----> 1 print(stationNames[-5]) IndexError: list index out of range
Attention
Poll pause - Question 2.4
Please visit the class polling page to participate (those present in lecture only).
Another nice feature of lists is that they are mutable, meaning that the values in a list that has been defined can be modified. Consider a list of the observation station types corresponding to the station names in the
stationNames
list.In [27]: stationTypes = ['Weather stations', 'Weather stations', 'Weather stations', 'Weather stations'] In [28]: print(stationTypes) ['Weather stations', 'Weather stations', 'Weather stations', 'Weather stations']
Now as we saw before, the station type for Helsinki Kaivopuisto should be ‘Mareographs’, not ‘Weather stations’. Fortunately, this is an easy fix. We simply replace the value at the corresponding location in the list with the correct one.
In [29]: stationTypes[2] = 'Mareographs' In [30]: print(stationTypes) ['Weather stations', 'Weather stations', 'Mareographs', 'Weather stations']
Lists also do not need to have only one type of data. Let’s consider that in addition to having a list of each station name, FMISID, latitude, etc. we would like to have a list of all of the values for station ‘Helsinki Kaivopuisto’.
In [31]: stationHelKaivo = [stationName, stationID, stationLat, stationLong, stationType] In [32]: print(stationHelKaivo) ['Helsinki Kaivopuisto', 132310, 60.15, 24.96, 'Mareographs']
Here we have one list with 3 different type of data in it. We can confirm this using the
type()
function.In [33]: type(stationHelKaivo) Out[33]: list In [34]: type(stationHelKaivo[0]) # The station name Out[34]: str In [35]: type(stationHelKaivo[1]) # The FMISID Out[35]: int In [36]: type(stationHelKaivo[2]) # The station latitude Out[36]: float
Finally, we can add and remove values from lists to change their lengths. Let’s consider that we no longer want to include the first value in the
stationNames
list.In [37]: print(stationNames) ['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula'] In [38]: del stationNames[0] In [39]: print(stationNames) ['Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula']
del
allows values in lists to be removed. It can also be used to delete values from memory in Python. If we would instead like to add a few samples to thestationNames
list, we can do so as follows.In [40]: stationNames.append('Helsinki lighthouse') In [41]: stationNames.append('Helsinki Malmi airfield') In [42]: print(stationNames) ['Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula', 'Helsinki lighthouse', 'Helsinki Malmi airfield']
As you can see, we add values one at a time using
stationNames.append()
.list.append()
is called a method in Python, which is a function that works for a given data type (a list in this case). We’ll see a bit more about these below.
The concept of objects¶
Python is one of a number of computer programming languages that are called ‘object-oriented languages’. It took me quite some time to understand what this meant, but the simple explanation is that we can consider the variables that we define to be ‘objects’ that can contain both data known as attributes and a specific set of functions (methods). The previous sentence could take quite some time to understand by itself, but using an example the concept of ‘objects’ is much easier to understand.
Let’s consider our list
stationNames
. As we know, we already have data in the liststationNames
, and we can modify that data using built-in methods such asstationNames.append()
. We can also do other things such as count the number of times a value occurs in a list, or where it occurs.In [43]: stationNames.count('Helsinki Kumpula') # The count method counts the number of occurences of a value Out[43]: 1 In [44]: stationNames.index('Helsinki Kumpula') # The index method gives the index value of an item in a list Out[44]: 2
The good news here is that our selected station name is only in the list once. Should we need to modify it for some reason, we also now know where it is in the list (index
2
).There are two other common methods for lists that we need to see. First, there is the
.reverse()
method, used to reverse the order of items in a list.In [45]: stationNames.reverse() In [46]: print(stationNames) ['Helsinki Malmi airfield', 'Helsinki lighthouse', 'Helsinki Kumpula', 'Helsinki Kaivopuisto', 'Helsinki Kaisaniemi']
Yay, it works!
Caution
A common mistake when sorting lists is to do something like
stationNames = stationNames.reverse()
. Do not do this! When reversing lists with.reverse()
theNone
value is returned (this is why there is no screen ouput when runningstationNames.reverse()
). If you then assign the output ofstationNames.reverse()
tostationNames
you will reverse the list, but then overwrite its contents with the returned valueNone
. This means you’ve deleted the list contents (!).The
.sort()
method works the same way.In [47]: stationNames.sort() # Notice no output here... In [48]: print(stationNames) ['Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula', 'Helsinki Malmi airfield', 'Helsinki lighthouse']
As you can see, the list has been sorted alphabetically using the
.sort()
method, but there is no screen output when this occurs. Again, if you were to assign that output tostationNames
the list would get sorted, but the contents would then be assignedNone
.Note
As you may have noticed,
Helsinki Malmi airfield
comes beforeHelsinki lighthouse
in the sorted list. This is because alphabetical sorting in Python places capital letters before lowercase letters.We won’t discuss any list attributes because as far as I know there aren’t any, but we’ll encounter some very useful attributes of other data types in the future.