This page was generated from source/notebooks/L2/Python-basic-elements.ipynb.
Binder badge
Binder badge CSC badge

Basic elements of Python

In this lesson we will revisit data types, learn how data can be stored in Python lists, and about the concept of objects in programming.

Sources

Like the previous lesson, this lesson is inspired by the Programming with Python lessons from the Software Carpentry organization.

Note

There are some Python cells in this notebook that already contain code. You just need to press Shift-Enter to run those cells. We’re trying to avoid having you race to keep up typing in basic things for the lesson so you can focus on the main points :D.

Data types revisited

Let’s start with some data

We saw a bit about variables and their values in the lesson last week, and we continue today with some variables related to FMI observation stations in Finland. For each station, a number of pieces of information are given, including the name of the station, an FMI station ID number (FMISID), its latitude, its longitude, and the station type. We can store this information and some additional information for a given station in Python as follows:

[4]:
stationName = 'Helsinki Kaivopuisto'
[5]:
stationID = 132310
[6]:
stationLat = 60.15
[7]:
stationLong = 24.96
[8]:
stationType = 'Mareographs'

Here we have 5 values assigned to variables related to a single observation station. Each variable has a unique name and they can store different types of data.

Reminder: Data types and their compatibility

We can explore the different types of data stored in variables using the type() function.

[9]:
type(stationName)
[9]:
str
[10]:
type(stationID)
[10]:
int
[11]:
type(stationLat)
[11]:
float

As expected, we see that the stationName is a character string, the stationID is an integer, and the stationLat is a floating point number.

Note

We haven’t mentioned it explicitly yet, but the variable names in this lesson use another popular variable format called camelCase. In camelCase the words in the variable name are not separated by underscores or any other character, but rather the first letter is capitalized for all words in the name other than the first one.

Note

Remember, the data types are important because some are not compatible with one another.

[12]:
stationName + stationID
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-a9cbfafbc479> in <module>()
----> 1 stationName + stationID

TypeError: must be str, not int

Here we get a TypeError because Python does not know to combine a string of characters (stationName) with an integer value (stationID).

Converting data from one type to another

It is not the case that things like the stationName and stationID cannot be combined at all, but in order to combine a character string with a number we need to perform a data type conversion to make them compatible. For example, we can could convert the stationID integer value into a character string using the str() function.

[13]:
stationIDStr = str(stationID)
[14]:
type(stationIDStr)
[14]:
str
[16]:
print(stationIDStr)
132310

As you can see, str() converts a numerical value into a character string with the same numbers as before.

Note

Similar to using str() to convert numbers to character strings, int() can be used to convert strings or floating point numbers to integers and float() can be used to convert strings or integers to floating point numbers.

Attention

Poll pause - Questions 2.2, 2.3

Please visit the class polling page to participate (those present in lecture).

Combining text and numbers

Although most mathematical operations operate on numerical values, a common way to combine character strings is using the addition operator +.

[13]:
stationNameAndID = stationName + ": " + str(stationID)
[14]:
print(stationNameAndID)
Helsinki Kaivopuisto: 132310

Note that here we are converting stationID to a character string using the str() function within the assignment to the variable stationNameAndID. Alternatively, we could have simply added stationName and stationIDStr.

Lists and indices

Above we have seen a bit of data related to one of several FMI observation stations in the Helsinki area. Rather than having individual variables for each of those stations, we can store many related values in a collection. The simplest type of collection in Python is a list.

Creating a list

Let’s first create a list of selected stationName values.

[16]:
stationNames = ['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula']
[17]:
print(stationNames)
['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula']
[18]:
type(stationNames)
[18]:
list

Here we have a list of 4 stationName values in a list called stationNames. As you can see, the type() function recognizes this as a list. Lists can be created using the square brackets ([ and ]), with commas separating the values in the list.

Index values

To access an individual value in the list we need to use an index value. An index value is a number that refers to a given position in the list. Let’s check out the first value in our list as an example:

[19]:
print(stationNames[1])
Helsinki Kaisaniemi

Wait, what? This is the second value in the list we’ve created, what is wrong? As it turns out, Python (and many other programming languages) start values stored in collections with the index value 0. Thus, to get the value for the first item in the list, we must use index 0.

[20]:
print(stationNames[0])
Helsinki Harmaja

OK, that makes sense, but it may take some getting used to…

A useful analog - Bill the vending machine

As it turns out, index values are extremely useful, very commonly used in many programming languages, yet often a point of confusion for new programmers. Thus, we need to have a trick for remembering what an index value is and how they are used. For this, we need to be introduced to Bill.

Bill the vending machine Bill, the vending machine.

As you can see, Bill is a vending machine that contains 6 items. Like Python lists, the list of items available from Bill starts at 0 and increases in increments of 1.

The way Bill works is that you insert your money, then select the location of the item you wish to receive. In an analogy to Python, we could say Bill is simply a list of food items and the buttons you push to get them are the index values. For example, if you would like to buy a taco from Bill, you would push button 3. An equivalent operation in Python could simply be

print(Bill[3])
Taco

Number of items in a list

We can find the length of a list using the len() function.

[23]:
len(stationNames)
[23]:
4

Just as expected, there are 4 values in our list and len(stationNames) returns a value of 4.

Index value tips

If we know the length of the list, we can now use it to find the value of the last item in the list, right?

[24]:
print(stationNames[4])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-24-802825731e74> in <module>()
----> 1 print(stationNames[4])

IndexError: list index out of range

What, an IndexError?!? That’s right, since our list starts with index 0 and has 4 values, the index of the last item in the list is len(SampleIDs) - 1. That isn’t ideal, but fortunately there’s a nice trick in Python to find the last item in a list.

[25]:
print(stationNames)
['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula']
[26]:
print(stationNames[-1])
Helsinki Kumpula
[27]:
print(stationNames[-4])
Helsinki Harmaja

Yes, in Python you can go backwards through lists by using negative index values. Index -1 gives the last value in the list and index -len(SampleIDs) would give the first. Of course, you still need to keep the index values within their ranges.

[28]:
print(stationNames[-5])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-28-b40ca6c3b597> in <module>()
----> 1 print(stationNames[-5])

IndexError: list index out of range

Attention

Poll pause - Question 2.4

Please visit the class polling page to participate (those present in lecture).

Modifying list values

Another nice feature of lists is that they are mutable, meaning that the values in a list that has been defined can be modified. Consider a list of the observation station types corresponding to the station names in the stationNames list.

[29]:
stationTypes = ['Weather stations', 'Weather stations', 'Weather stations', 'Weather stations']
print(stationTypes)
['Weather stations', 'Weather stations', 'Weather stations', 'Weather stations']

Now as we saw before, the station type for Helsinki Kaivopuisto should be ‘Mareographs’, not ‘Weather stations’. Fortunately, this is an easy fix. We simply replace the value at the corresponding location in the list with the correct one.

[30]:
stationTypes[2] = 'Mareographs'
print(stationTypes)
['Weather stations', 'Weather stations', 'Mareographs', 'Weather stations']

Data types in lists

Lists can also store more than one type of data. Let’s consider that in addition to having a list of each station name, FMISID, latitude, etc. we would like to have a list of all of the values for station ‘Helsinki Kaivopuisto’.

[31]:
stationHelKaivo = [stationName, stationID, stationLat, stationLong, stationType]
print(stationHelKaivo)
['Helsinki Kaivopuisto', 132310, 60.15, 24.96, 'Mareographs']

Here we have one list with 3 different types of data in it. We can confirm this using the type() function.

[33]:
type(stationHelKaivo)
[33]:
list
[34]:
type(stationHelKaivo[0])    # The station name
[34]:
str
[37]:
type(stationHelKaivo[1])    # The FMISID
[37]:
int
[38]:
type(stationHelKaivo[2])    # The station latitude
[38]:
float

Adding and removing values from lists

Finally, we can add and remove values from lists to change their lengths. Let’s consider that we no longer want to include the first value in the stationNames list.

[39]:
print(stationNames)
['Helsinki Harmaja', 'Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula']
[40]:
del stationNames[0]
[41]:
print(stationNames)
['Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula']

del allows values in lists to be removed. It can also be used to delete values from memory in Python. If we would instead like to add a few samples to the stationNames list, we can do so as follows.

[42]:
stationNames.append('Helsinki lighthouse')
stationNames.append('Helsinki Malmi airfield')
[43]:
print(stationNames)
['Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula', 'Helsinki lighthouse', 'Helsinki Malmi airfield']

As you can see, we add values one at a time using stationNames.append(). list.append() is called a method in Python, which is a function that works for a given data type (a list in this case). We’ll see a bit more about these below.

The concept of objects

Python is one of a number of computer programming languages that are called ‘object-oriented languages’. It may take quite some time to understand what this means, but the simple explanation is that we can consider the variables that we define to be ‘objects’ that can contain both data known as attributes and a specific set of functions known as methods. The previous sentence could also take some time to understand by itself, but using an example the concept of ‘objects’ is much easier to understand.

A (bad) example of methods

Let’s consider our list stationNames. As we know, we already have data in the list stationNames, and we can modify that data using built-in methods such as stationNames.append(). In this case, the method append() is something that exists for lists, but not for other data types. It is intuitive that you might like to add (or append) things to a list, but perhaps it does not make sense to append to other data types.

[45]:
stationNameLength = len(stationNames)
[46]:
print(stationNameLength)
5
[47]:
type(stationNameLength)
[47]:
int
[48]:
stationNameLength.append(1)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-48-4c1ef8aeb47c> in <module>()
----> 1 stationNameLength.append(1)

AttributeError: 'int' object has no attribute 'append'

Here we get an AttributeError because there is no method built in to the int data type to append to int data. While append() makes sense for list data, it is not sensible for int data, which is the reason no such method exists for int data.

Some other useful list methods

With lists we can do a number of useful things, such as count the number of times a value occurs in a list or where it occurs.

[49]:
stationNames.count('Helsinki Kumpula')    # The count method counts the number of occurences of a value
[49]:
1
[50]:
stationNames.index('Helsinki Kumpula')    # The index method gives the index value of an item in a list
[50]:
2

The good news here is that our selected station name is only in the list once. Should we need to modify it for some reason, we also now know where it is in the list (index 2).

Reversing a list

There are two other common methods for lists that we need to see. First, there is the .reverse() method, used to reverse the order of items in a list.

[51]:
stationNames.reverse()
[52]:
print(stationNames)
['Helsinki Malmi airfield', 'Helsinki lighthouse', 'Helsinki Kumpula', 'Helsinki Kaivopuisto', 'Helsinki Kaisaniemi']

Yay, it works!

Caution

A common mistake when sorting lists is to do something like stationNames = stationNames.reverse(). Do not do this! When reversing lists with .reverse() the None value is returned (this is why there is no screen ouput when running stationNames.reverse()). If you then assign the output of stationNames.reverse() to stationNames you will reverse the list, but then overwrite its contents with the returned value None. This means you’ve deleted the list contents (!).

Sorting a list

The .sort() method works the same way.

[53]:
stationNames.sort()   # Notice no output here...
[54]:
print(stationNames)
['Helsinki Kaisaniemi', 'Helsinki Kaivopuisto', 'Helsinki Kumpula', 'Helsinki Malmi airfield', 'Helsinki lighthouse']

As you can see, the list has been sorted alphabetically using the .sort() method, but there is no screen output when this occurs. Again, if you were to assign that output to stationNames the list would get sorted, but the contents would then be assigned None.

Note

As you may have noticed, Helsinki Malmi airfield comes before Helsinki lighthouse in the sorted list. This is because alphabetical sorting in Python places capital letters before lowercase letters.

List attributes

We won’t discuss any list attributes because as far as we know there aren’t any, but we’ll encounter some very useful attributes of other data types in the future.