Lesson overview

In this lesson we will learn the basics of reading and writing data from/to a file, and to do some basic data manipulation and analysis. This year we’re teaching these concepts using your choice of two different Python module options: Pandas and NumPy. Pandas is a modern and feature rich data analysis framework for Python that is designed to make data analysis and manipulation straightforward and powerful using easy-to-use data structures and operations. NumPy is designed for efficient numerical computing in Python and under the hood in Pandas.

We recommend that students continuing on to the Automating GIS processes course follow the lessons using Pandas below.

  1. Pandas lesson
  2. Exercise 5: Data analysis I

Those continuing on in the Introduction to Quantitative Geology course should follow the NumPy lessons.

  1. NumPy lesson
  2. Exercise 5: Data analysis I

Learning goals

After this weeks lesson your should be able to:

  • Read data from a .csv text file into Python using Pandas or NumPy
  • Do simple data analysis using Pandas/NumPy data structures and related functions
  • Write data from Pandas/NumPy to a .csv text file

Lecture videos

Pandas

Lesson 5.1P - Exploring data using Pandas

Dave Whipp & Henrikki Tenkanen, University of Helsinki @ Geo-Python channel on Youtube.

Lesson 5.2P - Processing data with Pandas

Dave Whipp & Henrikki Tenkanen, University of Helsinki @ Geo-Python channel on Youtube.

NumPy

Lesson 5.1N - Exploring data using NumPy

Dave Whipp & Henrikki Tenkanen, University of Helsinki @ Geo-Python channel on Youtube.

Lesson 5.2N - Processing data with NumPy

Dave Whipp & Henrikki Tenkanen, University of Helsinki @ Geo-Python channel on Youtube.

Lesson 5.3 - Exercise 5 preview

Dave Whipp & Henrikki Tenkanen, University of Helsinki @ Geo-Python channel on Youtube.

Lesson 5.4 - Exercise 4 review

Dave Whipp & Henrikki Tenkanen, University of Helsinki @ Geo-Python channel on Youtube.