Getting Started


  • You can run a cloud based Jupyter Notebook with a Google account and a web browser.
  • You can use a Jupyter notebook to edit and run Python.
  • Notebooks can include both code and markdown (text) cells.

Variables and Types


  • Use variables to store values.
  • Use print to display values.
  • Format output with f-strings.
  • Variables persist between cells.
  • Variables must be created before they are used.
  • Variables can be used in calculations.
  • Use an index to get a single character from a string.
  • Use a slice to get a portion of a string.
  • Use the built-in function len to find the length of a string.
  • Python is case-sensitive.
  • Every object has a type.
  • Use the built-in function type to find the type of an object.
  • Types control what operations can be done on objects.
  • Variables only change value when something is assigned to them.

Lists


  • A list stores many values in a single structure.
  • Use an item’s index to fetch it from a list.
  • Lists’ values can be replaced by assigning to them.
  • Appending items to a list lengthens it.
  • Use del to remove items from a list entirely.
  • Lists may contain values of different types.
  • Character strings can be indexed like lists.
  • Character strings are immutable.
  • Indexing beyond the end of the collection is an error.

Built-in Functions and Help


  • Use comments to add documentation to programs.
  • A function may take zero or more arguments.
  • Commonly-used built-in functions include max, min, and round.
  • Functions may only work for certain (combinations of) arguments.
  • Functions may have default values for some arguments.
  • Use the built-in function help to get help for a function.
  • Every function returns something.

Libraries & Pandas


  • Most of the power of a programming language is in its libraries.
  • A program must import a library module in order to use it.
  • Use help to learn about the contents of a library module.
  • Import specific items from a library to shorten programs.
  • Create an alias for a library when importing it to shorten programs.

For Loops


  • A for loop executes commands once for each value in a collection.
  • The first line of the for loop must end with a colon, and the body must be indented.
  • Indentation is always meaningful in Python.
  • A for loop is made up of a collection, a loop variable, and a body.
  • Loop variables can be called anything (but it is strongly advised to have a meaningful name to the looping variable).
  • The body of a loop can contain many statements.
  • Use range to iterate over a sequence of numbers.
  • The Accumulator pattern turns many values into one.

Looping Over Data Sets


  • Use a for loop to process files given a list of their names.
  • Use glob.glob to find sets of files whose names match a pattern.
  • Use glob and for to process batches of files.
  • Use a list “accumulator” to append a DataFrame to an empty list [].
  • The .merge(), .join(), and .concat() methods can combine pandas DataFrames.

Conditionals


  • Use if statements to control whether or not a block of code is executed.
  • Conditionals are often used inside loops.
  • Use else to execute a block of code when an if condition is not true.
  • Use elif to specify additional tests.
  • Conditions are tested once, in order.
  • Use and and or to check against multiple value statements.

Writing Functions


  • Break programs down into functions to make them easier to understand.
  • Define a function using def with a name, parameters, and a block of code.
  • Defining a function does not run it.
  • Arguments in call are matched to parameters in definition.
  • Functions may return a result to their caller using return.

Data Visualisation


  • Explored the use of pandas for basic data manipulation, ensuring correct indexing with DatetimeIndex to enable time-series operations like resampling.
  • Used pandas’ built-in plot() for initial visualizations and faced issues with overplotting, leading to adjustments like data filtering and resampling to simplify plots.
  • Introduced Plotly for advanced interactive visualizations, enhancing user engagement through dynamic plots such as line graphs, area charts, and bar plots with capabilities like dropdown selections.

Using Pandas


  • Use builtin methods .sum(), .mean(), unique(), and nunique() to explore summary statistics on the rows and colums in your DataFrame.
  • Use .groupby() to work with subsets of your dataset.
  • Sort pandas series with .sort_values().
  • Use .loc() and .iloc() to pinpoint specific locations in Pandas DataFrames.
  • Save DataFrames to CSV and pickle files using .to_csv() and .to_pickle().

Tidy Data with Pandas


  • In tidy data each variable forms a column, each observation forms a row, and each type of observational unit forms a table.
  • Using pandas for data manipulation to reshape data is fundamental for preparing data for analysis.

Wrap-Up


  • Python supports a large community within and outwith research.
  • Follow standard Python style (using PEP8) in your code.