Skip to main content
Beta
This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
Introduction to Python
You can run a cloud based Jupyter Notebook with a Google account and
a web browser.
You can use a Jupyter notebook to edit and run Python.
Notebooks can include both code and markdown (text) cells.
Use variables to store values.
Use print
to display values.
Format output with f-strings.
Variables persist between cells.
Variables must be created before they are used.
Variables can be used in calculations.
Use an index to get a single character from a string.
Use a slice to get a portion of a string.
Use the built-in function len
to find the length of a
string.
Python is case-sensitive.
Every object has a type.
Use the built-in function type
to find the type of an
object.
Types control what operations can be done on objects.
Variables only change value when something is assigned to them.
A list stores many values in a single structure.
Use an item’s index to fetch it from a list.
Lists’ values can be replaced by assigning to them.
Appending items to a list lengthens it.
Use del
to remove items from a list entirely.
Lists may contain values of different types.
Character strings can be indexed like lists.
Character strings are immutable.
Indexing beyond the end of the collection is an error.
Use comments to add documentation to programs.
A function may take zero or more arguments.
Commonly-used built-in functions include max
,
min
, and round
.
Functions may only work for certain (combinations of)
arguments.
Functions may have default values for some arguments.
Use the built-in function help
to get help for a
function.
Every function returns something.
Most of the power of a programming language is in its
libraries.
A program must import a library module in order to use it.
Use help
to learn about the contents of a library
module.
Import specific items from a library to shorten programs.
Create an alias for a library when importing it to shorten
programs.
A for loop executes commands once for each value in a
collection.
The first line of the for
loop must end with a colon,
and the body must be indented.
Indentation is always meaningful in Python.
A for
loop is made up of a collection, a loop variable,
and a body.
Loop variables can be called anything (but it is strongly advised to
have a meaningful name to the looping variable).
The body of a loop can contain many statements.
Use range
to iterate over a sequence of numbers.
The Accumulator pattern turns many values into one.
Use a for
loop to process files given a list of their
names.
Use glob.glob
to find sets of files whose names match a
pattern.
Use glob
and for
to process batches of
files.
Use a list “accumulator” to append a DataFrame to an empty list
[]
.
The .merge()
, .join()
, and
.concat()
methods can combine pandas DataFrames.
Use if
statements to control whether or not a block of
code is executed.
Conditionals are often used inside loops.
Use else
to execute a block of code when an
if
condition is not true.
Use elif
to specify additional tests.
Conditions are tested once, in order.
Use and
and or
to check against multiple
value statements.
Break programs down into functions to make them easier to
understand.
Define a function using def
with a name, parameters,
and a block of code.
Defining a function does not run it.
Arguments in call are matched to parameters in definition.
Functions may return a result to their caller using
return
.
Explored the use of pandas for basic data manipulation, ensuring
correct indexing with DatetimeIndex to enable time-series operations
like resampling.
Used pandas’ built-in plot() for initial visualizations and faced
issues with overplotting, leading to adjustments like data filtering and
resampling to simplify plots.
Introduced Plotly for advanced interactive visualizations, enhancing
user engagement through dynamic plots such as line graphs, area charts,
and bar plots with capabilities like dropdown selections.
Use builtin methods .sum()
, .mean()
,
unique()
, and nunique()
to explore summary
statistics on the rows and colums in your DataFrame.
Use .groupby()
to work with subsets of your
dataset.
Sort pandas series with .sort_values()
.
Use .loc()
and .iloc()
to pinpoint
specific locations in Pandas DataFrames.
Save DataFrames to CSV and pickle files using .to_csv()
and .to_pickle()
.
In tidy data each variable forms a column, each observation forms a
row, and each type of observational unit forms a table.
Using pandas for data manipulation to reshape data is fundamental
for preparing data for analysis.
Python supports a large community within and outwith research.
Follow standard Python style (using PEP8) in your code.