Python#

This first chapter outlines the basic ways to work in Python. If you’re familiar with other programming languages, you’ll likely spot some overlap.

See also

For more on anything you see here, you can check out the Python documentation.

Variables#

Use the = sign to store anything inside a variable.

myvar = 5
print(myvar)
5

Use descriptive variable names and avoid spaces! You can try:

snake_case
camelCase
OrEven_mix_it_up

Variables have distinct types, and you can find out a variable’s data type with the type() function.

type(myvar)
int
stringvar = "five"
type(stringvar)
str

Comments#

Use the # sign at the start of a line to create a comment.

# This variable contains a continuous value
some_variable = 2.5

You can use comments to keep track of what you’re using, leave a note for someone working after you, and especially to create a citation if the code you’re using isn’t your own.

Data Structures#

Python contains several useful data structures. These overlap with the data structures available in other languages, but they often have different names.

Lists#

Lists are analogous to arrays in other languages. They use brackets to contain ordered lists of data.

mylist = [5,6,7]

secondlist = ["cat","dog","fish"]

print(mylist)
[5, 6, 7]

In a list, you can easily access items using “list slicing,” a bracket notation that lets you access list items with their index.

# Get the first item in a list, at index 0
mylist[0]
5
# Get several items in a list, with a range of values
secondlist[1:3]
['dog', 'fish']
# Get items from the end of a list with negative values
mylist[-1]
7

You can also create a list of numbers (starting at 0) with the range() function:

list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Note

Python also has tuples, which are a lot like lists but use parentheses instead of brackets: i.e. (1, 2, 3). Tuples often include just 2 items at a time.

Sets#

Sets are like lists, but they include only unique items. They use braces instead of brackets.

myset = {1, 2, 3, 4}
print(myset)
{1, 2, 3, 4}

You can turn a list into a set to get only the unique items in that list:

repeating_list = [1, 2, 1, 3, 3, 5, 4, 6, 4, 1]
set(repeating_list)
{1, 2, 3, 4, 5, 6}

Sets have special operators to combine sets and find intersections.

set1 = {1, 2, 3}
set2 = {3, 4, 5}

# Items in either set
set1 | set2
{1, 2, 3, 4, 5}
# Items in both sets
set1 & set2
{3}
# Items in one set but not the other
set1 - set2
{1, 2}
# Items in either set but not both
set1 ^ set2
{1, 2, 4, 5}

Dictionaries#

Dictionaries are analogous to objects in other languages. They contain key-value pairs, and they’re also surrounded by brackets.

mydictionary = {"pet_name": "Fido", "age": 5, "pet_type": "dog"}
print(mydictionary)
{'pet_name': 'Fido', 'age': 5, 'pet_type': 'dog'}

You can access items in a dictionary using bracket notation, similar to list slicing.

mydictionary["pet_name"]
'Fido'
mydictionary["age"]
5

If you have two lists or sets of the same length, you can use zip() to combine them as the keys and values of a new dictionary.

Note

The zip function returns a generator (see below), so you need to convert it to a dictionary with the dict() function.

# Keep in mind that dictionary keys must be unique values

zipped_dictionary = zip(secondlist, mylist)
dict(zipped_dictionary)
{'cat': 5, 'dog': 6, 'fish': 7}

Generators/Iterators#

Some functions and methods in Python output generators instead of a distinct data type. These generators often need to be converted to a specific data type before they can be used.

For example, the dictionary method .values() gets the values of a dictionary as an iterator:

mydictionary.values()
dict_values(['Fido', 5, 'dog'])

We can see those values above, but we can’t access them in the way we would a list. The code below will throw an error:

mydictionary.values()[1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[21], line 1
----> 1 mydictionary.values()[1]

TypeError: 'dict_values' object is not subscriptable

To make this work, we need to convert the generator into a list:

dict_values = mydictionary.values()
dict_values = list(dict_values)

dict_values[1]
5

Manipulating Data Structures#

Once you know how to create and access data structures, you can use some basic python concepts to create them.

Loops#

The statements for and in let you loop through a list. This is called, straightforwardly, a “for loop.”

# The variable name after the word `for` is a new name you're creating
for x in mylist:
    print(x)
5
6
7

You can do anything you want inside a for loop to manipulate each item:

for x in mylist:
    y = x*5
    print(y)
25
30
35

If your goal is to create a new list with some altered items, you can put a for loop into brackets and run all your code on a single line. This is a special Python concept called a “list comprehension.” They keep your code concise and easy to read. The code below does the same thing as above, but puts the items into a brand new list:

newlist = [x*5 for x in mylist]
newlist
[25, 30, 35]

For loops aren’t just for lists! You can use for loops with dictionaries, but you need to use the .items() method:

for k,v in mydictionary.items():
    print(k,v)
pet_name Fido
age 5
pet_type dog

Conditions#

Commonly used inside loops, conditions let you evaluate true or false statements. They will perform one action if a statement is true and a different action if the statement is false (else).

Conditions typically use operators to compare things to one another. These operators include:

  • == is equal to

  • != is not equal to

  • > is greater than

  • >= is greater than or equal to

  • < is less than

  • <= is less than or equal to

for x in mylist:
    if x == 7:
        print("Hooray!")
    else:
        print("Hip")
Hip
Hip
Hooray!

You can also use elif to create a chain of multiple conditions.

for x in [5, 6, 7, 8]:
    if x/2 == 3 or x/2 == 4:
        print(x*10)
    elif x < 6:
        print("small and odd")
    else:
        print("This is seven.")
small and odd
60
This is seven.
80

Functions#

Functions are simply reusable bits of code. In Python, they’re generally a word followed by parentheses. Inside the parentheses are arguments: bits of data or information that the function needs to work. You’ve used functions already: print(), range(), and type() are all built-in Python functions:

type(mydictionary)
dict

You can easily create your own functions in Python using the def statement. You give your function a name and some arguments, and then you tell it what to do. The function should also return a value.

# As in a for loop, the `arg` variable is one you're creating on the spot.
def myFunction(arg):
    x = arg*10
    return x

Once you create a function, it will be stored in memory for you to use later.

myFunction(5)
50
myFunction(8)
80

Note

Methods are a lot like functions, but they are attached to an object with a period. For example, in mydictionary.values(), .values() is a method that gets the dictionary’s values.

Libraries#

Sometimes you will write functions yourself, and often you will use functions that others have written. The Python community is very large, full of people who’ve made helpful code that you can reuse. They’ve packaged this code into libraries. Libraries include sets of methods and functions that you can use. In the rest of this book, we’ll focus on three large data analysis libraries: pandas, altair, and networkx.

To import a library, you can simply use the import statement. If you like, you can also abbreviate the name of the library using as.

import pandas as pd

Generally importing libraries is the first thing you do, at the very top of your code. Once you’ve imported a library, you can use any of its functions.

# More on what this function does in the next chapter!
pd.Series(mylist)
0    5
1    6
2    7
dtype: int64

You can also import just small parts of very large libraries. For example, networkx as bipartite functions that you can import like so:

from networkx.algorithms import bipartite

This can save time and memory when you only need one or two functions.


That’s it for Python basics! In the next chapters, we’ll see how to use Python to manipulate tabular data.