Writing readable code

We have now covered enough Python for us to talk a little bit about readability.

So far, we’ve not thought about how to write “good” code, just code that works. If you want other people to be able to read your code then you need to think a little bit about how it’s presented and how other people might read it.

An example “code makeover”

Take the following piece of code which calculates “something”.

values = [10, 20, 30, 40, 50]
result_1 = 2 * 8.314 * 298.15 / values[0]
result_2 = 2 * 8.314 * 298.15 / values[1]
result_3 = 2 * 8.314 * 298.15 / values[2]
result_4 = 2 * 8.314 * 298.15 / values[3]
result_5 = 2 * 8.314 * 298.15 / values[4]

Clearly this is doing some multiplication and division, but why?

In fact, this calculates the pressure of two moles of an ideal gas at ambient temperature for volumes of \(10\), \(20\), \(30\), \(40\), and \(50\, \mathrm{m^3}\) using

\[ pV = nRT \, . \]

Clearly this code should be modified to actually tell us this, so let’s improve it.

The first thing we can do is use variables to store the number of moles, ideal gas constant, and temperature. This avoids repetition, and so is a less error-prone approach to programming.

from scipy import constants

n = 2
T = 298 

values = [10, 20, 30, 40, 50]

result_1 = n * constants.R * T / values[0]
result_2 = n * constants.R * T / values[1]
result_3 = n * constants.R * T / values[2]
result_4 = n * constants.R * T / values[3]
result_5 = n * constants.R * T / values[4]

This is better, but the variable names result_N and values can be substituted for something chemically meaningful - perhaps pressure and volume! To do this, we can make an empty list for the pressure, and then add each computed value to it.

Tip

Always use variable names which are meaningful and actually relate to the quantity that those variables are storing!

from scipy import constants

n = 2
T = 298 

volumes = [10, 20, 30, 40, 50]

pressures = []

pressures.append(n * constants.R * T / volumes[0])
pressures.append(n * constants.R * T / volumes[1])
pressures.append(n * constants.R * T / volumes[2])
pressures.append(n * constants.R * T / volumes[3])
pressures.append(n * constants.R * T / volumes[4])

It should be obvious that this is much clearer than the code we started with. The final improvement we can make is to add comments - text which is included in the code but ignored by Python.

Syntax

Python’s comment syntax uses either a hash symbol # or a pair of triple quotes ''', for example

# Everything after the hash is ignored by Python!
print('Look, the comments disappear when this code is executed')
'''
Everything
between these triple quotes 
is ignored as well
'''

'Look, the comments disappear when this code is executed'

Typically we use triple quotes either at the top of a file/cell, or when defining a function (which we’ll learn about in the next session). Otherwise a hash is used for a single line comment.

Tip

You should always write comments at the same time as you write your code - this is much easier than coming back later and trying to remember what the code does!

Adding comments to our ideal gas code results in the following

from scipy import constants

# Number of moles
n = 2

# Temperature in K
T = 298 

# Volumes in m^3
volumes = [10, 20, 30, 40, 50]

# Empty list of pressures
pressures = []

# Calculate pressure (Pa) at each temperature
# Using the ideal gas equation
pressures.append(n * constants.R * T / volumes[0])
pressures.append(n * constants.R * T / volumes[1])
pressures.append(n * constants.R * T / volumes[2])
pressures.append(n * constants.R * T / volumes[3])
pressures.append(n * constants.R * T / volumes[4])

In future sessions we’ll see how to make this code shorter and even more readable using some other features of Python, but this is the level of clarity you should aspire to when writing your code.

Basic code styling

The example above contains a number of stylistic choices which we haven’t discussed so far, but are still important for maintaining readability.

Whitespace

As you’ve probably gathered by now, Python is relatively relaxed when it comes to whitespace. For example, we can type basic addition like this

5 + 7

or like this

5+7

This same principle applies to much of Python’s basic syntax, such as variable assignment

a = 5
a=5

or when creating a list

elemental_symbols = ['Li', 'Na', 'K', 'Rb', 'Cs', 'Fr']
elemental_symbols = ['Li','Na','K','Rb','Cs','Fr']

… or a dictionary

regular_shape_edges = {3 : 'Triangle', 4 : 'Square', 5 : 'Pentagon' , 6 : 'Hexagon' , 7 : 'Heptagon' , 8 : 'Octagon'}
regular_shape_edges = {3:'Triangle',4:'Square',5:'Pentagon',6:'Hexagon',7:'Heptagon',8:'Octagon'}

where we’ve seen before that we can write dictionaries, and all data structures for that matter, on more than one line

regular_shape_edges = {
    3 : 'Triangle',
    4 : 'Square',
    5 : 'Pentagon',
    6 : 'Hexagon',
    7 : 'Heptagon',
    8 : 'Octagon'
}

Clearly, adding a bit of whitespace and sometimes writing your code over a several lines can really improve its readability.

We can also use whitespace to block out our code, for example

import math

numbers = [1, 2, 3, 4, 5, 6]
letters = ['a', 'b', 'c', 'd', 'e', 'f']

sqrt_2 = math.sqrt(numbers[1])
len_letters = len(letters)

print(sqrt_2)
print(len_letters)

Variable names

We touched on this above (and in Session 1) but variable names must accurately describe the purpose or meaning of a variable. Think of the following rules when choosing a variable name

Avoid single letters unless there is a good reason to use one - e.g. K is appropriate for an equilibrium constant, but j for the root-mean-squared displacement of a gas is not.
Be explicit - e.g. element_symbols is better than es - it clearly tells you what the variable contains.
Avoid UPPER CASE unless there is a natural reason to use it. Much like single letter names there are some situations where it makes sense e.g. delta_G, but in general you should avoid it.
Only use characters you can type with a keyboard - don’t use symbols like Δ as variable names.
Separate words with an underscore _. Spaces and dashes do not work!

Documentation

We’ve already seen that we can use comments to document our code and explain its function. Since you’re writing your code in a Jupyter notebook, you can also use a Markdown Cell to add plain text and equations outside of your code.

You can create a Markdown cell in a Jupyter notebook following the guidance available here (see Switching cell type). Markdown is a markup language which can be used to create nicely formatted text, equations, tables.

Tip

Use comments when writing small messages or definitions within the code - particularly for parts of the code that are otherwise hard to interpret.

Use Markdown Cells when writing a longer form explanation - perhaps the science behind your work, your motivations, and when stating and analysing results.

Going forward

Now that you understand how to write readable code, and why it is important, you should put these ideas into practice as you move forward.