Optimising Python

In order to speed up code, there are a number of optimisations and shortcuts you can take. Here we focus on Python specifically.

Analysis

The first step in speeding up code is profiling it. This can be done in a number of ways:

Numpy

When working with arrays or lists and doing operations on them, it is recommended to use numpy broadcasting instead of using for loops. This is because the numpy library has been optimised and uses compiled C code to speed up execution.

For example, compare the below code blocks.

%%timeit
x = np.arange(1,5+1)
for i in range(len(x)):
    x[i] = x[i]**2

# > 4.45 µs ± 972 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
x = np.arange(1,5+1)
x = np.pow(x,2)

# > 2.29 µs ± 350 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

As you can see, it takes half the time, which can be very significant.

However, note that numpy.arrays may not always be better than lists. Iteratively appending items is way faster for native python lists than for numpy arrays.

Cython

Cython is an extension of Python which is compiled into optimised C or C++ code, which can provide a significant speedup.

The steps are as follows:

  1. Ensure code is type annotated
  2. Save as a .pyx file
  3. This will be translated into C
  4. And then compiled into a shared library and loaded by Python

The cythonize shell command does steps 2-4 for you. If you’re working in a Jupyter notebook, the aforementioned %%Cython magic works instead.

The type annotations must be done using C types, not Python types!

To type variables, use cdef, for example

cdef double complex value # value is the variable name

To type function parameters and return values, use cpdef:

cpdef int square(double complex x=5):
    return x*x

It’s also possible to use numpy in Cython. You need to import twice:

import numpy as np
cimport numpy as np

Then, use numpy as normal, with a couple of alterations:

Numba

An alternative to all this rewriting is numba, which is a JIT compiler which translates some Python and numpy code into machine code.

To use, import numba. The @numba.njit decorator can be used before functions to immediately speed them up. However, the first call will be slow, as it will have to be compiled.

numba also supports parallelisation. For example, to operate on all elements of a list independently, we need to do two things:

Data Analysis

If .csv files get too large, they can be significantly compressed using a data structure known as a parquet.

Parquets are column-based, rather than row-based, and so are more suited to column operations.

You can achieve a significant speedup just with this change, for example:

import pandas as pd

df = pd.DataFrame(...)

df.to_parquet("data.parquet")

df = pd.from_parquet("data.parquet, engine="pyarrow")

# column operations are now much faster

Note that this requires the pyarrow package to be installed with pip install pyarrow first.

Inbuilts

functools.reduce is a useful function to accumulate some function over a list.

For example:

xvals = [1,2,3,4,5,4,3,2,1]
max_val = 0
for i in xvals:
    if i>max_val:
        max_val = i

is equivalent to

from functools import reduce
reduce(max, xvals)