Optimizing Python (Aggregate functions)

Introduction

Another way to speed up our Python code is to reduce the number of times a function is called. This is easy to do if we have a function that is to be applied to each member of a list. Our first instinct in that situation might be to use a loop or list comprehension to call the function on each element in the list.

Consider Sample 1, which converts a list of Cartesian-coordinate (x, y) tuples to polar (r, Θ) tuples. (If you have not yet gone through the function alias lesson, you should do so, and complete that lesson’s exercise before looking at the samples in this lesson.)

In Sample 1, lines 3 through 6 define the cartesianToPolar() function, which takes a Cartesian tuple and returns an equivalent polar tuple. Note that the optimization tips we have covered so far do not help us in this function – there is no list to use comprehension on, and using aliases won’t help, since there is only one lookup of each math module function required.

Line 8 of the code creates a list of random Cartesian points, and line 10 uses list comprehension to apply the cartesianToPolar() function to each, creating a new list called polars.

1
2
3
4
5
6
7
8
9
10
# Sample 1 - item-wise Cartesian to polar coordinates
# n contains number of coordinates
def cartesianToPolar(point):
    theta = math.atan(point[1] / point[0])
    r = math.sqrt(point[0] ** 2 + point[1] ** 2)
    return (theta, r)

euclideans = [(random.random(), random.random()) for i in range(n)]

polars = [cartesianToPolar(p) for p in euclideans]

Apply functions to the list versus each item

The primary optimization tip here is to make the list the parameter of the function, rather than an item in the list. That way, we call the function one time, versus calling it n times. Sample 2 shows this tip in practice; the cartesianToPolar() function now takes a list of tuples as its parameter, instead of a single point. The function uses list comprehension to built the list of polar coordinates to return, applying the same conversion formula as Sample 1.

1
2
3
4
5
6
7
8
# Sample 2 - list-wise Cartesian to polar coordinates
# n contains number of coordinates
def cartesianToPolar(points):
    return [(math.atan(point[1] / point[0]), math.sqrt(point[0] ** 2 + point[1] ** 2))
        for point in points]

euclideans = [(random.random(), random.random()) for i in range(n)]
polars = cartesianToPolar(euclideans)

Incorporating previous lessons

You may have noticed a way that we can improve the performance of the list comprehension in lines 4 and 5 of Sample 2: aliases. Each time a new polar coordinate tuple is created for the return list, the Python interpreter must look in to the math module to find the atan() and sqrt() functions. We can improve our performance by creating local names for these functions, thus looking each up only once, instead of n times. This idea is applied in Sample 3 below.

1
2
3
4
5
6
7
8
9
10
# Sample 3 - list-wise Cartesian to polar coords, w/ aliases
# n contains number of coordinates
def cartesianToPolar(points):
    at = math.atan
    sqt = math.sqrt
    return [(at(point[1] / point[0]), sqt(point[0] ** 2 + point[1] ** 2))
        for point in points]

euclideans = [(random.random(), random.random()) for i in range(n)]
polars = cartesianToPolar(euclideans)

In Sample 3, lines 4 and 5 declare local names for math.atan and math.sqrt: at and sqt, respectively. Then the list comprehension in lines 6 and 7 uses these local names to form the return list of polar coordinate tuples. We expect Sample 3 to be faster than Sample 2, which in turn is expect to be faster than Sample 1.

Tip: If you need to apply one of your own functions to all of the values in a list, consider re-designing your function so that you can pass the whole list to the function instead of passing a single list element.

We can go even faster, though.

Ignoring a previous lesson

In the using built-ins lesson, we encouraged programmers to use the built-in capabilities of Python, and / or trusted, well-tested modules and libraries, to improve the performance of their code. Sometimes, however, we can improve the speed of our code by breaking that “rule.”

In the conversion formula in line 6 of Sample 3, the code is computing the r value for the polar coordinate by calculating this formula:

Polar coordinates r

In Sample 3, we are taking advantage of the built-in Python exponentiation operator, **. The problem is that ** is a powerful, general-purpose operator that is overkill for the problem at hand. Since computing x² is the same as x × x, we can actually improve our performance by using multiplication instead of the exponentiation operator. This is shown in Sample 4.

1
2
3
4
5
6
7
8
9
10
# Sample 4 - list-wise Cartesian to polar coords, w/ alias, w/o **
# n contains the number of coordinates
def cartesianToPolar(points):
    at = math.atan
    sqt = math.sqrt
    return [(at(point[1] / point[0]), sqt(point[0] * point[0] + point[1] * point[1]))
        for point in points]

euclideans = [(random.random(), random.random()) for i in range(n)]
polars = cartesianToPolar(euclideans)

Line 6 of Sample 4 has replaced exponentiation with multiplication. Now, let us take a look at the performance of these four versions of the same code. The following graph shows the time taken by each of the four samples for a variety of list sizes, and the speedup comparing Sample 1 with Sample 4. We can see from the graph that applying the function in aggregate, using aliases, and using multiplication instead of exponentiation is approximately 1.5 times as fast as the naive code in Sample 1.

Aggregate function speedup

Tip: Calculating the square of a value via multiplication is faster than using the exponentiation operator to accomplish the same task.

Exercise

In machine learning, one task we frequently have to perform is to clean data, by removing data points that, for one reason or another, cannot be input to our machine learning model. This exercise is a simple taste of data cleaning. The code shown below creates a list of numbers which are supposed to be in the range [0, 1). However, a certain percentage of the values are invalid. The validateElement() function determines if a given value is valid or not. Then, in the main program, line 60 creates a new list that contains only valid data.

Your task in this exercise is to replace validateElement() with a function that works on the entire list rather than elements of the list, and then to modify the main program to use your new function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
'''
Aggregate function exercise. 
'''

import random

def buildData(n, defectRate):
    '''
    Build a list of sample data in the range [0, 1).
    A certain percentage of the elements will be outside
    the range; these represent invalid data points.

    parameters
    ----------
        n - number of list elements
        defectRate - percent of elements that are invalid

    returns
    -------
        List of sample data with some invalid values. 
    '''
    values = []
    rand = random.random
    for i in range(n):
        if rand() <= defectRate:
            if rand() <= 0.5:
                values.append(-rand())
            else:
                values.append(1.0 + rand())
        else:
            values.append(rand())
    return values

# TODO: replace this with a function that operates on
# the aggregate instead of individual elements
def validateElement(element):
    '''
    Determine if a single data point is valid.

    parameter
    ---------
        element - number supposed to be in [0, 1)

    returns 
    -------
        True if element is valid, False if it is not
        in the specified range
    '''
    return element >= 0.0 and element < 1.0

# main program
n = 10_000_000
defectRate = 0.01

# build sample data
originals = buildData(n, defectRate)

# TODO: replace this with a call to your new function
# filter data
filteredData = [x for x in originals if validateElement(x)]

Speedup

We refer to speedup as a way to compare our optimized code with its original, slower counterpart. Speedup is a measure of how well we have improved the runtime of our code. If T₀ is the time taken by the original code, and T₁ is the time taken by the improved code, then speedup is defined as

Speedup

A speedup of 2 would mean that our optimized code is twice as fast as the original, while any value under 1 would mean that we actually made our code slower!

All performance figures on this page were obtained on a Windows 10 PC with an Intel® Core™ i5-9600K CPU @ 3.70GHz and 32GB of RAM, in Python 3.6.9 running in an Ubuntu Windows Subsystem for Linux environment.