Easy performance optimizations in Python

September 16, 2019

Easy performance optimizations in Python

Performance is probably not the first thing that pops up in your mind when you think about Python. Nor is it required in a typical I/O intensive application, where most CPU cycles are spent waiting. But if a few small fixes can give your program those tiny performance boosts, that doesn't hurt, right?

Here are three easy fixes which you can give your Python code that little extra speed that it deserves.

1. Using {} instead of dict to initialize a dictionary

When initializing a new dictionary, using {} is much more performant than calling the dict built-in.

 
$ python3 -m timeit "x = dict()"
2000000 loops, best of 5: 111 nsec per loop

$ python3 -m timeit "x = {}"
10000000 loops, best of 5: 30.7 nsec per loop

To see why, let's look at the bytecode representation of both the statements.

 
>>> dis.dis("x = dict()")
  1           0 LOAD_NAME                0 (dict)
              2 CALL_FUNCTION            0
              4 STORE_NAME               1 (x)
              6 LOAD_CONST               0 (None)
              8 RETURN_VALUE

 
>>> dis.dis("x = {}")
  1           0 BUILD_MAP                0
              2 STORE_NAME               0 (x)
              4 LOAD_CONST               0 (None)
              6 RETURN_VALUE

‍
dict is slower because it calls a function that essentially returns {}. Hence, any occurrences of dict() can be safely replaced with {}.

Take care, however, that you're not using {} (or even dict(), since it returns {}) in variables that will be passed around to a lot of functions. In such cases, you may want to pass the dict callable, and then execute the callable only inside the function.

2. Using is instead of == for singleton comparison

When comparing to a singleton object, like True, False and None, is should be preferred over ==. This is because is directly compares the IDs of the two objects, and a singleton object's ID never changes in a runtime.

 
$ python3 -m timeit "x = 1; x == None"
10000000 loops, best of 5: 32.3 nsec per loop

 
$ python3 -m timeit "x = 1; x is None"
10000000 loops, best of 5: 21.2 nsec per loop

However, == invokes the self.__eq__ method of the comparable. In the above example, the int class's __eq__ is invoked. Therefore, even though the time difference in the above example doesn't look like much, it is going to increase for instances of classes with a more complex __eq__ method.

3. Avoid unnecessary calls to len()

In order to check if a list has non-zero length in a condition, doing this is more performant:

 
values = [1, 2, 3, 4, 5]
if values:
    # do something

‍
As compared to doing this:

 
values = [1, 2, 3, 4, 5]
if len(values):
    # do something

‍
Or even:

 
values = [1, 2, 3, 4, 5]
if bool(values):
    # do something

‍
Let us time both these variations using a slightly abridged snippet:

 
$ python3 -m timeit "x = [1, 2, 3, 4, 5]; y = 5 if x else 6"
5000000 loops, best of 5: 66.9 nsec per loop

 
$ python3 -m timeit "x = [1, 2, 3, 4, 5]; y = 5 if len(x) else 6"
2000000 loops, best of 5: 109 nsec per loop

 
$ python3 -m timeit "x = [1, 2, 3, 4, 5]; y = 5 if bool(x) else 6"
2000000 loops, best of 5: 149 nsec per loop

‍
This is because bool(x) eventually ends up calling the equivalent of len(x), as there is no __bool__ method defined for list. To see why the second code is slower than the first, let's dig into the bytecode.

 
>>> dis.dis("x = [1, 2, 3, 4, 5]; y = 5 if x else 6")
  1           0 LOAD_CONST               0 (1)
              2 LOAD_CONST               1 (2)
              4 LOAD_CONST               2 (3)
              6 LOAD_CONST               3 (4)
              8 LOAD_CONST               4 (5)
             10 BUILD_LIST               5
             12 STORE_NAME               0 (x)
             14 LOAD_NAME                0 (x)
             16 POP_JUMP_IF_FALSE       22
             18 LOAD_CONST               4 (5)
             20 JUMP_FORWARD             2 (to 24)
        >>   22 LOAD_CONST               5 (6)
        >>   24 STORE_NAME               1 (y)
             26 LOAD_CONST               6 (None)
             28 RETURN_VALUE

 
>>> dis.dis("x = [1, 2, 3, 4, 5]; y = 5 if len(x) else 6")
  1           0 LOAD_CONST               0 (1)
              2 LOAD_CONST               1 (2)
              4 LOAD_CONST               2 (3)
              6 LOAD_CONST               3 (4)
              8 LOAD_CONST               4 (5)
             10 BUILD_LIST               5
             12 STORE_NAME               0 (x)
             14 LOAD_NAME                1 (len)
             16 LOAD_NAME                0 (x)
             18 CALL_FUNCTION            1
             20 POP_JUMP_IF_FALSE       26
             22 LOAD_CONST               4 (5)
             24 JUMP_FORWARD             2 (to 28)
        >>   26 LOAD_CONST               5 (6)
        >>   28 STORE_NAME               2 (y)
             30 LOAD_CONST               6 (None)
             32 RETURN_VALUE

‍

In the second case, there are 4 extra statements. The statement POP_JUMP_IF_FALSE returns the length of the list (if you dig deep into the CPython implementation). In the second case, however, the call to len precedes the condition checking. Hence, it ends up being slower than the first version.