A while ago, I had to speed up some Python code. I learnt a few things about speeding up Python codes that I am sharing (in no particular order, but the couple major ones are underlined). Perhaps you find them useful:
- Profile First, where is time consumed most using time,
cProfile
orline_profiler
- Replace
Pandas
operations withNumpy
operations wherever possible. - While defining placeholder
Numpy
arrays, specify thedtype
. - Use
ndarray
instead ofrecarray
, structured array etc. - Vectorize the code if possible. This is major.
- Multiple
Numpy
function can do some same job. Compare whichNumpy
function gives better performance. - Sometimes, you might have to write your own lower level
Numpy
function for better performance rather than use the ready made library method. - If sorting is happening somewhere, you are in game for better speed gains. Try
np.searchsort()
. - Use
Numba
. If you can make it work, it works like magic. Sometimes, you might have to alter your code slightly to make itNumba
compatible as only the listed libraries on their documentation are supported. - Put pre-processing stuff in a separate function. Do calculation only once, which are not changing.