A while ago, I had to speed up some Python code. I learnt a few things about speeding up Python codes that I am sharing (in no particular order, but the couple major ones are underlined). Perhaps you find them useful:
- Profile First, where is time consumed most using time,
cProfileorline_profiler - Replace
Pandasoperations withNumpyoperations wherever possible. - While defining placeholder
Numpyarrays, specify thedtype. - Use
ndarrayinstead ofrecarray, structured array etc. - Vectorize the code if possible. This is major.
- Multiple
Numpyfunction can do some same job. Compare whichNumpyfunction gives better performance. - Sometimes, you might have to write your own lower level
Numpyfunction for better performance rather than use the ready made library method. - If sorting is happening somewhere, you are in game for better speed gains. Try
np.searchsort(). - Use
Numba. If you can make it work, it works like magic. Sometimes, you might have to alter your code slightly to make itNumbacompatible as only the listed libraries on their documentation are supported. - Put pre-processing stuff in a separate function. Do calculation only once, which are not changing.