Have you ever worked on a problem where you wanted to optimise your code or compare performance of two different version of a program. I was working on a problem to optimise a python program that; connects to a database, query a table and returns some data. The goal was to make whole process blazing fast so that dashboard users can see result under 2 second. And that is when the code profiling tool came into my toolbox.
On a separate note, developer is required to write code and test it, but a seasoned developer also test his code for performance. Because the real test of an application is when lots of people start using it.
Enough of the fuss, now let's understand what is code profiling.
Code Profiling:
Let's see that in action:
import random
import time
from functools import reduce
# Define a complex calculation function
def some_complex_calculation(x, y):
# Simulate a complex calculation, e.g., multiplying two numbers
result = x * y
return result
# Using the reduce function for vectorization
def using_reduce(func, data: list):
t1 = time.time()
result = reduce(func, data)
t2 = time.time()
return result, t2 - t1
# Using a plain for loop
def using_loop(data):
t1 = time.time()
result = 1
for num in data:
result *= num
t2 = time.time()
return result, t2-t1
if __name__ == "__main__":
# Generate a list of random numbers for demonstration
data = [random.randint(1, 1000) for _ in range(10000)]
result_of_reduce, time_taken_reduce = using_reduce(some_complex_calculation, data)
print("Using reduce function:")
print("Execution Time for reduce function is:", time_taken_reduce, "seconds")
result_plain_loop, time_plain_loop = using_loop(data)
print("\nUsing plain for loop:")
print("Execution Time using loop is:", time_plain_loop, "seconds")
Using reduce function:
Execution Time for reduce function is: 0.017206430435180664 seconds
Using plain for loop:
Execution Time using loop is: 0.014813899993896484 seconds
Just runtime is not enough, I need more info:
Above program gives only runtime of the code, but most of the times you need lot other info based on what does your program do. And for that there are lots of profilers available at your disposal.
1. cProfile:
import cProfile
import random
# Define a complex calculation function
def complex_calculation(x):
# Simulate a complex calculation, e.g., squaring the number
result = x * x
return result
# Generate a list of random numbers for demonstration
data = [random.randint(1, 1000) for _ in range(10000)]
# Using the map function for vectorization
def using_map():
return list(map(complex_calculation, data))
# Using a plain for loop
def using_plain_loop():
result = []
for num in data:
result.append(complex_calculation(num))
return result
if __name__ == "__main__":
print("Using map function:")
cProfile.run("using_map()")
print("\nUsing plain for loop:")
cProfile.run("using_plain_loop()")
Using map function:
10004 function calls in 0.002 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.002 0.002 <string>:1(<module>)
1 0.001 0.001 0.002 0.002 profiling.py:15(using_map)
10000 0.001 0.000 0.001 0.000 profiling.py:5(complex_calculation)
1 0.000 0.000 0.002 0.002 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Using plain for loop:
20004 function calls in 0.005 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.005 0.005 <string>:1(<module>)
1 0.003 0.003 0.005 0.005 profiling.py:19(using_plain_loop)
1000 0.001 0.000 0.001 0.000 profiling.py:5(complex_calculation)
1 0.000 0.000 0.005 0.005 {built-in method builtins.exec}
10000 0.001 0.000 0.001 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2. profile:
What if you want line level profile of you code? profile module is here for that case. Similar to cProfile, the profile module is also part of the standard library and offers line-level profiling. It can be used for fine-grained analysis.
import profile
import random
# Define a function that simulates some work
def my_slow_function():
total = 0
for _ in range(100000):
total += random.randint(1, 1000)
return total
# Profile the function using cProfile
if __name__ == "__main__":
profiler = profile.Profile()
profiler.run('my_slow_function()')
profiler.print_stats()
802422 function calls in 1.930 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
100000 0.107 0.000 0.107 0.000 :0(bit_length)
1 0.000 0.000 1.916 1.916 :0(exec)
102417 0.106 0.000 0.106 0.000 :0(getrandbits)
300000 0.362 0.000 0.362 0.000 :0(index)
1 0.014 0.014 0.014 0.014 :0(setprofile)
1 0.000 0.000 1.916 1.916 <string>:1(<module>)
1 0.117 0.117 1.916 1.916 mix_profiling.py:6(my_slow_function)
1 0.000 0.000 1.930 1.930 profile:0(my_slow_function())
0 0.000 0.000 profile:0(profiler)
100000 0.343 0.000 0.556 0.000 random.py:239(_randbelow_with_getrandbits)
100000 0.649 0.000 1.567 0.000 random.py:292(randrange)
100000 0.232 0.000 1.799 0.000 random.py:366(randint)
3. line_profiler:
import random
from line_profiler import LineProfiler
prof = LineProfiler()
def profile(func):
from functools import wraps
@wraps(func)
def wrapper(*args, **kwargs):
try:
return prof(func)(*args, **kwargs)
finally:
prof.print_stats()
return wrapper
@profile
def my_slow_function():
total = 0
for _ in range(100000):
total += random.randint(1, 1000)
return total
if __name__ == '__main__':
my_slow_function()
Timer unit: 1e-09 s
Total time: 0.401355 s
File: /home/sanjeet/Documents/git_pod/basic_python/profiling/mix_profiling.py
Function: my_slow_function at line 38
Line # Hits Time Per Hit % Time Line Contents
==============================================================
38 @profile
39 def my_slow_function():
40 1 1535.0 1535.0 0.0 total = 0
41 100001 26632008.0 266.3 6.6 for _ in range(100000):
42 100000 374721383.0 3747.2 93.4 total += random.randint(1, 1000) 43 1 178.0 178.0 0.0 return total
Process finished with exit code 0
4. memory_profiler:
from memory_profiler import profile
import random
@profile
def memory_intensive_function():
data = [random.randint(1, 1000) for _ in range(1000000)]
result = sum(data)
del data # To free up memory
return result
if __name__ == "__main__":
result = memory_intensive_function()
print("Result:", result)
Filename: /home/sanjeet/Documents/git_pod/basic_python/profiling/mix_profiling.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
54 19.4 MiB 19.4 MiB 1 @profile
55 def memory_intensive_function():
56 50.2 MiB 30.9 MiB 1000003 data = [random.randint(1, 1000) for _ in range(1000000)]
57 50.2 MiB 0.0 MiB 1 result = sum(data)
58 21.0 MiB -29.2 MiB 1 del data # To free up memory
59 21.0 MiB 0.0 MiB 1 return result
Result: 500539097
Process finished with exit code 0
5. Timeit:
Enough of static code profiling, what if I want to profile a running python app?
2. PySpy
SEO: Code Profiling, line_profiler, memory_profiler, cprofile, timeit, timeit example, cprofile example, python code profiling.
