-
Notifications
You must be signed in to change notification settings - Fork 130
Description
I just ran an old test case of mine to measure worst-case overhead (where the main line being measured is very simple) and noticed an overhead increase on the main branch of ~2x since 4.1.3. This is the absolute worst-case though, and if I recall correctly, the normal overhead was only 2-3x in 4.1.3 (I haven't measured it again on main). I plan to put some time in to find where the regressions were introduced and how they can be fixed! This is the script I'm using to measure, and following that is the output on different versions.
import time
def main(profiled=False):
start = time.perf_counter()
for x in range(2000000):
y = x
elapsed_ms = round((time.perf_counter()-start)*1000, 2)
if profiled:
print(f"Time Elapsed profiling: {elapsed_ms}ms")
else:
print(f"Time Elapsed without profiling: {elapsed_ms}ms")
return elapsed_ms
if __name__ == "__main__":
e1 = main()
main = profile(main)
e2 = main(True)
print(f"Slowdown from profiling: {e2-e1}ms, {round(e2/e1, 2)}x")4.1.3 output:
kernprof -lvz test.py
Time Elapsed without profiling: 84.68ms
Time Elapsed profiling: 2219.49ms
Slowdown from profiling: 2134.81ms, 26.21x
Wrote profile results to test.py.lprof
Timer unit: 1e-06 s
Total time: 0.994691 s
File: test.py
Function: main at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 def main(profiled=False):
4 1 1.7 1.7 0.0 start = time.perf_counter()
5 2000001 546656.5 0.3 55.0 for x in range(2000000):
6 2000000 447940.7 0.2 45.0 y = x
7 1 40.5 40.5 0.0 elapsed_ms = round((time.perf_counter()-start)*1000, 2)
8 1 1.0 1.0 0.0 if profiled:
9 1 48.4 48.4 0.0 print(f"Time Elapsed profiling: {elapsed_ms}ms")
10 else:
11 print(f"Time Elapsed without profiling: {elapsed_ms}ms")
12 1 2.1 2.1 0.0 return elapsed_ms
main branch output:
kernprof -lvz test.py
Time Elapsed without profiling: 84.85ms
Time Elapsed profiling: 5103.88ms
Slowdown from profiling: 5019.03ms, 60.15x
Wrote profile results to 'test.py.lprof'
Timer unit: 1e-06 s
Total time: 3.7014 s
File: test.py
Function: main at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 def main(profiled=False):
4 1 2.6 2.6 0.0 start = time.perf_counter()
5 2000001 1872583.6 0.9 50.6 for x in range(2000000):
6 2000000 1828740.6 0.9 49.4 y = x
7 1 25.6 25.6 0.0 elapsed_ms = round((time.perf_counter()-start)*1000, 2)
8 1 1.2 1.2 0.0 if profiled:
9 1 45.6 45.6 0.0 print(f"Time Elapsed profiling: {elapsed_ms}ms")
10 else:
11 print(f"Time Elapsed without profiling: {elapsed_ms}ms")
12 1 4.8 4.8 0.0 return elapsed_ms
The relative difference between versions is basically the only relevant thing here, so I'm not planning to bother providing specs, but I can assure you that things that could cause obvious variations between runs like turbo boost were turned off and everything was pinned to a single core.