You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
implement incbench command for ease-of-use benchmark (#1884)
# Description
implement incbench command as entrypoint for ease-of-use benchmark
automatically check numa/socket info and dump it with table for ease-of-understand
supports both Linux and Windows platform
add benchmark documents
dump benchmark summary
add benchmark UTs
# General Use Cases
incbench main.py: run 1 instance on NUMA:0.
incbench --num_i 2 main.py: run 2 instances on NUMA:0.
incbench --num_c 2 main.py: run multi-instances with 2 cores per instance on NUMA:0.
incbench -C 24-47 main.py: run 1 instance on COREs:24-47.
incbench -C 24-47 --num_c 4 main.py: run multi-instances with 4 COREs per instance on COREs:24-47.
---------
Signed-off-by: xin3he <[email protected]>
Co-authored-by: chen, suyue <[email protected]>
| num_cores_per_instance | None | Number of cores in each instance |
30
+
| C, cores | 0-${num_cores_on_NUMA-1} | decides the visible core range |
31
+
| cross_memory | False | whether to allocate memory cross NUMA |
32
+
33
+
> Note: cross_memory is set to True only when memory is insufficient.
34
+
35
+
### General Use Cases
36
+
37
+
1.`incbench main.py`: run 1 instance on NUMA:0.
38
+
2.`incbench --num_i 2 main.py`: run 2 instances on NUMA:0.
39
+
3.`incbench --num_c 2 main.py`: run multi-instances with 2 cores per instance on NUMA:0.
40
+
4.`incbench -C 24-47 main.py`: run 1 instance on COREs:24-47.
41
+
5.`incbench -C 24-47 --num_c 4 main.py`: run multi-instances with 4 COREs per instance on COREs:24-47.
42
+
43
+
> Note:
44
+
> -`num_i` works the same as `num_instances`
45
+
> -`num_c` works the same as `num_cores_per_instance`
46
+
47
+
### Dump Throughput and Latency Summary
48
+
49
+
To merge benchmark results from multi-instances, "incbench" automatically checks log file messages for "throughput" and "latency" information matching the following patterns.
Copy file name to clipboardExpand all lines: examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/smooth_quant/run_benchmark.sh
Copy file name to clipboardExpand all lines: examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/smooth_quant/run_clm_no_trainer.py
0 commit comments