NCCL Tests

This project incorporates NVIDIA NCCL Tests as part of our benchmarking and validation framework.

NCCL is a high-performance library developed by NVIDIA for accelerating collective communication primitives (such as all-reduce, all-gather, broadcast, reduce, and reduce-scatter) on multi-GPU systems. It is optimized for NVIDIA hardware and widely used in deep learning frameworks like PyTorch and TensorFlow to scale training across multiple GPUs and nodes.

Kubernetes

For details on how we deploy and manage these tests in Kubernetes, see our Kubernetes README.

Docker

For details on how we deploy and manage these tests in Docker, see our Docker README.

NCCL Tests - AllReduce Benchmark

About the AllReduce Test

The all_reduce operation sums arrays of data across all GPUs and distributes the result back to each GPU. It is fundamental in distributed deep learning for synchronizing gradients across devices.

The NCCL all_reduce test measures:

Bandwidth (GB/s): How much data can be reduced per second.
Latency (μs): Time taken for the operation to complete.
Scalability: How performance changes with more GPUs or nodes.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
docker		docker
kubernetes		kubernetes
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NCCL Tests

Kubernetes

Docker

NCCL Tests - AllReduce Benchmark

About the AllReduce Test

Copyright

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

voltagepark/nccl-tests

Folders and files

Latest commit

History

Repository files navigation

NCCL Tests

Kubernetes

Docker

NCCL Tests - AllReduce Benchmark

About the AllReduce Test

Copyright

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

Packages