Single-Cell Analysis Blueprint

This repository houses tutorial notebooks to run GPU-accelerated single-cell analysis workflows using RAPIDS-singlecell, a GPU accelerated library developed by scverse®. The goal is of this repository is to help users try out and explore different capabilities of RAPIDS-singlecell on datasets ranging from 250 thousand to 11 million cells. To make this as easy as possible, we set up two different GPU envinronments on Brev that are designed to get you working with GPU-accelerated single-cell workflows as quickly as possible (see Quickstart). We've also provided instructions to run these notebooks on your own CUDA-enabled GPU systems (see Bring your own compute).

These notebooks will be valuable for single-cell scientists who want to quickly evaluate ease of use as well as explore the biological interpretability of RAPIDS-singlecell results. Secondarily, scientists will find value in learning to apply these methods to very large data sets. This repository is also broadly useful for any data scientist or developer who wants to run and evaluate single cell methods leveraging RAPIDS-singlecell. Data sets used for this tutorial were made publicly available by 10X as well as CZ cellxgene. The base container is the 25.04 RAPIDSAI Notebooks Container, which you can freely get from NVIDIA's NGC Catalog following the instructions below.

If you like these notebooks and this GPU accelerated capability, and want to support scverse's efforts, please learn more about them here as well as consider joining their community.

Quick start

The quickest way to use these blueprints is to use one of our pre-configured NVIDIA Brev reseources.

Select your resource size, and click "Deploy Now":

For a Standard Instance (L40s), click here:
For a Large Instance (8x H100), click here:

Click Deploy Launchable on the Brev.dev Launchable page
Wait for the Container status show Ready (can take up to 8 minutes). Then, click Access GPU
On the Instance page, click Open Notebook

You should drop into a fully installed and populated JupyterLab environment. Open up your desired notebook from the list below, and have a great time!

Overview

This repository contains a diverse set of notebooks to help get anyone started using RAPIDS-singlecell developed by scverse.

The outline below is a suggested exploration flow. Unless otherwise noted, you can choose any notebook to get started, as long as you have the GPU resources to run the notebook.

For those who are new to doing basic analysis for single cell data, the end to end analysis of 01_scRNA_analysis_preprocessing.ipynb is the best place to start, where you are walked through the steps of data preprocessing, cleanup, visualization, and investigation.

Notebook	Description	Min GPU Size / Instance
01_scRNA_analysis_preprocessing.ipynb	End to end workflow, where we understand the cells, run ETL on the data set then visiualize and explore the results. This tutorial is good for all users	24GB / Standard RSC Instance
02_scRNA_analysis_extended.ipynb	This notebook continues from the outputs of 01_scRNA_analysis_preprocessing.ipynb as an overview of methods that can be used to investigate transcriptional regulation	24GB / Standard RSC Instance
03_scRNA_analysis_with_pearson_residuals.ipynb	End to end workflow, like 01_scRNA_analysis_preprocessing.ipynb, but uses pearson residuals for normalization.	24GB / Standard RSC Instance
04_scRNA_analysis_dask_out_of_core.ipynb	In this notebook, we show the scalability of the analysis toof up to 11M cells easily by using Dask. Requires a 48GB GPU	48GB / Standard RSC Instance
05_scRNA_analysis_multi_GPU.ipynb	This notebook enhances the 11M cell dataset analysis with Dask without exceeding memory limits. It fully scales to utilize all available GPUs, uses chunk-based execution, and efficiently manages memory Requires 8x H100s or better. For all other GPUs systems, please run 04_scRNA_analysis_dask_out_of_core.ipynb instead	8x 80GB / Large RSC Instance
06_scRNA_analysis_90k_brain_example.ipynb	In this notebook, show diversity in capability by run a similar workflow to 01_scRNA_analysis_preprocessing.ipynb, but on brain cells	24GB / Standard RSC Instance
07_scRNA_analysis_1.3M_brain_example.ipynb	In this notebook, we scale up the analysis of 06_scRNA_analysis_90k_brain_example.ipynb to 1 million brain cells. Requires an 80GB GPU, like an H100	80GB / Large RSC Instance

You can find more detail on each notebook in the Notebooks README.

Note

To ensure you have the maximum GPU memory available, please remember to shut down your completed notebook's kernel before starting a new notebook. If you don't, you may experience Out Of Memory (OOM) based errors. To fix that, simply kill all the kernels, and the restart only the kernel for the notebook you want to run.

Deploying this Repository

The goal of this repository is to make it easy to try GPU-accelerated single-cell analysis workflows on different compute environments and datasets. Our preferred environment is NVIDIA Brev, but you can also run these in your own GPU-connected environment. We've provided a few tutorials below on how to set this up, and the easiest place to start is to follow the Quickstart instructions.

Using our Brev Launchable (super easy mode - highly recommended)

Follow our Quickstart Instructions above.

Create a custom instance using Brev (knowledgeable users only)

If you want to try a compute environment on Brev that's not one of the Quickstart Launchables, you will need to create a new Launchable or Standalone Compute Instance. This will let you select your desired cloud provider and desired compute resource. Note, we have not tested this on every combination of cloud provider and instance type, so your experience may vary.

If you're interested in trying this out, please follow the instructions here: Setting up your Custom Brev Launchable

Deploy on a CUDA compatible GPU system (knowledgeable users only)

Some people may want to have this experience off of Brev and take it with you. Great! We wrote a (somewhat) easy tutorial here: Bring your own compute

Support

If you have any questions about these notebooks or need support, please open an Issue on this repository and we will respond there.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
Maintainers		Maintainers
assets		assets
data		data
deploy		deploy
docker/brev		docker/brev
docs		docs
evaluate		evaluate
notebooks		notebooks
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
RAPIDS-SingleCell Developer Blueprint OSS Licenses - SW Components.pdf		RAPIDS-SingleCell Developer Blueprint OSS Licenses - SW Components.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Single-Cell Analysis Blueprint

Quick start

Overview

Deploying this Repository

Using our Brev Launchable (super easy mode - highly recommended)

Create a custom instance using Brev (knowledgeable users only)

Deploy on a CUDA compatible GPU system (knowledgeable users only)

Support

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

NVIDIA-AI-Blueprints/single-cell-analysis-blueprint

Folders and files

Latest commit

History

Repository files navigation

Single-Cell Analysis Blueprint

Quick start

Overview

Deploying this Repository

Using our Brev Launchable (super easy mode - highly recommended)

Create a custom instance using Brev (knowledgeable users only)

Deploy on a CUDA compatible GPU system (knowledgeable users only)

Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages