Skip to content

Commit 06e9ebe

Browse files
authored
Add instructions to install vLLM+cu118 (#1717)
1 parent c5f7740 commit 06e9ebe

File tree

1 file changed

+19
-5
lines changed

1 file changed

+19
-5
lines changed

docs/source/getting_started/installation.rst

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@
33
Installation
44
============
55

6-
vLLM is a Python library that also contains pre-compiled C++ and CUDA (11.8) binaries.
6+
vLLM is a Python library that also contains pre-compiled C++ and CUDA (12.1) binaries.
77

88
Requirements
99
------------
1010

1111
* OS: Linux
1212
* Python: 3.8 -- 3.11
13-
* GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, etc.)
13+
* GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, H100, etc.)
1414

1515
Install with pip
1616
----------------
@@ -23,9 +23,24 @@ You can install vLLM using pip:
2323
$ conda create -n myenv python=3.8 -y
2424
$ conda activate myenv
2525
26-
$ # Install vLLM.
26+
$ # Install vLLM with CUDA 12.1.
2727
$ pip install vllm
2828
29+
.. note::
30+
31+
As of now, vLLM's binaries are compiled on CUDA 12.1 by default.
32+
However, you can install vLLM with CUDA 11.8 by running:
33+
34+
.. code-block:: console
35+
36+
$ # Install vLLM with CUDA 11.8.
37+
$ # Replace `cp310` with your Python version (e.g., `cp38`, `cp39`, `cp311`).
38+
$ pip install https://github.com/vllm-project/vllm/releases/download/v0.2.2/vllm-0.2.2+cu118-cp310-cp310-manylinux1_x86_64.whl
39+
40+
$ # Re-install PyTorch with CUDA 11.8.
41+
$ pip uninstall torch -y
42+
$ pip install torch --upgrade --index-url https://download.pytorch.org/whl/cu118
43+
2944
3045
.. _build_from_source:
3146

@@ -45,6 +60,5 @@ You can also build and install vLLM from source:
4560

4661
.. code-block:: console
4762
48-
$ # Pull the Docker image with CUDA 11.8.
4963
$ # Use `--ipc=host` to make sure the shared memory is large enough.
50-
$ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12-py3
64+
$ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:23.10-py3

0 commit comments

Comments
 (0)