3
3
Installation
4
4
============
5
5
6
- vLLM is a Python library that also contains pre-compiled C++ and CUDA (11.8 ) binaries.
6
+ vLLM is a Python library that also contains pre-compiled C++ and CUDA (12.1 ) binaries.
7
7
8
8
Requirements
9
9
------------
10
10
11
11
* OS: Linux
12
12
* Python: 3.8 -- 3.11
13
- * GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, etc.)
13
+ * GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, H100, etc.)
14
14
15
15
Install with pip
16
16
----------------
@@ -23,9 +23,24 @@ You can install vLLM using pip:
23
23
$ conda create -n myenv python=3.8 -y
24
24
$ conda activate myenv
25
25
26
- $ # Install vLLM.
26
+ $ # Install vLLM with CUDA 12.1 .
27
27
$ pip install vllm
28
28
29
+ .. note ::
30
+
31
+ As of now, vLLM's binaries are compiled on CUDA 12.1 by default.
32
+ However, you can install vLLM with CUDA 11.8 by running:
33
+
34
+ .. code-block :: console
35
+
36
+ $ # Install vLLM with CUDA 11.8.
37
+ $ # Replace `cp310` with your Python version (e.g., `cp38`, `cp39`, `cp311`).
38
+ $ pip install https://github.com/vllm-project/vllm/releases/download/v0.2.2/vllm-0.2.2+cu118-cp310-cp310-manylinux1_x86_64.whl
39
+
40
+ $ # Re-install PyTorch with CUDA 11.8.
41
+ $ pip uninstall torch -y
42
+ $ pip install torch --upgrade --index-url https://download.pytorch.org/whl/cu118
43
+
29
44
30
45
.. _build_from_source :
31
46
@@ -45,6 +60,5 @@ You can also build and install vLLM from source:
45
60
46
61
.. code-block :: console
47
62
48
- $ # Pull the Docker image with CUDA 11.8.
49
63
$ # Use `--ipc=host` to make sure the shared memory is large enough.
50
- $ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:22.12 -py3
64
+ $ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:23.10 -py3
0 commit comments