Skip to content

Commit 525a43f

Browse files
authored
docs: update deployment examples (#135)
Signed-off-by: rmdg88 <[email protected]> Signed-off-by: Rui Dias Gomes <[email protected]>
1 parent c1ce471 commit 525a43f

File tree

4 files changed

+230
-3
lines changed

4 files changed

+230
-3
lines changed

.markdownlint-cli2.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ config:
33
no-emphasis-as-header: false
44
first-line-heading: false
55
MD033:
6-
allowed_elements: ["details", "summary", "br", "a", "p", "img"]
6+
allowed_elements: ["details", "summary", "br", "a", "b", "p", "img"]
77
MD024:
88
siblings_only: true
99
globs:

docs/deploy-examples/compose-gpu.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
services:
2+
docling:
3+
image: ghcr.io/docling-project/docling-serve-cu124
4+
container_name: docling-serve
5+
ports:
6+
- 5001:5001
7+
environment:
8+
- DOCLING_SERVE_ENABLE_UI=true
9+
deploy:
10+
resources:
11+
reservations:
12+
devices:
13+
- driver: nvidia
14+
count: all # nvidia-smi
15+
capabilities: [gpu]
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# This example deployment configures Docling Serve with a Service and cuda image
2+
---
3+
apiVersion: v1
4+
kind: Service
5+
metadata:
6+
name: docling-serve
7+
labels:
8+
app: docling-serve
9+
component: docling-serve-api
10+
spec:
11+
ports:
12+
- name: http
13+
port: 5001
14+
targetPort: http
15+
selector:
16+
app: docling-serve
17+
component: docling-serve-api
18+
---
19+
kind: Deployment
20+
apiVersion: apps/v1
21+
metadata:
22+
name: docling-serve
23+
labels:
24+
app: docling-serve
25+
component: docling-serve-api
26+
spec:
27+
replicas: 1
28+
selector:
29+
matchLabels:
30+
app: docling-serve
31+
component: docling-serve-api
32+
template:
33+
metadata:
34+
labels:
35+
app: docling-serve
36+
component: docling-serve-api
37+
spec:
38+
restartPolicy: Always
39+
containers:
40+
- name: api
41+
resources:
42+
limits:
43+
cpu: 500m
44+
memory: 2Gi
45+
nvidia.com/gpu: 1 # Limit to one GPU
46+
requests:
47+
cpu: 250m
48+
memory: 1Gi
49+
nvidia.com/gpu: 1 # Limit to one GPU
50+
env:
51+
- name: DOCLING_SERVE_ENABLE_UI
52+
value: 'true'
53+
ports:
54+
- name: http
55+
containerPort: 5001
56+
protocol: TCP
57+
imagePullPolicy: Always
58+
image: 'ghcr.io/docling-project/docling-serve-cu124'

docs/deployment.md

Lines changed: 156 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,161 @@
1-
# Deployment
1+
# Deployment Examples
2+
3+
This document provides deployment examples for running the application in different environments.
4+
5+
Choose the deployment option that best fits your setup.
6+
7+
- **[Local GPU](#local-gpu)**: For deploying the application locally on a machine with a NVIDIA GPU (using Docker Compose).
8+
- **[OpenShift](#openshift)**: For deploying the application on an OpenShift cluster, designed for cloud-native environments.
9+
10+
---
11+
12+
## Local GPU
13+
14+
### Docker compose
15+
16+
Manifest example: [compose-gpu.yaml](./deploy-examples/compose-gpu.yaml)
17+
18+
This deployment has the following features:
19+
20+
- NVIDIA cuda enabled
21+
22+
Install the app with:
23+
24+
```sh
25+
docker compose -f docs/deploy-examples/compose-gpu.yaml up -d
26+
```
27+
28+
For using the API:
29+
30+
```sh
31+
# Make a test query
32+
curl -X 'POST' \
33+
"localhost:5001/v1alpha/convert/source/async" \
34+
-H "accept: application/json" \
35+
-H "Content-Type: application/json" \
36+
-d '{
37+
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
38+
}'
39+
```
40+
41+
<details>
42+
<summary><b>Requirements</b></summary>
43+
44+
- debian/ubuntu/rhel/fedora/opensuse
45+
- docker
46+
- nvidia drivers >=550.54.14
47+
- nvidia-container-toolkit
48+
49+
Docs:
50+
51+
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/supported-platforms.html)
52+
- [CUDA Toolkit Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id6)
53+
54+
</details>
55+
56+
<details>
57+
<summary><b>Steps</b></summary>
58+
59+
1. Check driver version and which GPU you want to use (0/1/2/3.. and update [compose-gpu.yaml](./deploy-examples/compose-gpu.yaml) file or use `count: all`)
60+
61+
```sh
62+
nvidia-smi
63+
```
64+
65+
2. Check if the NVIDIA Container Toolkit is installed/updated
66+
67+
```sh
68+
# debian
69+
dpkg -l | grep nvidia-container-toolkit
70+
```
71+
72+
```sh
73+
# rhel
74+
rpm -q nvidia-container-toolkit
75+
```
76+
77+
NVIDIA Container Toolkit install steps can be found here:
78+
79+
<https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html>
80+
81+
3. Check which runtime is being used by Docker
82+
83+
```sh
84+
# docker
85+
docker info | grep -i runtime
86+
```
87+
88+
4. If the default Docker runtime changes back from 'nvidia' to 'default' after restarting the Docker service (optional):
89+
90+
Backup the daemon.json file:
91+
92+
```sh
93+
sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.bak
94+
```
95+
96+
Update the daemon.json file:
97+
98+
```sh
99+
echo '{
100+
"runtimes": {
101+
"nvidia": {
102+
"path": "nvidia-container-runtime"
103+
}
104+
},
105+
"default-runtime": "nvidia"
106+
}' | sudo tee /etc/docker/daemon.json > /dev/null
107+
```
108+
109+
Restart the Docker service:
110+
111+
```sh
112+
sudo systemctl restart docker
113+
```
114+
115+
Confirm 'nvidia' is the default runtime used by Docker by repeating step 3.
116+
117+
5. Run the container:
118+
119+
```sh
120+
docker compose -f docs/deploy-examples/compose-gpu.yaml up -d
121+
```
122+
123+
</details>
2124

3125
## OpenShift
4126

127+
### Simple deployment
128+
129+
Manifest example: [docling-serve-simple.yaml](./deploy-examples/docling-serve-simple.yaml)
130+
131+
This deployment example has the following features:
132+
133+
- Deployment configuration
134+
- Service configuration
135+
- NVIDIA cuda enabled
136+
137+
Install the app with:
138+
139+
```sh
140+
oc apply -f docs/deploy-examples/docling-serve-simple.yaml
141+
```
142+
143+
For using the API:
144+
145+
```sh
146+
# Port-forward the service
147+
oc port-forward svc/docling-serve 5001:5001
148+
149+
# Make a test query
150+
curl -X 'POST' \
151+
"localhost:5001/v1alpha/convert/source/async" \
152+
-H "accept: application/json" \
153+
-H "Content-Type: application/json" \
154+
-d '{
155+
"http_sources": [{"url": "https://arxiv.org/pdf/2501.17887"}]
156+
}'
157+
```
158+
5159
### Secure deployment with `oauth-proxy`
6160

7161
Manifest example: [docling-serve-oauth.yaml](./deploy-examples/docling-serve-oauth.yaml)
@@ -15,7 +169,7 @@ This deployment has the following features:
15169
Install the app with:
16170

17171
```sh
18-
kubectl apply -f docs/deploy-examples/docling-serve-oauth.yaml
172+
oc apply -f docs/deploy-examples/docling-serve-oauth.yaml
19173
```
20174

21175
For using the API:

0 commit comments

Comments
 (0)