Skip to content

Commit 3f090b7

Browse files
authored
docs: Example and instructions on how to load model weights to persistent volume (#197)
Signed-off-by: Viktor Kuropiatnyk <[email protected]>
1 parent 21c1791 commit 3f090b7

File tree

5 files changed

+195
-1
lines changed

5 files changed

+195
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ An easy to use UI is available at the `/ui` endpoint.
7070

7171
## Documentation and advance usages
7272

73-
Visit the [Docling Serve documentation](./docs/README.md) for learning how to [configure the webserver](./docs/configuration.md), use all the [runtime options](./docs/usage.md) of the API and [deployment examples](./docs/deployment.md).
73+
Visit the [Docling Serve documentation](./docs/README.md) for learning how to [configure the webserver](./docs/configuration.md), use all the [runtime options](./docs/usage.md) of the API and [deployment examples](./docs/deployment.md), pre-load model weights into a persistent volume [model weights on persistent volume](./docs/pre-loading-models.md)
7474

7575
## Get help and support
7676

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
kind: Deployment
2+
apiVersion: apps/v1
3+
metadata:
4+
name: docling-serve
5+
labels:
6+
app: docling-serve
7+
component: docling-serve-api
8+
spec:
9+
replicas: 1
10+
selector:
11+
matchLabels:
12+
app: docling-serve
13+
component: docling-serve-api
14+
template:
15+
metadata:
16+
labels:
17+
app: docling-serve
18+
component: docling-serve-api
19+
spec:
20+
restartPolicy: Always
21+
containers:
22+
- name: api
23+
resources:
24+
limits:
25+
cpu: 500m
26+
memory: 2Gi
27+
requests:
28+
cpu: 250m
29+
memory: 1Gi
30+
env:
31+
- name: DOCLING_SERVE_ENABLE_UI
32+
value: 'true'
33+
- name: DOCLING_SERVE_ARTIFACTS_PATH
34+
value: '/modelcache'
35+
ports:
36+
- name: http
37+
containerPort: 5001
38+
protocol: TCP
39+
imagePullPolicy: Always
40+
image: 'ghcr.io/docling-project/docling-serve-cpu'
41+
volumeMounts:
42+
- name: docling-model-cache
43+
mountPath: /modelcache
44+
volumes:
45+
- name: docling-model-cache
46+
persistentVolumeClaim:
47+
claimName: docling-model-cache-pvc
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
apiVersion: batch/v1
2+
kind: Job
3+
metadata:
4+
name: docling-model-cache-load
5+
spec:
6+
selector: {}
7+
template:
8+
metadata:
9+
name: docling-model-load
10+
spec:
11+
containers:
12+
- name: loader
13+
image: ghcr.io/docling-project/docling-serve-cpu:main
14+
command:
15+
- docling-tools
16+
- models
17+
- download
18+
- '--output-dir=/modelcache'
19+
- 'layout'
20+
- 'tableformer'
21+
- 'code_formula'
22+
- 'picture_classifier'
23+
- 'smolvlm'
24+
- 'granite_vision'
25+
- 'easyocr'
26+
volumeMounts:
27+
- name: docling-model-cache
28+
mountPath: /modelcache
29+
volumes:
30+
- name: docling-model-cache
31+
persistentVolumeClaim:
32+
claimName: docling-model-cache-pvc
33+
restartPolicy: Never
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
apiVersion: v1
2+
kind: PersistentVolumeClaim
3+
metadata:
4+
name: docling-model-cache-pvc
5+
spec:
6+
accessModes:
7+
- ReadWriteOnce
8+
volumeMode: Filesystem
9+
resources:
10+
requests:
11+
storage: 10Gi

docs/pre-loading-models.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# Pre-loading models for docling
2+
3+
This document provides examples for pre-loading docling models to a persistent volume and re-using it for docling-serve deployments.
4+
5+
1. We need to create a persistent volume that will store models weights:
6+
7+
```yaml
8+
apiVersion: v1
9+
kind: PersistentVolumeClaim
10+
metadata:
11+
name: docling-model-cache-pvc
12+
spec:
13+
accessModes:
14+
- ReadWriteOnce
15+
volumeMode: Filesystem
16+
resources:
17+
requests:
18+
storage: 10Gi
19+
```
20+
21+
If you don't want to use default storage class, set your custom storage class with following:
22+
23+
```yaml
24+
spec:
25+
...
26+
storageClassName: <Storage Class Name>
27+
```
28+
29+
Manifest example: [docling-model-cache-pvc.yaml](./deploy-examples/docling-model-cache-pvc.yaml)
30+
31+
2. In order to load model weights, we can use docling-toolkit to download them, as this is a one time operation we can use kubernetes job for this:
32+
33+
```yaml
34+
apiVersion: batch/v1
35+
kind: Job
36+
metadata:
37+
name: docling-model-cache-load
38+
spec:
39+
selector: {}
40+
template:
41+
metadata:
42+
name: docling-model-load
43+
spec:
44+
containers:
45+
- name: loader
46+
image: ghcr.io/docling-project/docling-serve-cpu:main
47+
command:
48+
- docling-tools
49+
- models
50+
- download
51+
- '--output-dir=/modelcache'
52+
- 'layout'
53+
- 'tableformer'
54+
- 'code_formula'
55+
- 'picture_classifier'
56+
- 'smolvlm'
57+
- 'granite_vision'
58+
- 'easyocr'
59+
volumeMounts:
60+
- name: docling-model-cache
61+
mountPath: /modelcache
62+
volumes:
63+
- name: docling-model-cache
64+
persistentVolumeClaim:
65+
claimName: docling-model-cache-pvc
66+
restartPolicy: Never
67+
```
68+
69+
The job will mount previously created persistent volume and execute command similar to how we would load models locally:
70+
`docling-tools models download --output-dir <MOUNT-PATH> [LIST_OF_MODELS]`
71+
72+
In manifest, we specify desired models individually, or we can use `--all` parameter to download all models.
73+
74+
Manifest example: [docling-model-cache-job.yaml](./deploy-examples/docling-model-cache-job.yaml)
75+
76+
3. Now we can mount volume in the docling-serve deployment and set env `DOCLING_SERVE_ARTIFACTS_PATH` to point to it.
77+
Following additions to deploymeny should be made:
78+
79+
```yaml
80+
spec:
81+
template:
82+
spec:
83+
containers:
84+
- name: api
85+
env:
86+
...
87+
- name: DOCLING_SERVE_ARTIFACTS_PATH
88+
value: '/modelcache'
89+
volumeMounts:
90+
- name: docling-model-cache
91+
mountPath: /modelcache
92+
...
93+
volumes:
94+
- name: docling-model-cache
95+
persistentVolumeClaim:
96+
claimName: docling-model-cache-pvc
97+
```
98+
99+
Make sure that value of `DOCLING_SERVE_ARTIFACTS_PATH` is the same as where models were downloaded and where volume is mounted.
100+
101+
Now when docling-serve is executing tasks, the underlying docling installation will load model weights from mouted volume.
102+
103+
Manifest example: [docling-model-cache-deployment.yaml](./deploy-examples/docling-model-cache-deployment.yaml)

0 commit comments

Comments
 (0)