Skip to content

Commit 7dc3d9d

Browse files
committed
move --image & --keep-groups to run, serve, perplexity, bench commands
This eliminates the need for pulling images by accident when not using containers. Since these commands are only used for container commands, no need for them in other places. Fixes: #1662 Signed-off-by: Daniel J Walsh <[email protected]>
1 parent b7c15ce commit 7dc3d9d

17 files changed

+192
-67
lines changed

docs/ramalama-bench.1.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,33 @@ for a value and set the variable only if it is set on the host.
4848
#### **--help**, **-h**
4949
show this help message and exit
5050

51+
#### **--image**=IMAGE
52+
OCI container image to run with specified AI model. RamaLama defaults to using
53+
images based on the accelerator it discovers. For example:
54+
`quay.io/ramalama/ramalama`. See the table below for all default images.
55+
The default image tag is based on the minor version of the RamaLama package.
56+
Version 0.10.0 of RamaLama pulls an image with a `:0.10` tag from the quay.io/ramalama OCI repository. The --image option overrides this default.
57+
58+
The default can be overridden in the ramalama.conf file or via the
59+
RAMALAMA_IMAGE environment variable. `export RAMALAMA_IMAGE=quay.io/ramalama/aiimage:1.2` tells
60+
RamaLama to use the `quay.io/ramalama/aiimage:1.2` image.
61+
62+
Accelerated images:
63+
64+
| Accelerator | Image |
65+
| ------------------------| -------------------------- |
66+
| CPU, Apple | quay.io/ramalama/ramalama |
67+
| HIP_VISIBLE_DEVICES | quay.io/ramalama/rocm |
68+
| CUDA_VISIBLE_DEVICES | quay.io/ramalama/cuda |
69+
| ASAHI_VISIBLE_DEVICES | quay.io/ramalama/asahi |
70+
| INTEL_VISIBLE_DEVICES | quay.io/ramalama/intel-gpu |
71+
| ASCEND_VISIBLE_DEVICES | quay.io/ramalama/cann |
72+
| MUSA_VISIBLE_DEVICES | quay.io/ramalama/musa |
73+
74+
#### **--keep-groups**
75+
pass --group-add keep-groups to podman (default: False)
76+
If GPU device on host system is accessible to user via group access, this option leaks the groups into the container.
77+
5178
#### **--name**, **-n**
5279
name of the container to run the Model in
5380

docs/ramalama-perplexity.1.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,33 @@ for a value and set the variable only if it is set on the host.
5353
#### **--help**, **-h**
5454
show this help message and exit
5555

56+
#### **--image**=IMAGE
57+
OCI container image to run with specified AI model. RamaLama defaults to using
58+
images based on the accelerator it discovers. For example:
59+
`quay.io/ramalama/ramalama`. See the table below for all default images.
60+
The default image tag is based on the minor version of the RamaLama package.
61+
Version 0.10.0 of RamaLama pulls an image with a `:0.10` tag from the quay.io/ramalama OCI repository. The --image option overrides this default.
62+
63+
The default can be overridden in the ramalama.conf file or via the
64+
RAMALAMA_IMAGE environment variable. `export RAMALAMA_IMAGE=quay.io/ramalama/aiimage:1.2` tells
65+
RamaLama to use the `quay.io/ramalama/aiimage:1.2` image.
66+
67+
Accelerated images:
68+
69+
| Accelerator | Image |
70+
| ------------------------| -------------------------- |
71+
| CPU, Apple | quay.io/ramalama/ramalama |
72+
| HIP_VISIBLE_DEVICES | quay.io/ramalama/rocm |
73+
| CUDA_VISIBLE_DEVICES | quay.io/ramalama/cuda |
74+
| ASAHI_VISIBLE_DEVICES | quay.io/ramalama/asahi |
75+
| INTEL_VISIBLE_DEVICES | quay.io/ramalama/intel-gpu |
76+
| ASCEND_VISIBLE_DEVICES | quay.io/ramalama/cann |
77+
| MUSA_VISIBLE_DEVICES | quay.io/ramalama/musa |
78+
79+
#### **--keep-groups**
80+
pass --group-add keep-groups to podman (default: False)
81+
If GPU device on host system is accessible to user via group access, this option leaks the groups into the container.
82+
5683
#### **--name**, **-n**
5784
name of the container to run the Model in
5885

docs/ramalama-rag.1.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,33 @@ for a value and set the variable only if it is set on the host.
3535
#### **--help**, **-h**
3636
Print usage message
3737

38+
#### **--image**=IMAGE
39+
OCI container image to run with specified AI model. RamaLama defaults to using
40+
images based on the accelerator it discovers. For example:
41+
`quay.io/ramalama/ramalama-rag`. See the table below for all default images.
42+
The default image tag is based on the minor version of the RamaLama package.
43+
Version 0.10.0 of RamaLama pulls an image with a `:0.10` tag from the quay.io/ramalama OCI repository. The --image option overrides this default.
44+
45+
The default can be overridden in the ramalama.conf file or via the
46+
RAMALAMA_IMAGE environment variable. `export RAMALAMA_IMAGE=quay.io/ramalama/aiimage:1.2` tells
47+
RamaLama to use the `quay.io/ramalama/aiimage:1.2` image.
48+
49+
Accelerated images:
50+
51+
| Accelerator | Image |
52+
| ------------------------| ------------------------------ |
53+
| CPU, Apple | quay.io/ramalama/ramalama-rag |
54+
| HIP_VISIBLE_DEVICES | quay.io/ramalama/rocm-rag |
55+
| CUDA_VISIBLE_DEVICES | quay.io/ramalama/cuda-rag |
56+
| ASAHI_VISIBLE_DEVICES | quay.io/ramalama/asahi-rag |
57+
| INTEL_VISIBLE_DEVICES | quay.io/ramalama/intel-gpu-rag |
58+
| ASCEND_VISIBLE_DEVICES | quay.io/ramalama/cann-rag |
59+
| MUSA_VISIBLE_DEVICES | quay.io/ramalama/musa-rag |
60+
61+
#### **--keep-groups**
62+
pass --group-add keep-groups to podman (default: False)
63+
If GPU device on host system is accessible to user via group access, this option leaks the groups into the container.
64+
3865
#### **--network**=*none*
3966
sets the configuration for network namespaces when handling RUN instructions
4067

docs/ramalama-run.1.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,33 @@ for a value and set the variable only if it is set on the host.
6161
#### **--help**, **-h**
6262
Show this help message and exit
6363

64+
#### **--image**=IMAGE
65+
OCI container image to run with specified AI model. RamaLama defaults to using
66+
images based on the accelerator it discovers. For example:
67+
`quay.io/ramalama/ramalama`. See the table below for all default images.
68+
The default image tag is based on the minor version of the RamaLama package.
69+
Version 0.10.0 of RamaLama pulls an image with a `:0.10` tag from the quay.io/ramalama OCI repository. The --image option overrides this default.
70+
71+
The default can be overridden in the ramalama.conf file or via the
72+
RAMALAMA_IMAGE environment variable. `export RAMALAMA_IMAGE=quay.io/ramalama/aiimage:1.2` tells
73+
RamaLama to use the `quay.io/ramalama/aiimage:1.2` image.
74+
75+
Accelerated images:
76+
77+
| Accelerator | Image |
78+
| ------------------------| -------------------------- |
79+
| CPU, Apple | quay.io/ramalama/ramalama |
80+
| HIP_VISIBLE_DEVICES | quay.io/ramalama/rocm |
81+
| CUDA_VISIBLE_DEVICES | quay.io/ramalama/cuda |
82+
| ASAHI_VISIBLE_DEVICES | quay.io/ramalama/asahi |
83+
| INTEL_VISIBLE_DEVICES | quay.io/ramalama/intel-gpu |
84+
| ASCEND_VISIBLE_DEVICES | quay.io/ramalama/cann |
85+
| MUSA_VISIBLE_DEVICES | quay.io/ramalama/musa |
86+
87+
#### **--keep-groups**
88+
pass --group-add keep-groups to podman (default: False)
89+
If GPU device on host system is accessible to user via group access, this option leaks the groups into the container.
90+
6491
#### **--keepalive**
6592
duration to keep a model loaded (e.g. 5m)
6693

docs/ramalama-serve.1.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,33 @@ show this help message and exit
9393
#### **--host**="0.0.0.0"
9494
IP address for llama.cpp to listen on.
9595

96+
#### **--image**=IMAGE
97+
OCI container image to run with specified AI model. RamaLama defaults to using
98+
images based on the accelerator it discovers. For example:
99+
`quay.io/ramalama/ramalama`. See the table above for all default images.
100+
The default image tag is based on the minor version of the RamaLama package.
101+
Version 0.10.0 of RamaLama pulls an image with a `:0.10` tag from the quay.io/ramalama OCI repository. The --image option overrides this default.
102+
103+
The default can be overridden in the ramalama.conf file or via the
104+
RAMALAMA_IMAGE environment variable. `export RAMALAMA_IMAGE=quay.io/ramalama/aiimage:1.2` tells
105+
RamaLama to use the `quay.io/ramalama/aiimage:1.2` image.
106+
107+
Accelerated images:
108+
109+
| Accelerator | Image |
110+
| ------------------------| -------------------------- |
111+
| CPU, Apple | quay.io/ramalama/ramalama |
112+
| HIP_VISIBLE_DEVICES | quay.io/ramalama/rocm |
113+
| CUDA_VISIBLE_DEVICES | quay.io/ramalama/cuda |
114+
| ASAHI_VISIBLE_DEVICES | quay.io/ramalama/asahi |
115+
| INTEL_VISIBLE_DEVICES | quay.io/ramalama/intel-gpu |
116+
| ASCEND_VISIBLE_DEVICES | quay.io/ramalama/cann |
117+
| MUSA_VISIBLE_DEVICES | quay.io/ramalama/musa |
118+
119+
#### **--keep-groups**
120+
pass --group-add keep-groups to podman (default: False)
121+
If GPU device on host system is accessible to user via group access, this option leaks the groups into the container.
122+
96123
#### **--model-draft**
97124

98125
A draft model is a smaller, faster model that helps accelerate the decoding

docs/ramalama.1.md

Lines changed: 0 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -23,18 +23,6 @@ version of RamaLama. For example RamaLama version 1.2.3 on an NVIDIA system
2323
pulls quay.io/ramalama/cuda:1.2. To override the default image use the
2424
`--image` option.
2525

26-
Accelerated images:
27-
28-
| Accelerator | Image |
29-
| ------------------------| -------------------------- |
30-
| CPU, Apple | quay.io/ramalama/ramalama |
31-
| HIP_VISIBLE_DEVICES | quay.io/ramalama/rocm |
32-
| CUDA_VISIBLE_DEVICES | quay.io/ramalama/cuda |
33-
| ASAHI_VISIBLE_DEVICES | quay.io/ramalama/asahi |
34-
| INTEL_VISIBLE_DEVICES | quay.io/ramalama/intel-gpu |
35-
| ASCEND_VISIBLE_DEVICES | quay.io/ramalama/cann |
36-
| MUSA_VISIBLE_DEVICES | quay.io/ramalama/musa |
37-
3826
RamaLama pulls AI Models from model registries. Starting a chatbot or a rest API service from a simple single command. Models are treated similarly to how Podman and Docker treat container images.
3927

4028
When both Podman and Docker are installed, RamaLama defaults to Podman, The `RAMALAMA_CONTAINER_ENGINE=docker` environment variable can override this behaviour. When neither are installed RamaLama attempts to run the model with software on the local system.
@@ -137,21 +125,6 @@ The default can be overridden in the ramalama.conf file or via the RAMALAMA_CONT
137125
#### **--help**, **-h**
138126
show this help message and exit
139127

140-
#### **--image**=IMAGE
141-
OCI container image to run with specified AI model. RamaLama defaults to use
142-
images based on the accelerator it discovers. For example:
143-
`quay.io/ramalama/ramalama`. See the table below for all default images.
144-
The default image tag is based on the minor version of the RamaLama package.
145-
Version 0.10.0 of RamaLama pulls $IMAGE:0.10 from the quay.io/ramalama OCI repository. The --image option overrides this default.
146-
147-
The default can be overridden in the ramalama.conf file or via the
148-
RAMALAMA_IMAGE environment variable. `export RAMALAMA_IMAGE=quay.io/ramalama/aiimage:1.2` tells
149-
RamaLama to use the `quay.io/ramalama/aiimage:1.2` image.
150-
151-
#### **--keep-groups**
152-
pass --group-add keep-groups to podman (default: False)
153-
Needed to access the gpu on some systems, but has an impact on security, use with caution.
154-
155128
#### **--nocontainer**
156129
Do not run RamaLama in the default container (default: False)
157130
The default can be overridden in the ramalama.conf file.

docs/ramalama.conf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
# OCI model car image
2626
# Image to use when building and pushing --type=car models
2727
#
28-
#carimage = "registry.access.redhat.com/ubi9-micro:latest"
28+
#carimage = "registry.access.redhat.com/ubi10-micro:latest"
2929

3030
# Run RamaLama in the default container.
3131
#

docs/ramalama.conf.5.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ The ramalama table contains settings to configure and manage the OCI runtime.
6565
Unified API layer for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry.
6666
Options: llama-stack, none
6767

68-
**carimage**="registry.access.redhat.com/ubi9-micro:latest"
68+
**carimage**="registry.access.redhat.com/ubi10-micro:latest"
6969

7070
OCI model car image
7171

ramalama/cli.py

Lines changed: 33 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@
2121

2222
import ramalama.chat as chat
2323
import ramalama.oci
24-
import ramalama.rag
2524
from ramalama import engine
2625
from ramalama.chat import default_prefix
2726
from ramalama.common import accel_image, get_accel, perror
@@ -30,6 +29,7 @@
3029
from ramalama.model import MODEL_TYPES
3130
from ramalama.model_factory import ModelFactory, New
3231
from ramalama.model_store.global_store import GlobalModelStore
32+
from ramalama.rag import rag_image
3333
from ramalama.shortnames import Shortnames
3434
from ramalama.stack import Stack
3535
from ramalama.version import print_version, version
@@ -192,21 +192,6 @@ def configure_arguments(parser):
192192
default=CONFIG.engine,
193193
help="""run RamaLama using the specified container engine.
194194
The RAMALAMA_CONTAINER_ENGINE environment variable modifies default behaviour.""",
195-
)
196-
parser.add_argument(
197-
"--image",
198-
default=accel_image(CONFIG),
199-
help="OCI container image to run with the specified AI model",
200-
action=OverrideDefaultAction,
201-
completer=local_images,
202-
)
203-
parser.add_argument(
204-
"--keep-groups",
205-
dest="podman_keep_groups",
206-
default=CONFIG.keep_groups,
207-
action="store_true",
208-
help="""pass `--group-add keep-groups` to podman, if using podman.
209-
Needed to access gpu on some systems, but has security implications.""",
210195
)
211196
parser.add_argument(
212197
"--nocontainer",
@@ -520,7 +505,7 @@ def info_cli(args):
520505
"Engine": {
521506
"Name": args.engine,
522507
},
523-
"Image": args.image,
508+
"Image": accel_image(CONFIG),
524509
"Runtime": args.runtime,
525510
"Store": args.store,
526511
"UseContainer": args.container,
@@ -662,6 +647,7 @@ def convert_cli(args):
662647
model = ModelFactory(tgt, args).create_oci()
663648

664649
source_model = _get_source_model(args)
650+
args.carimage = rag_image(accel_image(CONFIG))
665651
model.convert(source_model, args)
666652

667653

@@ -789,6 +775,21 @@ def runtime_options(parser, command):
789775
help="IP address to listen",
790776
completer=suppressCompleter,
791777
)
778+
parser.add_argument(
779+
"--image",
780+
default=accel_image(CONFIG),
781+
help="OCI container image to run with the specified AI model",
782+
action=OverrideDefaultAction,
783+
completer=local_images,
784+
)
785+
parser.add_argument(
786+
"--keep-groups",
787+
dest="podman_keep_groups",
788+
default=CONFIG.keep_groups,
789+
action="store_true",
790+
help="""pass `--group-add keep-groups` to podman.
791+
If GPU device on host is accessible to via group access, this option leaks the user groups into the container.""",
792+
)
792793
if command == "run":
793794
parser.add_argument(
794795
"--keepalive", type=str, help="duration to keep a model loaded (e.g. 5m)", completer=suppressCompleter
@@ -1060,6 +1061,21 @@ def rag_parser(subparsers):
10601061
help="environment variables to add to the running RAG container",
10611062
completer=local_env,
10621063
)
1064+
parser.add_argument(
1065+
"--image",
1066+
default=accel_image(CONFIG),
1067+
help="OCI container image to run with the specified AI model",
1068+
action=OverrideDefaultAction,
1069+
completer=local_images,
1070+
)
1071+
parser.add_argument(
1072+
"--keep-groups",
1073+
dest="podman_keep_groups",
1074+
default=CONFIG.keep_groups,
1075+
action="store_true",
1076+
help="""pass `--group-add keep-groups` to podman.
1077+
If GPU device on host is accessible to via group access, this option leaks the user groups into the container.""",
1078+
)
10631079
add_network_argument(parser, dflt=None)
10641080
parser.add_argument(
10651081
"--pull",

ramalama/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ class RamalamaSettings:
6464
class BaseConfig:
6565
container: bool = None # type: ignore
6666
image: str = None # type: ignore
67-
carimage: str = "registry.access.redhat.com/ubi9-micro:latest"
67+
carimage: str = "registry.access.redhat.com/ubi10-micro:latest"
6868
ctx_size: int = 2048
6969
engine: SUPPORTED_ENGINES | None = field(default_factory=get_default_engine)
7070
env: list[str] = field(default_factory=list)

0 commit comments

Comments
 (0)