Skip to content

Always failed to run a specific model using ramalama run $model_name #1414

@melodyliu1986

Description

@melodyliu1986

Issue Description

I wanted to run a chatbot using the run command, but it never succeeded. Can you provide me with some tips on what happened here?

Steps to reproduce the issue

$ cat /etc/redhat-release 
Fedora release 40 (Forty)

$ pip show ramalama
Name: ramalama
Version: 0.8.3
Summary: RamaLama is a command line tool for working with AI LLM models.
Home-page: https://github.com/containers/ramalama

$ ramalama pull granite3-moe

$ ramalama pull tiny

$ ramalama list
NAME                                      MODIFIED       SIZE     
ollama://granite3-moe/granite3-moe:latest 14 minutes ago 783.77 MB
ollama://tinyllama/tinyllama:latest       2 seconds ago  608.16 MB

$ ramalama run granite3-moe
🦭 > how are you?
Error: could not connect to: http://127.0.0.1:8080/v1/chat/completions

$ ramalama run tinyllama
🦭 > what is the longest river in the world?
Error: could not connect to: http://127.0.0.1:8080/v1/chat/completions

$ ramalama serve granite3-moe
serving on port 8080
build: 5335 (d8919424) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) for x86_64-redhat-linux
system info: n_threads = 4, n_threads_batch = 4, total_threads = 8

Image

Describe the results you received

the ramalama run always failed, but it's successful to serve the model using ramalama serve.

Describe the results you expected

I expected that I could run the chatbot successfully.

ramalama info output

$ ramalama info
{
    "Accelerator": "none",
    "Engine": {
        "Info": {
            "host": {
                "arch": "amd64",
                "buildahVersion": "1.39.0",
                "cgroupControllers": [
                    "cpu",
                    "memory",
                    "pids"
                ],
                "cgroupManager": "systemd",
                "cgroupVersion": "v2",
                "conmon": {
                    "package": "conmon-2.1.12-2.fc40.x86_64",
                    "path": "/usr/bin/conmon",
                    "version": "conmon version 2.1.12, commit: "
                },
                "cpuUtilization": {
                    "idlePercent": 96.22,
                    "systemPercent": 0.75,
                    "userPercent": 3.03
                },
                "cpus": 8,
                "databaseBackend": "sqlite",
                "distribution": {
                    "distribution": "fedora",
                    "variant": "matecompiz",
                    "version": "40"
                },
                "eventLogger": "journald",
                "freeLocks": 2048,
                "hostname": "fedora",
                "idMappings": {
                    "gidmap": [
                        {
                            "container_id": 0,
                            "host_id": 1000,
                            "size": 1
                        },
                        {
                            "container_id": 1,
                            "host_id": 524288,
                            "size": 65536
                        }
                    ],
                    "uidmap": [
                        {
                            "container_id": 0,
                            "host_id": 1000,
                            "size": 1
                        },
                        {
                            "container_id": 1,
                            "host_id": 524288,
                            "size": 65536
                        }
                    ]
                },
                "kernel": "6.13.8-100.fc40.x86_64",
                "linkmode": "dynamic",
                "logDriver": "journald",
                "memFree": 17538068480,
                "memTotal": 33331781632,
                "networkBackend": "netavark",
                "networkBackendInfo": {
                    "backend": "netavark",
                    "dns": {
                        "package": "aardvark-dns-1.14.0-1.fc40.x86_64",
                        "path": "/usr/libexec/podman/aardvark-dns",
                        "version": "aardvark-dns 1.14.0"
                    },
                    "package": "netavark-1.14.0-1.fc40.x86_64",
                    "path": "/usr/libexec/podman/netavark",
                    "version": "netavark 1.14.0"
                },
                "ociRuntime": {
                    "name": "crun",
                    "package": "crun-1.20-2.fc40.x86_64",
                    "path": "/usr/bin/crun",
                    "version": "crun version 1.20\ncommit: 9c9a76ac11994701dd666c4f0b869ceffb599a66\nrundir: /run/user/1000/crun\nspec: 1.0.0\n+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL"
                },
                "os": "linux",
                "pasta": {
                    "executable": "/usr/bin/pasta",
                    "package": "passt-0^20250217.ga1e48a0-2.fc40.x86_64",
                    "version": ""
                },
                "remoteSocket": {
                    "exists": true,
                    "path": "/run/user/1000/podman/podman.sock"
                },
                "rootlessNetworkCmd": "pasta",
                "security": {
                    "apparmorEnabled": false,
                    "capabilities": "CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT",
                    "rootless": true,
                    "seccompEnabled": true,
                    "seccompProfilePath": "/usr/share/containers/seccomp.json",
                    "selinuxEnabled": true
                },
                "serviceIsRemote": false,
                "slirp4netns": {
                    "executable": "/usr/bin/slirp4netns",
                    "package": "slirp4netns-1.3.1-1.fc40.x86_64",
                    "version": "slirp4netns version 1.3.1\ncommit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236\nlibslirp: 4.7.0\nSLIRP_CONFIG_VERSION_MAX: 4\nlibseccomp: 2.5.5"
                },
                "swapFree": 8589930496,
                "swapTotal": 8589930496,
                "uptime": "5h 39m 32.00s (Approximately 0.21 days)",
                "variant": ""
            },
            "plugins": {
                "authorization": null,
                "log": [
                    "k8s-file",
                    "none",
                    "passthrough",
                    "journald"
                ],
                "network": [
                    "bridge",
                    "macvlan",
                    "ipvlan"
                ],
                "volume": [
                    "local"
                ]
            },
            "registries": {
                "search": [
                    "registry.fedoraproject.org",
                    "registry.access.redhat.com",
                    "docker.io",
                    "registry-proxy.engineering.redhat.com"
                ]
            },
            "store": {
                "configFile": "/home/songliu/.config/containers/storage.conf",
                "containerStore": {
                    "number": 0,
                    "paused": 0,
                    "running": 0,
                    "stopped": 0
                },
                "graphDriverName": "overlay",
                "graphOptions": {},
                "graphRoot": "/home/songliu/.local/share/containers/storage",
                "graphRootAllocated": 510405902336,
                "graphRootUsed": 152152965120,
                "graphStatus": {
                    "Backing Filesystem": "btrfs",
                    "Native Overlay Diff": "true",
                    "Supports d_type": "true",
                    "Supports shifting": "false",
                    "Supports volatile": "true",
                    "Using metacopy": "false"
                },
                "imageCopyTmpDir": "/var/tmp",
                "imageStore": {
                    "number": 152
                },
                "runRoot": "/run/user/1000/containers",
                "transientStore": false,
                "volumePath": "/home/songliu/.local/share/containers/storage/volumes"
            },
            "version": {
                "APIVersion": "5.4.0",
                "BuildOrigin": "Fedora Project",
                "Built": 1739232000,
                "BuiltTime": "Tue Feb 11 08:00:00 2025",
                "GitCommit": "",
                "GoVersion": "go1.22.11",
                "Os": "linux",
                "OsArch": "linux/amd64",
                "Version": "5.4.0"
            }
        },
        "Name": "podman"
    },
    "Image": "quay.io/ramalama/ramalama:0.8",
    "Runtime": "llama.cpp",
    "Shortnames": {
        "Files": [
            "/home/songliu/.local/share/ramalama/shortnames.conf"
        ],
        "Names": {
            "cerebrum": "huggingface://froggeric/Cerebrum-1.0-7b-GGUF/Cerebrum-1.0-7b-Q4_KS.gguf",
            "deepseek": "ollama://deepseek-r1",
            "dragon": "huggingface://llmware/dragon-mistral-7b-v0/dragon-mistral-7b-q4_k_m.gguf",
            "gemma3": "hf://bartowski/google_gemma-3-4b-it-GGUF/google_gemma-3-4b-it-IQ2_M.gguf",
            "gemma3:12b": "hf://bartowski/google_gemma-3-12b-it-GGUF/google_gemma-3-12b-it-IQ2_M.gguf",
            "gemma3:1b": "hf://bartowski/google_gemma-3-1b-it-GGUF/google_gemma-3-1b-it-IQ2_M.gguf",
            "gemma3:27b": "hf://bartowski/google_gemma-3-27b-it-GGUF/google_gemma-3-27b-it-IQ2_M.gguf",
            "gemma3:4b": "hf://bartowski/google_gemma-3-4b-it-GGUF/google_gemma-3-4b-it-IQ2_M.gguf",
            "granite": "ollama://granite3.1-dense",
            "granite-code": "hf://ibm-granite/granite-3b-code-base-2k-GGUF/granite-3b-code-base.Q4_K_M.gguf",
            "granite-code:20b": "hf://ibm-granite/granite-20b-code-base-8k-GGUF/granite-20b-code-base.Q4_K_M.gguf",
            "granite-code:34b": "hf://ibm-granite/granite-34b-code-base-8k-GGUF/granite-34b-code-base.Q4_K_M.gguf",
            "granite-code:3b": "hf://ibm-granite/granite-3b-code-base-2k-GGUF/granite-3b-code-base.Q4_K_M.gguf",
            "granite-code:8b": "hf://ibm-granite/granite-8b-code-base-4k-GGUF/granite-8b-code-base.Q4_K_M.gguf",
            "granite-lab-7b": "huggingface://instructlab/granite-7b-lab-GGUF/granite-7b-lab-Q4_K_M.gguf",
            "granite-lab-8b": "huggingface://ibm-granite/granite-8b-code-base-GGUF/granite-8b-code-base.Q4_K_M.gguf",
            "granite-lab:7b": "huggingface://instructlab/granite-7b-lab-GGUF/granite-7b-lab-Q4_K_M.gguf",
            "granite:2b": "ollama://granite3.1-dense:2b",
            "granite:7b": "huggingface://instructlab/granite-7b-lab-GGUF/granite-7b-lab-Q4_K_M.gguf",
            "granite:8b": "ollama://granite3.1-dense:8b",
            "hermes": "huggingface://NousResearch/Hermes-2-Pro-Mistral-7B-GGUF/Hermes-2-Pro-Mistral-7B.Q4_K_M.gguf",
            "ibm/granite": "ollama://granite3.1-dense:8b",
            "ibm/granite:2b": "ollama://granite3.1-dense:2b",
            "ibm/granite:7b": "huggingface://instructlab/granite-7b-lab-GGUF/granite-7b-lab-Q4_K_M.gguf",
            "ibm/granite:8b": "ollama://granite3.1-dense:8b",
            "merlinite": "huggingface://instructlab/merlinite-7b-lab-GGUF/merlinite-7b-lab-Q4_K_M.gguf",
            "merlinite-lab-7b": "huggingface://instructlab/merlinite-7b-lab-GGUF/merlinite-7b-lab-Q4_K_M.gguf",
            "merlinite-lab:7b": "huggingface://instructlab/merlinite-7b-lab-GGUF/merlinite-7b-lab-Q4_K_M.gguf",
            "merlinite:7b": "huggingface://instructlab/merlinite-7b-lab-GGUF/merlinite-7b-lab-Q4_K_M.gguf",
            "mistral": "huggingface://TheBloke/Mistral-7B-Instruct-v0.2-GGUF/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
            "mistral-small3.1": "hf://bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF/mistralai_Mistral-Small-3.1-24B-Instruct-2503-IQ2_M.gguf",
            "mistral-small3.1:24b": "hf://bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF/mistralai_Mistral-Small-3.1-24B-Instruct-2503-IQ2_M.gguf",
            "mistral:7b": "huggingface://TheBloke/Mistral-7B-Instruct-v0.2-GGUF/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
            "mistral:7b-v1": "huggingface://TheBloke/Mistral-7B-Instruct-v0.1-GGUF/mistral-7b-instruct-v0.1.Q5_K_M.gguf",
            "mistral:7b-v2": "huggingface://TheBloke/Mistral-7B-Instruct-v0.2-GGUF/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
            "mistral:7b-v3": "huggingface://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF/Mistral-7B-Instruct-v0.3.Q4_K_M.gguf",
            "mistral_code_16k": "huggingface://TheBloke/Mistral-7B-Code-16K-qlora-GGUF/mistral-7b-code-16k-qlora.Q4_K_M.gguf",
            "mistral_codealpaca": "huggingface://TheBloke/Mistral-7B-codealpaca-lora-GGUF/mistral-7b-codealpaca-lora.Q4_K_M.gguf",
            "mixtao": "huggingface://MaziyarPanahi/MixTAO-7Bx2-MoE-Instruct-v7.0-GGUF/MixTAO-7Bx2-MoE-Instruct-v7.0.Q4_K_M.gguf",
            "openchat": "huggingface://TheBloke/openchat-3.5-0106-GGUF/openchat-3.5-0106.Q4_K_M.gguf",
            "openorca": "huggingface://TheBloke/Mistral-7B-OpenOrca-GGUF/mistral-7b-openorca.Q4_K_M.gguf",
            "phi2": "huggingface://MaziyarPanahi/phi-2-GGUF/phi-2.Q4_K_M.gguf",
            "smollm:135m": "ollama://smollm:135m",
            "tiny": "ollama://tinyllama"
        }
    },
    "Store": "/home/songliu/.local/share/ramalama",
    "UseContainer": true,
    "Version": "0.8.3"
}

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions