Releases: containers/ramalama
Releases · containers/ramalama
v0.11.3
What's Changed
- musa: re-enable whisper.cpp build and update its commit SHA by @yeahdongcn in #1758
- Bump to v0.11.2 by @rhatdan in #1757
- fix model name in stack.py by @pbalczynski in #1759
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1753769805 by @renovate[bot] in #1765
- vLLM v0.10.0 release by @ericcurtin in #1763
- fix(deps): update react monorepo to v19.1.1 by @renovate[bot] in #1762
- Fix excess error output in run command by @arortiz-rh in #1760
- call Model.validate_args() from Model.ensure_model_exists() by @mikebonnet in #1772
- De-duplicate bash build scripts by @ericcurtin in #1773
- Enable/Disable thinking on reasoning models by @rhatdan in #1768
- Include fix that allows us to build on older ARM SoC's by @ericcurtin in #1775
- Enable multiline chat by @engelmi in #1777
- Include the host in the quadlet's PublishPort directive by @Stebalien in #1771
- Fix run/generate for oci models by @olliewalsh in #1779
- chore(deps): update dependency typescript to ~5.9.0 by @renovate[bot] in #1782
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1753978585 by @red-hat-konflux-kflux-prd-rh03[bot] in #1781
- Typing and bug squashes by @ieaves in #1764
- Adding --add-to-unit option to --generate to allow creating or updati… by @Annakan in #1774
- Fix handling of configured_has_all by @rhatdan in #1784
- building stable diffusion. by @jtligon in #1769
- Use modelstore as tmpdir for hfcli by @engelmi in #1787
- Set TMPDIR to /var/tmp if not set by @rhatdan in #1786
New Contributors
- @pbalczynski made their first contribution in #1759
- @arortiz-rh made their first contribution in #1760
- @Stebalien made their first contribution in #1771
- @Annakan made their first contribution in #1774
Full Changelog: v0.11.2...v0.11.3
v0.11.2
What's Changed
- Bump to v0.11.1 by @rhatdan in #1726
- konflux: add pipelines for ramalama-vllm and layered images by @mikebonnet in #1717
- Don't override image when using rag if user specified it by @rhatdan in #1727
- Re-enable passing chat template to model by @engelmi in #1732
- No virglrenderer in RHEL by @ericcurtin in #1728
- Add stale githup workflow to maintain older issues and PRs. by @rhatdan in #1733
- konflux: build -rag images on bigger instances with large disks by @mikebonnet in #1737
- musa: upgrade musa sdk to rc4.2.0 by @yeahdongcn in #1697
- Remove GGUF version check when parsing by @engelmi in #1738
- Define image within container with full name by @rhatdan in #1734
- musa: disable build of whisper.cpp, and update llama.cpp by @mikebonnet in #1745
- Include mmproj mount in quadlet by @olliewalsh in #1742
- Adds docs site by @ieaves in #1736
- Fix listing models by @engelmi in #1748
- fix(deps): update dependency huggingface-hub to ~=0.34.0 by @renovate[bot] in #1747
- chore(deps): update dependency typescript to ~5.8.0 by @renovate[bot] in #1746
- Use blobs directory as context directory on convert by @engelmi in #1739
- konflux: push images to the quay.io/ramalama org after integration testing by @mikebonnet in #1743
- CUDA vLLM variant by @ericcurtin in #1741
- Add setuptools_scm by @ericcurtin in #1749
- Fixes docsite page linking by @ieaves in #1752
- Fix kube volumemount for hostpaths and add mmproj by @olliewalsh in #1751
- More cuda vLLM enablement by @ericcurtin in #1750
- Fix assembling URLs for big models by @engelmi in #1756
Full Changelog: v0.11.1...v0.11.2
v0.11.1
What's Changed
- Bump to 0.11.0 by @rhatdan in #1694
- Mistral should point to lmstudio gguf by @ericcurtin in #1698
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752587049 by @renovate[bot] in #1699
- chore(deps): update quay.io/konflux-ci/build-trusted-artifacts:latest docker digest to f7d0c51 by @renovate[bot] in #1696
- reduce unnecessary image pulls during testing, and re-enable a couple tests by @mikebonnet in #1700
- Minor fixes to rpm builds by packit and spec file. by @smooge in #1704
- konflux: build cuda on arm64, and simplify testing by @mikebonnet in #1687
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752625787 by @red-hat-konflux-kflux-prd-rh03[bot] in #1710
- Included ramalama.conf in wheel by @carlwgeorge in #1711
- Improve NVIDIA GPU detection. by @jwieleRH in #1617
- README: remove duplicate statements by @rhatdan in #1707
- fix GPU selection and pytorch URL when building rag images by @mikebonnet in #1709
- Add support for Intel Iris Xe Graphics (46AA, 46A6, 46A8) by @tonyjames in #1712
- konflux: add pipelines for asahi, cann, intel-gpu, llama-stack, musa, openvino, and ramalama-cli by @mikebonnet in #1708
- Add vllm to cpu inferencing Containerfile by @ericcurtin in #1677
- build_rag.sh: install cmake by @mikebonnet in #1716
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1718
- container-images: add virglrenderer to vulkan by @slp in #1714
- added milvus support and qol console logs for rag command by @bmahabirbu in #1720
- fixes issue where format=markdown saves to dublicate absolute path by @bmahabirbu in #1719
- Engine should be created after checks by @rhatdan in #1722
- Use model organization as namespace when pulling Ollama models by @engelmi in #1721
- Consolodate run and chat commands together also allow specification of prefix in ramalama.conf by @rhatdan in #1706
- If container fails on Run, warn and exit by @rhatdan in #1723
- Added temporary migration routine for non-namespaced ollama models by @engelmi in #1725
New Contributors
- @tonyjames made their first contribution in #1712
Full Changelog: v0.11.0...v0.11.1
v0.11.0
What's Changed
- Bump to v0.10.1 by @rhatdan in #1667
- Adds the ability to include vision based context to chat via --rag by @ieaves in #1661
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1751897624 by @red-hat-konflux-kflux-prd-rh03[bot] in #1670
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1664
- feat: allow for dynamic version installing of ramalama-stack by @nathan-weinberg in #1671
- Inspect add safetensor support by @engelmi in #1666
- Revert "feat: allow for dynamic version installing of ramalama-stack" by @ericcurtin in #1672
- move --image & --keep-groups to run, serve, perplexity, bench commands by @rhatdan in #1669
- mlx fixes by @ericcurtin in #1673
- Enhance ref file and mount all snapshot files to container by @engelmi in #1643
- Hide --container option, having --container/--nocontainer is confusing by @rhatdan in #1675
- Enable SELinux separation by @rhatdan in #1676
- chore: bump ramalama-stack to 0.2.5 by @nathan-weinberg in #1680
- Bugfix for chat by @ericcurtin in #1679
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1752069608 by @renovate[bot] in #1668
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752069608 by @red-hat-konflux-kflux-prd-rh03[bot] in #1684
- konflux: add integration tests that run in multi-arch VMs by @mikebonnet in #1683
- Allow
ramalama rag
to output different formats by @rhatdan in #1685 - Bug/chat fix by @ieaves in #1681
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1688
- Only install if pyproject.toml exists by @ericcurtin in #1689
- Readme improvements: Update model's name and improve CUDA_VISIBLE_DEVICES section by @mbortoli in #1691
- Move rpms by @smooge in #1693
New Contributors
Full Changelog: v0.10.1...v0.11.0
v0.10.1
What's Changed
- Bump to v0.10.0 by @rhatdan in #1629
- Fix handling of --host option when running in a container by @rhatdan in #1628
- Start process of moving python-ramalama to ramalama by @smooge in #1498
- Fix modelstore deleting logic when multiple reference refer to the same blob/snapshot by @olliewalsh in #1620
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751287003 by @red-hat-konflux-kflux-prd-rh03 in #1633
- run tests during build pipelines by @mikebonnet in #1614
- Split the model store into multiple files by @engelmi in #1640
- chore: bump ramalama-stack to 0.2.4 by @nathan-weinberg in #1639
- Use config instance for defining pull behavior in accel_image by @engelmi in #1638
- quadlet: add missing privileged options by @jbtrystram in #1631
- build layered images from Containerfiles by @mikebonnet in #1641
- Add command to list available models by @ericcurtin in #1635
- Adds a user configuration setting to disable gpu prompting by @ieaves in #1632
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751445649 by @red-hat-konflux-kflux-prd-rh03 in #1658
- Update lint and format tools configuration by @telemaco in #1659
- konflux: add pipelines for the layered images of ramalama, cuda, rocm, and rocm-ubi by @mikebonnet in #1657
- Always use absolute path for --store option by @rhatdan in #1637
- Add .pre-commit-config.yaml by @telemaco in #1660
- MLX runtime support by @kush-gupt in #1642
- Make sure errors and progress messages go to STDERR by @rhatdan in #1665
New Contributors
- @jbtrystram made their first contribution in #1631
- @telemaco made their first contribution in #1659
Full Changelog: v0.10.0...v0.10.1
v0.10.0
What's Changed
- Bump to v0.9.3 by @rhatdan in #1586
- Remove last libexec program by @rhatdan in #1576
- Don't pull image when doing ramalama --help call by @rhatdan in #1589
- API key support by @ericcurtin in #1578
- Move RamaLama container image to default to fedora:42 by @rhatdan in #1595
- Missing options of api_key and pid2kill are causing crashes by @rhatdan in #1601
- Some of our tests are running for hours, need to be timed out by @rhatdan in #1602
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1750786174 by @red-hat-konflux-kflux-prd-rh03 in #1600
- konflux: centralize pipeline definitions by @mikebonnet in #1599
- Allow std input by @ericcurtin in #1606
- konflux: use shared pipelines for rocm, rocm-ubi, and cuda by @mikebonnet in #1608
- Prune model store code by @engelmi in #1607
- Switchout hasattr for getattr wherever possible by @rhatdan in #1605
- add support for running bats in a container by @mikebonnet in #1598
- Separate build image into its own VM by @rhatdan in #1609
- container-images: pin mesa version to COPR by @slp in #1603
- konflux: build bats image by @red-hat-konflux-kflux-prd-rh03 in #1612
- rename "nopull" boolean to "pull" by @ktdreyer in #1611
- Use standard zsh completion directory by @carlwgeorge in #1619
- Free up disk space for building all images by @rhatdan in #1615
- Fix removing of file based URL models by @rhatdan in #1610
- chore: bump ramalama-stack to 0.2.3 by @nathan-weinberg in #1616
- Fixup to work with llama-stack by @rhatdan in #1588
- Fix unit tests for machines with GPUs by @sarroutbi in #1621
- Want to pick up support for gemma3n by @ericcurtin in #1623
- Add gemma aliases by @ericcurtin in #1624
- Adds the ability to pass files to
ramalama run
by @ieaves in #1570 - chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03 in #1627
New Contributors
- @ktdreyer made their first contribution in #1611
- @carlwgeorge made their first contribution in #1619
Full Changelog: v0.9.3...v0.10.0
v0.9.3
What's Changed
- Convert tabs to spaces by @ericcurtin in #1538
- Make minimum version of Python consistent by @Hasnep in #1512
- Upgrade podman by @ericcurtin in #1540
- Bump to v0.9.2 by @rhatdan in #1537
- Downgrade whisper by @ericcurtin in #1543
- Deduplicate code by @ericcurtin in #1539
- Add dnf update -y to Fedora ROCm build by @ericcurtin in #1544
- model: always pass in GPU offloading parameters by @alaviss in #1502
- Run bats test with TMPDIR pointing at /mnt/tmp by @rhatdan in #1548
- Tabs to spaces by @ericcurtin in #1549
- Add GGML_VK_VISIBLE_DEVICES env var by @ericcurtin in #1547
- Create tempdir when run as non-root user by @rhatdan in #1551
- Red Hat Konflux kflux-prd-rh03 update ramalama by @red-hat-konflux-kflux-prd-rh03 in #1542
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372 by @red-hat-konflux-kflux-prd-rh03 in #1555
- Fix default prefix for systems with no engines by @rhatdan in #1556
- Add install command via homebrew by @scraly in #1558
- Remove Model flag for safetensor files for now by @engelmi in #1559
- Add verbose rule for complete output on unit tests by @sarroutbi in #1562
- Reuse code for unit test execution rules by @sarroutbi in #1564
- :latest tag should not be assumed for non-OCI artefacts by @ericcurtin in #1534
- Replace ramalama-client-code with ramalama chat by @rhatdan in #1550
- Document the image format created/consumed by the oci:// transport by @mtrmac in #1569
- Trying to save space by @ericcurtin in #1541
- Fix test_accel unit test to fallback to latest by @sarroutbi in #1567
- install ramalama into containers from the current checkout by @mikebonnet in #1566
- TMT: run tests with GPUs by @lsm5 in #1101
- fix: vLLM serving and model mounting by @kush-gupt in #1571
- Make model argument mandatory by @ericcurtin in #1574
- fix: broken link in CI dashboard by @nathan-weinberg in #1580
- chore: bump ramalama-stack to 0.2.2 by @nathan-weinberg in #1579
New Contributors
- @red-hat-konflux-kflux-prd-rh03 made their first contribution in #1542
- @scraly made their first contribution in #1558
- @mtrmac made their first contribution in #1569
Full Changelog: v0.9.2...v0.9.3
v0.9.2
What's Changed
- Only print this in the llama-stack case by @ericcurtin in #1486
- Throw exception when using OCI without engine by @rhatdan in #1471
- Make sure llama-stack URL is shown to user by @rhatdan in #1490
- Fix #1489 by @yeahdongcn in #1491
- There's a change that we want that avoids using software rasterizers by @ericcurtin in #1495
- Install uv to fix build issue by @ericcurtin in #1496
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372 by @renovate in #1492
- Only enumerate ROCm-capable AMD GPUs by @alaviss in #1500
- amdkfd: add constants for heap types by @alaviss in #1501
- This is not a multi-model model by @ericcurtin in #1499
- fix: remove unneeded dependency from Llama Stack container by @nathan-weinberg in #1503
- Increase retry attempts to attempt to connect to server by @ericcurtin in #1507
- Ignore errors when removing snapshot directory by @engelmi in #1511
- Add Python shebang files to linting by @Hasnep in #1514
- For
ramalama ls
shorten huggingface lines by @ericcurtin in #1516 - Update black target version by @Hasnep in #1513
- Wait for upto 16 seconds for model to load by @ericcurtin in #1510
- This installs ramalama via uv if python3 version is too old by @ericcurtin in #1497
- fix(deps): update dependency huggingface-hub to ~=0.33.0 by @renovate in #1505
- chore(common/intel_gpus): detect arc a770, a750 by @kwaa in #1517
- Do not run with --tty when not in interactive mode by @rhatdan in #1506
- Update to add multi-modal by @rhatdan in #1522
- Add --all option to ramalama ls by @engelmi in #1528
- Add colors to "ramalama serve" if we can by @ericcurtin in #1529
- Change the FROM for asahi container image by @ericcurtin in #1523
- Refactor config and arg typing by @ieaves in #1488
- Add ramalama chat command by @rhatdan in #1531
- Suggest using uv pip install to get missing module by @rhatdan in #1532
- Not sure this is supposed to be here by @ericcurtin in #1535
- chore: bump ramalama-stack to 0.2.1 by @nathan-weinberg in #1536
- honor the user specifying the image by @rhatdan in #1527
New Contributors
Full Changelog: v0.9.1...v0.9.2
v0.9.1
What's Changed
- feat: s390x build commands by @taronaeo in #1459
- docs: update container_build.sh help information by @taronaeo in #1461
- chore: remove unclear else from llama and whisper build by @taronaeo in #1464
- fix: lock down ramalama-stack version in llama-stack Containerfile by @nathan-weinberg in #1465
- Rename: RepoFile=>HFStyleRepoFile, BaseRepository=>HFStyleRepository, BaseRepoModel=>HFStyleRepoModel by @yeahdongcn in #1466
- Documentation improvements by @waltdisgrace in #1468
- Change timeouts by @ericcurtin in #1469
- llama-stack container build fails with == 1.5.0 by @rhatdan in #1467
- Do not override a small subset of env vars by @ericcurtin in #1475
- Call set_gpu_type_env_vars rather than set_accel_env_vars by @ericcurtin in #1476
- Don't warmup by default by @ericcurtin in #1477
- chore: bump 'ramalama-stack' version to 0.2.0 by @nathan-weinberg in #1478
- Adds dev dependency groups by @ieaves in #1481
- fix(deps): update dependency huggingface-hub to ~=0.32.4 by @renovate in #1483
- Fix handling of generate with llama-stack by @rhatdan in #1472
- Update demos to show serving models. by @rhatdan in #1474
- Bump to v0.9.1 by @rhatdan in #1484
New Contributors
- @waltdisgrace made their first contribution in #1468
Full Changelog: v0.9.0...v0.9.1
v0.9.0
What's Changed
- chore: bump llama.cpp to support tool streaming by @p5 in #1438
- Bump to v0.8.5 by @rhatdan in #1439
- fix: update references to Python 3.8 to Python 3.11 by @nathan-weinberg in #1441
- Fix quadlet handling of duplicate options by @olliewalsh in #1442
- fix(gguf_parser): fix big endian model parsing by @taronaeo in #1444
- Choice could be not set and should not be used by @rhatdan in #1447
- fix(run): Ensure 'run' subcommand works with host proxy settings. by @melodyliu1986 in #1430
- Switch default ramalama image build to use VULKAN by @rhatdan in #1449
- make ramalama-client-core send default model to server by @rhatdan in #1450
- fix(gguf_parser): fix memoryerror exception when loading non-native models by @taronaeo in #1452
- Small logging improvements by @almusil in #1455
- feat(model_store): prevent model endianness mismatch on download by @taronaeo in #1454
- Add support for llama-stack by @rhatdan in #1413
- Refactoring huggingface.py and modelscope.py and extract repo_model_base.py by @yeahdongcn in #1456
- Eliminate selinux-policy packages from containers by @rhatdan in #1451
- Snapshot verification by @engelmi in #1458
- Add support for generating kube.yaml and quadlet/kube files for llama… by @rhatdan in #1457
- Bump to v0.9.0 by @rhatdan in #1462
New Contributors
Full Changelog: v0.8.5...v0.9.0