Skip to content

v0.6.0

Latest
Compare
Choose a tag to compare
@joerunde joerunde released this 31 Jul 22:28
· 9 commits to main since this release
a7bc26b

This release:

  • 🎉 Supports embedding models on vLLM v1!
  • 🔥 Removes all remaining support for vLLM v0
  • ⚡ Contains performance and stability fixes for continuous batching
    • ⚗️ Support for up to --max-num-seqs 4 --max-model-len 8192 --tensor-parallel-size 4 has been tested on ibm-granite/granite-3.3-8b-instruct
  • 📦 Officially supports vllm 0.9.2 and 0.10.0

What's Changed

New Contributors

Full Changelog: v0.5.3...v0.6.0