Releases: llm-d/llm-d-kv-cache-manager
Releases · llm-d/llm-d-kv-cache-manager
v0.3.1
What's Changed
- Add Prow GitHub Actions by @Jooho in #117
- fix: Modified the download url of libtokenizers.darwin-x86_64.tar.gz by @WillardHu in #110
- Update code-ownership files to best utilize PROW + auto assign by @vMaroon in #121
- CI: Expand LRUStore Unit Tests for Partial and Prefix Matches by @yankay in #120
- fix: remove OWNERS_ALIASES and update OWNERS by @Jooho in #122
- Implement auto-assign for reviewers without write permissions by @vMaroon in #123
- feat: Add a SliceMapE function for handle errors and add unit tests by @WillardHu in #119
- add benchmark data by @vMaroon in #129
- [feat]support specifying imagePullSecrets for chart by @my-git9 in #130
- Support new KVEvents format by @vMaroon in #132
- Fix indexer behavior when no kvblock-keys are generated by @vMaroon in #118
New Contributors
- @Jooho made their first contribution in #117
- @WillardHu made their first contribution in #110
- @my-git9 made their first contribution in #130
Full Changelog: v0.3.0...v0.3.1
v0.3.0
Summary
- OpenAI production ready Chat-Completions preprocessing library
- Synchronous tokenization with caching
- Expanded benchmarking and stronger test coverage
- General code and documentation improvements
What's Changed
- Bump helm-chart Image by @vMaroon in #66
- Doc Enhancements by @vMaroon in #73
- Update LICENSE by @vMaroon in #74
- Fix README Diagram by @vMaroon in #75
- Enhance README Diagram Clarity by @vMaroon in #78
- Fix kv_events offline example by @irar2 in #82
- fix: Redis kvblock parsing bugs and add basic unit tests by @yankay in #80
- fix: correct shell command substitution syntax in Makefile by @yankay in #81
- Optimized chat completions library, build support and testing infrastructure by @guygir in #79
- Remove redundant keys return from Index.Lookup interface by @sagiahrac in #84
- KVEvents/others minor refactoring by @vMaroon in #88
- Add InMemoryIndex unit tests by @sagiahrac in #86
- Add instrumentedIndex basic unit tests by @sagiahrac in #87
- docs: fix mermaid chart arrow syntax by @Zerohertz in #93
- Chat-Completions Enhancements: Updated Examples + Code Improvements by @guygir in #92
- Tokenization unit tests by @sagiahrac in #90
- feat: Add Synchronous Tokenization Support to Tokenization Pool by @sagiahrac in #95
- [CI]: added some index-related test cases while refactoring the test code to be more concise. by @yankay in #102
- [docs] Update KV-Events and KV-Cache examples with correct paths and commands by @yankay in #106
New Contributors
- @irar2 made their first contribution in #82
- @yankay made their first contribution in #80
- @guygir made their first contribution in #79
- @sagiahrac made their first contribution in #84
- @Zerohertz made their first contribution in #93
Full Changelog: v0.2.1...v0.3.0-rc1
v0.2.1
v0.2.0
What's Changed
- Introduced vLLM-Native KV-Events processing and new indexing backends
- In-Memory index (default): KV-Events are digested and stored in memory
- Redis index
- Added observability and real-time Prometheus metrics
- Tracks KV-Block admissions, evictions, lookups and hit-rates
- Enhanced configurability
- Updated integration in llm-d-inference-scheduler (accurate prefix-cache aware scorer)
- Initial support for OpenAI-compatible Chat Completions templating (library)
- Enhanced user examples and end-to-end (vLLM <-> indexer) deployment setup
- General documentation improvements
PRs
- (chore): typo in tokenizer file by @buraksekili in #39
- [KV-Events] Introduce KV-Block Indexing Backends - Part 1 of 3 by @vMaroon in #40
- fix: replace llm-d tag to 0.0.8 by @kfirtoledo in #42
- docs: Add a setup documentation about examples/kv-cache-index by @buraksekili in #38
- [KV-Events] KV-Events Processing - Part 3 of 3 by @vMaroon in #44
- Matched Default TokenProcessorConfig.BlockSize with vLLM's by @vMaroon in #52
- [KVBlock.Index] Prometheus Metrics & Logging by @vMaroon in #53
- Enhance Configurability by @vMaroon in #55
- Update configuration.md by @vMaroon in #56
- Implement Metrics Logging Configuration in Indexer by @vMaroon in #57
- Completions-Support (#50) Extension by @guygir in #58
New Contributors
- @buraksekili made their first contribution in #39
- @guygir made their first contribution in #58
Full Changelog: v0.1.1...v0.2.0-RC1
v0.1.1
What's Changed
- Update OWNERS by @vMaroon in #25
- Update CONTRIBUTING.md by @clubanderson in #27
- Update README.md by @clubanderson in #28
- Update CONTRIBUTING.md by @clubanderson in #35
- Refactor Redis config to use redis.Options struct by @relyt0925 in #37
New Contributors
- @clubanderson made their first contribution in #27
- @relyt0925 made their first contribution in #37
Full Changelog: v0.1.0...v0.1.1