Skip to content

Releases: llm-d/llm-d-kv-cache-manager

v0.3.1

25 Sep 23:05
598637a
Compare
Choose a tag to compare

What's Changed

  • Add Prow GitHub Actions by @Jooho in #117
  • fix: Modified the download url of libtokenizers.darwin-x86_64.tar.gz by @WillardHu in #110
  • Update code-ownership files to best utilize PROW + auto assign by @vMaroon in #121
  • CI: Expand LRUStore Unit Tests for Partial and Prefix Matches by @yankay in #120
  • fix: remove OWNERS_ALIASES and update OWNERS by @Jooho in #122
  • Implement auto-assign for reviewers without write permissions by @vMaroon in #123
  • feat: Add a SliceMapE function for handle errors and add unit tests by @WillardHu in #119
  • add benchmark data by @vMaroon in #129
  • [feat]support specifying imagePullSecrets for chart by @my-git9 in #130
  • Support new KVEvents format by @vMaroon in #132
  • Fix indexer behavior when no kvblock-keys are generated by @vMaroon in #118

New Contributors

Full Changelog: v0.3.0...v0.3.1

v0.3.0

04 Sep 17:05
68d56a3
Compare
Choose a tag to compare

Summary

  • OpenAI production ready Chat-Completions preprocessing library
  • Synchronous tokenization with caching
  • Expanded benchmarking and stronger test coverage
  • General code and documentation improvements

What's Changed

  • Bump helm-chart Image by @vMaroon in #66
  • Doc Enhancements by @vMaroon in #73
  • Update LICENSE by @vMaroon in #74
  • Fix README Diagram by @vMaroon in #75
  • Enhance README Diagram Clarity by @vMaroon in #78
  • Fix kv_events offline example by @irar2 in #82
  • fix: Redis kvblock parsing bugs and add basic unit tests by @yankay in #80
  • fix: correct shell command substitution syntax in Makefile by @yankay in #81
  • Optimized chat completions library, build support and testing infrastructure by @guygir in #79
  • Remove redundant keys return from Index.Lookup interface by @sagiahrac in #84
  • KVEvents/others minor refactoring by @vMaroon in #88
  • Add InMemoryIndex unit tests by @sagiahrac in #86
  • Add instrumentedIndex basic unit tests by @sagiahrac in #87
  • docs: fix mermaid chart arrow syntax by @Zerohertz in #93
  • Chat-Completions Enhancements: Updated Examples + Code Improvements by @guygir in #92
  • Tokenization unit tests by @sagiahrac in #90
  • feat: Add Synchronous Tokenization Support to Tokenization Pool by @sagiahrac in #95
  • [CI]: added some index-related test cases while refactoring the test code to be more concise. by @yankay in #102
  • [docs] Update KV-Events and KV-Cache examples with correct paths and commands by @yankay in #106

New Contributors

Full Changelog: v0.2.1...v0.3.0-rc1

v0.2.1

24 Jul 13:00
56b4bd5
Compare
Choose a tag to compare

What's Changed

  • kvevents Package Build Data Exportation by @vMaroon in #61
  • Update Tokenizer Release Version by @vMaroon in #63
  • Remove Default StorageClass: "ocs-storagecluster-cephfs" by @dumb0002 in #64

New Contributors

Full Changelog: v0.2.0...v0.2.1

v0.2.0

19 Jul 21:54
8a60b22
Compare
Choose a tag to compare

What's Changed

  • Introduced vLLM-Native KV-Events processing and new indexing backends
    • In-Memory index (default): KV-Events are digested and stored in memory
    • Redis index
  • Added observability and real-time Prometheus metrics
    • Tracks KV-Block admissions, evictions, lookups and hit-rates
  • Enhanced configurability
  • Updated integration in llm-d-inference-scheduler (accurate prefix-cache aware scorer)
  • Initial support for OpenAI-compatible Chat Completions templating (library)
  • Enhanced user examples and end-to-end (vLLM <-> indexer) deployment setup
  • General documentation improvements

PRs

  • (chore): typo in tokenizer file by @buraksekili in #39
  • [KV-Events] Introduce KV-Block Indexing Backends - Part 1 of 3 by @vMaroon in #40
  • fix: replace llm-d tag to 0.0.8 by @kfirtoledo in #42
  • docs: Add a setup documentation about examples/kv-cache-index by @buraksekili in #38
  • [KV-Events] KV-Events Processing - Part 3 of 3 by @vMaroon in #44
  • Matched Default TokenProcessorConfig.BlockSize with vLLM's by @vMaroon in #52
  • [KVBlock.Index] Prometheus Metrics & Logging by @vMaroon in #53
  • Enhance Configurability by @vMaroon in #55
  • Update configuration.md by @vMaroon in #56
  • Implement Metrics Logging Configuration in Indexer by @vMaroon in #57
  • Completions-Support (#50) Extension by @guygir in #58

New Contributors

Full Changelog: v0.1.1...v0.2.0-RC1

v0.1.1

03 Jun 07:29
c7f0332
Compare
Choose a tag to compare
v0.1.1 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v0.1.0...v0.1.1

v0.1.0

20 May 09:11
fec714f
Compare
Choose a tag to compare
v0.1.0 Pre-release
Pre-release

What's Changed

New Contributors

  • @oglok made their first contribution in #21

Full Changelog: 0.0.3...v0.1.0