Skip to content

[Roadmap]: 0.7.0 roadmap and release date #4230

@harryskim

Description

@harryskim

Hi Dynamo developers!

Many apologies for being late. It is the conference season, and we were unfortunately swamped with talk preparations. The v0.6.1 already has gone out, but we wanted to provide visibility into Dynamo v0.7.0 release. Please refer to the long term H2 roadmap here.

As before in H2, we are contributing to make progress on the five major focus areas:

  1. Performance
  2. Fault tolerance
  3. K8 deployment
  4. KV cache management and transfer
  5. Scheduling with smart router and planner

Additionally, with 0.7.0 release, we will have 2 new artifacts for increased modularity and it will ship with CUDA v13:

  1. KVBM pip wheel
  2. End Point Picker (EPP) container image

This release will emphasize on

  1. Providing maximum performance via composibility (KV aware routing + KV offloading + disaggregated serving)
  2. Seamless production grade serving from configuration (AIConfigurator & Planner) to production (Grove & granular fault tolerance for LLMs)

📅 Timeline

The target date for the v0.7.0 release is 11/19 (Thu)

Dynamo v0.7.0 Features

1. Performance

  • Consolidated examplar showcasing composibility with disaggregated serving, KV aware routing and KV offloading with KVBM.

2. Fault Tolerance & Observability

Fault Tolerance

  • ETCD lease keep alive resilience.
  • ETCD watcher resilience.
  • Fault tolerance CI harness.
  • Request cancellation test cases.

Observability

  • Achieve parity with SGLang, TRT-LLM, and vLLM metrics
  • Add engine (component/backend) metrics
  • Extend metric collection guide for K8 with CPU metrics
  • Publish NSight integration example

3. K8s Deployment

  • Remove ETCD dependency
  • Multi-LoRA support
  • SLA profiler and AIConfigurator integration

4. KV Cache Management & Transfer

KV Block Manager

Note: G1 = HBM, G2 = Host memory, G3 = Local disk, G4 = Remote storage

  • Enable KV event sharing with router
  • Pip wheel for KVBM
  • Performant G4 offloading

5. Planning & Routing

Router

  • Enable composibility with KV aware routing + KVBM + disaggregated serving

Planner

  • SLA planner MoE scaling support
  • Seamless UX for using AIC + Planner + Grove for multinode deployments
  • Extend Planner support to aggregated deployments

If there are any additional features that needs to be considered or prioritized, please let us know in the comment. Thank you so much for your ongoing feedback, and we will do our best to incorporate them to GA Dynamo in December 🙏.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestroadmapTracks features, enhancements, or milestones planned as part of the project roadmap

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions