feat(llm): Determine the best LLM deployment config automatically

### What you would like to be added?

Inspired by this research paper [Vidur: A Large-Scale Simulation Framework For LLM Inference](https://proceedings.mlsys.org/paper_files/paper/2024/file/b74a8de47d2b3c928360e0a011f48351-Paper-Conference.pdf)

Optimizing the deployment of Large language models (LLMs) is expensive today since it requires experimentally running an application workload against an LLM implementation while exploring large configuration space formed by system knobs such as parallelization strategies, batching techniques, and scheduling policies.

> we present Vidur-Search, a configuration search tool that helps optimize LLM deployment. Vidur-Search uses Vidur
to automatically identify the most cost-effective deployment configuration that meets application performance
constraints. For example, Vidur-Search finds the best deployment configuration for LLaMA2-70B in one hour on
a CPU machine, in contrast to a deployment-based exploration which would require 42K GPU hours – costing
218K dollars.

### Why is this needed?

Not sure if it is in the scope of katib, but glad to raise an issue here.


### Love this feature?

Give it a 👍 We prioritize the features with most 👍

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(llm): Determine the best LLM deployment config automatically #2396

What you would like to be added?

Why is this needed?

Love this feature?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(llm): Determine the best LLM deployment config automatically #2396

Description

What you would like to be added?

Why is this needed?

Love this feature?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions