Skip to content

Conversation

@xzyaoi
Copy link
Collaborator

@xzyaoi xzyaoi commented Apr 8, 2025

No description provided.

@xzyaoi xzyaoi merged commit 4349939 into dev Apr 8, 2025
1 check failed
@xzyaoi xzyaoi deleted the feature/offload branch April 8, 2025 09:38
xzyaoi added a commit that referenced this pull request May 27, 2025
* Minor fix (#46)

* leftover minor

* minor

* fix dockerfile for x86_64 to include hopper

* fix benchmark script

* minor

* Models/gemma 3 (#47)

* wip: revamp model registration

* fix gemma3 for causal LM

* gemma3

* update dockerfile

* Bump transformers from 4.46.3 to 4.48.0 in /meta (#49)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.46.3 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.46.3...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-version: 4.48.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Xiaozhe Yao <[email protected]>

* Support #44

* redo docs (#52)

* WIP: Offloading and other utilities (#51)

* wip: offload support

* debug

* fix sampling

* fix aarch64 dockerfile

* update aarch script

* remove sgl-kernel from dependencies

* update dockerfile

* minor reorg

* remove templated message printout

* fix dockerfile in aarch build

* relax triton req

* move `triton` dep to requirements-cuda

* minor fix

* fix build issues

* test dependencies

* move torch-memory-saver

* minor

* fix build issues & add initial metrics ui

* ready to build

* wip: multistage build

* Models/qwen3 (#53)

* init: qwen3

* qwen3

* minor

* minor refactor: health check on server starts

* minor update

* update buildfile

* minor

* minor bug fix

* logger

* update dev

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants