-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Open
Labels
feature requestFeature request pending on roadmapFeature request pending on roadmap
Description
With huggingface/trl#3094 TRL will support vLLM for generation (at least in GRPO) by launching a server with trl vllm-serve --model-name. This means we can now use vLLM for larger models that require multi-gpu setups (by controlling setting different CUDA_VISIBLE_DEVICES for both the vLLM and training process). Looks like peft support for it will be coming soon.. This means, in theory, you could fine-tune Llama 3 70B in 4-bit with GRPO and Unsloth (if you happen to have like, 3 A100s all linked together and a lot of time).
I may be jumping the gun a bit here, but I've been looking forward to multi-GPU vLLM support in TRL for a while now and would love to see it integrated with Unsloth (even if we're still limited to 1 GPU training for now).
GabrielThompson, shimmyshimmer, woreom, TweedBeetle, zhongyi-zhou and 2 more
Metadata
Metadata
Assignees
Labels
feature requestFeature request pending on roadmapFeature request pending on roadmap