What's Changed
- feat: Bagel Image Understanding by @pufanyi in #43
- fix: Bagel Docs Data Format by @pufanyi in #44
- fix: Allow training bagel on understanding dataset when
visual_gen=Trueby @pufanyi in #46 - feat: Bagel naive implementation of sparse attention by @kcz358 in #45
- feat: Better merge and print batch input by @kcz358 in #48
- fix: Merge fsdp by @kcz358 in #49
- add_single_gpu_muon&fix_some_bugs by @BIGKnight in #53
- [v0.1.2] release: hydra launch config, sit, and rae training by @kcz358 in #50
- fix: Fix launch from cli using config examples by @kcz358 in #54
- feat: Support Qwen2.5 Omni Thinker by @kcz358 in #56
- feat: Add llava_ov, bagel and better cicd readme and control by @kcz358 in #57
- docs: Add a auto build docs, may be deprecated by @kcz358 in #58
- feat: Support Qwen3-VL ulysses sequence parallel operation by @kcz358 in #59
- fix: Fix random shuffle seed on same dp rank to prevent sp hang by @kcz358 in #60
- docs: improve documentation accuracy and add Qwen-VL training guide by @mwxely in #62
- Fix/reorg examples by @Luodian in #61
- Dev/readme by @Luodian in #63
- docs: Fix some examples error and better documentation on implementing new class by @kcz358 in #64
New Contributors
Full Changelog: v0.1.1...v0.1.2