Skip to content

[v0.1.2] First official Release with new models and feature support

Latest

Choose a tag to compare

@kcz358 kcz358 released this 25 Oct 11:11
· 38 commits to main since this release
78a0185

What's Changed

  • feat: Bagel Image Understanding by @pufanyi in #43
  • fix: Bagel Docs Data Format by @pufanyi in #44
  • fix: Allow training bagel on understanding dataset when visual_gen=True by @pufanyi in #46
  • feat: Bagel naive implementation of sparse attention by @kcz358 in #45
  • feat: Better merge and print batch input by @kcz358 in #48
  • fix: Merge fsdp by @kcz358 in #49
  • add_single_gpu_muon&fix_some_bugs by @BIGKnight in #53
  • [v0.1.2] release: hydra launch config, sit, and rae training by @kcz358 in #50
  • fix: Fix launch from cli using config examples by @kcz358 in #54
  • feat: Support Qwen2.5 Omni Thinker by @kcz358 in #56
  • feat: Add llava_ov, bagel and better cicd readme and control by @kcz358 in #57
  • docs: Add a auto build docs, may be deprecated by @kcz358 in #58
  • feat: Support Qwen3-VL ulysses sequence parallel operation by @kcz358 in #59
  • fix: Fix random shuffle seed on same dp rank to prevent sp hang by @kcz358 in #60
  • docs: improve documentation accuracy and add Qwen-VL training guide by @mwxely in #62
  • Fix/reorg examples by @Luodian in #61
  • Dev/readme by @Luodian in #63
  • docs: Fix some examples error and better documentation on implementing new class by @kcz358 in #64

New Contributors

Full Changelog: v0.1.1...v0.1.2