What's Changed
- Adding comparison for different fp8 matmuls by @drisspg in #36
- Fixes device-tma kernel by @drisspg in #37
- Update benchmarks by @drisspg in #39
- hacking by @drisspg in #41
- Option to use torch.distributed.breakpoint in nan checker by @danielvegamyhre in #46
- update by @drisspg in #47
- remove examples by @drisspg in #43
- misc by @drisspg in #44
- tweaks by @drisspg in #48
- Fix release by @drisspg in #51
- update by @drisspg in #52
New Contributors
- @danielvegamyhre made their first contribution in #46
Full Changelog: v0.0.1...v0.0.2