·
21 commits
to refs/heads/main
since this release
What's Changed 🚀
v0.6.0 marks the official release of our new ray-based distributed engine, Flotilla! If you are already using the ray runner, you do not need to change anything. Setting the DAFT_RUNNER=ray
environment variable, or within your python program via daft.context.set_runner_ray()
, will use Flotilla by default.
All operations except cross join, sort merge join, and pivot are currently supported. We will be working on adding support for them soon! If you need to use the legacy ray runner, please set daft.set_execution_config(use_legacy_ray_runner=True)
💥 Breaking Changes
SQLCatalog
was deprecated in v0.5 and is now removed, in favor of the bindings
kwargs.
Before:
catalog = SQLCatalog({"test_data": df})
result = daft.sql("SELECT * FROM test_data", catalog=catalog)
After:
bindings = {"test_data": df}
result = daft.sql("SELECT * FROM test_data", **bindings)
- feat!: revert daft.func behavior on literal arguments @kevinzwang (#5087)
- revert!: "revert: Temporarily revert "Remove deprecated APIs for 0.6" @desmondcheongzx (#5084)
✨ Features
- feat(embed_text): Support LM Studio as a provider @desmondcheongzx (#5103)
- feat: Implement embed_image() @desmondcheongzx (#5101)
- feat!: revert daft.func behavior on literal arguments @kevinzwang (#5087)
- feat: Automatically grab embedding dimensions for sentence transformers @desmondcheongzx (#5078)
- feat: add mcap datasource reader @Jay-ju (#4727)
🐛 Bug Fixes
- fix: Undo skipcheck change @srilman (#5131)
- fix: fix youtube video reading @rchowell (#5126)
- fix: Remove flotilla fallback @colin-ho (#5114)
- fix: Add nulls in json reads if a line doesn't contain the field from the schema @colin-ho (#4993)
- fix: Check if UDFs are Serializable @srilman (#5091)
- fix: nightly property test @malcolmgreaves (#5076)
- fix: Handle Unserializable Errors in Process UDFs @srilman (#5075)
- fix: Implement Multi-Column Aggregations with List-like columns @srilman (#5017)
🚀 Performance
- perf: Implement count pushdown for parquet @desmondcheongzx (#5038)
- perf(flotilla): Use Worker Affinity with Pre-Shuffle Merge @srilman (#5112)
- perf: Split UDFs from Filters @srilman (#5070)
- perf(embed_text): Let Sentence Transformers select the best available device @desmondcheongzx (#5082)
♻️ Refactor
📖 Documentation
- docs: fix navigation labels to match section names @ykdojo (#5121)
- docs: fix flickering typewriter animation on overview page @ykdojo (#5118)
- docs: Add batch inference use case @desmondcheongzx (#5116)
- docs: Add docs for custom data sources and sinks @desmondcheongzx (#5115)
- docs: add dark mode support for Algolia DocSearch @ykdojo (#5109)
- docs: add noindex tag to non-stable pages @jaychia (#5105)
- docs: Add text guide @desmondcheongzx (#5102)
- docs: Improve installation instructions @desmondcheongzx (#5094)
- docs: More fixes to the overview page in light mode @desmondcheongzx (#5095)
- docs: Document write_turbopuffer in the user guide @desmondcheongzx (#5092)
👷 CI
- ci: fix test-wheels job in build-wheel.yml @kevinzwang (#5134)
- ci: Truncate the # of concurrent jobs in PR CI @srilman (#5122)
- ci: Run tests before publish @colin-ho (#5009)
- ci: Always run the
unit-tests
required check @colin-ho (#5119) - ci: Do not skip postmerge tests @desmondcheongzx (#5096)
🔧 Maintenance
- chore: Add AGENTS.md @srilman (#5124)
- chore: Remove docs codeowners @desmondcheongzx (#5111)
- chore: Clean up write_turbopuffer guide @desmondcheongzx (#5093)
⏪ Reverts
- revert!: "revert: Temporarily revert "Remove deprecated APIs for 0.6" @desmondcheongzx (#5084)
Full Changelog: v0.5.22...v0.5.23