Skip to content

Conversation

universalmind303
Copy link
Contributor

@universalmind303 universalmind303 commented Mar 5, 2025

Note for reviewers

I tried to use the sphinx autodoc stuff, but the markdown in daft.pyspark wasn't rendering properly so i just copy/pasted it. But i don't think we're even using the sphinx stuff anymore so 🤷

Copy link

codecov bot commented Mar 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.60%. Comparing base (7ef32fc) to head (58e23a5).
Report is 5 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3919      +/-   ##
==========================================
+ Coverage   75.35%   77.60%   +2.25%     
==========================================
  Files         767      768       +1     
  Lines      103619    98818    -4801     
==========================================
- Hits        78080    76689    -1391     
+ Misses      25539    22129    -3410     

see 44 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -0,0 +1,30 @@
# PySpark.

The `daft.pyspark` module provides a way to create a PySpark session that can be run locally or backed by a ray cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a relative link between user guide and API docs for daft.pyspark method?

docs/mkdocs.yml Outdated
@@ -43,6 +43,7 @@ nav:
- Tutorials: resources/tutorials.md
- Benchmarks: resources/benchmarks/tpch.md # Benchmarks can expand into a folder once we have more
- Telemetry: resources/telemetry.md
- Spark Connect: spark_connect.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to decide where best to fit this doc in terms of TOC. I feel like it would fit best under Integrations but at the same it's not exactly the same as the other integrations. I almost want to put it under Migration Guide but it's also not exactly that either.

If you think it belongs in L1, I would move it in between Catalogs and Distributed Computing maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future idea: I feel like we should segment the integrations further

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move it to top-level. Feels like a big enough feature


The `daft.pyspark` module provides a way to create a PySpark session that can be run locally or backed by a ray cluster.

This serves as a way to run the daft query engine, but with a spark compatible API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry small nit, can we capitalize Daft and Spark and Ray on the previous line?

docs/mkdocs.yml Outdated
@@ -43,6 +43,7 @@ nav:
- Tutorials: resources/tutorials.md
- Benchmarks: resources/benchmarks/tpch.md # Benchmarks can expand into a folder once we have more
- Telemetry: resources/telemetry.md
- Spark Connect: spark_connect.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move it to top-level. Feels like a big enough feature

@universalmind303 universalmind303 merged commit 2e3189f into Eventual-Inc:main Mar 6, 2025
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants