-
Notifications
You must be signed in to change notification settings - Fork 25
Add Substrait Protobuf Visitor Framework #136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Substrait Protobuf Visitor Framework #136
Conversation
4c84acc
to
f0523ff
Compare
## IMPORTANT: Read-Only Analysis Only | ||
This interface is designed for read-only analysis, and assumes the tree does not change | ||
as it is being walked. Any modification of the tree could disrupt the walking of the | ||
tree causing nodes to be missed, visited out of order, or cause panics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some modifications would be reasonably safe with this interface. Here are a few possibilities: https://github.com/voltrondata/spark-substrait-gateway/tree/main/src/transforms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, I can update the description to be less intense and just warn
v.count++ | ||
} | ||
|
||
// Visitor that only cares about expressions (e.g., finding all functions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably for finding all utilized functions versus checking the extension list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I can also push a plan context that will get the extension list. didn't want the one PR to get too big.
|
||
// Visit provides a convenient way to traverse a Substrait plan. | ||
// It walks all relations in the plan, calling the appropriate visitor methods | ||
// based on what interfaces the visitor implements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth noting that you have to visit the entire plan. So if you wanted to stop early once you found something you couldn't handle (such as an unsupported type) you couldn't bail early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should have an ability to exit early? the visitRel/visitExpr could return an "ok" and if any of the visits return !ok we don't traverse deeper. Wouldn't stop all other recursive branches though.
// | ||
// If your visitor needs plan context (for extensions, etc.), construct it | ||
// with NewPlanContext and pass it to your visitor's constructor. | ||
func Walk(rel *proto.Rel, visitor Visitor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use this to start a visit at any point in the tree? An alternative name could be VisitRelation since it also visits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, yeah I like that name more. Will update.
) | ||
|
||
// walk implements the core traversal logic for Substrait relations. | ||
// For cycle detection in DAG structures, wrap your visitor with CycleDetectingVisitor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see CycleDetectingVisitor
in this PR. That said are there any use cases where we would need them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, not really? If you were accepting substrait protobufs that were built within the service then there isn't anything in go to stop a cycle, but if you unmarshal the protobuf then it couldn't have a cycle. I think it's just cleaner to remove this for now.
Thanks @EpsilonPrime for the review, will fix up the comments and push a revision. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #136 +/- ##
==========================================
+ Coverage 64.18% 64.84% +0.66%
==========================================
Files 44 47 +3
Lines 11029 11336 +307
==========================================
+ Hits 7079 7351 +272
- Misses 3631 3655 +24
- Partials 319 330 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
We've developed an efficient visitor pattern implementation for traversing Substrait protobuf plans in Go. We'd like to contribute this to the Substrait community as it solves common traversal needs in an idiomatic Go way.
Issue
Tradeoffs and design is outlined in #135