-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Fully native DTensor.__new__ #162508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully native DTensor.__new__ #162508
Conversation
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162508
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 2c44479 with merge base a63221a ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really sure what the end goal here is and the level of optimization we're aiming for. But pybind is most likely costing you a lot, especially for things like setting a slot attribute?
extra_dispatch_keys = extra_dispatch_keys.add(c10::DispatchKey::Negative); | ||
} | ||
|
||
py::handle spec = py::handle(r.pyobject(2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does .pyobject() return a new or borrowed reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
borrowed -- it just vends from the input args array, which is borrowed.
} | ||
|
||
py::handle spec = py::handle(r.pyobject(2)); | ||
const auto tensor_meta = spec.attr(dtensor_interned_strings.tensor_meta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some error checking from the old code is lost here when this is None?
Also what happens here when the attribute doesn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens here when the attribute doesn't exist?
PyObject_GetAttr() will set an error, and pybind11 will throw.
Some error checking from the old code is lost here when this is None
you can't get an attribute from None, so this change is safe. it's harder to debug, so I suppose we can TORCH_CHECK here.
py_tensor.attr(dtensor_interned_strings._spec) = spec; | ||
py_tensor.attr(dtensor_interned_strings._local_tensor) = local_tensor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen this before. Is the refcounting acutally correct on this?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why wouldn't it be? it would be an obvious bug in pybind11 if this didn't work
Per discussions with @ezyang, the end goal is C++ DTensor. I've chosen to move toward that incrementally, since 1) it would be better if we don't actually have to stop and do a full rewrite 2) incremental software development is faster in general 3) it is unlikely that the incremental pieces will be entirely different from what we would write if we were sitting down with a blank slate.
pybind11's C++ wrappers for the CPython API seem mostly OK . The cost seems to center around per-bound-C++-function-call overhead. I have a pair of upstream pybind11 PRs that reduce that overhead somewhat waiting for reviews (pybind/pybind11#5824 and pybind/pybind11#5830). |
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
…ly native DTensor.__new__" Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta ezyang msaroufim dcci [ghstack-poisoned]
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. Pull Request resolved: pytorch#162508 Approved by: https://github.com/ezyang ghstack dependencies: pytorch#161695
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. Pull Request resolved: pytorch#162508 Approved by: https://github.com/ezyang ghstack dependencies: pytorch#161695
…nsor_info (#162968) Next PR writes a C++ implementation. Seems good to have tests first. Pull Request resolved: #162968 Approved by: https://github.com/ezyang ghstack dependencies: #161695, #162508
Move the entirety of `__new__` into C++, saving a layer of disable_dynamo and making progress toward all-C++. Pull Request resolved: pytorch#162508 Approved by: https://github.com/ezyang ghstack dependencies: pytorch#161695
…nsor_info (pytorch#162968) Next PR writes a C++ implementation. Seems good to have tests first. Pull Request resolved: pytorch#162968 Approved by: https://github.com/ezyang ghstack dependencies: pytorch#161695, pytorch#162508
Stack from ghstack (oldest at bottom):
Move the entirety of
__new__
into C++, saving a layer of disable_dynamo and making progress toward all-C++.cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @ezyang @msaroufim @dcci