-
Notifications
You must be signed in to change notification settings - Fork 90
Bind fill_next_token_bitmask against nb::ndarray #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a962dbc
to
4ccf711
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just got time to review this. I have left some comments. It's generally good to support non-torch tensor types, like mlx, tensorflow, jax.array, mojo tensor, etc.
Currently there are still relatively few users using non-torch tensors, but I do hope more users can leverage XGrammar to achieve structured generation in different platforms.
You’re also very welcome to support Mojo. If you’re interested, you can implement support for Mojo tensors in contrib/mojo.py.
Signed-off-by: Ubospica <[email protected]>
@Ahajha I have made the changes. Please check it out and you're welcome to make comments! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's where I'm not sure about. I have a few goals here that I should probably lay out. Though admittedly, we (Modular) might be switching away from xgrammar
for some of these.
The whole reason we're even considering switching away is because of the PyTorch dependency. We're in a sense a PT competitor, and so having it as a dependency is weird. It's also a packaging nightmare - it's massive, and you have to opt-in to a CPU-only version which is smaller if that's all you use (which is the case for us currently). So our options here are either switch away, use xgrammar as a C++ library (in which case this PR doesn't really matter, we'd just make the requisite bindings changes in-house), or submit the requisite changes to remove PT as a dependency here, which this is the only function touching PT that we actually use (I don't think we even use the kernels AFAICT).
All that said, by annotating the type with torch.tensor
, we keep the PT dependency around. I think there's a way we could mark it as annotation-only, but we'd still need to do a bit of work to remove the dep from the rest of the file. At the end of the day, for this to work for us, the PT dependency would have to be dropped, which probably the only way forwards is to push everything else into an extra, which unfortunately there's no (current) way to specify a default extra, so it would break users (you'd have to require something like xgrammar[torch]
. I think we're of the mindset that it's a good project, just not a good fit for us.
The other goal is just a personal one which is moving more code to bindings, I'll respond to that comment on the issue shortly but the idea of using ArrayLike
is that that is what nanobind autogenerates for ndarrays, so it would carry over nicely.
|
||
namespace nb = nanobind; | ||
using namespace xgrammar; | ||
|
||
namespace { | ||
namespace xgrammar { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This anonymous namespace was intentional - it probably doesn't make too much of a difference, but it marks any functions within it as compilation-unit-local, which could help optimize the code. It also can give warnings if anything in it isn't used. Since the functions are just used in the file, we can mark it as such.
Probably not a huge deal, but that was the motivation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand the motivation. But since we have enabled LTO, the difference between the two could be minor. So both is fine I think.
@Ahajha Thanks for the response! I think it’s reasonable to gradually separate out the Torch-related code (for example, placing it in contrib/torch.py). But several issues arise when we want to maintain backward compatibility:
I’d like to hear your thoughts on these two issues. At the very least, moving Torch-related code to contrib/torch.py might make it easier for you to fork the code and remove torch dependency for your usecase. As for this PR, I think we can change the annotation to ArrayLike. Can you also mention in the Python docstring that this parameter |
@Ahajha I think we can update to remove torch.Tensor in the type annotation, and merge this PR. I can do it tomorrow. If you have any further comments, feel free to share them! |
@Ahajha Thank you for your contribution! |
@Ubospica Sorry about the lack of response. Yes that should be fine, thanks! |
No worries. Thank you! |
Nanobind has built-in support for the DLPack standard, which allows us to take anything adhering to the DLPack spec here rather than just PyTorch objects. See here: https://nanobind.readthedocs.io/en/latest/ndarray.html
This does add a dependency on numpy, but only for a single type, which does seem a little overkill. If I push on #233 a little bit though, we should only need these definitions in the stubfiles, and will be unneeded for the
.py
files. So ideally this is a temporary dependency.My goal is to reduce the dependency on PyTorch a bit by making the code more general. I don't know if (or even think) that we can remove it entirely, but this seems worthwhile to do.
The annotation on the
nb::ndarray
class will actually check at runtime that the parameter has the desired properties (in this case, it's on the CPU and is an int32_t), and will simply fail to call the function if not. This might be a slight breaking change in terms of what exception actually gets raised, but I think this is reasonable.A few related changes:
const&
instead of*
, didn't see a good reason for it.nanobind.cc
just to keep things hygenic.