Skip to content

Conversation

Ahajha
Copy link
Contributor

@Ahajha Ahajha commented Jun 16, 2025

Nanobind has built-in support for the DLPack standard, which allows us to take anything adhering to the DLPack spec here rather than just PyTorch objects. See here: https://nanobind.readthedocs.io/en/latest/ndarray.html

This does add a dependency on numpy, but only for a single type, which does seem a little overkill. If I push on #233 a little bit though, we should only need these definitions in the stubfiles, and will be unneeded for the .py files. So ideally this is a temporary dependency.

My goal is to reduce the dependency on PyTorch a bit by making the code more general. I don't know if (or even think) that we can remove it entirely, but this seems worthwhile to do.

The annotation on the nb::ndarray class will actually check at runtime that the parameter has the desired properties (in this case, it's on the CPU and is an int32_t), and will simply fail to call the function if not. This might be a slight breaking change in terms of what exception actually gets raised, but I think this is reasonable.

A few related changes:

  • I've converted a few function signatures to take DLTensors by const& instead of *, didn't see a good reason for it.
  • Added an anonymous namespace in nanobind.cc just to keep things hygenic.

@Ahajha Ahajha force-pushed the ahajha/torch-to-dltensor branch from a962dbc to 4ccf711 Compare June 16, 2025 06:25
Copy link
Collaborator

@Ubospica Ubospica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just got time to review this. I have left some comments. It's generally good to support non-torch tensor types, like mlx, tensorflow, jax.array, mojo tensor, etc.

Currently there are still relatively few users using non-torch tensors, but I do hope more users can leverage XGrammar to achieve structured generation in different platforms.

You’re also very welcome to support Mojo. If you’re interested, you can implement support for Mojo tensors in contrib/mojo.py.

@Ubospica
Copy link
Collaborator

@Ahajha I have made the changes. Please check it out and you're welcome to make comments!

Copy link
Contributor Author

@Ahajha Ahajha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where I'm not sure about. I have a few goals here that I should probably lay out. Though admittedly, we (Modular) might be switching away from xgrammar for some of these.

The whole reason we're even considering switching away is because of the PyTorch dependency. We're in a sense a PT competitor, and so having it as a dependency is weird. It's also a packaging nightmare - it's massive, and you have to opt-in to a CPU-only version which is smaller if that's all you use (which is the case for us currently). So our options here are either switch away, use xgrammar as a C++ library (in which case this PR doesn't really matter, we'd just make the requisite bindings changes in-house), or submit the requisite changes to remove PT as a dependency here, which this is the only function touching PT that we actually use (I don't think we even use the kernels AFAICT).

All that said, by annotating the type with torch.tensor, we keep the PT dependency around. I think there's a way we could mark it as annotation-only, but we'd still need to do a bit of work to remove the dep from the rest of the file. At the end of the day, for this to work for us, the PT dependency would have to be dropped, which probably the only way forwards is to push everything else into an extra, which unfortunately there's no (current) way to specify a default extra, so it would break users (you'd have to require something like xgrammar[torch]. I think we're of the mindset that it's a good project, just not a good fit for us.

The other goal is just a personal one which is moving more code to bindings, I'll respond to that comment on the issue shortly but the idea of using ArrayLike is that that is what nanobind autogenerates for ndarrays, so it would carry over nicely.

Comment on lines 22 to 26

namespace nb = nanobind;
using namespace xgrammar;

namespace {
namespace xgrammar {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This anonymous namespace was intentional - it probably doesn't make too much of a difference, but it marks any functions within it as compilation-unit-local, which could help optimize the code. It also can give warnings if anything in it isn't used. Since the functions are just used in the file, we can mark it as such.

Probably not a huge deal, but that was the motivation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand the motivation. But since we have enabled LTO, the difference between the two could be minor. So both is fine I think.

@Ubospica
Copy link
Collaborator

Ubospica commented Jul 22, 2025

Here's where I'm not sure about. I have a few goals here that I should probably lay out. Though admittedly, we (Modular) might be switching away from xgrammar for some of these.

The whole reason we're even considering switching away is because of the PyTorch dependency. We're in a sense a PT competitor, and so having it as a dependency is weird. It's also a packaging nightmare - it's massive, and you have to opt-in to a CPU-only version which is smaller if that's all you use (which is the case for us currently). So our options here are either switch away, use xgrammar as a C++ library (in which case this PR doesn't really matter, we'd just make the requisite bindings changes in-house), or submit the requisite changes to remove PT as a dependency here, which this is the only function touching PT that we actually use (I don't think we even use the kernels AFAICT).

All that said, by annotating the type with torch.tensor, we keep the PT dependency around. I think there's a way we could mark it as annotation-only, but we'd still need to do a bit of work to remove the dep from the rest of the file. At the end of the day, for this to work for us, the PT dependency would have to be dropped, which probably the only way forwards is to push everything else into an extra, which unfortunately there's no (current) way to specify a default extra, so it would break users (you'd have to require something like xgrammar[torch]. I think we're of the mindset that it's a good project, just not a good fit for us.

The other goal is just a personal one which is moving more code to bindings, I'll respond to that comment on the issue shortly but the idea of using ArrayLike is that that is what nanobind autogenerates for ndarrays, so it would carry over nicely.

@Ahajha Thanks for the response!

I think it’s reasonable to gradually separate out the Torch-related code (for example, placing it in contrib/torch.py). But several issues arise when we want to maintain backward compatibility:

  1. allocate_token_bitmask still needs to return a torch.Tensor, and since it’s a public API, that means the codebase still needs to depend on Torch.
  2. Regarding Python dependencies, one possible solution is to let xgrammar[all] install all dependencies, while xgrammar itself doesn’t depend on Torch. I’m just a bit concerned that this might break backward compatibility with pip install xgrammar.

I’d like to hear your thoughts on these two issues.

At the very least, moving Torch-related code to contrib/torch.py might make it easier for you to fork the code and remove torch dependency for your usecase.

As for this PR, I think we can change the annotation to ArrayLike. Can you also mention in the Python docstring that this parameter accepts array-like objects such as torch.Tensor that supports the DLPack procotol? Then we can go ahead and merge it.

@Ubospica
Copy link
Collaborator

Ubospica commented Jul 29, 2025

@Ahajha I think we can update to remove torch.Tensor in the type annotation, and merge this PR. I can do it tomorrow. If you have any further comments, feel free to share them!

@Ubospica
Copy link
Collaborator

@Ahajha Thank you for your contribution!

@Ahajha
Copy link
Contributor Author

Ahajha commented Jul 30, 2025

@Ubospica Sorry about the lack of response. Yes that should be fine, thanks!

@Ubospica
Copy link
Collaborator

@Ubospica Sorry about the lack of response. Yes that should be fine, thanks!

No worries. Thank you!

@Ubospica Ubospica merged commit 604d004 into mlc-ai:main Aug 4, 2025
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants