-
Notifications
You must be signed in to change notification settings - Fork 489
Add errors when dtype is anything other than int32 for ptr metatdata #1492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…lengths Signed-off-by: Pavani Majety <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @pavanimajety, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces robust data type validation for key metadata arguments (qo_indptr
, kv_indptr
, and kv_len_arr
) within the flashinfer/mla.py
module. By enforcing that these arguments are torch.int32
tensors, the changes aim to prevent potential errors and ensure compatibility with underlying operations that rely on specific data type conventions. Additionally, the relevant docstrings have been updated to reflect these int32
requirements, enhancing code clarity for developers.
Highlights
- Data Type Validation: Added explicit
ValueError
checks in theplan
function withinflashinfer/mla.py
to ensure thatkv_len_arr
,kv_indptr
, andqo_indptr
tensors are oftorch.int32
dtype, preventing potential runtime issues from incorrect data types. - Documentation Update: Updated the docstrings for
kv_indptr
,kv_indices
, andkv_len_arr
inflashinfer/mla.py
to explicitly statetorch.Tensor(int32)
as their expected data type, improving code clarity and documentation.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds important validation checks for the data types of metadata arguments in the plan
method, ensuring they are torch.int32
. This is a good improvement for robustness.
My review includes a suggestion to refactor the validation logic to be more concise and to add a missing check for the kv_indices
tensor, which was updated in the docstring but not validated in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pavanimajety for bringing this up!
torch.Tensor(int32)
is invalid typing and I encourage using:
torch.IntTensor
instead (IntTensor in torch refers to int32 tensor, specifically).
Signed-off-by: Pavani Majety <[email protected]>
Addressed the feedback! |
On a side note, does it make sense to make the ptr arrays int64? And leave the kv_lens and kv_indices in int32? |
Sure at kernel-wise it could be supported, but let's leave it for later PRs :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR adds a very simple error check for the metadata arguments qo_indptr, kv_indptr and kv_lens array.
📌 Description
This PR adds a very simple error check for the metadata arguments qo_indptr, kv_indptr and kv_lens array.
🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commit
by runningpip install pre-commit
(or used your preferred method).pre-commit install
.pre-commit run --all-files
and fixed any reported issues.🧪 Tests
unittest
, etc.).Reviewer Notes