Skip to content

Conversation

iofu728
Copy link
Collaborator

@iofu728 iofu728 commented May 12, 2025

What does this PR do?

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@iofu728

@iofu728 iofu728 requested a review from Copilot May 12, 2025 12:14
@iofu728 iofu728 self-assigned this May 12, 2025
@iofu728 iofu728 added the feature feature label May 12, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for xAttention by adding new kernel implementations and updating related configuration, module, and documentation files.

  • Added xAttention kernels in the ops folder.
  • Updated module routing, model patching, and configuration to include xAttention.
  • Revised the README to document the new feature.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
minference/ops/xattention_fa.py Added kernel implementations for xAttention operations.
minference/modules/forward.py Updated to map "xattention" to its forward function.
minference/models_patch.py Extended model patching support to include "xattention".
minference/minference_configuration.py Updated configuration to register "xattention" as a supported type.
README.md Documented the xAttention feature in the supported methods list.
Comments suppressed due to low confidence (1)

minference/ops/xattention_fa.py:213

  • The parameter name 'is_caual' is misspelled; consider renaming it to 'is_causal' for clarity.
def flat_group_gemm_fuse_reshape_kernel(Q, K, Out, ... is_caual: tl.constexpr,):

assert q_len % reshaped_block_size == 0
try:
assert k_len % segment_size == 0
except:
Copy link
Preview

Copilot AI May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using a bare except with a debugger breakpoint; consider handling a specific exception or removing the breakpoint in production code.

Suggested change
except:
except AssertionError:

Copilot uses AI. Check for mistakes.

@iofu728 iofu728 merged commit 91a0506 into main May 12, 2025
1 check passed
@iofu728 iofu728 deleted the hjiang/add_xattention branch May 12, 2025 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant