Skip to content

Conversation

iofu728
Copy link
Collaborator

@iofu728 iofu728 commented Apr 16, 2025

What does this PR do?

  • Fixes the residual;
  • Fixed self._supports_num_logits_to_keep;

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@iofu728

@iofu728 iofu728 added the bug Something isn't working label Apr 16, 2025
@iofu728 iofu728 requested a review from Copilot April 16, 2025 07:58
@iofu728 iofu728 self-assigned this Apr 16, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes residual computation in the Llama decoder layer and improves the handling of logits support, while updating the constructor signature for DynamicCacheWithRepeat.

  • Uses hidden_states.clone() to create an independent residual copy in the decoder layer.
  • Adds additional checks for supported logits properties in cache preparation.
  • Updates the init signature in DynamicCacheWithRepeat to require a config parameter.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
minference/patch.py Clones hidden states in the Llama decoder layer to avoid unintended mutations.
minference/modules/kvcompression.py Adds conditional checks for logits support and updates DynamicCacheWithRepeat's init.
Comments suppressed due to low confidence (1)

minference/modules/kvcompression.py:370

  • Changing the init signature to require a 'config' argument may break backward compatibility if existing instantiations do not pass this parameter. Verify that all usages of DynamicCacheWithRepeat are updated accordingly.
def __init__(self, config, *args, **kwargs):

@iofu728 iofu728 merged commit 5416d89 into main Apr 16, 2025
1 check passed
@iofu728 iofu728 deleted the hjiang/fix_generation branch April 16, 2025 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant