Skip to content

AWQ resolved mappings -- ensure shapes align #1372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 24, 2025

Conversation

brian-dellabetta
Copy link
Collaborator

@brian-dellabetta brian-dellabetta commented Apr 23, 2025

SUMMARY:
Add a robustness check to AWQ to exclude mappings where the layer shapes don't align. This is a known issue, just wanted to do it in a different PR because it occurs in a different location in AutoAWQ, and I wanted to keep the initial AWQ PR to be as close to the AutoAWQ implementation so the git history reflects our changes well.

TEST PLAN:
Added unit test to check default case and edge cases

Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@brian-dellabetta brian-dellabetta changed the title awq resolved mappings ensure shapes align AWQ resolved mappings -- ensure shapes align Apr 23, 2025
@kylesayrs
Copy link
Collaborator

Shouldn't we be erroring if the user passes a bad mapping, rather than excluding model-specific names?

@kylesayrs
Copy link
Collaborator

kylesayrs commented Apr 23, 2025

Also, is there an architecture-agnostic way to verify a mapping? How do we know that the model doesn't perform a reshaping in between mappings?

@brian-dellabetta brian-dellabetta marked this pull request as ready for review April 23, 2025 19:05
@brian-dellabetta
Copy link
Collaborator Author

Shouldn't we be erroring if the user passes a bad mapping, rather than excluding model-specific names?

It does, it will fail in the call if for example a user has "re:.*input_layernorm" in the smooth layer mapping but it can't be found. this is an edge case where mapping is ok, but some models don't have compatible shapes

@brian-dellabetta brian-dellabetta added the ready When a PR is ready for review label Apr 23, 2025
@brian-dellabetta
Copy link
Collaborator Author

Also, is there an architecture-agnostic way to verify a mapping? How do we know that the model doesn't perform a reshaping in between mappings?

I talked to @rahul-tuli about an architecture-agnostic way, but I think for now we resolved to scope this to exactly how the AutoAWQ handles it here. In reality, if the smooth_layer.out_features doesn't match the balance_layer.in_features, it will fail during runtime. Wonkier architectures may hit further errors, but we resolved to handle those as they arise rather than attempting a more all-encompassing solution

@kylesayrs
Copy link
Collaborator

@brian-dellabetta I'm worried about breaking/polluting other model architectures. If we're excluding names specific to an architecture, shouldn't we be creating a model architecture mappings registry, similar to smoothquant?

@brian-dellabetta
Copy link
Collaborator Author

brian-dellabetta commented Apr 23, 2025

@brian-dellabetta I'm worried about breaking/polluting other model architectures. If we're excluding names specific to an architecture, shouldn't we be creating a model architecture mappings registry, similar to smoothquant?

@kylesayrs we do have a mappings registry, see diff just above here, for each model family. However, the specific model can have different shapes. for example, ["TinyLlama/TinyLlama-1.1B-Chat-v1.0", "meta-llama/Llama-2-7b-hf", "meta-llama/Meta-Llama-3-8B-Instruct"] can all use the "Llama" mappings in the registry, but the v_proj/o_proj need to be excluded for tiny llama and llama 3 because the shapes don't align. The user will hit a runtime error. We are using the rule of thumb of trying to adhere to original AutoAWQ implementation for details like this

@kylesayrs
Copy link
Collaborator

@brian-dellabetta I see. Since v_proj and o_proj are really common names, I'd still rather limit these changes to the llama model type.

I'd suggest that, rather than logging the exclusion for each layer (which could be noisy), hoisting the logic out of the for loops and modifying the mapping based on the config.

def validate_mapping(self, model: PreTrainedModel, mappings: List[AWQMapping]):
    if model.config.model_type == "llama":
        num_attn_heads = model.config.num_attention_heads
        num_kv_heads = model.config.num_num_key_value_heads
        if num_attn_heads != num_kv_heads:
            logger.info(f"Excluding v_proj due to shape mismatch")
            mappings = [mapping for mapping in mappings if mapping.smooth_layer != "re:.*v_proj"]
            
     return mappings

@brian-dellabetta
Copy link
Collaborator Author

@brian-dellabetta I see. Since v_proj and o_proj are really common names, I'd still rather limit these changes to the llama model type.

@kylesayrs it is not limited to llama. See AutoAWQ's qwen, mistral and gemma implementations for a few additional examples.

@kylesayrs
Copy link
Collaborator

@brian-dellabetta My point is that AWQ explicitly lists which architectures this rule applies to, whereas the current implementation applies the rule to all architectures.

However, it seems like this is consistent enough across architectures to merit a blanket rule.

If logging for every layer is noisy, we can talk about logging only once.

kylesayrs
kylesayrs previously approved these changes Apr 23, 2025
@brian-dellabetta
Copy link
Collaborator Author

@brian-dellabetta My point is that AWQ explicitly lists which architectures this rule applies to, whereas the current implementation applies the rule to all architectures.

However, it seems like this is consistent enough across architectures to merit a blanket rule.

If logging for every layer is noisy, we can talk about logging only once.

@kylesayrs yeah, I don't think they've picked those model architectures in particular, I bet they were added as errors arose, because without this check the algorithm will error out further down. I am not sure why they are doing shape checks, which seems more exclusive than just checking smooth_layer.out_features!=balance_layer.in_features, but Rahul encouraged following suit with their implementation. The authors have said this seems to have minimal effect with respect to the other mappings as well.

Yeah I thought about the logging being noisy for large models. Since we are restricting this to o_proj/v_proj layers i'll change it to a count

Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/awq-resolve-mappings branch from f250c1c to 51a58a0 Compare April 23, 2025 21:14
Copy link
Collaborator

@shanjiaz shanjiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding tests! Logs are much cleaner too. Works for me locally 🎉

@dsikka dsikka enabled auto-merge (squash) April 24, 2025 03:20
@dsikka dsikka requested a review from kylesayrs April 24, 2025 03:20
@dsikka dsikka merged commit e936d45 into main Apr 24, 2025
8 checks passed
@dsikka dsikka deleted the bdellabe/awq-resolve-mappings branch April 24, 2025 14:01
kylesayrs pushed a commit that referenced this pull request Apr 29, 2025
SUMMARY:
Add a robustness check to AWQ to exclude mappings where the layer shapes
don't align. This is a known issue, just wanted to do it in a different
PR because it occurs in a different location in AutoAWQ, and I wanted to
keep the initial AWQ PR to be as close to the AutoAWQ implementation so
the git history reflects our changes well.


TEST PLAN:
Added unit test to check default case and edge cases

---------

Signed-off-by: Brian Dellabetta <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants