AWQ resolved mappings -- ensure shapes align #1372

brian-dellabetta · 2025-04-23T15:15:45Z

SUMMARY:
Add a robustness check to AWQ to exclude mappings where the layer shapes don't align. This is a known issue, just wanted to do it in a different PR because it occurs in a different location in AutoAWQ, and I wanted to keep the initial AWQ PR to be as close to the AutoAWQ implementation so the git history reflects our changes well.

TEST PLAN:
Added unit test to check default case and edge cases

github-actions · 2025-04-23T15:15:55Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

src/llmcompressor/modifiers/awq/base.py

kylesayrs · 2025-04-23T18:26:47Z

Shouldn't we be erroring if the user passes a bad mapping, rather than excluding model-specific names?

kylesayrs · 2025-04-23T18:27:29Z

Also, is there an architecture-agnostic way to verify a mapping? How do we know that the model doesn't perform a reshaping in between mappings?

brian-dellabetta · 2025-04-23T19:07:45Z

Shouldn't we be erroring if the user passes a bad mapping, rather than excluding model-specific names?

It does, it will fail in the call if for example a user has "re:.*input_layernorm" in the smooth layer mapping but it can't be found. this is an edge case where mapping is ok, but some models don't have compatible shapes

brian-dellabetta · 2025-04-23T19:11:11Z

Also, is there an architecture-agnostic way to verify a mapping? How do we know that the model doesn't perform a reshaping in between mappings?

I talked to @rahul-tuli about an architecture-agnostic way, but I think for now we resolved to scope this to exactly how the AutoAWQ handles it here. In reality, if the smooth_layer.out_features doesn't match the balance_layer.in_features, it will fail during runtime. Wonkier architectures may hit further errors, but we resolved to handle those as they arise rather than attempting a more all-encompassing solution

kylesayrs · 2025-04-23T19:14:34Z

@brian-dellabetta I'm worried about breaking/polluting other model architectures. If we're excluding names specific to an architecture, shouldn't we be creating a model architecture mappings registry, similar to smoothquant?

brian-dellabetta · 2025-04-23T19:38:16Z

@brian-dellabetta I'm worried about breaking/polluting other model architectures. If we're excluding names specific to an architecture, shouldn't we be creating a model architecture mappings registry, similar to smoothquant?

@kylesayrs we do have a mappings registry, see diff just above here, for each model family. However, the specific model can have different shapes. for example, ["TinyLlama/TinyLlama-1.1B-Chat-v1.0", "meta-llama/Llama-2-7b-hf", "meta-llama/Meta-Llama-3-8B-Instruct"] can all use the "Llama" mappings in the registry, but the v_proj/o_proj need to be excluded for tiny llama and llama 3 because the shapes don't align. The user will hit a runtime error. We are using the rule of thumb of trying to adhere to original AutoAWQ implementation for details like this

kylesayrs · 2025-04-23T19:56:06Z

@brian-dellabetta I see. Since v_proj and o_proj are really common names, I'd still rather limit these changes to the llama model type.

I'd suggest that, rather than logging the exclusion for each layer (which could be noisy), hoisting the logic out of the for loops and modifying the mapping based on the config.

def validate_mapping(self, model: PreTrainedModel, mappings: List[AWQMapping]):
    if model.config.model_type == "llama":
        num_attn_heads = model.config.num_attention_heads
        num_kv_heads = model.config.num_num_key_value_heads
        if num_attn_heads != num_kv_heads:
            logger.info(f"Excluding v_proj due to shape mismatch")
            mappings = [mapping for mapping in mappings if mapping.smooth_layer != "re:.*v_proj"]
            
     return mappings

brian-dellabetta · 2025-04-23T20:22:35Z

@brian-dellabetta I see. Since v_proj and o_proj are really common names, I'd still rather limit these changes to the llama model type.

@kylesayrs it is not limited to llama. See AutoAWQ's qwen, mistral and gemma implementations for a few additional examples.

kylesayrs · 2025-04-23T20:48:43Z

@brian-dellabetta My point is that AWQ explicitly lists which architectures this rule applies to, whereas the current implementation applies the rule to all architectures.

However, it seems like this is consistent enough across architectures to merit a blanket rule.

If logging for every layer is noisy, we can talk about logging only once.

src/llmcompressor/modifiers/awq/base.py

brian-dellabetta · 2025-04-23T21:11:16Z

@brian-dellabetta My point is that AWQ explicitly lists which architectures this rule applies to, whereas the current implementation applies the rule to all architectures.

However, it seems like this is consistent enough across architectures to merit a blanket rule.

If logging for every layer is noisy, we can talk about logging only once.

@kylesayrs yeah, I don't think they've picked those model architectures in particular, I bet they were added as errors arose, because without this check the algorithm will error out further down. I am not sure why they are doing shape checks, which seems more exclusive than just checking smooth_layer.out_features!=balance_layer.in_features, but Rahul encouraged following suit with their implementation. The authors have said this seems to have minimal effect with respect to the other mappings as well.

Yeah I thought about the logging being noisy for large models. Since we are restricting this to o_proj/v_proj layers i'll change it to a count

Signed-off-by: Brian Dellabetta <[email protected]>

shanjiaz

Thanks for adding tests! Logs are much cleaner too. Works for me locally 🎉

SUMMARY: Add a robustness check to AWQ to exclude mappings where the layer shapes don't align. This is a known issue, just wanted to do it in a different PR because it occurs in a different location in AutoAWQ, and I wanted to keep the initial AWQ PR to be as close to the AutoAWQ implementation so the git history reflects our changes well. TEST PLAN: Added unit test to check default case and edge cases --------- Signed-off-by: Brian Dellabetta <[email protected]> Co-authored-by: Dipika Sikka <[email protected]>

brian-dellabetta requested review from rahul-tuli and shanjiaz April 23, 2025 15:15

brian-dellabetta changed the title ~~awq resolved mappings ensure shapes align~~ AWQ resolved mappings -- ensure shapes align Apr 23, 2025

kylesayrs reviewed Apr 23, 2025

View reviewed changes

src/llmcompressor/modifiers/awq/base.py Outdated Show resolved Hide resolved

brian-dellabetta marked this pull request as ready for review April 23, 2025 19:05

brian-dellabetta added the ready When a PR is ready for review label Apr 23, 2025

kylesayrs approved these changes Apr 23, 2025

View reviewed changes

kylesayrs previously approved these changes Apr 23, 2025

View reviewed changes

src/llmcompressor/modifiers/awq/base.py Outdated Show resolved Hide resolved

brian-dellabetta dismissed kylesayrs’s stale review via f250c1c April 23, 2025 21:13

brian-dellabetta added 7 commits April 23, 2025 21:14

awq resolved mappings ensure shapes align

d306bba

Signed-off-by: Brian Dellabetta <[email protected]>

exclude mapping when no balance layers found

68f4c77

Signed-off-by: Brian Dellabetta <[email protected]>

strictly scope to just v_proj/o_proj

5746cc6

Signed-off-by: Brian Dellabetta <[email protected]>

add unit test

93fbad6

Signed-off-by: Brian Dellabetta <[email protected]>

style fix

90f772f

Signed-off-by: Brian Dellabetta <[email protected]>

stylefixes

f7fd6cf

Signed-off-by: Brian Dellabetta <[email protected]>

codereview updates

51a58a0

Signed-off-by: Brian Dellabetta <[email protected]>

brian-dellabetta force-pushed the bdellabe/awq-resolve-mappings branch from f250c1c to 51a58a0 Compare April 23, 2025 21:14

shanjiaz approved these changes Apr 23, 2025

View reviewed changes

Merge branch 'main' into bdellabe/awq-resolve-mappings

34aab62

dsikka enabled auto-merge (squash) April 24, 2025 03:20

dsikka requested a review from kylesayrs April 24, 2025 03:20

kylesayrs approved these changes Apr 24, 2025

View reviewed changes

dsikka merged commit e936d45 into main Apr 24, 2025
8 checks passed

dsikka deleted the bdellabe/awq-resolve-mappings branch April 24, 2025 14:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AWQ resolved mappings -- ensure shapes align #1372

AWQ resolved mappings -- ensure shapes align #1372

Uh oh!

brian-dellabetta commented Apr 23, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 23, 2025

Uh oh!

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

kylesayrs commented Apr 23, 2025 •

edited

Loading

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

brian-dellabetta commented Apr 23, 2025 •

edited

Loading

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

shanjiaz left a comment

Uh oh!

Uh oh!

Uh oh!

AWQ resolved mappings -- ensure shapes align #1372

AWQ resolved mappings -- ensure shapes align #1372

Uh oh!

Conversation

brian-dellabetta commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 23, 2025

Uh oh!

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

kylesayrs commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

brian-dellabetta commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

kylesayrs commented Apr 23, 2025

Uh oh!

Uh oh!

brian-dellabetta commented Apr 23, 2025

Uh oh!

shanjiaz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

brian-dellabetta commented Apr 23, 2025 •

edited

Loading

kylesayrs commented Apr 23, 2025 •

edited

Loading

brian-dellabetta commented Apr 23, 2025 •

edited

Loading