-
Notifications
You must be signed in to change notification settings - Fork 195
GPTQ Activation Ordering #94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 48 commits
Commits
Show all changes
64 commits
Select commit
Hold shift + click to select a range
012138a
actorder
horheynm f88c84e
g_idx fix
horheynm 3211fe1
fix
horheynm bbbf564
lint
horheynm 8d29f0d
propagagte g_idx with perm
horheynm 89224e9
scratch
horheynm cb8446d
GPTQ - move calibration of quantiztion params to after hessian calibr…
d7029a0
no recompute
horheynm eeff533
clean up
horheynm 842b150
remvoe unwanted code
horheynm 240c39d
draft
horheynm 820d08a
draft
horheynm 564845e
draft
horheynm 6f54737
mimic gptq
horheynm 2cc99bb
permutation seems to be working
kylesayrs 6fe537d
WIP: fails on non-square weights
kylesayrs 6611073
pass perm into quant params calculation
kylesayrs 9077969
works on vllm and loading with identity permutation
kylesayrs 6a1565e
WIP: working pytorch with actorder
kylesayrs 1940df4
able to inference with script and reload, needed to set
kylesayrs 11beac1
remove testing comments
kylesayrs 9456698
remove scripts
kylesayrs 0c773e6
remove dregs
kylesayrs b6bebc2
merge actorder and group cases
kylesayrs 3bde194
code structuring and cleanup
kylesayrs 758c495
use `refresh_layer_weight_quant_params`
kylesayrs 85fb1ff
update_layer_weight_quant_params reuse
kylesayrs 5b52e9d
deep copy H to allow for future reuse
kylesayrs 9e2cef9
hoist group_size
kylesayrs e725cc7
remove footer note
kylesayrs 2392b83
apply style
kylesayrs a5a30e1
fix rebase dreggs
kylesayrs ca6fc6e
remove extra line
kylesayrs 6f99634
move lines for better grouping
kylesayrs b726bd6
move for better diff
kylesayrs 2002761
remove extra lines
kylesayrs 0ef0c5b
use getattr to avoid pr dep
kylesayrs 476aed0
Revert "use getattr to avoid pr dep"
kylesayrs ffb809c
add actorder to docstring
kylesayrs edc02d4
Merge remote-tracking branch 'origin' into kylesayrs/activation-ordering
kylesayrs bc49946
do not clone hessian
kylesayrs 99f2286
apply style
kylesayrs 48b36c2
avoid unset g_idx parameter by observing directly
kylesayrs 9550f14
use update_layer_weight_quant_params
kylesayrs d22ff2e
Merge remote-tracking branch 'origin/main' into kylesayrs/activation-…
kylesayrs 72d919f
Merge branch 'main' into kylesayrs/activation-ordering
kylesayrs e4d37a6
indent for when quantization_scheme is missing
kylesayrs cdc8bcd
add actorder e2e test
kylesayrs 1fe188b
do not freeze if initialized from gptq
kylesayrs b06a103
add get_attr_chain helper function
kylesayrs f293efd
cleanup and clarify logic
kylesayrs a99e0da
apply style
kylesayrs bf915d4
rename to getattr_chain, handle no default case
kylesayrs 66ef96b
out of place type conversion
kylesayrs 98aaf88
Merge remote-tracking branch 'origin/gptq-cleanup' into kylesayrs/act…
kylesayrs 91c877a
account for extra case
kylesayrs b711e14
remove freeze_quantization argument
kylesayrs 974dbc7
remove fake_quantization case, update debug message
kylesayrs 094e429
remove todo
kylesayrs 582c179
Merge remote-tracking branch 'origin/gptq-cleanup' into kylesayrs/act…
kylesayrs febb741
correct name
kylesayrs 83a1d93
Merge remote-tracking branch 'origin/main' into kylesayrs/activation-…
kylesayrs a1646e5
Merge remote-tracking branch 'origin/main' into kylesayrs/activation-…
kylesayrs eef6bab
change to false in docstring
kylesayrs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5 changes: 5 additions & 0 deletions
5
tests/llmcompressor/transformers/compression/configs/actorder_1.1b.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
cadence: "nightly" | ||
test_type: "regression" | ||
model_stub: "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T" | ||
new_recipe: "tests/llmcompressor/transformers/compression/recipes/new_quant_actorder.yaml" | ||
ppl_threshold: 20 |
19 changes: 19 additions & 0 deletions
19
tests/llmcompressor/transformers/compression/recipes/new_quant_actorder.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
test_stage: | ||
quant_modifiers: | ||
QuantizationModifier: | ||
ignore: ["lm_head", "model.layers.0.mlp.down_proj"] | ||
config_groups: | ||
group_0: | ||
weights: | ||
num_bits: 4 | ||
type: "int" | ||
symmetric: False | ||
strategy: "group" | ||
group_size: 128 | ||
actorder: True | ||
input_activations: null | ||
output_activations: null | ||
targets: ["Linear"] | ||
GPTQModifier: | ||
block_size: 128 | ||
sequential_update: False |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.