You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SUMMARY:
- The current tests are failing because when loading the tinystories
model, the lm_head is ending up with device type "meta"
- This model is generally problematic so we swap to use TinyLlama
- With the size of the model being large, we target just one layer for
quantization to contain runtime, while updating the asserts to be
reflective of the just one layer being quantized
TESTING:
- All tests pass with these changes
0 commit comments