Skip to content

Conversation

ATMxsp01
Copy link
Contributor

@ATMxsp01 ATMxsp01 commented Aug 12, 2024

This pr set the fuse rope of mistral model to False

PR-11747 fixed the mistral-7B-instruct-v0.2 FP16 results difference between ipex-llm/ipex/NV in Longbench. However, this caused some other precisions like sym_int4 output garbled characters. This pr disabled the fuse rope optimization as a workaround to solve this problem (temporarily) .

Note

Since fuse qkv is also disabled in this case, it will cause a decrease to the model performance

@cyita cyita requested a review from Oscilloscope98 August 12, 2024 07:49
Copy link
Contributor

@cyita cyita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cyita
Copy link
Contributor

cyita commented Aug 12, 2024

@cyita cyita merged commit 1b05cab into intel:main Aug 12, 2024
@ATMxsp01 ATMxsp01 deleted the longbench-patch branch August 13, 2024 06:51
@qiyuangong qiyuangong mentioned this pull request Aug 13, 2024
7 tasks
qiyuangong added a commit that referenced this pull request Aug 13, 2024
* Fix mistral forward_qkv without self.rotary_emb.base in q4_0.
* Replace apply_rotary_pos_emb_no_cache_xpu with rotary_half_inplaced.
* Revert #11765
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants