Commit 2443ba9
authored
Fix long contexts in LoRA (#624)
#566 breaks long-contexts +
LoRA flow.
This assumes caching sin-cos buffer for first decoder layer is
sufficient to handle all cases, which is not the applicable for
long-context + LoRA.
This PR ignores `_prepare_cos_sin` call prior to HpuModelAdapter forward
in long-context + LoRA flow.1 parent 9555fef commit 2443ba9
File tree
3 files changed
+24
-6
lines changed- tests/lora
- vllm
- lora/punica_wrapper
- model_executor/layers
3 files changed
+24
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| 67 | + | |
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
| |||
86 | 88 | | |
87 | 89 | | |
88 | 90 | | |
| 91 | + | |
89 | 92 | | |
90 | | - | |
91 | | - | |
92 | | - | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
93 | 99 | | |
94 | 100 | | |
95 | 101 | | |
| |||
102 | 108 | | |
103 | 109 | | |
104 | 110 | | |
105 | | - | |
106 | 111 | | |
107 | 112 | | |
108 | | - | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
109 | 123 | | |
110 | 124 | | |
111 | 125 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
235 | | - | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
236 | 239 | | |
237 | 240 | | |
238 | 241 | | |
| |||
0 commit comments