We introduce Post-Optimization Model Edit (POME), a new algorithm that enhances the performance of fine-tuned large language models using only their pretrained and fine-tuned checkpoints, without requiring extra data or further optimization. The core idea is to apply a muon-style projection to
- [2025/10] 🔥 Propose POME [arxiv] [HuggingFace], a simple and efficient method for post optimization model edit.
Prerequisites:
- Python >= 3.10
- PyTorch >= 2.5.1
- CUDA >= 11.6
We strongly recommend using Anaconda to create a new environment (Python >= 3.10) to run our examples.
Clone POME and install the required packages:
git clone https://github.com/sglucas/POME.git
cd matamath
pip install -r requirements.txt
POME:
# For LLaMA2-7B
BASE_MODEL='meta-llama/Llama-2-7b-hf'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL --output_path $OUTPUT --alpha 2.6 --truncation 0.5 --layer up_proj
# For LLaMA3-8B
BASE_MODEL='meta-llama/Meta-Llama-3-8B'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL --output_path $OUTPUT --alpha 1.5 --truncation 0.5 --layer up_proj
# For Gemma2-9B
BASE_MODEL='google/gemma-2-9b'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL --output_path $OUTPUT --alpha 1.2 --truncation 0.5 --layer up_proj
Evaluation on GSM8K and MATH:
cd metamath
bash eval_gsm8k.sh --model $OUTPUT --data_file ./data/test/GSM8K_test.jsonl
bash eval_math.sh --model $OUTPUT --data_file ./data/test/MATH_test.jsonl
You can also directly download the models from huggingface and then evaluate them:
| Model | MetaMathQA | MetaMathQA+POME |
|---|---|---|
| LLaMA2-7B | 🤗 HuggingFace | 🤗 HuggingFace |
| LLaMA3-8B | 🤗 HuggingFace | 🤗 HuggingFace |
| Gemma2-9B | 🤗 HuggingFace | 🤗 HuggingFace |
Results:
| Model | Task | Adam | +POME |
|---|---|---|---|
| LLaMA2-7B | GSM8K | 67.2 | 69.7 |
| LLaMA2-7B | MATH | 19.4 | 19.7 |
| LLaMA3-8B | GSM8K | 80.3 | 81.4 |
| LLaMA3-8B | MATH | 31.5 | 32.7 |
| Gemma2-9B | GSM8K | 82.2 | 83.3 |
| Gemma2-9B | MATH | 36.1 | 37.3 |
Coming soon.
@misc{liu2025pomepostoptimizationmodel,
title={POME: Post Optimization Model Edit via Muon-style Projection},
author={Yong Liu and Di Fu and Yang Luo and Zirui Zhu and Minhao Cheng and Cho-Jui Hsieh and Yang You},
year={2025},
eprint={2510.06627},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2510.06627},
}