Skip to content

NUS-HPC-AI-Lab/POME

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

POME: Post Optimization Model Edit via Muon-style Projection

Introduction

We introduce Post-Optimization Model Edit (POME), a new algorithm that enhances the performance of fine-tuned large language models using only their pretrained and fine-tuned checkpoints, without requiring extra data or further optimization. The core idea is to apply a muon-style projection to $\Delta W$, the difference between the fine-tuned and pretrained weights. This projection uses truncated singular value decomposition (SVD) to equalize the influence of dominant update directions and prune small singular values, which often represent noise. As a simple post-processing step, POME is completely decoupled from the training pipeline. It requires zero modifications and imposes no overhead, making it universally compatible with any optimizer or distributed framework.

Latest News 🔥

  • [2025/10] 🔥 Propose POME [arxiv] [HuggingFace], a simple and efficient method for post optimization model edit.

1. MetaMathQA

Prerequisites:

  • Python >= 3.10
  • PyTorch >= 2.5.1
  • CUDA >= 11.6

We strongly recommend using Anaconda to create a new environment (Python >= 3.10) to run our examples.

Clone POME and install the required packages:

git clone https://github.com/sglucas/POME.git
cd matamath
pip install -r requirements.txt

POME:

# For LLaMA2-7B
BASE_MODEL='meta-llama/Llama-2-7b-hf'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL   --output_path $OUTPUT --alpha 2.6 --truncation 0.5 --layer up_proj

# For LLaMA3-8B
BASE_MODEL='meta-llama/Meta-Llama-3-8B'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL   --output_path $OUTPUT --alpha 1.5 --truncation 0.5 --layer up_proj

# For Gemma2-9B
BASE_MODEL='google/gemma-2-9b'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL   --output_path $OUTPUT --alpha 1.2 --truncation 0.5 --layer up_proj

Evaluation on GSM8K and MATH:

cd metamath
bash eval_gsm8k.sh --model $OUTPUT  --data_file ./data/test/GSM8K_test.jsonl
bash eval_math.sh --model $OUTPUT   --data_file ./data/test/MATH_test.jsonl

You can also directly download the models from huggingface and then evaluate them:

Model MetaMathQA MetaMathQA+POME
LLaMA2-7B 🤗 HuggingFace 🤗 HuggingFace
LLaMA3-8B 🤗 HuggingFace 🤗 HuggingFace
Gemma2-9B 🤗 HuggingFace 🤗 HuggingFace

Results:

Model Task Adam +POME
LLaMA2-7B GSM8K 67.2 69.7
LLaMA2-7B MATH 19.4 19.7
LLaMA3-8B GSM8K 80.3 81.4
LLaMA3-8B MATH 31.5 32.7
Gemma2-9B GSM8K 82.2 83.3
Gemma2-9B MATH 36.1 37.3

2. Code Generation

Coming soon.

Citation

@misc{liu2025pomepostoptimizationmodel,
      title={POME: Post Optimization Model Edit via Muon-style Projection}, 
      author={Yong Liu and Di Fu and Yang Luo and Zirui Zhu and Minhao Cheng and Cho-Jui Hsieh and Yang You},
      year={2025},
      eprint={2510.06627},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2510.06627}, 
}

About

POME: Post Optimization Model Edit via Muon-style Projection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages