Skip to content

bigai-nlco/LatentSeek

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Introduction

LatentSeek

LatentSeek is a novel framework that enhances LLM reasoning through Test-Time Instance-level Adaptation (TTIA) within the model's latent space. Specifically, LatentSeek leverages policy gradient to iteratively update latent representations, guided by self-generated reward signals.

Installation

conda create -n latentseek python=3.10
conda activate latentseek
pip3 install torch torchvision torchaudio
pip install transformers datasets tqdm accelerate
pip install termcolor

# for evaluation
cd src/extract_judge_answer/latex2sympy
pip install -e .
pip install math-verify
pip install word2number

Usage

cd src
cd scripts
vim example.sh # modify this according to your need
cd ..
sh scripts/example.sh

The example.sh file

PATH_TO_DATA= # path to the dataset (the path str should contain either "AIME_2024", "gsm8k", "MATH-500")
PATH_TO_MODEL= # path to the model 
rho= # the value of rho, which is the hyperparameter for the fractional update
lr= # the learning rate
solver_prompt_idx= # the index of the solver prompt to use (0 for "boxex", 1 for "json")

python main.py \
    --dataset $PATH_TO_DATA \
    --model_name_or_path $PATH_TO_MODEL \
    --output_dir ./output \
    --k $rho \
    --lr $lr \
    --solver_prompt_idx $solver_prompt_idx \
    --device "cuda" \

Files for Modification

Citation

@misc{li2025seekdarkreasoningtesttime,
      title={Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space}, 
      author={Hengli Li and Chenxi Li and Tong Wu and Xuekai Zhu and Yuxuan Wang and Zhaoxin Yu and Eric Hanchen Jiang and Song-Chun Zhu and Zixia Jia and Ying Nian Wu and Zilong Zheng},
      year={2025},
      eprint={2505.13308},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.13308}, 
}

Contact

About

Official Repository of LatentSeek

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •