Skip to content

prescient-design/llome

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLOME

This is the repository for the paper "Generalists vs. Specialists: Evaluating LLMs on Highly-Constrained Biophysical Sequence Optimization Tasks," by Angelica Chen*, Samuel D. Stanton*, Frances Ding, Robert G. Alberstein, Andrew M. Watkins, Richard Bonneau, Vladimir Gligorijević, Kyunghyun Cho, Nathan C. Frey

Set-Up

  1. Clone this repo to your home directory! (Some of the scripts in this repo assume that your code is stored in ~/llome. If the path to your repo is different, change the path_to_repo option in the hydra config.)
  2. Install the requirements in a virtual environment:
uv venv --python 3.10
source .venv/bin/activate
uv pip install -r requirements.txt
  1. Run aws configure to ensure you are configured for access to S3 (if using S3 storage).

To test your setup

python -m run_pipeline --config-name=pipeline_sanity_check_marge local_output_dir=<PATH_TO_LOCAL_OUTPUT_DIR> parent_output_dir=<S3_OUTPUT_DIR>

Important things to know about the LLOME pipeline

  • Storage: This pipeline makes use of both local storage (local_output_dir in the YAML config) and S3 storage (parent_output_dir in the YAML config). All model checkpoints are automatically copied to S3 and deleted from local storage after the end of a training job. By default, all generated seeds, generations, and training data are stored in <config.parent_output_dir>/<config.run_name>, where config is the Hydra config.
    • To disable S3 storage and use only local storage, simply set parent_output_dir: "null" in the config.
  • Slurm Configurations: All training, dataset generation/formatting, and generation jobs are launched as separate SLURM jobs. These SLURM jobs are easily configured with the respective slurm_args subsections in the config.
  • Resuming jobs: This pipeline is designed to automatically resume previous runs, assuming that the pipeline is run with the same arguments as a prior run. To disable this behavior, use --overwrite=True.
  • Training large models: We currently use DDP (without accompanying model parallelism) to train our models (FSDP is a TODO though!), which means that you likely need GPU RAM that is ~4x the size of the model to train in full precision. Our code is currently only tested on single-node multi-GPU setups.
  • Analyzing results: To assess the performance of the LLM at iteration $i$, analyze the training data generated for iteration $i+1$ -- this accounts for both the likelihood-based ranking and filtering of the LLM at iteration $i$.

To reproduce experiments

Note: Update the slurm_args section in each config before running your experiment!

For function $f_2$:

LLOME-SFT:

python -m run_pipeline --config-name=pipeline_sft_f2 local_output_dir=<PATH_TO_LOCAL_OUTPUT_DIR> parent_output_dir=<S3_OUTPUT_DIR>

LLOME-MargE:

python -m run_pipeline --config-name=pipeline_marge_f2 local_output_dir=<PATH_TO_LOCAL_OUTPUT_DIR> parent_output_dir=<S3_OUTPUT_DIR>

LLOME-DPO:

python -m run_pipeline --config-name=pipeline_dpo_f2 local_output_dir=<PATH_TO_LOCAL_OUTPUT_DIR> parent_output_dir=<S3_OUTPUT_DIR>

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages