🤗 Models on Hugging Face | Blog | Website | CyberSec Eval Paper | Llama Guard Paper
This repo contains a fork of PurpleLlama for the ProSec project.
- Install the dependences of purplellama:
cd CybersecurityBenchmarks
pip install -r requirements.txt
# install cargo if not already installed
# sudo apt-get install cargo
# Then add $HOME/.cargo/bin to your PATH
cargo install weggli
- Install vllm
pip install vllm==0.7.3
Use the following commands to run experiments:
./eval-quick.sh <path to model> <name of the run> <vllm port>
# or
./eval-full.sh <path to model> <name of the run> <vllm port>
For example,
./eval-quick.sh model-ckpts/prosec-phi3mini first-run 8001
will run the evaluation with the model checkpoint at model-ckpts/prosec-phi3mini
, name the run first-run
, and host the vllm server on port 8001
.
The two scripts have the same arguments and similar functionality.
eval-quick.sh
runs the experiment on a small subset of the evaluation dataset, while eval-full.sh
runs the full evaluation.
Both scripts will first host the given model on a local vllm server, and then call the scripts in purplellama to evaluate the hosted model.
The results will be saved in CybersecurityBenchmarks/datasets/instruct-stat
.
After the evaluation is done, both scripts kill the vllm server to free up resources.