Security scanner detecting Python Pickle files performing suspicious actions.
Scan a malicious model on Hugging Face:
pip install picklescan
picklescan --huggingface ykilcher/totally-harmless-modelThe scanner reports that the Pickle is calling eval() to execute arbitrary code:
https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin:archive/data.pkl: global import '__builtin__ eval' FOUND
----------- SCAN SUMMARY -----------
Scanned files: 1
Infected files: 1
Dangerous globals: 1The scanner can also load Pickles from local files, directories, URLs, and zip archives (a-la PyTorch):
picklescan --path downloads/pytorch_model.bin
picklescan --path downloads
picklescan --url https://huggingface.co/sshleifer/tiny-distilbert-base-cased-distilled-squad/resolve/main/pytorch_model.binTo scan Numpy's .npy files, pip install the numpy package first.
The scanner exit status codes are (a-la ClamAV):
0: scan did not find malware1: scan found malware2: scan failed
Create and activate the conda environment (miniconda is sufficient):
conda env create -f conda.yaml
conda activate picklescan
Install the package in editable mode to develop and test:
python3 -m pip install -e .
Edit with VS Code:
code .
Run unit tests:
pytest tests
Run manual tests:
- Local PyTorch (zip) file
 
mkdir downloads
wget -O downloads/pytorch_model.bin https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin
picklescan -l DEBUG -p downloads/pytorch_model.bin- Remote PyTorch (zip) URL
 
picklescan -l DEBUG -u https://huggingface.co/prajjwal1/bert-tiny/resolve/main/pytorch_model.binLint the code:
black src tests --line-length 140
flake8 src tests --count --show-source
Publish the package to PyPI: bump the package version in setup.cfg and create a GitHub release. This triggers the publish workflow.
Alternative manual steps to publish the package:
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m build
python3 -m twine upload dist/*
Test the package: bump the version of picklescan in conda.test.yaml and run
conda env remove -n picklescan-test
conda env create -f conda.test.yaml
conda activate picklescan-test
picklescan --huggingface ykilcher/totally-harmless-model
Tested on Linux 5.10.102.1-microsoft-standard-WSL2 x86_64 (WSL2).
- pickledoc -- Non-official but in-depth documentation of the Pickle file format
 - pickledbg -- Step-by-step Pickle dissassembly debugger
 - pickletools.py -- The official "documentation" of the Pickle file format (where documentation == code).
 - Machine Learning Attack Series: Backdooring Pickle Files, Johann Rehberger, 2022
 - Hugging Face Pickle Scanning, Luc Georges, 2022
 - The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!, Yannic Kilcher, 2022
 - Secure Machine Learning at Scale with MLSecOps, Alejandro Saucedo, 2022
 - Backdooring Pickles: A decade only made things worse, ColdwaterQ, DEFCON 2022
 - Never a dill moment: Exploiting machine learning pickle files, Evan Sultanik, 2021 (tool: Fickling)
 - Exploiting Python pickles, David Hamann, 2020
 - Dangerous Pickles - malicious python serialization, Evan Sangaline, 2017
 - Python Pickle Security Problems and Solutions, Travis Cunningham, 2015
 - Arbitrary code execution with Python pickles, Stephen Checkoway, 2013
 - Sour Pickles, A serialised exploitation guide in one part, Marco Slaviero, BlackHat USA 2011 (see also: doc, slides)