GitHub - mmaitre314/picklescan: Security scanner detecting Python Pickle files performing suspicious actions

Python Pickle Malware Scanner

Security scanner detecting Python Pickle files performing suspicious actions.

Getting started

Scan a malicious model on Hugging Face:

pip install picklescan
picklescan --huggingface ykilcher/totally-harmless-model

The scanner reports that the Pickle is calling eval() to execute arbitrary code:

https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin:archive/data.pkl: global import '__builtin__ eval' FOUND
----------- SCAN SUMMARY -----------
Scanned files: 1
Infected files: 1
Dangerous globals: 1

The scanner can also load Pickles from local files, directories, URLs, and zip archives (a-la PyTorch):

picklescan --path downloads/pytorch_model.bin
picklescan --path downloads
picklescan --url https://huggingface.co/sshleifer/tiny-distilbert-base-cased-distilled-squad/resolve/main/pytorch_model.bin

To scan Numpy's .npy files, pip install the numpy package first.

The scanner exit status codes are (a-la ClamAV):

0: scan did not find malware
1: scan found malware
2: scan failed

Develop

Create and activate the conda environment (miniconda is sufficient):

conda env create -f conda.yaml
conda activate picklescan

Install the package in editable mode to develop and test:

python3 -m pip install -e .

Edit with VS Code:

code .

Run unit tests:

pytest tests

Run manual tests:

Local PyTorch (zip) file

mkdir downloads
wget -O downloads/pytorch_model.bin https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin
picklescan -l DEBUG -p downloads/pytorch_model.bin

Remote PyTorch (zip) URL

picklescan -l DEBUG -u https://huggingface.co/prajjwal1/bert-tiny/resolve/main/pytorch_model.bin

Lint the code:

black src tests --line-length 140
flake8 src tests --count --show-source

Publish the package to PyPI: bump the package version in setup.cfg and create a GitHub release. This triggers the publish workflow.

Alternative manual steps to publish the package:

python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m build
python3 -m twine upload dist/*

Test the package: bump the version of picklescan in conda.test.yaml and run

conda env remove -n picklescan-test
conda env create -f conda.test.yaml
conda activate picklescan-test
picklescan --huggingface ykilcher/totally-harmless-model

Tested on Linux 5.10.102.1-microsoft-standard-WSL2 x86_64 (WSL2).

References

pickledoc -- Non-official but in-depth documentation of the Pickle file format
pickledbg -- Step-by-step Pickle dissassembly debugger
pickletools.py -- The official "documentation" of the Pickle file format (where documentation == code).
Machine Learning Attack Series: Backdooring Pickle Files, Johann Rehberger, 2022
Hugging Face Pickle Scanning, Luc Georges, 2022
The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!, Yannic Kilcher, 2022
Secure Machine Learning at Scale with MLSecOps, Alejandro Saucedo, 2022
Backdooring Pickles: A decade only made things worse, ColdwaterQ, DEFCON 2022
Never a dill moment: Exploiting machine learning pickle files, Evan Sultanik, 2021 (tool: Fickling)
Exploiting Python pickles, David Hamann, 2020
Dangerous Pickles - malicious python serialization, Evan Sangaline, 2017
Python Pickle Security Problems and Solutions, Travis Cunningham, 2015
Arbitrary code execution with Python pickles, Stephen Checkoway, 2013
Sour Pickles, A serialised exploitation guide in one part, Marco Slaviero, BlackHat USA 2011 (see also: doc, slides)

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
img		img
src/picklescan		src/picklescan
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE.txt		NOTICE.txt
README.md		README.md
SECURITY.md		SECURITY.md
conda.extras.yaml		conda.extras.yaml
conda.test.yaml		conda.test.yaml
conda.yaml		conda.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_extras.txt		requirements_extras.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Getting started

Develop

References

About

Uh oh!

Releases 30

Uh oh!

Contributors 7

Languages

License

mmaitre314/picklescan

Folders and files

Latest commit

History

Repository files navigation

Getting started

Develop

References

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 30

Uh oh!

Contributors 7

Languages