This repository contains all of the code needed to query the HuggingFace API using the hf_research cli.
We recommend using uv for installation.
uv venv
uv pip install .
However, everything will work the same with a standard pip installation.
pip install .
NOTE: If you're installing with pip and want to use the CLI, you'll need to make sure the python environment you install into has been appended to your path.
Once installed, the CLI will be accessible as hfss
.
Generating a complete scan of HuggingFace involves running two commands in sequence.
hfss scan full
hfss integrity drift
You'll likely need a HuggingFace API token to complete these scans without being severely rate limited. You can provide a token by setting the HF_TOKEN
environment variable or by adding it to a .env
file wherever you run the scan.
By default the cli can be stopped / started as needed. Final and temporary scan results will be stored in a configurable cache directory defaulting to cache
in the invoked directory. This can also be configured via the CACHE_DIR
environment variable.
WORKERS
: Controls threading for parallelized queriesVERBOSE
: Controls verbosity of output logging.