Skip to content

Features Database

amrismil edited this page Oct 28, 2025 · 1 revision

Instead of generating feature files locally, you can download them from the AlphaPulldown Features Database, which contains precomputed protein features for major model organisms.

Installation

Note

For EMBL cluster users: You can access the directory with generated features files at /g/alphafold/input_features/

To access the Features Database, you need to install the MinIO Client (mc).

Steps:

  1. Download the mc binary.
  2. Make the binary executable.
  3. Move it to your PATH for system-wide access.

Example for AMD64 architecture:

curl -O https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc
sudo mv mc /usr/local/bin/

Verify installation:

To ensure mc is correctly installed, you can run:

mc --help

Configuration

Set up an alias for easy access to the AlphaPulldown Features Database hosted at EMBL:

mc alias set embl https://s3.embl.de "" "" --api S3v4

This alias allows you to interact with the Features Database as if it were a local directory.

Downloading Features

Once mc is installed and configured, you can start accessing the Features Database. The mc commands mimic standard bash commands.

List available organisms:

To view the list of available organisms with precomputed feature files, run:

mc ls embl/alphapulldown/input_features

Each organism directory contains compressed .pkl.xz feature files, named according to their UniProt ID.

Download specific protein features:

For example, to download the feature file for the protein with UniProt ID Q6BF25 from Escherichia coli, use:

mc cp embl/alphapulldown/input_features/Escherichia_coli/Q6BF25.pkl.xz Q6BF25.pkl.xz

Download all features for an organism:

To download all feature files for proteins from a specific organism, such as E. coli, copy the entire directory:

mc cp --recursive embl/alphapulldown/input_features/Escherichia_coli/ ./Escherichia_coli/

Alternatively, you can mirror the contents of the organism’s directory, ensuring all files are synced between the source and your local directory:

mc mirror embl/alphapulldown/input_features/Escherichia_coli/ Escherichia_coli/

This command mirrors the remote directory to your local system, keeping both locations in sync.

Clone this wiki locally