This repository contains Python code to reproduce the core analysis described in the manuscript: "A stability-weighted combined map for Cell Painting across regular and prolonged incubation".
- Loads a single CSV file that contains both exposure windows.
- Keeps only compounds with at least N independent long-incubation iterations (optional filter).
- Splits metadata vs numeric features.
- Standardises features and runs PCA.
- Builds a kNN graph in PCA space and performs Leiden clustering (Scanpy).
- Computes within-window replicate stability (mean within-compound replicate distance in PCA space).
- Builds a Combined dataset by selecting, for each compound, the exposure window with higher stability.
- Exports cluster tables and (optionally) UMAP plots for Short / Long / Combined.
Outputs are saved to the folder specified in out_dir.
The input CSV should contain:
- metadata columns starting with
Metadata_(e.g.,Metadata_Name,Metadata_Source,Metadata_MoA) - numeric feature columns (CellProfiler features)
By default, the script uses Metadata_Source to split Short vs Long using string patterns:
- short: contains
sh - long: contains
l
If your naming differs, update the patterns in the config.
- UMAP is used for 2D visualisation.
- The code sets a random seed for reproducibility.
A stability-weighted combined map for Cell Painting.ipynbenv/siriusrequirements.txt: full pip freeze shared by the supervisor (may include non-Windows packages).env/requirements_windows.txt: Windows-friendly requirements (same as above but with non-Windows lines removed).environment.yml: conda environment file that installs Python and then installsenv/requirements_windows.txt.