Documentation: File and Folder Structure

Below is a description of the general file and folder structure in the project, along with how conda environments can be used, and how each element is typically utilized. All executable scripts for stages should be named app.py to maintain consistency.

Project Structure

Project root
- manifest.json — The main project manifest (if used). It may contain common settings or link individual pipeline stages.
- Other files not related to a specific stage.
Stage folders (for example, load_anndata, clustering, dimensionality_reduction)
1. manifest.json (inside the stage folder)
  - Describes the key parameters required by this stage.
  - Contains the stage name, its description, execution order (stage), types and requirements of input parameters (params), and information about returned data (return).
  - May specify dependencies (e.g., depends_and_script or depends_or_script) and the environments used (conda, libs, conda_pip).
    - Conda usage: If conda is specified, the system creates or uses a conda environment with the requested Python version and installs the listed libraries (libs via conda, conda_pip via pip in that conda environment). This isolation helps avoid library conflicts.
2. app.py (the executable script for this stage)
  - This file name should always be app.py to maintain a consistent structure.
  - It contains the core business logic: reading data, transforming it, analyzing it, and producing output.
  - Typically, it defines a function (often run(**kwargs)) that:
    1. Imports the necessary dependencies (e.g., scanpy, scvi, numpy).
    2. Reads parameters from kwargs, which are provided from manifest.json (e.g., file path, analysis method, metrics).
    3. Calls a helper function or a series of functions that perform the main logic (e.g., data loading, clustering, dimensionality reduction, etc.).
    4. Returns the result in the format described in the manifest (usually a dictionary where the keys match the fields in return).

Example `app.py` Structure

Import libraries

import scanpy as sc
import numpy as np
import pandas as pd
# ...

Define helper functions (e.g., reduce_dimensionality, cluster, load_data)

def reduce_dimensionality(adata, method='pca', ...):
    # Dimensionality reduction logic
    return adata

run(**kwargs) function

def run(**kwargs):
    # Read arguments
    adata = kwargs.get('adata')
    method = kwargs.get('method', 'pca')
    # ...
    # Call a helper function
    out = reduce_dimensionality(adata, method=method)
    # Return the result
    return { 'adata': out }

Main Purpose

manifest.json in each folder:
- Defines which parameters the stage requires and what data it returns.
- Specifies the execution order in the pipeline.
- Allows you to determine which libraries (conda or pip) are needed for the stage.
- May include version constraints for packages.
- Conda Environments: If conda is specified, the system will create/use the indicated environment (for example, python=3.11) and install the specified libraries.
app.py:
- Performs the main work — processes data using parameters obtained from manifest.json.
- Produces output that subsequent stages can access.
- Has a structure consisting of several steps:
  - Imports
  - Helper functions
  - run(**kwargs) function — the entry point.

Example Project Structure

project_root/
├── manifest.json               # Main (root) manifest, if present
├── load_anndata/
│   ├── manifest.json           # Manifest for the loading stage
│   └── app.py                  # Script performing data loading
├── clustering/
│   ├── manifest.json           # Manifest for the clustering stage
│   └── app.py                  # Script for clustering data
├── dimensionality_reduction/
│   ├── manifest.json           # Manifest for the dimensionality reduction stage
│   └── app.py                  # Script performing the analysis
└── other_folders_or_files      # Other files/folders in the project

Usage Recommendations

Store a maximum of one stage in each folder (with its own manifest.json and app.py).
The main manifest can set the overall pipeline logic or serve as the entry point for the entire system.
Each app.py should be as focused as possible, making the stage easier to test, modify, and reuse.
Parameters in manifest.json should be described in as much detail as possible so that users understand what is required as input and what will be returned as output.
Conda Environments: When conda is specified, each stage can be isolated in its own environment to avoid library version conflicts across different scripts.

Documentation: File and Folder Structure

Below is a description of the general file and folder structure in the project, along with how conda environments can be used, and how each element is typically utilized. All executable scripts for stages should be named app.py to maintain consistency.

Project Structure

Project root
- manifest.json — The main project manifest (if used). It may contain common settings or link individual pipeline stages.
- Other files not related to a specific stage.
Stage folders (for example, load_anndata, clustering, dimensionality_reduction)
1. manifest.json (inside the stage folder)
  - Describes the key parameters required by this stage.
  - Contains the stage name, its description, execution order (stage), types and requirements of input parameters (params), and information about returned data (return).
  - May specify dependencies (e.g., depends_and_script or depends_or_script) and the environments used (conda, libs, conda_pip).
    - Conda usage: If conda is specified, the system creates or uses a conda environment with the requested Python version and installs the listed libraries (libs via conda, conda_pip via pip in that conda environment). This isolation helps avoid library conflicts.
2. app.py (the executable script for this stage)
  - This file name should always be app.py to maintain a consistent structure.
  - It contains the core business logic: reading data, transforming it, analyzing it, and producing output.
  - Typically, it defines a function (often run(**kwargs)) that:
    1. Imports the necessary dependencies (e.g., scanpy, scvi, numpy).
    2. Reads parameters from kwargs, which are provided from manifest.json (e.g., file path, analysis method, metrics).
    3. Calls a helper function or a series of functions that perform the main logic (e.g., data loading, clustering, dimensionality reduction, etc.).
    4. Returns the result in the format described in the manifest (usually a dictionary where the keys match the fields in return).

Example `app.py` Structure

Import libraries

import scanpy as sc
import numpy as np
import pandas as pd
# ...

Define helper functions (e.g., reduce_dimensionality, cluster, load_data)

def reduce_dimensionality(adata, method='pca', ...):
    # Dimensionality reduction logic
    return adata

run(**kwargs) function

def run(**kwargs):
    # Read arguments
    adata = kwargs.get('adata')
    method = kwargs.get('method', 'pca')
    # ...
    # Call a helper function
    out = reduce_dimensionality(adata, method=method)
    # Return the result
    return { 'adata': out }

Main Purpose

manifest.json in each folder:
- Defines which parameters the stage requires and what data it returns.
- Specifies the execution order in the pipeline.
- Allows you to determine which libraries (conda or pip) are needed for the stage.
- May include version constraints for packages.
- Conda Environments: If conda is specified, the system will create/use the indicated environment (for example, python=3.11) and install the specified libraries.
app.py:
- Performs the main work — processes data using parameters obtained from manifest.json.
- Produces output that subsequent stages can access.
- Has a structure consisting of several steps:
  - Imports
  - Helper functions
  - run(**kwargs) function — the entry point.

Example Project Structure

project_root/
├── manifest.json               # Main (root) manifest, if present
├── load_anndata/
│   ├── manifest.json           # Manifest for the loading stage
│   └── app.py                  # Script performing data loading
├── clustering/
│   ├── manifest.json           # Manifest for the clustering stage
│   └── app.py                  # Script for clustering data
├── dimensionality_reduction/
│   ├── manifest.json           # Manifest for the dimensionality reduction stage
│   └── app.py                  # Script performing the analysis
└── other_folders_or_files      # Other files/folders in the project

Usage Recommendations

Store a maximum of one stage in each folder (with its own manifest.json and app.py).
The main manifest can set the overall pipeline logic or serve as the entry point for the entire system.
Each app.py should be as focused as possible, making the stage easier to test, modify, and reuse.
Parameters in manifest.json should be described in as much detail as possible so that users understand what is required as input and what will be returned as output.
Conda Environments: When conda is specified, each stage can be isolated in its own environment to avoid library version conflicts across different scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
CLQ		CLQ
CLQ_numba_v2		CLQ_numba_v2
analyze_pathways		analyze_pathways
anotate_clusters		anotate_clusters
clq_anndata		clq_anndata
clustering		clustering
differential_expression		differential_expression
dimensionality_reduction		dimensionality_reduction
load_anndata		load_anndata
niche_analysis		niche_analysis
preprocessing		preprocessing
.DS_Store		.DS_Store
.gitignore		.gitignore
CLQ_vector.ipynb		CLQ_vector.ipynb
LICENSE.md		LICENSE.md
README.md		README.md
environment.yml		environment.yml
manifest.json		manifest.json
pipeline_sketch.ipynb		pipeline_sketch.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Documentation: File and Folder Structure

Project Structure

Example `app.py` Structure

Main Purpose

Example Project Structure

Usage Recommendations

Documentation: File and Folder Structure

Project Structure

Example `app.py` Structure

Main Purpose

Example Project Structure

Usage Recommendations

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Genentech/spex_spatial_transcriptomics

Folders and files

Latest commit

History

Repository files navigation

Documentation: File and Folder Structure

Project Structure

Example app.py Structure

Main Purpose

Example Project Structure

Usage Recommendations

Documentation: File and Folder Structure

Project Structure

Example app.py Structure

Main Purpose

Example Project Structure

Usage Recommendations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Example `app.py` Structure

Example `app.py` Structure

Packages