Skip to content

Use somewhat normalized "experiments" table instead of conditions/timecoursesย #585

@dilpath

Description

@dilpath

@matthiaskoenig and others have suggested we make our tables more normalized.

Instead of a timecourses table (#581), I would suggest an experiments table, which merges the conditions and timecourses tables into a single table. The main idea is that conditions/timecourses all describe the "input" function of the dynamical system. Combining this "input function information" into a single table enables some additional operations.

All "unquoted" tables in this post are in the new proposed format.

Tables quoted like this are in old formats.

Conditions table -> experiments table

The following columns are sufficient to define a "normalized" PEtab v1 conditions table.

experimentId
    The experiment ID.
inputId
    The input variable ID.
    For example, a experimental condition parameter in the SBML model.
value
    The value that `inputId` takes. 

Example 1: classic conditions table as experiments table

This PEtab v1 conditions table

conditionId k0 k2
cond1 5 3

is now this PEtab v2 experiments table

experimentId inputId value
cond1 k0 5
cond1 k2 3

This enables additional optional columns, e.g. for units.

Timecourses table -> experiments table

This experiments table can be extended to support timecourses like #581, with the following optional column:

time
    The time at which the condition is applied.
    The earliest time of the `experimentId` is its `t0` for simulation.

Example 2: timecourses table as experiments table

This timecourse in the currently-proposed format (#581)

conditionId k0
cond1 1
cond2 2
cond3 3
timecourseId timecourse
tc1 0:cond1; 10:cond2; 250:cond3

is now specified in these long formats for the conditions and timecourses

normalized_conditions.tsv

experimentId inputId value
cond1 k0 1
cond2 k0 2
cond3 k0 3

normalized_timecourses.tsv

experimentId inputId time
tc1 cond1 0
tc1 cond2 10
tc1 cond3 250

which are specified in the PEtab YAML like

...
problems:
- experiment_files:
  - normalized_conditions.tsv
  - normalized_timecourses.tsv
  measurement_files:
  - ....tsv
  ...

Here, you might notice the trick. The two tables are combined into a single experiments table, i.e., those two long tables, and the joint table below, are equivalent tables in the exact same format -- all are valid tables in the proposed format.

experimentId inputId value time
cond1 k0 1
cond2 k0 2
cond3 k0 3
tc1 cond1 0
tc1 cond2 10
tc1 cond3 250

This joint table enables a lot more flexibility, e.g. the following two features.

(1) Timecourses can be specified in terms of model parameters directly, e.g. the above joint table is equivalent to

experimentId inputId value time
tc1 k0 1 0
tc1 k0 2 10
tc1 k0 3 250

(2) Nesting is now possible, for easier specification of periodic timecourses.

Nested timecourses

We already agreed that repeating timecourse specification is useful. I would add nested timecourses too, since I already have a use case. Hence the following optional column:

repeatEvery
    The `inputId` is repeated (reapplied/restarted from its `t0`) every `repeatEvery` time units.

Example 3: Nested and repeating timecourse

This describes an experiment where a switch is toggled on/off every 5 time units until t=100.

  • switchOn and switchOff are like PEtab v1 conditions
  • switchSequence is like a timecourse as in Specification of timecourses & long condition tableย #581
  • experiment1 is a nested timecourse where switchSequence is repeated every 10 time units to simulate the repeated toggling of the switch, until t=100.
experimentId inputId value time repeatEvery
switchOn switch 1
switchOff switch 0
switchSequence switchOn 0
switchSequence switchOff 5
experiment1 switchSequence 0 10
experiment1 switchOff 100

Pros

  • most users do not need timecourses, but v2 currently requires a timecourses table. This combined experiments table means users don't need a dummy (timecourse1 = 0:condition1) timecourse table to convert their PEtab v1 problems into v2, and can instead use any condition/timecourse/nested timecourse experimentId in the measurements table. I think this is more intuitive for users.
  • this experiments table defines inputs, and then supports merging/repeating/concatenating into timecourses and nested timecourses. i.e. all "input" information that PEtab core intends to support is in a single table.
  • some basic operations on experiments is possible. For example, one could modify some complicated condition cond1 with 1000 input variables at just one of its input variables like
experimentId inputId value
cond2 cond1
cond2 k999 3

i.e., I think this format future-proofs PEtab v2 by supporting many features/operations on conditions. In the end, these can all be "denested" easily into things that look like PEtab v1 conditions applied at specific time points (or, SBML events), so it makes no difference to PEtab-compatible tools.

Cons

  • users should see Example 1 in the docs, and the optional columns in Examples 2 and 3 should be presented carefully since they are irrelevant (and potentially confusing) to most.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions