-
Notifications
You must be signed in to change notification settings - Fork 12
Description
@matthiaskoenig and others have suggested we make our tables more normalized.
Instead of a timecourses table (#581), I would suggest an experiments table, which merges the conditions and timecourses tables into a single table. The main idea is that conditions/timecourses all describe the "input" function of the dynamical system. Combining this "input function information" into a single table enables some additional operations.
All "unquoted" tables in this post are in the new proposed format.
Tables quoted like this are in old formats.
Conditions table -> experiments table
The following columns are sufficient to define a "normalized" PEtab v1 conditions table.
experimentId
The experiment ID.
inputId
The input variable ID.
For example, a experimental condition parameter in the SBML model.
value
The value that `inputId` takes.
Example 1: classic conditions table as experiments table
This PEtab v1 conditions table
conditionId k0 k2 cond1 5 3
is now this PEtab v2 experiments table
| experimentId | inputId | value |
|---|---|---|
| cond1 | k0 | 5 |
| cond1 | k2 | 3 |
This enables additional optional columns, e.g. for units.
Timecourses table -> experiments table
This experiments table can be extended to support timecourses like #581, with the following optional column:
time
The time at which the condition is applied.
The earliest time of the `experimentId` is its `t0` for simulation.
Example 2: timecourses table as experiments table
This timecourse in the currently-proposed format (#581)
conditionId k0 cond1 1 cond2 2 cond3 3
timecourseId timecourse tc1 0:cond1; 10:cond2; 250:cond3
is now specified in these long formats for the conditions and timecourses
normalized_conditions.tsv
| experimentId | inputId | value |
|---|---|---|
| cond1 | k0 | 1 |
| cond2 | k0 | 2 |
| cond3 | k0 | 3 |
normalized_timecourses.tsv
| experimentId | inputId | time |
|---|---|---|
| tc1 | cond1 | 0 |
| tc1 | cond2 | 10 |
| tc1 | cond3 | 250 |
which are specified in the PEtab YAML like
...
problems:
- experiment_files:
- normalized_conditions.tsv
- normalized_timecourses.tsv
measurement_files:
- ....tsv
...Here, you might notice the trick. The two tables are combined into a single experiments table, i.e., those two long tables, and the joint table below, are equivalent tables in the exact same format -- all are valid tables in the proposed format.
| experimentId | inputId | value | time |
|---|---|---|---|
| cond1 | k0 | 1 | |
| cond2 | k0 | 2 | |
| cond3 | k0 | 3 | |
| tc1 | cond1 | 0 | |
| tc1 | cond2 | 10 | |
| tc1 | cond3 | 250 |
This joint table enables a lot more flexibility, e.g. the following two features.
(1) Timecourses can be specified in terms of model parameters directly, e.g. the above joint table is equivalent to
| experimentId | inputId | value | time |
|---|---|---|---|
| tc1 | k0 | 1 | 0 |
| tc1 | k0 | 2 | 10 |
| tc1 | k0 | 3 | 250 |
(2) Nesting is now possible, for easier specification of periodic timecourses.
Nested timecourses
We already agreed that repeating timecourse specification is useful. I would add nested timecourses too, since I already have a use case. Hence the following optional column:
repeatEvery
The `inputId` is repeated (reapplied/restarted from its `t0`) every `repeatEvery` time units.
Example 3: Nested and repeating timecourse
This describes an experiment where a switch is toggled on/off every 5 time units until t=100.
switchOnandswitchOffare like PEtab v1 conditionsswitchSequenceis like a timecourse as in Specification of timecourses & long condition tableย #581experiment1is a nested timecourse whereswitchSequenceis repeated every 10 time units to simulate the repeated toggling of the switch, untilt=100.
| experimentId | inputId | value | time | repeatEvery |
|---|---|---|---|---|
| switchOn | switch | 1 | ||
| switchOff | switch | 0 | ||
| switchSequence | switchOn | 0 | ||
| switchSequence | switchOff | 5 | ||
| experiment1 | switchSequence | 0 | 10 | |
| experiment1 | switchOff | 100 |
Pros
- most users do not need timecourses, but v2 currently requires a timecourses table. This combined experiments table means users don't need a dummy (
timecourse1 = 0:condition1) timecourse table to convert their PEtab v1 problems into v2, and can instead use any condition/timecourse/nested timecourseexperimentIdin the measurements table. I think this is more intuitive for users. - this experiments table defines inputs, and then supports merging/repeating/concatenating into timecourses and nested timecourses. i.e. all "input" information that PEtab core intends to support is in a single table.
- some basic operations on experiments is possible. For example, one could modify some complicated condition
cond1with 1000 input variables at just one of its input variables like
| experimentId | inputId | value |
|---|---|---|
| cond2 | cond1 | |
| cond2 | k999 | 3 |
i.e., I think this format future-proofs PEtab v2 by supporting many features/operations on conditions. In the end, these can all be "denested" easily into things that look like PEtab v1 conditions applied at specific time points (or, SBML events), so it makes no difference to PEtab-compatible tools.
Cons
- users should see Example 1 in the docs, and the optional columns in Examples 2 and 3 should be presented carefully since they are irrelevant (and potentially confusing) to most.