-
Notifications
You must be signed in to change notification settings - Fork 62
Description
What's the functionality you would like to add
A generic interface to aggregate of (Time, Value)
pairs and manipulate such timeseries.
How would the function be used
SELECT * FROM data;
time | value
------------------------+-------
2019-01-01 00:01:00-05 | 1
2019-01-01 00:01:30-05 | 2
2019-01-01 00:02:00-05 | 2
2019-01-01 00:02:30-05 | 2
2019-01-01 00:03:00-05 | 3
2019-01-01 00:04:00-05 |
2019-01-01 00:04:30-05 |
2019-01-01 00:05:00-05 |
2019-01-01 00:05:30-05 |
2019-01-01 00:06:00-05 | 6
2019-01-01 00:06:30-05 | 6
(11 rows)
-- a timeseries is sorted set of (time, value) pairs
SELECT timeseries(time, value, sample('1 minute') |> interpolate()) FROM data;
timeseries
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
(time, double precision) ("2019-01-01 00:01:00-05",1), ("2019-01-01 00:01:30-05",2), ("2019-01-01 00:02:00-05",2), ("2019-01-01 00:02:30-05",2), ("2019-01-01 00:03:00-05",3), ("2019-01-01 00:04:00-05", NULL), ("2019-01-01 00:04:30-05", NULL), ("2019-01-01 00:05:00-05", NULL), ("2019-01-01 00:05:30-05", NULL), ("2019-01-01 00:06:00-05",6), ("2019-01-01 00:06:30-05",6)
(1 row)
-- it can take a pipeline of transformations to be applied to the underlying data
-- enabling transformations such as bucketing, interpolation and others
SELECT timeseries(time, value, sample('1 minute') |> interpolate()) FROM data;
timeseries
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
(time, double precision) ("2019-01-01 00:01:00-05", 1), ("2019-01-01 00:02:00-05",2), ("2019-01-01 00:03:00-05",3), ("2019-01-01 00:04:00-05",3.6666666666666665), ("2019-01-01 00:05:00-05",4.666666666666666), ("2019-01-01 00:06:00-05",6)
(1 row)
Is your feature request related to a problem? A wishlist item Please describe.
TODO expand other wishlist items this relates to
Open Questions
-
What, if anything, should be the pipeline syntax?
-
We have multiple ways we can relate the pipeline to the aggregate, on is that the aggreagate takes in the pipline as an argument (shown above) this allows us the most flexibility in execution, and may allow us incremental execution in some instances. An alternative is that the pipeline takes in the aggregate as in
timeseries(time, value) |> sample('1 minute') |> interpolate()
this may be a more natural syntax, expecially with output aggregations as in
timeseries(time, value) |> sample('1 minute') |> interpolate()
in the current syntax that would look something like
timeseries(time, value, sample('1 minute') |> interpolate())
-
Ordering: to what extent should we force the input to be ordered, versus sort internally to the aggregation?
-
Should we have separate types for regular and irregular timeseries?