interpolated_average can't pull values through empty buckets 

**Relevant system information:**
- OS: Timescale Managed Service Azure
- PostgreSQL version: 14.4
- TimescaleDB Toolkit version: 1.10.0
- Installation method: Timescale Managed Service Azure



**Describe the bug**
Time weighted average fails if buckets without values exist. It just ignores buckets without data (output has less rows than expected) and for buckets with data that are adjacent to buckets without data it computes wrong averages.

**To Reproduce**
- Input data:

```
ts, signal_id, value
('2019-12-31T23:10:00', 8, 10);
('2020-01-01T03:10:00', 8, 20);
('2020-01-01T03:50:00', 8, 40);
('2020-01-01T00:00:00', 9, 10);
('2020-01-01T03:10:00', 9, 20);
('2020-01-01T03:50:00', 9, 40);
```

- Expected output (LOCF, no linear interpolation):
```
ts, signal_id, value
('2020-01-01T01:00:00', 8, 10)
('2020-01-01T02:00:00', 8, 10)
('2020-01-01T03:00:00', 8,10)
('2020-01-01T04:00:00', 8,21.66666)
('2020-01-01T05:00:00', 8,40)
('2020-01-01T01:00:00', 9, 10)
('2020-01-01T02:00:00', 9, 10)
('2020-01-01T03:00:00', 9,10)
('2020-01-01T04:00:00', 9,21.66666)
('2020-01-01T05:00:00', 9,40)
```

- Query used
```
SELECT
    signal_id,
    time_value,
    toolkit_experimental.interpolated_average(
        tws,
        time_value,
        '3600 s',
        LAG(tws) OVER (Partition By signal_id ORDER BY time_value),
        LEAD(tws) OVER (Partition By signal_id ORDER BY time_value)
    )
    FROM (
    SELECT
        signal_id,
        time_bucket('3600 s', ts) AS time_value,
        time_weight('LOCF', ts, measurement_value) AS tws
    FROM measurements
    Where ts BETWEEN '2020-01-01T00:00:00' AND '2020-01-01T05:00:00'
    And signal_id = 9    
    GROUP BY signal_id, time_value
    ) t
```

- Actual output:
```
('2020-01-01T01:00:00', 9, 10)
('2020-01-01T03:00:00', 9, 18)
('2020-01-01T01:00:00', 8, 10)
('2020-01-01T03:00:00', 8, 18)
```

**Workaround**
with: 
- *datetime_from* = '2020-01-01T00:00:00'
- *datetime_to* = '2020-01-01T05:00:00' 
- *signal_id* = 9
- *resampled_interval* = 3600
- fall_back_time = *datetime_from* - 30 Days

```
with t1 as (
    SELECT signal_id, ts, measurement_value
    From measurements
    Where ts >= '{datetime_from}' AND ts < '{datetime_to}' And signal_id in ({signals})
),
t2 as (SELECT
    time_bucket_gapfill('{resampled_interval} s', ts) AS time_value_gap_fill,
    signal_id,
    locf(last(measurement_value,ts)) as locf_value,
    last(measurement_value, ts) as indicator_empty_bucket
    From measurements
    Where ts BETWEEN '{fall_back_time}' AND '{datetime_to}' and signal_id in ({signals})
    GROUP BY signal_id, time_value_gap_fill),
t3 as (
    select signal_id, time_value_gap_fill as ts, locf_value as measurement_value
    from t2 where indicator_empty_bucket is Null
    and time_value_gap_fill >= '{datetime_from-datetime.timedelta(seconds=resampled_interval)}'
),
t4 as (
select * from t3 union all select * from t1 order by ts),
t5 as (
    SELECT signal_id, time_bucket('{resampled_interval} s', ts) AS interval_label ,
     time_weight('LOCF', ts, measurement_value) AS tws
     FROM t4
    GROUP BY signal_id, interval_label
),
t6 as (
    SELECT interval_label, signal_id,
        toolkit_experimental.interpolated_average(
            tws,
            interval_label,
            '{resampled_interval} s',
            LAG(tws) OVER (Partition By signal_id ORDER BY interval_label),
            LEAD(tws) OVER (Partition By signal_id ORDER BY interval_label)) as time_weighted_average
            from t5
)
Select interval_label +'{resampled_interval} s' ,
 signal_id,
 time_weighted_average from t6 where interval_label < '{datetime_to}'
  and interval_label > '{datetime_from-datetime.timedelta(seconds=resampled_interval)}';
```

This should work out of the box. The workaround query also works for multiple signals. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

interpolated_average can't pull values through empty buckets #548

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

interpolated_average can't pull values through empty buckets #548

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions