-
Notifications
You must be signed in to change notification settings - Fork 62
Description
While combining several heartbeat aggs together using the rollup function I noticed that I get a different result if I first do a rollup on a different field and then roll the results together. Rollup for heartbeat agg should always give the same result for the same base data, regardless of the intermediate steps.
The data for this query is from the SustData data set, I've constructed the aggregates like so:
CREATE MATERIALIZED VIEW weekly_heartbeat AS
SELECT
time_bucket('1 week', tmstp) as week,
iid as unit,
deploy,
heartbeat_agg(tmstp, time_bucket('1w', tmstp), '1w', '2m')
FROM power_samples
GROUP BY 1,2,3;
Here are the rollup queries I ran and the results I got:
with week_agg as (select deploy, week, rollup(heartbeat_agg) as agg from weekly_heartbeat group by 1, 2)
select deploy, live_ranges(rollup(agg)) from week_agg group by deploy order by deploy;
deploy | live_ranges
--------+-----------------------------------------------------
1 | ("2010-07-29 00:00:00+00","2010-11-26 00:01:00+00")
2 | ("2010-11-25 00:00:00+00","2011-03-27 01:01:59+00")
2 | ("2011-03-27 03:00:00+00","2012-03-25 01:01:59+00")
2 | ("2012-03-25 03:00:26+00","2012-04-17 00:01:00+00")
2 | ("2012-04-20 00:00:00+00","2012-04-21 00:01:00+00")
2 | ("2012-05-11 00:00:00+00","2012-05-13 00:01:00+00")
2 | ("2013-02-20 00:00:00+00","2013-02-21 00:01:00+00")
3 | ("2012-08-01 00:00:01+00","2013-03-31 01:01:16+00")
3 | ("2013-03-31 03:00:03+00","2013-05-22 00:01:00+00")
4 | ("2013-07-31 00:00:00+00","2014-03-30 01:01:49+00")
4 | ("2014-03-30 03:00:01+00","2014-04-25 00:01:00+00")
with unit_agg as (select deploy, unit, rollup(heartbeat_agg) as agg from weekly_heartbeat group by 1,2)
select deploy, live_ranges(rollup(agg)) from unit_agg group by 1 order by 1;
deploy | live_ranges
--------+-----------------------------------------------------
1 | ("2010-07-29 00:00:00+00","2010-08-01 00:01:00+00")
1 | ("2010-08-02 00:00:00+00","2010-11-26 00:01:00+00")
2 | ("2010-11-25 00:00:00+00","2011-03-27 01:01:59+00")
2 | ("2011-03-27 03:00:00+00","2012-03-25 01:01:59+00")
2 | ("2012-03-25 03:00:26+00","2012-04-17 00:01:00+00")
2 | ("2012-04-20 00:00:00+00","2012-04-21 00:01:00+00")
2 | ("2012-05-11 00:00:00+00","2012-05-13 00:01:00+00")
2 | ("2013-02-20 00:00:00+00","2013-02-21 00:01:00+00")
3 | ("2012-08-01 00:00:01+00","2013-03-31 01:01:16+00")
3 | ("2013-03-31 03:00:03+00","2013-05-26 00:01:00+00")
4 | ("2013-07-31 00:00:00+00","2014-03-30 01:01:49+00")
4 | ("2014-03-30 03:00:01+00","2014-04-25 00:01:00+00")
select deploy, live_ranges(rollup(heartbeat_agg)) from weekly_heartbeat group by deploy order by deploy;
deploy | live_ranges
--------+-----------------------------------------------------
1 | ("2010-07-29 00:00:00+00","2010-09-13 00:01:40+00")
1 | ("2010-09-27 00:00:00+00","2010-11-15 00:01:00+00")
1 | ("2010-11-22 00:00:41+00","2010-11-26 00:01:00+00")
2 | ("2010-11-25 00:00:00+00","2011-03-27 01:01:59+00")
2 | ("2011-03-27 03:00:00+00","2012-03-25 01:01:59+00")
2 | ("2012-03-25 03:00:26+00","2012-04-17 00:01:00+00")
2 | ("2012-04-20 00:00:00+00","2012-04-21 00:01:00+00")
2 | ("2012-05-11 00:00:00+00","2012-05-13 00:01:00+00")
2 | ("2013-02-20 00:00:00+00","2013-02-21 00:01:00+00")
3 | ("2012-08-01 00:00:01+00","2013-03-31 01:01:16+00")
3 | ("2013-03-31 03:00:03+00","2013-04-15 00:01:09+00")
3 | ("2013-05-06 00:00:05+00","2013-05-13 00:01:25+00")
3 | ("2013-05-20 00:00:04+00","2013-05-22 00:01:00+00")
4 | ("2013-07-31 00:00:00+00","2014-03-30 01:01:49+00")
4 | ("2014-03-30 03:00:01+00","2014-04-07 00:01:01+00")
As can be seen, there are some missing ranges when simply rolling all the data together than aren't present if the overlapping ranges are first rolled together. There's an even different result in one of the cases when first rolling together the aggregates for a unit and then rolling up by week.