Skip to content

Conversation

zilder
Copy link
Member

@zilder zilder commented Feb 21, 2025

OSM sets a special callback for dropping tiered chunks that is used by timescaledb in ts_chunk_do_drop_chunks. But in background worker OSM library doesn't get loaded and retention policy ends up not removing tiered chunks.

Disable-check: approval-count
(for some reason Gayathri's approval did not count)

Copy link

codecov bot commented Feb 21, 2025

Codecov Report

❌ Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.53%. Comparing base (412f427) to head (e637e4c).
⚠️ Report is 44 commits behind head on main.

Files with missing lines Patch % Lines
src/chunk.c 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7766      +/-   ##
==========================================
+ Coverage   82.47%   82.53%   +0.06%     
==========================================
  Files         247      247              
  Lines       47520    47494      -26     
  Branches    12081    12078       -3     
==========================================
+ Hits        39190    39201      +11     
- Misses       3474     3491      +17     
+ Partials     4856     4802      -54     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zilder zilder force-pushed the zilder/retention_bgw_load_osm branch from b24b4ec to c2d3700 Compare September 12, 2025 12:33
@zilder zilder marked this pull request as ready for review September 12, 2025 12:34
Copy link

@svenklemm, @antekresic: please review this pull request.

Powered by pull-review

@zilder zilder force-pushed the zilder/retention_bgw_load_osm branch from c2d3700 to 176fa37 Compare September 12, 2025 13:00
elog(LOG, "failed to load OSM extension: %s", error->message);

/* Finally, free error data */
FreeErrorData(error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transaction is borked after an error, so you need rethrow the error here. If you want to continue processing, you need to execute the load in a sub-transaction that can fail while the main transaction continues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably throw the error here though... you only try to load this library if an OSM chunk exists (i.e., OSM is being used), so if you cannot load the extension lib at that point, you should probably throw an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the explicit library loading code with the GetFdwRoutineByRelId() call as you suggested in slack, it works perfectly.

@zilder zilder force-pushed the zilder/retention_bgw_load_osm branch 2 times, most recently from 1f56ec5 to 01565d9 Compare September 15, 2025 13:00
@zilder zilder requested a review from erimatnor September 15, 2025 14:07
Copy link
Contributor

@gayyappan gayyappan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the previous commit and update the commit message/PR comment.

@zilder zilder force-pushed the zilder/retention_bgw_load_osm branch from 01565d9 to 831f7d1 Compare September 17, 2025 07:55
The OSM library doesn't get loaded when a retention policy runs from
a background worker. This patch ensures that the library gets loaded by
calling `GetFdwRoutineByRelId()` on the OSM chunk.
@zilder zilder force-pushed the zilder/retention_bgw_load_osm branch from 831f7d1 to e637e4c Compare September 17, 2025 13:40
@zilder zilder merged commit c3559d4 into timescale:main Sep 17, 2025
41 of 42 checks passed
@philkra philkra added the force-auto-backport Automatically backport this PR or fix of this issue, even if it's not marked as "bug" label Sep 17, 2025
philkra added a commit that referenced this pull request Sep 30, 2025
## 2.22.1 (2025-09-30)

This release contains performance improvements and bug fixes since the
[2.22.0](https://github.com/timescale/timescaledb/releases/tag/2.20.0)
release. We recommend that you upgrade at the next available
opportunity.

This release blocks the ability to leverage **concurrent refresh
policies** in **hierarchical continous aggregates**, as potential
deadlocks can occur. If you have [concurrent refresh
policies](https://docs.tigerdata.com/use-timescale/latest/continuous-aggregates/refresh-policies/#add-concurrent-refresh-policies)
in **hierarchical** continous aggregates, [please disable the
jobs](https://docs.tigerdata.com/api/latest/jobs-automation/alter_job/#samples),
as following:

```
SELECT alter_job("<job_id_of_concurrent_policy>", scheduled => false);
```

**Bugfixes**
* [#7766](#7766) Load OSM
extension in retention background worker to drop tiered chunks
* [#8550](#8550) Error in
gapfill with expressions over aggregates and groupby columns and
out-of-order columns
* [#8593](#8593) Error on
change of invalidation method for continuous aggregate
* [#8599](#8599) Fix attnum
mismatch bug in chunk constraint checks
* [#8607](#8607) Fix
interrupted continous aggregate refresh materialization phase leaving
behind pending materialization ranges
* [#8638](#8638) `ALTER
TABLE RESET` for `orderby` settings
* [#8644](#8644) Fix
migration script for sparse index configuration
* [#8657](#8657) Fix
`CREATE TABLE WITH` when using UUIDv7 partitioning
* [#8659](#8659) Don't
propagate `ALTER TABLE` commands to foreign data wrapper chunks
* [#8693](#8693) Compressed
index not chosen for `varchar` typed `segmentby` columns
* [#8707](#8707) Block
concurrent refresh policies for hierarchical continous aggregate due to
potential deadlocks

**Thanks**
* @MKrkkl for reporting a bug in Gapfill queries with expressions over
aggregates and groupby columns
* @brandonpurcell-dev for creating a test case that showed a bug in
`CREATE TABLE WITH` when using UUIDv7 partitioning
* @snyrkill for reporting a bug when interrupting a continous aggregate
refresh

---------

Signed-off-by: Philip Krauss <[email protected]>
Co-authored-by: timescale-automation <123763385+github-actions[bot]@users.noreply.github.com>
Co-authored-by: philkra <[email protected]>
Co-authored-by: Philip Krauss <[email protected]>
Co-authored-by: Iain Cox <[email protected]>
@timescale-automation timescale-automation added the released-2.22.1 Released in 2.22.1 label Sep 30, 2025
philkra added a commit that referenced this pull request Oct 1, 2025
## 2.22.1 (2025-09-30)

This release contains performance improvements and bug fixes since the
[2.22.0](https://github.com/timescale/timescaledb/releases/tag/2.20.0)
release. We recommend that you upgrade at the next available
opportunity.

This release blocks the ability to leverage **concurrent refresh
policies** in **hierarchical continous aggregates**, as potential
deadlocks can occur. If you have [concurrent refresh
policies](https://docs.tigerdata.com/use-timescale/latest/continuous-aggregates/refresh-policies/#add-concurrent-refresh-policies)
in **hierarchical** continous aggregates, [please disable the
jobs](https://docs.tigerdata.com/api/latest/jobs-automation/alter_job/#samples),
as following:

```
SELECT alter_job("<job_id_of_concurrent_policy>", scheduled => false);
```

**Bugfixes**
* [#7766](#7766) Load OSM
extension in retention background worker to drop tiered chunks
* [#8550](#8550) Error in
gapfill with expressions over aggregates and groupby columns and
out-of-order columns
* [#8593](#8593) Error on
change of invalidation method for continuous aggregate
* [#8599](#8599) Fix attnum
mismatch bug in chunk constraint checks
* [#8607](#8607) Fix
interrupted continous aggregate refresh materialization phase leaving
behind pending materialization ranges
* [#8638](#8638) `ALTER
TABLE RESET` for `orderby` settings
* [#8644](#8644) Fix
migration script for sparse index configuration
* [#8657](#8657) Fix
`CREATE TABLE WITH` when using UUIDv7 partitioning
* [#8659](#8659) Don't
propagate `ALTER TABLE` commands to foreign data wrapper chunks
* [#8693](#8693) Compressed
index not chosen for `varchar` typed `segmentby` columns
* [#8707](#8707) Block
concurrent refresh policies for hierarchical continous aggregate due to
potential deadlocks

**Thanks**
* @MKrkkl for reporting a bug in Gapfill queries with expressions over
aggregates and groupby columns
* @brandonpurcell-dev for creating a test case that showed a bug in
`CREATE TABLE WITH` when using UUIDv7 partitioning
* @snyrkill for reporting a bug when interrupting a continous aggregate
refresh

---------

Signed-off-by: Philip Krauss <[email protected]>
Co-authored-by: timescale-automation <123763385+github-actions[bot]@users.noreply.github.com>
Co-authored-by: philkra <[email protected]>
Co-authored-by: Philip Krauss <[email protected]>
Co-authored-by: Iain Cox <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported-2.22.x force-auto-backport Automatically backport this PR or fix of this issue, even if it's not marked as "bug" released-2.22.1 Released in 2.22.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants