Caching information for repeated UPSERTs to the same chunk #8182

dbeck · 2025-05-27T10:36:11Z

When upserting data into the same compressed chunk we make index checks and compression setting lookup that is suprisingly expensive. This change allows caching scan keys and compression settings between repeated calls to the same compressed chunk and it speeds up these repeated upserts.

codecov · 2025-05-27T10:50:40Z

Codecov Report

Attention: Patch coverage is 90.80460% with 8 lines in your changes missing coverage. Please review.

Project coverage is 82.41%. Comparing base (8f0fa91) to head (c592577).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
tsl/src/compression/compression_dml.c	92.64%	0 Missing and 5 partials ⚠️
tsl/src/compression/compression_scankey.c	87.50%	0 Missing and 2 partials ⚠️
src/nodes/chunk_dispatch/chunk_dispatch.c	66.66%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8182      +/-   ##
==========================================
+ Coverage   82.29%   82.41%   +0.12%     
==========================================
  Files         256      256              
  Lines       47934    47958      +24     
  Branches    12077    12089      +12     
==========================================
+ Hits        39448    39526      +78     
- Misses       3643     3648       +5     
+ Partials     4843     4784      -59

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dbeck · 2025-05-27T11:15:05Z

This change brings close to 2x performance gains on upsert performance:

https://grafana.dev-us-east-1.ops.dev.timescale.com/d/fasYic_4z/compare-benchmark-runs?orgId=1&var-run1=4460&var-run2=4461&var-postgres=16&var-branch=All&var-threshold=0.05&var-use_historical_thresholds=false&var-exact_suite_version=false

src/nodes/chunk_dispatch/chunk_insert_state.h

When upserting data into the same compressed chunk we make index checks and compression setting lookup that is suprisingly expensive. This change allows caching scan keys and compression settings between repeated calls to the same compressed chunk and it speeds up these repeated upserts.

dbeck · 2025-05-27T13:38:55Z

@akuzm : I believe I addressed all the concerns

akuzm · 2025-05-27T13:54:51Z

@akuzm : I believe I addressed all the concerns

Thanks! Let's also ask @antekresic to take a look since I'm not really up to date on this code.

tsl/src/compression/compression_dml.c

antekresic

Great work @dbeck ! LGTM 🎉

## 2.21.0 (2025-07-08) This release contains performance improvements and bug fixes since the 2.20.3 release. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.21.0** * The attach & detach chunks feature allows manually adding or removing chunks from a hypertable with uncompressed chunks, similar to PostgreSQL’s partition management. * Continued improvement of backfilling into the columnstore, achieving up to 2.5x speedup for constrained tables, by introducing caching logic that boosts throughput for writes to compressed chunks, bringing `INSERT` performance close to that of uncompressed chunks. * Optimized `DELETE` operations on the columstore through batch-level deletions of non-segmentby keys in the filter condition, greatly improving performance to up to 42x faster in some cases, as well as reducing bloat, and lowering resource usage. * The heavy lock taken in Continuous Aggregate refresh was relaxed, enabling concurrent refreshes for non-overlapping ranges and eliminating the need for complex customer workarounds. * [tech preview] Direct Compress is an innovative TimescaleDB feature that improves high-volume data ingestion by compressing data in memory and writing it directly to disk, reducing I/O overhead, eliminating dependency on background compression jobs, and significantly boosting insert performance. **Sunsetting of the hypercore access method** We made the decision to deprecate hypercore access method (TAM) with the 2.21.0 release. It was an experiment, which did not show the signals we hoped for and will be sunsetted in TimescaleDB 2.22.0, scheduled for September 2025. Upgrading to 2.22.0 and higher will be blocked if TAM is still in use. Since TAM’s inception in [2.18.0](https://github.com/timescale/timescaledb/releases/tag/2.18.0), we learned that btrees were not the right architecture. The recent advancements in the columnstore—such as more performant backfilling, SkipScan, adding check constraints, and faster point queries—put the [columnstore](https://www.timescale.com/blog/hypercore-a-hybrid-row-storage-engine-for-real-time-analytics) close to or on par with TAM without the storage from the additional index. We apologize for the inconvenience this action potentially causes and are here to assist you during the migration process. Migration path ``` do $$ declare relid regclass; begin for relid in select cl.oid from pg_class cl join pg_am am on (am.oid = cl.relam) where am.amname = 'hypercore' loop raise notice 'converting % to heap', relid::regclass; execute format('alter table %s set access method heap', relid); end loop; end $$; ``` **Features** * [#8081](#8081) Use JSON error code for job configuration parsing * [#8100](#8100) Support splitting compressed chunks * [#8131](#8131) Add policy to process hypertable invalidations * [#8141](#8141) Add function to process hypertable invalidations * [#8165](#8165) Reindex recompressed chunks in compression policy * [#8178](#8178) Add columnstore option to `CREATE TABLE WITH` * [#8179](#8179) Implement direct `DELETE` on non-segmentby columns * [#8182](#8182) Cache information for repeated upserts into the same compressed chunk * [#8187](#8187) Allow concurrent Continuous Aggregate refreshes * [#8191](#8191) Add option to not process hypertable invalidations * [#8196](#8196) Show deprecation warning for TAM * [#8208](#8208) Use `NULL` compression for bool batches with all null values like the other compression algorithms * [#8223](#8223) Support for attach/detach chunk * [#8265](#8265) Set incremental Continous Aggregate refresh policy on by default * [#8274](#8274) Allow creating concurrent continuous aggregate refresh policies * [#8314](#8314) Add support for timescaledb_lake in loader * [#8209](#8209) Add experimental support for Direct Compress of `COPY` * [#8341](#8341) Allow quick migration from hypercore TAM to (columnstore) heap **Bugfixes** * [#8153](#8153) Restoring a database having NULL compressed data * [#8164](#8164) Check columns when creating new chunk from table * [#8294](#8294) The "vectorized predicate called for a null value" error for WHERE conditions like `x = any(null::int[])`. * [#8307](#8307) Fix missing catalog entries for bool and null compression in fresh installations * [#8323](#8323) Fix DML issue with expression indexes and BHS **GUCs** * `enable_direct_compress_copy`: Enable experimental support for direct compression during `COPY`, default: off * `enable_direct_compress_copy_sort_batches`: Enable batch sorting during direct compress `COPY`, default: on * `enable_direct_compress_copy_client_sorted`: Correct handling of data sorting by the user is required for this option, default: off --------- Signed-off-by: Philip Krauss <[email protected]> Co-authored-by: philkra <[email protected]> Co-authored-by: philkra <[email protected]> Co-authored-by: Fabrízio de Royes Mello <[email protected]> Co-authored-by: Anastasiia Tovpeko <[email protected]>

dbeck requested review from akuzm, antekresic and svenklemm May 27, 2025 10:36

github-actions bot assigned dbeck May 27, 2025

philkra added this to the v2.21.0 milestone May 27, 2025

svenklemm approved these changes May 27, 2025

View reviewed changes

akuzm reviewed May 27, 2025

View reviewed changes

src/nodes/chunk_dispatch/chunk_insert_state.h Outdated Show resolved Hide resolved

dbeck force-pushed the dbeck/tsbench_upsert_rebased branch from 75d04e7 to 3849fcf Compare May 27, 2025 11:46

akuzm reviewed May 27, 2025

View reviewed changes

src/nodes/chunk_dispatch/chunk_insert_state.h Outdated Show resolved Hide resolved

dbeck force-pushed the dbeck/tsbench_upsert_rebased branch from 3849fcf to 5aa9641 Compare May 27, 2025 12:05

dbeck force-pushed the dbeck/tsbench_upsert_rebased branch from 5aa9641 to ee81eb5 Compare May 27, 2025 12:15

natalya-aksman reviewed May 27, 2025

View reviewed changes

tsl/src/compression/compression_dml.c Show resolved Hide resolved

natalya-aksman reviewed May 27, 2025

View reviewed changes

tsl/src/compression/compression_dml.c Show resolved Hide resolved

antekresic approved these changes May 29, 2025

View reviewed changes

natalya-aksman approved these changes May 29, 2025

View reviewed changes

Merge branch 'main' into dbeck/tsbench_upsert_rebased

c592577

dbeck enabled auto-merge (rebase) May 29, 2025 12:41

dbeck merged commit f9a9037 into main May 29, 2025
42 of 43 checks passed

dbeck deleted the dbeck/tsbench_upsert_rebased branch May 29, 2025 13:00

philkra mentioned this pull request Jul 2, 2025

Release 2.21.0 #8325

Merged

timescale-automation added the released-2.21.0 Released in 2.21.0 label Jul 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Caching information for repeated UPSERTs to the same chunk #8182

Caching information for repeated UPSERTs to the same chunk #8182

Uh oh!

dbeck commented May 27, 2025

Uh oh!

codecov bot commented May 27, 2025 •

edited

Loading

Uh oh!

dbeck commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

dbeck commented May 27, 2025

Uh oh!

akuzm commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

antekresic left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Uh oh!

Caching information for repeated UPSERTs to the same chunk #8182

Caching information for repeated UPSERTs to the same chunk #8182

Uh oh!

Conversation

dbeck commented May 27, 2025

Uh oh!

codecov bot commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dbeck commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

dbeck commented May 27, 2025

Uh oh!

akuzm commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

antekresic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

codecov bot commented May 27, 2025 •

edited

Loading