Skip to content

Conversation

melihmutlu
Copy link
Member

@melihmutlu melihmutlu commented May 21, 2025

Chunks must have the same columns with their hypertables. create_chunk()
should not allow to have a chunk with additional columns which do not
exist in the parent hypertable. Check columns by name in create_chunk()
when creating a chunk from an existing table.

This commit also forces chunks and hypertable to have the same
expressions for generated columns, even though PostgreSQL inheritance
may allow it.

For PG15, it is allowed to have generated column in chunks if and only
if the corresponding hypertable column is also generated. For later
versions, this is handled by PostgreSQL code.

Copy link

codecov bot commented May 21, 2025

Codecov Report

Attention: Patch coverage is 75.86207% with 7 lines in your changes missing coverage. Please review.

Project coverage is 82.41%. Comparing base (bc69773) to head (276b7d6).
Report is 100 commits behind head on main.

Files with missing lines Patch % Lines
src/chunk.c 76.19% 0 Missing and 5 partials ⚠️
src/utils.c 75.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8164      +/-   ##
==========================================
+ Coverage   82.32%   82.41%   +0.09%     
==========================================
  Files         256      256              
  Lines       47990    47988       -2     
  Branches    12095    12102       +7     
==========================================
+ Hits        39506    39551      +45     
- Misses       3635     3646      +11     
+ Partials     4849     4791      -58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@melihmutlu melihmutlu force-pushed the check_atts_create_chunk branch from 281c545 to a558b03 Compare May 22, 2025 07:14
@melihmutlu melihmutlu marked this pull request as ready for review May 22, 2025 07:24
@philkra philkra added this to the v2.21.0 milestone May 22, 2025
Copy link
Member

@erimatnor erimatnor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor nits and suggestions. Otherwise LGTM.

Copy link
Member

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the other reviewers that the locks are too strong and you should just use locks strong enough to prevent modifications of the table definition.

I am also missing tests for the following (some where missing from before, but it's good to have checks for them any way). I might have missed that some of them exists, already, so feel free to point me to them in that case.

  • Create a chunk but use a table that you created with more columns than necessary and then altered to drop some columns. This is to catch cases where the attribute numbers do not match but the names match. (You are checking attisdropped, so it should be fine, but good to test it.)
  • Create a chunk that has all the right names and types, but in a different order than the parent.
  • Create a table with too few columns, insert some data, and add a new column with the correct type to match the parent hypertable and then try to add it. This is to see what happens with attributes that has atthasmissing set.
  • Add a chunk that has all the correct names compared to the parent table, but with different types. I don't see any such tests in the current test, so might be good to add it since you're making this change.
  • Add a chunk with a generated column of the correct name and type.

After you've added chunks that should be possible to add, you should probably try to insert some rows in the parent table and see that they are routed correctly (granted, these were missing since before).

@melihmutlu melihmutlu force-pushed the check_atts_create_chunk branch 3 times, most recently from faa9744 to d5d0b96 Compare May 23, 2025 12:59
@melihmutlu melihmutlu force-pushed the check_atts_create_chunk branch 3 times, most recently from f57507e to 4beba26 Compare May 27, 2025 09:35
@melihmutlu
Copy link
Member Author

See the commit postgres/postgres@8bf6ec3 for the changes related to restrictions on generated columns in inheritance after PG15.

@melihmutlu melihmutlu force-pushed the check_atts_create_chunk branch from 4beba26 to 4f6259d Compare May 28, 2025 11:17
Copy link
Member

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions for extending the tests and a few nits. Otherwise looks good.

@melihmutlu melihmutlu force-pushed the check_atts_create_chunk branch from 4f6259d to ee728a7 Compare June 2, 2025 07:29
@melihmutlu melihmutlu force-pushed the check_atts_create_chunk branch from ee728a7 to aa7e90d Compare June 2, 2025 10:20
Chunks must have the same columns with their hypertables. create_chunk()
should not allow to have a chunk with additional columns which do not
exist in the parent hypertable. Check columns by name in create_chunk()
when creating a chunk from an existing table.

This commit also forces chunks and hypertable to have the same
expressions for generated columns, even though PostgreSQL inheritance
may allow it.

For PG15, it is allowed to have generated column in chunks if and only
if the corresponding hypertable column is also generated. For later
versions, this is handled by PostgreSQL code.
@melihmutlu melihmutlu force-pushed the check_atts_create_chunk branch from aa7e90d to 276b7d6 Compare June 2, 2025 11:58
@melihmutlu melihmutlu enabled auto-merge (squash) June 2, 2025 11:59
@melihmutlu melihmutlu merged commit e6c4a38 into timescale:main Jun 2, 2025
43 of 44 checks passed
natalya-aksman pushed a commit to natalya-aksman/timescaledb that referenced this pull request Jun 5, 2025
Chunks must have the same columns with their hypertables. create_chunk()
should not allow to have a chunk with additional columns which do not
exist in the parent hypertable. Check columns by name in create_chunk()
when creating a chunk from an existing table.

This commit also forces chunks and hypertable to have the same
expressions for generated columns, even though PostgreSQL inheritance
may allow it.

For PG15, it is allowed to have generated column in chunks if and only
if the corresponding hypertable column is also generated. For later
versions, this is handled by PostgreSQL code.
@philkra philkra mentioned this pull request Jul 2, 2025
philkra added a commit that referenced this pull request Jul 8, 2025
## 2.21.0 (2025-07-08)

This release contains performance improvements and bug fixes since the
2.20.3 release. We recommend that you upgrade at the next available
opportunity.

**Highlighted features in TimescaleDB v2.21.0**
* The attach & detach chunks feature allows manually adding or removing
chunks from a hypertable with uncompressed chunks, similar to
PostgreSQL’s partition management.
* Continued improvement of backfilling into the columnstore, achieving
up to 2.5x speedup for constrained tables, by introducing caching logic
that boosts throughput for writes to compressed chunks, bringing
`INSERT` performance close to that of uncompressed chunks.
* Optimized `DELETE` operations on the columstore through batch-level
deletions of non-segmentby keys in the filter condition, greatly
improving performance to up to 42x faster in some cases, as well as
reducing bloat, and lowering resource usage.
* The heavy lock taken in Continuous Aggregate refresh was relaxed,
enabling concurrent refreshes for non-overlapping ranges and eliminating
the need for complex customer workarounds.
* [tech preview] Direct Compress is an innovative TimescaleDB feature
that improves high-volume data ingestion by compressing data in memory
and writing it directly to disk, reducing I/O overhead, eliminating
dependency on background compression jobs, and significantly boosting
insert performance.

**Sunsetting of the hypercore access method**
We made the decision to deprecate hypercore access method (TAM) with the
2.21.0 release. It was an experiment, which did not show the signals we
hoped for and will be sunsetted in TimescaleDB 2.22.0, scheduled for
September 2025. Upgrading to 2.22.0 and higher will be blocked if TAM is
still in use. Since TAM’s inception in
[2.18.0](https://github.com/timescale/timescaledb/releases/tag/2.18.0),
we learned that btrees were not the right architecture. The recent
advancements in the columnstore—such as more performant backfilling,
SkipScan, adding check constraints, and faster point queries—put the
[columnstore](https://www.timescale.com/blog/hypercore-a-hybrid-row-storage-engine-for-real-time-analytics)
close to or on par with TAM without the storage from the additional
index. We apologize for the inconvenience this action potentially causes
and are here to assist you during the migration process.

Migration path

```
do $$
declare   
   relid regclass;
begin
   for relid in
       select cl.oid from pg_class cl
       join pg_am am on (am.oid = cl.relam)
       where am.amname = 'hypercore'
   loop
       raise notice 'converting % to heap', relid::regclass;
       execute format('alter table %s set access method heap', relid);
   end loop;
end
$$;
```

**Features**
* [#8081](#8081) Use JSON
error code for job configuration parsing
* [#8100](#8100) Support
splitting compressed chunks
* [#8131](#8131) Add policy
to process hypertable invalidations
* [#8141](#8141) Add
function to process hypertable invalidations
* [#8165](#8165) Reindex
recompressed chunks in compression policy
* [#8178](#8178) Add
columnstore option to `CREATE TABLE WITH`
* [#8179](#8179) Implement
direct `DELETE` on non-segmentby columns
* [#8182](#8182) Cache
information for repeated upserts into the same compressed chunk
* [#8187](#8187) Allow
concurrent Continuous Aggregate refreshes
* [#8191](#8191) Add option
to not process hypertable invalidations
* [#8196](#8196) Show
deprecation warning for TAM
* [#8208](#8208) Use `NULL`
compression for bool batches with all null values like the other
compression algorithms
* [#8223](#8223) Support
for attach/detach chunk
* [#8265](#8265) Set
incremental Continous Aggregate refresh policy on by default
* [#8274](#8274) Allow
creating concurrent continuous aggregate refresh policies
* [#8314](#8314) Add
support for timescaledb_lake in loader
* [#8209](#8209) Add
experimental support for Direct Compress of `COPY`
* [#8341](#8341) Allow
quick migration from hypercore TAM to (columnstore) heap

**Bugfixes**
* [#8153](#8153) Restoring
a database having NULL compressed data
* [#8164](#8164) Check
columns when creating new chunk from table
* [#8294](#8294) The
"vectorized predicate called for a null value" error for WHERE
conditions like `x = any(null::int[])`.
* [#8307](#8307) Fix
missing catalog entries for bool and null compression in fresh
installations
* [#8323](#8323) Fix DML
issue with expression indexes and BHS

**GUCs**
* `enable_direct_compress_copy`: Enable experimental support for direct
compression during `COPY`, default: off
* `enable_direct_compress_copy_sort_batches`: Enable batch sorting
during direct compress `COPY`, default: on
* `enable_direct_compress_copy_client_sorted`: Correct handling of data
sorting by the user is required for this option, default: off

---------

Signed-off-by: Philip Krauss <[email protected]>
Co-authored-by: philkra <[email protected]>
Co-authored-by: philkra <[email protected]>
Co-authored-by: Fabrízio de Royes Mello <[email protected]>
Co-authored-by: Anastasiia Tovpeko <[email protected]>
@timescale-automation timescale-automation added the released-2.21.0 Released in 2.21.0 label Jul 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
released-2.21.0 Released in 2.21.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants