Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion kvdb-rocksdb/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ The format is based on [Keep a Changelog].

## [Unreleased]

## [0.20.2] - 2025-11-17
## [0.21.0] - 2025-12-01
- Expose function `force_compact` to forcefully compact columns [#958](https://github.com/paritytech/parity-common/pull/958)
- Make snappy and jemalloc configurable features [#950](https://github.com/paritytech/parity-common/pull/950)

## [0.20.1] - 2025-11-07
Expand Down
2 changes: 1 addition & 1 deletion kvdb-rocksdb/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "kvdb-rocksdb"
version = "0.20.2"
version = "0.21.0"
description = "kvdb implementation backed by RocksDB"
rust-version = "1.71.1"
authors.workspace = true
Expand Down
55 changes: 14 additions & 41 deletions kvdb-rocksdb/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,8 @@ use std::{
collections::HashMap,
error, io,
path::{Path, PathBuf},
time::{Duration, Instant},
};

use parking_lot::Mutex;
use rocksdb::{
BlockBasedOptions, ColumnFamily, ColumnFamilyDescriptor, CompactOptions, Options, ReadOptions, WriteBatch,
WriteOptions, DB,
Expand Down Expand Up @@ -271,7 +269,6 @@ pub struct Database {
read_opts: ReadOptions,
block_opts: BlockBasedOptions,
stats: stats::RunningDbStats,
last_compaction: Mutex<Instant>,
}

/// Generate the options for RocksDB, based on the given `DatabaseConfig`.
Expand Down Expand Up @@ -354,23 +351,15 @@ impl Database {
Self::open_primary(&opts, path.as_ref(), config, column_names.as_slice(), &block_opts)?
};

let db = Database {
Ok(Database {
inner: DBAndColumns { db, column_names },
config: config.clone(),
opts,
read_opts,
write_opts,
block_opts,
stats: stats::RunningDbStats::new(),
last_compaction: Mutex::new(Instant::now()),
};

// After opening the DB, we want to compact it.
//
// This just in case the node crashed before to ensure the db stays fast.
db.force_compaction()?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dq: This is not limited to archive nodes, but we have observed this while running archives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tested this before mainly with warp synced nodes. Aka not a node that has a 600GB disk 🙈

That it takes ages was detected while trying to switch to the stable25rc1 release on our westend nodes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, this must be the fix for the westend nodes that were killed by the keep-alive/health-check services 🙏

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the fix for them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it locally with the archive db. Before my PC was also not able to compact it in any reasonable amount of time.


Ok(db)
})
}

/// Internal api to open a database in primary mode.
Expand Down Expand Up @@ -472,21 +461,7 @@ impl Database {
}
self.stats.tally_bytes_written(stats_total_bytes as u64);

let res = cfs.db.write_opt(batch, &self.write_opts).map_err(other_io_err)?;

// If we have written more data than what we want to have stored in a `sst` file, we force compaction.
// We also ensure that we only compact once per minute.
//
// Otherwise, rocksdb read performance is going down, after e.g. a warp sync.
if stats_total_bytes > self.config.compaction.initial_file_size as usize &&
self.last_compaction.lock().elapsed() > Duration::from_secs(60)
{
self.force_compaction()?;

*self.last_compaction.lock() = Instant::now();
}

Ok(res)
cfs.db.write_opt(batch, &self.write_opts).map_err(other_io_err)
}

/// Get value by key.
Expand Down Expand Up @@ -606,25 +581,23 @@ impl Database {
self.inner.db.try_catch_up_with_primary().map_err(other_io_err)
}

/// Force compacting the entire db.
fn force_compaction(&self) -> io::Result<()> {
/// Force compact a single column.
///
/// After compaction of the column, this may lead to better read performance.
pub fn force_compact(&self, col: u32) -> io::Result<()> {
let mut compact_options = CompactOptions::default();
compact_options.set_bottommost_level_compaction(rocksdb::BottommostLevelCompaction::Force);

// Don't ask me why we can not just use `compact_range_opt`...
// But we are forced to trigger compaction on every column. Actually we only need this for the `STATE` column,
// but we don't know which one this is here. So, we just iterate all of them.
for col in 0..self.inner.column_names.len() {
self.inner
.db
.compact_range_cf_opt(self.inner.cf(col)?, None::<Vec<u8>>, None::<Vec<u8>>, &compact_options);
}

self.inner.db.compact_range_cf_opt(
self.inner.cf(col as usize)?,
None::<Vec<u8>>,
None::<Vec<u8>>,
&compact_options,
);
Ok(())
}
}

// duplicate declaration of methods here to avoid trait import in certain existing cases
// Duplicate declaration of methods here to avoid trait import in certain existing cases
// at time of addition.
impl KeyValueDB for Database {
fn get(&self, col: u32, key: &[u8]) -> io::Result<Option<DBValue>> {
Expand Down
Loading