Skip to content

Please compress the database dump #354

@MayeulC

Description

@MayeulC

Describe the bug

My synapse database is big. Database dumps can take more than 100 GB. My latest one (single user server) is at 32.8 GB.
Writing and reading that file to disk takes a long time, not to mention the wasted disk space.

Backups used to be compressed, but back then they were first tar-ed, then compressed. Both stages took a while.

Suggested solution

Pipe the postgresql dump to a (fast, multithread) compressor:

ynh_psql_dump_db --database="$synapse_db_name" > ${YNH_CWD}/dump.sql

Change this to

ynh_psql_dump_db --database="$synapse_db_name" | zstd -3 -T0 > ${YNH_CWD}/dump.sql 

(Ideally this would combine nice and ionice, like that:)

ynh_psql_dump_db --database="$synapse_db_name" | ionice -n 6 nice zstd -3 -T0 -o "${YNH_CWD}/dump.sql"

On the restore side:

ynh_psql_execute_file_as_root --file="${YNH_CWD}/dump.sql" --database="$synapse_db_name"

Hmm, not straightforward here. Either make a fifo with mkfifo and pass this as the path, or change the helper/introduce a helper without the redirection there:
https://github.com/YunoHost/yunohost/blob/4b9e26b974b0cc8f7aa44fd773537508316b8ba6/helpers/postgresql#L78-L79

Expected gains

zstd level 3 gets an old dump from 17GB to 3.8GB. Level 7 only gets this down to 3.5GB. Level 1 (minimum, fastest) reaches 4.2 GB.

Both archive creation and dump time should be faster as less data needs to be written to disk, especially for hard disks.

Sample runs

These runs were collected with some I/O in the background (synapse restoration in progress).

# time cat tmp/apps/synapse/backup/dump.sql |zstd -T0 -1 > test1.zst
cat tmp/apps/synapse/backup/dump.sql  0,20s user 20,29s system 14% cpu 2:24,87 total
zstd -T0 -1 > test1.zst  110,79s user 10,31s system 83% cpu 2:24,89 total
# time cat tmp/apps/synapse/backup/dump.sql |zstd -T0 -3 > test3.zst
cat tmp/apps/synapse/backup/dump.sql  0,25s user 15,99s system 10% cpu 2:32,51 total
zstd -T0 -3 > test3.zst  120,20s user 7,80s system 83% cpu 2:32,58 total
# time cat tmp/apps/synapse/backup/dump.sql |zstd -T0 -7 > test7.zst
cat tmp/apps/synapse/backup/dump.sql  0,17s user 16,89s system 8% cpu 3:16,85 total
zstd -T0 -7 > test7.zst  630,73s user 7,80s system 324% cpu 3:17,04 total
# time cat tmp/apps/synapse/backup/dump.sql > test0.sql
cat tmp/apps/synapse/backup/dump.sql > test0.sql  0,17s user 21,84s system 10% cpu 3:33,71 total

This suggests that level 3 is probably good enough, and compressing barely adds any time to the operation, at least on my (powerful, relatively slow disk for the backup partition) machine.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions