Skip to content

Conversation

@mrkbac
Copy link
Contributor

@mrkbac mrkbac commented May 3, 2025

Changelog

Updates recover CLI command, which does not decompress all the chunks making it 4x - 27x faster

Description

This PR updates the `recover' command.

It does not decompress the chunks and copies them as is, only decompressing them if they contain unknown channels.
If the last message in the mcap is a message index, the last chunk will also be decompressed and the message indexes recreated ensuring no indexes are missing.
This makes it 4x - 27x faster than the regular command.

Benchmarks (MacBook Pro M1):

300MB File:

hyperfine \
  --warmup 2 \
  --runs 10 \
 'go run cli/mcap/main.go recover 300MB.mcap -o /dev/null' \
 'mcap recover 300MB.mcap -o /dev/null'
Benchmark 1: go run cli/mcap/main.go recover 300MB.mcap -o /dev/null
  Time (mean ± σ):     496.2 ms ±   9.1 ms    [User: 435.7 ms, System: 2779.0 ms]
  Range (min … max):   477.2 ms … 513.6 ms    10 runs
 
Benchmark 2: mcap recover 300MB.mcap -o /dev/null
  Time (mean ± σ):      2.170 s ±  0.026 s    [User: 3.226 s, System: 2.814 s]
  Range (min … max):    2.134 s …  2.213 s    10 runs
 
Summary
  go run cli/mcap/main.go recover 300MB.mcap -o /dev/null ran
    4.37 ± 0.10 times faster than mcap recover 300MB.mcap -o /dev/null

19GB File

hyperfine \
  --warmup 1 \
  --runs 3 \
 'go run cli/mcap/main.go recover 19GB.mcap -o /dev/null' \
 'mcap recover 19GB.mcap -o /dev/null'
Benchmark 1: go run cli/mcap/main.go recover 19GB.mcap -o /dev/null
  Time (mean ± σ):      4.685 s ±  0.024 s    [User: 5.386 s, System: 5.382 s]
  Range (min … max):    4.666 s …  4.712 s    3 runs
 
Benchmark 2: mcap recover 19GB.mcap -o /dev/null
  Time (mean ± σ):     128.508 s ±  0.340 s    [User: 205.740 s, System: 9.388 s]
  Range (min … max):   128.152 s … 128.829 s    3 runs
 
Summary
  go run cli/mcap/main.go recover 19GB.mcap -o /dev/null ran
   27.43 ± 0.16 times faster than mcap recover 19GB.mcap -o /dev/null

This PR also introduces, mcap info support on unindexed mcaps and mcap recover-in-place

@mrkbac mrkbac requested a review from james-rms as a code owner May 3, 2025 12:13
@james-rms
Copy link
Collaborator

Hi @mrkbac - thanks for your contribution, this is a really nice idea.

I don't think it makes sense to keep this as a separate subcommand - recover should do this by default where appropriate.

"appropriate" could be considered as any MCAP containing both chunk and message indexes, where no chunk contains unknown channels.

I'd be in favor of making this the default behavior, and adding a --always-decompress-chunks flag to disable this behavior.

@mrkbac
Copy link
Contributor Author

mrkbac commented May 15, 2025

Hey @james-rms I've update the PR and replace recover with the new chunk based version.
-a always decompresses the chunks and regenerates the message index, but it still copies the original chunk over to the new file, so this is still about 20% faster compared to the current version.

There are a few differences to the original recover command, let me know if i should add then:

  • Nochunk-size and compression flag since I just copy the chunks from the source

@mrkbac mrkbac changed the title mcap recover-fast CLI command Faster mcap recover CLI command May 15, 2025
@james-rms
Copy link
Collaborator

There's a lot here - i have a few comments but it's hard to parse out which changes depend on them. could you pull the changes for info, doctor and recover-in-place out of this one and into separate PRs?

@mrkbac mrkbac requested a review from gasmith as a code owner May 27, 2025 07:11
@mrkbac
Copy link
Contributor Author

mrkbac commented May 27, 2025

Sure, I've removed them from here. Let me know if I should change anything else.

@james-rms james-rms merged commit 703df26 into foxglove:main Jul 31, 2025
28 checks passed
@jhurliman
Copy link
Contributor

mcap info support on unindexed mcaps

Legend. Is there a separate PR open for this?

@mrkbac
Copy link
Contributor Author

mrkbac commented Aug 15, 2025

Not yet I removed it from here to keep this PR focused. I'll try to open one today or tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants