rfc(feature): Logs for Crashes #148

philipphofmann · 2025-09-24T04:19:50Z

This RFC aims to develop a strategy for SDKs to prevent the loss of logs when an application terminates abnormally, such as a crash or watchdog termination.

Rendered RFC

AbhiPrasad

I think it would be a good to add a note about recording client outcomes for this, like if the FIFO queue is full for example.

Everything else lgtm!

AbhiPrasad · 2025-09-25T14:05:18Z

text/0148-logs-for-crashes.md

+1) if the application crashes between storing the envelope and deleting a `batch-processor-cache-file` or `log-crash-recover-file`.
+2) if a crash occurs between step 2 and before step 3 of the FIFO queue process, where logs might exist in both the `batch-processor-cache-file` and the FIFO queue.
+
+In both cases, the SDK will send duplicate logs on the next launch. While this isn't acceptable long-term, we accept it for now because solving this correctly is complicated and we don't currently handle this edge case for other telemetry data such as crashes. If duplicate logs become problematic, we can implement database-style atomic operations using marker files or something similar to prevent duplication.


Can we solve this duplicate issue by attaching item ids to logs? If so, I'd be open to doing that, the protocol can be easily expanded to allow for sdks to set this.

Yes, we can. This would be pretty easy for 2), but for 1) we would somehow need to hold sending envelopes back before we run this check, which could be a bit tricky but doable. Do you think it's worth the effort to already do this in the first iteration?

Do you think it's worth the effort to already do this in the first iteration?

Yes because duplicate log data has both a billing impact (users pay for two logs instead of one) and has the chance to reduce trust in the product if users are onboarding onto sentry for the first time.

OK, let's do it, but it's some extra logic. I only want to add it if it's needed. Will update the RFC.

AbhiPrasad · 2025-09-25T14:06:33Z

text/0148-logs-for-crashes.md

+
+When the BatchProcessor receives a log, it performs the following steps
+
+1. Put the log into the FIFO queue on the calling thread.


do we have recommendations on queue size? Should we adjust this based on backpressure (like amount of logs being recorded on the calling threads)?

I just picked 64 for now. As we're immidiately writing these to disk, I think this should be sufficient.

I think 64 will not be enough, depending on the time interval. During startup an app can easily log more than that in a short timestamp due to e.g. bootstrapping services.
But the 64 is a relative value, so we should clarify what the related value is.

For the FIFO queue we don't have a time interval. The SDK has to write the log immidiately to disk after adding it to the queue. There is no time interval for storing to disk. I updated the wording. Do you still believe 64 isn't enough in that case @philprime and @AbhiPrasad ?

@AbhiPrasad, please have another look now.

If SDKs are using a flush interval of 5 seconds, that gives us 12/13 logs per second on a span of 5 seconds.
I'd say it's a nice number for small/medium projects but would be nice if users could configure it.

@lucas-zimerman, yes, we can make this configurable, but I would only if required. As already pointed out, this number is for the in-memory FIFO queue. When users add logs, the SDKs first put them into the in-memory FIFO queue and then immidiately store them to disk asynchronously. So the in-memory FIFO queue will only overlflow if the bg thread can't keep up storing the logs to disk, and then flushing them out. So, even if a user adds 100 or even 1000 logs per second, SDKs should be able to keep up.

text/0148-logs-for-crashes.md

philprime

Solid RFC, no other feedback than the already open discussions.

markushi

Left a few minor comments, but this is already in some very good shape!

text/0148-logs-for-crashes.md

antonis

The suggested approach looks solid too me! Thank you for driving this @philipphofmann 🙇

Also looping in @lucas-zimerman since he has more context on the RN Logs implementation and for awareness.

lucas-zimerman · 2025-09-30T08:45:45Z

text/0148-logs-for-crashes.md

+
+The BatchProcessor maintains its logic of batching multiple logs together into a single envelope to avoid multiple HTTP requests.
+
+Hybrid SDKs pass every span down to the native SDKs, which will put every log in their BatchProcessor and its cache.


With that said, will the batch processor only be served as a queue for sending logs or will also invoke integrations for parsing the logs, like beforelog on the native side?

Are there hybrid SDKs using the batch processor at the moment?

batch processor only be served as a queue for sending logs

Only for sending logs. Logs go into the BatchProcessor after going through all integrations, beforeSend, etc. I updated the wording. Is it clear now, @lucas-zimerman ?

lucas-zimerman · 2025-09-30T08:52:35Z

Left a few minor comments, but this is looking good!

text/0148-logs-for-crashes.md

romtsn · 2025-10-02T15:40:35Z

text/0148-logs-for-crashes.md

+
+1. Put the log into the FIFO queue on the calling thread.
+2. On a background thread, serialize the next log of the FIFO queue and store it in the `batch-processor-cache-file`.
+3. Remove the log from the FIFO queue.


I guess I'm a bit confused here - why do we want to remove the log from the queue? Or is this queue a different one from what we already have in the BatchProcessor?

My understanding was that just in addition to what we already have (the existing queue) we would just spin up a task to store the same log in a file. If it crashes and the in-memory logs are lost, we'd use the file on the next launch to send the leftovers.

When a crash occurs, the SDKs write the logs in the FIFO queue to the log-crash-recover-file and send these logs on the next SDK launch.

@romtsn, the FIFO queue exists to prevent log loss when a crash occurs immediately after logging.

logger.trace("Starting database connection") // The above log must show up in Sentry SentrySDK.crash()

If we don't use the async safe memory and only store it to disk on a BG thread, we can't guarantee that on Cocoa. Can you on Java?

As discussed in DMs let's figure out the details later when you move this to develop docs. My concern was that we're doing double the work here: removing log entries from the in-memory queue, but then also loading them from disk into memory again (step 4. below) which entails more I/O work.

So why not keep them in-memory and use the in-memory queue as main source, and only use the disk cache for exceptional cases (like crashes/watchdog terminations).

Yep, let's figure this out once we add this to the develop docs. Thanks for bringing it up @romtsn.

AbhiPrasad

lgtm!

text/0148-logs-for-crashes.md

Co-authored-by: Abhijeet Prasad <[email protected]>

Co-authored-by: Roman Zavarnitsyn <[email protected]>

romtsn

One point for later, but LGTM otherwise!

rfc(feature): Logs for Crashes

ed2950b

philipphofmann force-pushed the rfc/logs-for-crashes branch from a3c9142 to ed2950b Compare September 24, 2025 04:19

philipphofmann added 5 commits September 24, 2025 06:43

first version

bd5ca6f

improve

eaeba55

remove duplicated unresolved questions

cb24277

another fix

1c0781a

wording improvements

4265be6

philipphofmann marked this pull request as ready for review September 25, 2025 06:23

philipphofmann mentioned this pull request Sep 25, 2025

Structured Logs: Logs for crashes getsentry/sentry-cocoa#5660

Open

philipphofmann requested a review from a team September 25, 2025 12:31

AbhiPrasad reviewed Sep 25, 2025

View reviewed changes

feedback from Abhi

451e384

philprime reviewed Sep 26, 2025

View reviewed changes

markushi reviewed Sep 29, 2025

View reviewed changes

text/0148-logs-for-crashes.md Outdated Show resolved Hide resolved

text/0148-logs-for-crashes.md Outdated Show resolved Hide resolved

text/0148-logs-for-crashes.md Outdated Show resolved Hide resolved

philipphofmann added 3 commits September 30, 2025 06:41

explain no timeout fifo queue

6730db9

include feedback

1d81859

typo

57d096c

philipphofmann requested a review from AbhiPrasad September 30, 2025 07:45

antonis reviewed Sep 30, 2025

View reviewed changes

lucas-zimerman reviewed Sep 30, 2025

View reviewed changes

philipphofmann added 3 commits October 2, 2025 09:26

update wording for hybrid

bced8a7

add author and approver

4a4b13a

add decision and abhi as approver

ebdac44

JoshuaMoelans mentioned this pull request Oct 2, 2025

Structured Logs: follow-up getsentry/sentry-native#1398

Open

romtsn reviewed Oct 2, 2025

View reviewed changes

text/0148-logs-for-crashes.md Outdated Show resolved Hide resolved

romtsn reviewed Oct 2, 2025

View reviewed changes

AbhiPrasad approved these changes Oct 2, 2025

View reviewed changes

text/0148-logs-for-crashes.md Outdated Show resolved Hide resolved

text/0148-logs-for-crashes.md Outdated Show resolved Hide resolved

text/0148-logs-for-crashes.md Outdated Show resolved Hide resolved

philipphofmann and others added 2 commits October 3, 2025 06:10

Update text/0148-logs-for-crashes.md

6f2d49b

Co-authored-by: Abhijeet Prasad <[email protected]>

Update text/0148-logs-for-crashes.md

37e77e1

Co-authored-by: Abhijeet Prasad <[email protected]>

philipphofmann and others added 2 commits October 3, 2025 06:10

Update text/0148-logs-for-crashes.md

106bb95

Co-authored-by: Roman Zavarnitsyn <[email protected]>

feedback abhi

e8a376b

romtsn approved these changes Oct 3, 2025

View reviewed changes

mark as done

e01c07d

philipphofmann merged commit 71ce99e into main Oct 3, 2025
6 checks passed

philipphofmann deleted the rfc/logs-for-crashes branch October 3, 2025 13:42


		When the BatchProcessor receives a log, it performs the following steps

		1. Put the log into the FIFO queue on the calling thread.


		The BatchProcessor maintains its logic of batching multiple logs together into a single envelope to avoid multiple HTTP requests.

		Hybrid SDKs pass every span down to the native SDKs, which will put every log in their BatchProcessor and its cache.

Uh oh!

rfc(feature): Logs for Crashes #148

rfc(feature): Logs for Crashes #148

Uh oh!

Conversation

philipphofmann commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AbhiPrasad left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philipphofmann Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philipphofmann Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

philprime left a comment

Choose a reason for hiding this comment

Uh oh!

markushi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

antonis left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philipphofmann Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucas-zimerman commented Sep 30, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

romtsn Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AbhiPrasad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

romtsn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

philipphofmann commented Sep 24, 2025 •

edited

Loading

philipphofmann Sep 30, 2025 •

edited

Loading

philipphofmann Oct 2, 2025 •

edited

Loading

philipphofmann Oct 2, 2025 •

edited

Loading

romtsn Oct 3, 2025 •

edited

Loading