Skip to content

Conversation

@sparrc
Copy link

@sparrc sparrc commented Nov 5, 2025

Fixes #11113

There were a number of use-after-free and invalid memory access bugs found via the above issue that occur when the forward input pauses due to mem_buf_limit being full. This PR attempts to fix all of those errors to avoid segfaults in the forward input under heavy load. See valgrind output below confirming no errors found after running for 30 minutes at high load.

When using the forward input plugin with a low Mem_Buf_Limit, Fluent Bit would crash with a segmentation fault when memory limits were hit and the plugin was paused. The issue was that fw_conn_del_all() would free all connections immediately, but event handlers were still running and trying to access that freed memory.

Summary

This commit solves the problem by tracking what the plugin instance is doing at any moment and delaying connection cleanup if needed. It adds a state field to the plugin context with four possible values:

  • FW_INSTANCE_STATE_RUNNING - normal operation
  • FW_INSTANCE_STATE_ACCEPTING_CLIENT - currently accepting a new connection
  • FW_INSTANCE_STATE_PROCESSING_PACKET - event handler is processing data
  • FW_INSTANCE_STATE_PAUSED - plugin has been paused

When in_fw_pause() is called, it checks the current state. If the plugin is currently accepting a client or processing a packet, it skips calling fw_conn_del_all() and just sets the state to PAUSED.

The in_fw_collect() function (which accepts new connections) and fw_conn_event() (which processes incoming data) both set their respective states when they start, then check if the state changed to PAUSED before they exit. If it did, they call fw_conn_del_all() themselves after they're done with the connection.

This ensures that connections are never freed while code is actively using them. The cleanup just gets deferred until the current operation finishes, preventing the use-after-free that was causing the crash.

Problem

The plugin could crash when a connection was deleted (e.g., during pause/shutdown) while messages were still being processed.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change

see issue report.

  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found
==150631== 
==150631== HEAP SUMMARY:
==150631==     in use at exit: 0 bytes in 0 blocks
==150631==   total heap usage: 471,017 allocs, 471,017 frees, 5,381,700,270 bytes allocated
==150631== 
==150631== All heap blocks were freed -- no leaks are possible
==150631== 
==150631== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@coderabbitai

This comment was marked as outdated.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
plugins/in_forward/fw_conn.c (1)

162-163: Redundant initialization detected.

Line 163 sets conn->handshake_status = FW_HANDSHAKE_ESTABLISHED, but line 172 does the same. The second assignment is meant to be the default before the conditional logic starting at line 173. The first assignment on line 163 appears redundant.

Remove the redundant assignment:

     conn->being_deleted = 0;
-    conn->handshake_status = FW_HANDSHAKE_ESTABLISHED;
     /*
      * Always force the secure-forward handshake when:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8579d63 and e4572e6.

📒 Files selected for processing (3)
  • plugins/in_forward/fw_conn.c (5 hunks)
  • plugins/in_forward/fw_conn.h (1 hunks)
  • plugins/in_forward/fw_prot.c (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
plugins/in_forward/fw_prot.c (1)
src/flb_sds.c (1)
  • flb_sds_destroy (389-399)
plugins/in_forward/fw_conn.c (1)
include/fluent-bit/flb_mem.h (1)
  • flb_free (126-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
🔇 Additional comments (5)
plugins/in_forward/fw_prot.c (2)

1079-1082: Good defensive check, but memory ordering concerns remain.

The early safety check properly prevents processing when the connection is being deleted or the buffer is NULL, returning 0 to signal no data received. This prevents use-after-free and null pointer dereferences.

However, the check relies on reading conn->being_deleted without synchronization (see related comment on fw_conn.h). If this flag is not atomic or volatile, the check might not see an in-progress deletion due to memory ordering issues.


1296-1312: TOCTOU gap is valid, but proposed fix is incomplete.

The race condition is real: fw_conn_del() is called from event handlers without mutex protection and freely deallocates the connection object, while fw_prot_process() checks being_deleted outside any mutex lock. Between the unprotected check at line 1297 and resource access at line 1314, a concurrent deletion can occur.

The proposed fix helps by moving the being_deleted check inside conn_mutex, but it doesn't fully resolve the issue because:

  1. fw_conn_del() itself (fw_conn.c:245) does not acquire conn_mutex before calling flb_free(conn) at line 284
  2. The gap persists after the mutex is released at line 1312 and before resources are accessed at line 1314
  3. being_deleted serves as a graceful-shutdown flag rather than complete synchronization

A complete fix requires either protecting fw_conn_del() with conn_mutex, or ensuring fw_prot_process() is always called with the mutex held throughout all resource access, not just the brief check window.

plugins/in_forward/fw_conn.c (3)

48-51: Good defensive check.

Validating that the connection object is not NULL before proceeding prevents null pointer dereferences. This is especially important when the connection might have been cleaned up asynchronously.


247-251: Excellent placement of the deletion flag.

Setting being_deleted = 1 at the very start of the deletion function, before any cleanup operations, is the correct approach. This ensures that any in-flight operations checking this flag will see the deletion state as early as possible, reducing the window for use-after-free scenarios.


264-264: Excellent defensive cleanup pattern.

Null-setting pointers after freeing them is a solid defensive programming practice that prevents:

  • Double-free vulnerabilities: If fw_conn_del is somehow called twice, the null checks prevent attempting to free already-freed memory
  • Use-after-free detection: Any code attempting to use these pointers after deletion will get a null pointer dereference instead of accessing freed memory, making bugs easier to detect

The guarded free for conn->buf (checking if it's not NULL before freeing) is also good practice.

Also applies to: 275-275, 278-282

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

coderabbitai[bot]

This comment was marked as outdated.

@sparrc
Copy link
Author

sparrc commented Nov 7, 2025

@edsiper @leonardo-albertovich @cosmo0920

Hi all, this PR is ready for review, I've commented on the two remaining AI code review comments, please take a look, thanks!

@sparrc sparrc requested a review from a team November 7, 2025 17:12
@leonardo-albertovich
Copy link
Contributor

leonardo-albertovich commented Nov 7, 2025

Considering the threading model the changes don't make sense to me so please give me a chance to perform a complete review of the PR before merging it.

As a side note, the only situation in which inter-thread play could be a problem would entail the plugin running in threaded mode (which according to issue #1113 is not the case) and there being either incorrect signaling (ie. acting from the main thread instead of signaling) or incorrect processing of events (ie. adding sockets or timers to the main threads event loop instead of the plugins thread).

I mentioned these possible issues in case your original issue didn't completely reflect the deployment (or in case there's some funny code in fluent-bit enabling threaded mode for this plugin by default) but I lean more towards the root of the issue being something different.

I noticed that you mentioned that this triggers when the plugin intance is suspended due to it exceeding mem_buf_limit. Do you have a clear cut repro case I can use to accelerate the validation process?

@sparrc
Copy link
Author

sparrc commented Nov 7, 2025

Considering the threading model the changes don't make sense to me so please give me a chance to perform a complete review of the PR before merging it.

As a side note, the only situation in which inter-thread play could be a problem would entail the plugin running in threaded mode (which according to issue #1113 is not the case) and there being either incorrect signaling (ie. acting from the main thread instead of signaling) or incorrect processing of events (ie. adding sockets or timers to the main threads event loop instead of the plugins thread).

I mentioned these possible issues in case your original issue didn't completely reflect the deployment (or in case there's some funny code in fluent-bit enabling threaded mode for this plugin by default) but I lean more towards the root of the issue being something different.

I noticed that you mentioned that this triggers when the plugin intance is suspended due to it exceeding mem_buf_limit. Do you have a clear cut repro case I can use to accelerate the validation process?

Thanks @leonardo-albertovich for taking a look. I am not running in threaded mode. I see slightly different logs when I run in threaded mode when the plugin pauses input. I will try to repro in threaded mode too. EDIT: Im not able to reproduce this issue with the input in threaded mode.

For repro, this is what I've been running with:

  1. Fluent-bit config:
[SERVICE]
    Flush        1

[INPUT]
    Name forward
    Mem_Buf_Limit 4MB
    unix_path /var/run/fluent.sock

[OUTPUT]
    Name  file
    Match *
    Path  /home/ec2-user/fluent
  1. build and run fluent-bit:
cmake -DFLB_RELEASE=Off -DFLB_MTRACE=On -DFLB_DEV=On -DFLB_BACKTRACE=On -DFLB_DEBUG=On -DFLB_TRACE=On -DFLB_VALGRIND=On ..
make -j $(nproc)
$(pwd)/bin/fluent-bit -c fluent.conf
  1. Run high load log generator script on the socket. I am using a docker container to do this like below. There is no requirement to use docker and the fluentd log driver to do this, it was just the simplest way for me to write to a unix socket in the correct message format.
docker run -d --log-driver=fluentd --log-opt mode=non-blocking \
                      --log-opt fluentd-address=unix:///var/run/fluent.sock \
                      -e LOG_RATE_PER_SECOND=1000 -e LOG_SIZE_KB=7 \
                      public.ecr.aws/cssparr/loggen:latest
  1. I will usually see a segfault within 1-2 minutes of running these steps.
  2. If running with valgrind, you can see many errors even if you ctrl-C the process before the segfault occurs.
sudo valgrind -s --leak-check=full $(pwd)/bin/fluent-bit -c ~/fluent.conf
...

...
==478421==    by 0x541C723: clone (in /usr/lib64/libc.so.6)
==478421== 
==478421== ERROR SUMMARY: 123185 errors from 31 contexts (suppressed: 0 from 0)

@sparrc
Copy link
Author

sparrc commented Nov 7, 2025

I also have a python script for this if it helps. It doesnt seem able to push FLB hard enough to segfault (at least not in reasonable timeframes), but it does raise similar valgrind errors.

See https://gist.github.com/sparrc/cbf7882c9b2479d062fdeb116719a3ab

running the above script like

sudo pip install msgpack
sudo python ./main.py -s /var/run/fluent.sock --log-size 2048

@leonardo-albertovich
Copy link
Contributor

I have analyzed this issue and while the premise is correct I think there interpretation of the root cause is slightly off.

Firstly, due to how input plugins operate it's not possible for a race condition to be the cause of this because there is only one thread acting of the data (which I verified) which means the locks are unnecessary.

Secondly, the premature cleanup is happening during event processing which means it's much easier and cleaner to delay the cleanup process by keeping tabs on the plugins state as a whole rather than implementing reference counting.

Please check this branch I've created in my fork to assist you on the refactoring process. https://github.com/leonardo-albertovich/fluent-bit/tree/leonardo-issue_11113-fix

In my local setup I've used a fluent-bit server with the following configuration :

[SERVICE]
    Flush 20
    Grace 0

[INPUT]
    Name forward
    Mem_Buf_Limit 4MB
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Name  stdout
    Match *

Which causes fluent-bit to buffer the received data 20 seconds before flushing it.

And a fluent-bit producer instance with this configuration :

[SERVICE]
    Flush 1
    Grace 0

[INPUT]
    Name dummy
    Rate 100
    Copies 100

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

[OUTPUT]
    Match *
    Name forward
    Workers 1
    unix_path /home/leo/fluent.sock

Which basically floods the server with the same data over and over.

This easily exceeds the 4 MB limit and reliably reproduces the issue when running master as is.

Please don't hesitate to contact me if you have any questions or comments.

This change fixes a use after free issue related to connection
disposal which caused the event handler to access invalid
memory when the memory limits were exceeded during ingestion.

In order to overcome this issue we track the plugin instances
state and delay the connection cleanup process.

Signed-off-by: Leonardo Albertovich <[email protected]>
@sparrc sparrc force-pushed the sparrc-sigsegv-fix-1 branch from df6a298 to 9472157 Compare November 10, 2025 17:35
@sparrc
Copy link
Author

sparrc commented Nov 10, 2025

Thanks @leonardo-albertovich, that approach does seem quite a lot simpler and better, I have tested it and it fixes the issue for us as well. I have cherry-picked and pushed your commit onto this PR if that is alright with you, I will update the description now.

@sparrc sparrc changed the title in_forward: add race condition protection for connection deletion in_forward: fix connection release on pause memory corruption Nov 10, 2025
coderabbitai[bot]

This comment was marked as off-topic.

@sparrc
Copy link
Author

sparrc commented Nov 10, 2025

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +171 to +174
ctx->state = state_backup;

if (ctx->state == FW_INSTANCE_STATE_PAUSED) {
fw_conn_del_all(ctx);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore state before pause check prevents deferred cleanup

After pausing, in_fw_pause skips fw_conn_del_all() if a connection is being accepted and sets ctx->state to PAUSED so that the acceptor can clean up once it finishes. However in in_fw_collect the code unconditionally restores the previous state before checking for FW_INSTANCE_STATE_PAUSED. Because state_backup is almost always RUNNING, the fw_conn_del_all() path never runs and the state is reset back to running even when a pause happened during the accept. This leaves already-accepted connections open and the plugin marked as running despite being paused, so ingestion continues instead of being torn down.

Useful? React with 👍 / 👎.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this is a valid problem because in in_fw_collect there's an if-statement that ensures that the connections are closed when the is_paused boolean is set:

if(ctx->is_paused) {
    flb_downstream_conn_release(connection);
    flb_plg_trace(ins, "TCP connection will be closed FD=%i", connection->fd);
    ctx->state = state_backup;
    return -1;
}

@leonardo-albertovich
Copy link
Contributor

Looks good to me, thanks for being open to my suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SIGSEGV when using forward input plugin with low Mem_Buf_Limit

2 participants