-
Notifications
You must be signed in to change notification settings - Fork 1.8k
in_calyptia_fleet: load configuration on initial startup. #10996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a wrapper collect callback and a two-stage startup: an initial oneshot fetch (with retry and scheduling pause/resume) handled by Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor FB as Fluent Bit Engine
participant PF as Fleet Plugin
participant MC as Main Collector
participant OC as Oneshot Collector
participant API as Fleet API
rect rgba(229,240,255,0.6)
note right of PF: Initialization
FB->>PF: in_calyptia_fleet_init()
alt Existing committed config
PF-->>FB: Proceed with normal operation
else Needs initial fetch
PF->>MC: Pause schedule
PF->>OC: Schedule oneshot collect (in_calyptia_fleet_collect_once)
end
end
rect rgba(235,255,235,0.6)
note over OC,API: Oneshot initial fetch (retry-capable)
OC->>API: Fetch fleet config
alt Success
OC-->>PF: Commit config
PF->>MC: Resume schedule
else Error / retry
loop Up to FLEET_INITIAL_MAX_TRIES
OC->>API: Retry after interval
end
alt Still failing
PF-->>FB: Init error and cleanup
end
end
end
rect rgba(255,248,229,0.6)
note right of MC: Normal collection via wrapper
FB->>PF: cb_collect -> in_calyptia_fleet_collect_cb()
PF->>PF: Invoke original collect path (in_calyptia_fleet_collect)
PF-->>FB: Return status
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Phillip Whelan <[email protected]>
f05db2d to
63a2dfd
Compare
|
Here is a debug log: And a valgrind log: Note: valgrind shows a memory leak in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/in_calyptia_fleet/in_calyptia_fleet.c (2)
2655-2682: Add cleanup for initial_fd collector.The
in_calyptia_fleet_destroyfunction doesn't clean upctx->initial_fdif it was created but never fired (e.g., if initialization fails after line 2626). This could leak the collector resource.Apply this diff to add cleanup for the initial collector:
static void in_calyptia_fleet_destroy(struct flb_in_calyptia_fleet_config *ctx) { if (!ctx) { return; } if (ctx->fleet_url) { flb_sds_destroy(ctx->fleet_url); } if (ctx->fleet_files_url) { flb_sds_destroy(ctx->fleet_files_url); } if (ctx->fleet_id && ctx->fleet_id_found) { flb_sds_destroy(ctx->fleet_id); } if (ctx->collect_fd >= 0) { flb_input_collector_delete(ctx->collect_fd, ctx->ins); } + if (ctx->initial_fd >= 0) { + flb_input_collector_delete(ctx->initial_fd, ctx->ins); + } + if (ctx->u) { flb_upstream_destroy(ctx->u); } flb_free(ctx); }
2490-2498: Initialize initial_fd to -1 for consistency.The
ctx->initial_fdfield is not explicitly initialized, relying onflb_callocto zero it. However,ctx->collect_fdis explicitly initialized to-1at line 2496 for consistency with the cleanup checks that use>= 0. The same pattern should apply toinitial_fd.Apply this diff:
ctx = flb_calloc(1, sizeof(struct flb_in_calyptia_fleet_config)); if (!ctx) { flb_errno(); return -1; } ctx->ins = in; ctx->collect_fd = -1; + ctx->initial_fd = -1; ctx->fleet_id_found = FLB_FALSE; ctx->config_timestamp = -1;
🧹 Nitpick comments (2)
plugins/in_calyptia_fleet/in_calyptia_fleet.h (1)
28-30: Clarify the retry limit naming.
FLEET_INITIAL_MAX_TRIES = 1is confusing because it suggests one try, but the implementation actually allows the initial attempt plus one retry (2 total attempts). Consider renaming toFLEET_INITIAL_MAX_RETRIESor updating the value to2with a name likeFLEET_INITIAL_MAX_ATTEMPTSfor clarity.plugins/in_calyptia_fleet/in_calyptia_fleet.c (1)
2076-2082: Consider removing the wrapper callback.The
in_calyptia_fleet_collect_cbwrapper adds an unnecessary level of indirection. The oneshot callback at line 2094 could callin_calyptia_fleet_collectdirectly, simplifying the code structure.Apply this diff to simplify:
-static int in_calyptia_fleet_collect_cb(struct flb_input_instance *ins, - struct flb_config *config, - void *in_context) -{ - FLB_INPUT_RETURN(in_calyptia_fleet_collect(ins, config, in_context)); -} - static int in_calyptia_fleet_collect_once(struct flb_input_instance *ins, struct flb_config *config, void *in_context) { int ret; struct flb_in_calyptia_fleet_config *ctx = in_context; /* pause the oneshot interval forever. */ flb_input_collector_delete(ctx->initial_fd, ins); ret = in_calyptia_fleet_collect(ins, config, in_context);Then update line 2602 and line 2747 to use
in_calyptia_fleet_collectdirectly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/in_calyptia_fleet/in_calyptia_fleet.c(5 hunks)plugins/in_calyptia_fleet/in_calyptia_fleet.h(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/in_calyptia_fleet/in_calyptia_fleet.c (3)
src/flb_input.c (5)
flb_input_collector_delete(2074-2093)flb_input_set_collector_time(1685-1704)flb_input_collector_start(1790-1809)flb_input_collector_resume(2095-2151)flb_input_collector_pause(2035-2072)src/flb_upstream.c (1)
flb_upstream_destroy(656-698)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (25)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (1)
plugins/in_calyptia_fleet/in_calyptia_fleet.c (1)
2055-2074: LGTM!The refactoring to use a goto label for error handling is appropriate and makes the control flow clearer.
…o schedule. Signed-off-by: Phillip Whelan <[email protected]>
…al one time collector. Signed-off-by: Phillip Whelan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
plugins/in_calyptia_fleet/in_calyptia_fleet.c (2)
2619-2621: Consider checking the return value.
calyptia_config_commitcan fail and returnFLB_FALSE. While not critical during initialization, logging an error on failure would aid troubleshooting.if (is_fleet_config(ctx, config)) { - calyptia_config_commit(ctx); + if (calyptia_config_commit(ctx) == FLB_FALSE) { + flb_plg_warn(ctx->ins, "failed to commit existing fleet config"); + } }
2603-2639: Consider adding explanatory comment for two-stage startup.The initialization logic implements two distinct paths:
- If already using a fleet config, commit it immediately
- Otherwise, pause the main collector and schedule a one-shot initial fetch with retries
A brief comment explaining this two-stage approach would improve maintainability.
ctx->collect_fd = ret; flb_plg_info(ctx->ins, "fleet collector initialized with interval: %d sec %d nsec", ctx->interval_sec, ctx->interval_nsec); + /* Two-stage startup: commit existing fleet config or fetch initial config with one-shot collector */ if (is_fleet_config(ctx, config)) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
plugins/in_calyptia_fleet/in_calyptia_fleet.c(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/in_calyptia_fleet/in_calyptia_fleet.c (1)
src/flb_input.c (5)
flb_input_collector_delete(2074-2093)flb_input_set_collector_time(1685-1704)flb_input_collector_resume(2095-2151)flb_input_collector_start(1790-1809)flb_input_collector_pause(2035-2072)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (5)
plugins/in_calyptia_fleet/in_calyptia_fleet.c (5)
2055-2074: LGTM: Refactored for reusability.The addition of a temporary
retvariable and thefleet_id_errorlabel allows this function to be called from both the wrapper callback and the one-shot callback while maintaining consistent error handling.
2077-2082: LGTM: Clean wrapper pattern.The wrapper callback maintains the plugin callback signature while allowing
in_calyptia_fleet_collectto be reused by the one-shot collector.
2084-2120: LGTM: Robust retry logic with proper collector lifecycle management.The one-shot collector correctly handles all scenarios:
- On success, resumes the main collector immediately
- On failure, schedules a retry (if attempts remain) or resumes the main collector (if exhausted)
- On retry collector creation failure, resumes the main collector to prevent indefinite pause
The past review concern about resuming the main collector on retry failure has been addressed (line 2107).
2748-2748: LGTM: Correct callback registration.The plugin now uses the wrapper callback, which maintains the expected signature while enabling the two-stage startup mechanism.
2084-2120: ✅ Past review comments addressed.The issues raised in previous reviews have been resolved:
- Line 2107 ensures the main collector resumes even when retry collector creation fails
- Line 2635 uses
in_calyptia_fleet_destroy(ctx)for consistent cleanup instead of manual resource management
Summary
To avoid slow startups this patch uses a oneshot interval to download the initial configuration when registering an agent to a fleet.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor