-
Notifications
You must be signed in to change notification settings - Fork 1.7k
filter_kubernetes: add entity attribute retrieval logics #10586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
CI needs to pass before the review |
Signed-off-by: Zhihong Lin <[email protected]>
5dc266e
to
954dc73
Compare
WalkthroughThis set of changes introduces a pod-to-service association feature in the Kubernetes filter plugin, including new configuration options, background mapping fetch logic, TLS and platform detection enhancements, and extensive metadata enrichment. The test suite is expanded with new test data and test cases for various pod association and kubelet configurations. Changes
Sequence Diagram(s)sequenceDiagram
participant TestRunner
participant KubernetesFilter
participant PodServiceMapFetcher
participant UpstreamEndpoint
participant HashTable
TestRunner->>KubernetesFilter: Start filter with pod association enabled
KubernetesFilter->>PodServiceMapFetcher: Spawn background thread
loop Every refresh interval
PodServiceMapFetcher->>HashTable: Try preload from file
alt File not found
PodServiceMapFetcher->>UpstreamEndpoint: HTTP GET pod-service map (TLS if configured)
UpstreamEndpoint-->>PodServiceMapFetcher: JSON mapping
end
PodServiceMapFetcher->>HashTable: Parse and update mapping (mutex protected)
end
KubernetesFilter->>HashTable: Lookup pod-to-service association during metadata enrichment
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
🔇 Additional comments (5)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
🔭 Outside diff range comments (1)
tests/runtime/filter_kubernetes.c (1)
386-409
: Check return values of file operations in tests.The test doesn't verify if
clear_file
andwrite_log_to_file
succeed, which could lead to false positives if file operations fail.if(type == KUBE_POD_ASSOCIATION) { - clear_file(path); + ret = clear_file(path); + TEST_CHECK_(ret == 0, "clearing log file"); } /* Start the engine */ ret = flb_start(ctx.flb); TEST_CHECK_(ret == 0, "starting engine"); if (ret == -1) { goto exit; } /* Poll for up to 3 seconds or until we got a match */ for (ret = 0; ret < 3000 && result.nMatched != nExpected; ret++) { usleep(1000); if (ret == 2000 && type == KUBE_POD_ASSOCIATION) { - write_log_to_file(path); + int write_ret = write_log_to_file(path, NULL); + TEST_CHECK_(write_ret == 0, "writing log entry"); } }
🧹 Nitpick comments (7)
tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-pod.meta (1)
1-109
: Consider consolidating duplicate test dataThis file is nearly identical to
options_use-pod-association-enabled.meta
except for the pod name. Consider using a template or parameterized approach to reduce duplication and maintenance burden.plugins/filter_kubernetes/kube_conf.h (2)
159-159
: Inconsistent naming conventionThe field
deploymentRegex
uses camelCase while other fields in the struct use snake_case (e.g.,api_host
,api_port
). Consider renaming todeployment_regex
for consistency.- struct flb_regex *deploymentRegex; + struct flb_regex *deployment_regex;
91-100
: Consider using flexible buffer sizesThe
service_attributes
struct uses fixed-size buffers. While the sizes align with AWS limits, consider whether dynamic allocation might be more memory-efficient, especially if many instances are created.plugins/filter_kubernetes/kubernetes.c (1)
478-478
: Fix parameter type mismatch.The function signature expects a
const char *
but namespace_kube_buf is alreadyconst char *
, making the cast unnecessary and potentially hiding type issues.- const char *namespace_kube_buf, + const char *namespace_kube_buf,plugins/filter_kubernetes/kube_meta.c (3)
832-847
: Simplify NULL checks in callback function.The NULL checks for
name
,value
, anddata
are defensive but unnecessary in a callback context. If these are NULL, there's a bigger problem with the caller.Consider simplifying to just check
vlen
:static void cb_results_workload(const char *name, const char *value, size_t vlen, void *data) { - if (name == NULL || value == NULL || vlen == 0 || data == NULL) { + if (vlen == 0) { return; } struct flb_kube_meta *meta = data;
861-953
: LGTM! Well-structured workload detection logic.The implementation correctly handles the priority-based workload search through owner references. The use of goto for the
set_workload
label is appropriate here as it reduces code duplication.Minor suggestions:
- Consider adding a comment explaining the priority order at the function start
- Some variable declarations could be moved closer to their usage
1467-1537
: LGTM! Comprehensive AWS entity metadata enrichment.The implementation correctly adds AWS entity fields to the metadata with proper conditional checks and string packing. The code follows the existing pattern in the codebase.
Consider extracting a helper function to reduce repetition:
static void pack_string_field(msgpack_packer *mp_pck, const char *key, size_t key_len, const char *value, size_t value_len) { msgpack_pack_str(mp_pck, key_len); msgpack_pack_str_body(mp_pck, key, key_len); msgpack_pack_str(mp_pck, value_len); msgpack_pack_str_body(mp_pck, value, value_len); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (19)
tests/runtime/data/kubernetes/log/options/options_use-kubelet-disabled-daemonset_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-kubelet-disabled-deployment_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-kubelet-disabled-pod_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-kubelet-disabled-replicaset_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-kubelet-enabled-daemonset_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-kubelet-enabled-deployment_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-kubelet-enabled-pod_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-kubelet-enabled-replicaset_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/log/options/options_use-pod-association-enabled_fluent-bit.log
is excluded by!**/*.log
tests/runtime/data/kubernetes/out/options/options_use-kubelet-disabled-daemonset_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-kubelet-disabled-deployment_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-kubelet-disabled-pod_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-kubelet-disabled-replicaset_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-kubelet-enabled-daemonset_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-kubelet-enabled-deployment_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-kubelet-enabled-pod_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-kubelet-enabled-replicaset_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/out/options/options_use-pod-association-enabled_fluent-bit.out
is excluded by!**/*.out
tests/runtime/data/kubernetes/servicemap/options_use-pod-association-enabled_fluent-bit.map
is excluded by!**/*.map
📒 Files selected for processing (18)
plugins/filter_kubernetes/kube_conf.c
(4 hunks)plugins/filter_kubernetes/kube_conf.h
(3 hunks)plugins/filter_kubernetes/kube_meta.c
(22 hunks)plugins/filter_kubernetes/kube_meta.h
(2 hunks)plugins/filter_kubernetes/kube_regex.c
(1 hunks)plugins/filter_kubernetes/kube_regex.h
(1 hunks)plugins/filter_kubernetes/kubernetes.c
(5 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-daemonset.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-deployment.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-pod.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-replicaset.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-daemonset.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-deployment.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-pod.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-replicaset.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled.meta
(1 hunks)tests/runtime/data/kubernetes/meta/options_use-pod-association-enabled.meta
(1 hunks)tests/runtime/filter_kubernetes.c
(9 hunks)
🔇 Additional comments (26)
tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-replicaset.meta (1)
1-119
: LGTM! Well-structured test data for ReplicaSet pod association.The JSON structure is valid and provides comprehensive test data for a pod owned by a ReplicaSet. The owner reference correctly specifies the controlling ReplicaSet with appropriate metadata fields.
tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled.meta (1)
20-30
: LGTM! Proper addition of Deployment owner reference.The ownerReferences field is correctly structured with appropriate API version, kind, and controller flags for a Deployment resource.
tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-daemonset.meta (1)
1-119
: LGTM! Comprehensive test data for DaemonSet pod association.The JSON manifest is well-structured and provides complete test coverage for a pod owned by a DaemonSet, including proper owner reference metadata.
plugins/filter_kubernetes/kube_conf.c (4)
193-196
: LGTM! Pod hash table creation is properly configured.The pod hash table creation correctly uses the configured TTL value and eviction policy for managing pod-to-service mappings.
214-216
: LGTM! Proper cleanup of pod hash table.The pod hash table cleanup is correctly implemented with null check before destruction.
226-228
: LGTM! Deployment regex cleanup is properly handled.The deployment regex is properly destroyed with appropriate null check.
243-253
: LGTM! Comprehensive cleanup of new pod association resources.All new resources for pod association feature are properly cleaned up:
- TLS context destruction
- Upstream connection cleanup
- Platform string deallocation
The cleanup order and null checks are appropriate.
tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-deployment.meta (1)
1-126
: LGTM! Well-structured test data for deployment pod scenario.The Kubernetes Pod manifest JSON is properly structured with:
- Valid metadata including owner references to ReplicaSet (typical for deployment)
- Complete pod specification with container, volumes, and tolerations
- Realistic status information for a running pod
- Appropriate annotations for Prometheus monitoring
This test data effectively supports validation of pod association with deployment workloads.
tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-deployment.meta (1)
1-119
: Verify the JSON structure for kubelet enabled scenario.The JSON structure appears to mix individual Pod format (lines 1-4) with list format (items array). This seems unusual as:
- Kubelet API typically returns pod lists without outer Pod kind/apiVersion
- The combination of Pod metadata at root level with items array is non-standard
Please verify this structure matches the expected kubelet API response format for the test scenario.
plugins/filter_kubernetes/kube_meta.h (4)
30-31
: LGTM! Consistent field additions for cluster and workload metadata.The new length and string pointer fields follow the established pattern in the struct, maintaining consistency with existing metadata fields.
Also applies to: 38-39
40-41
: LGTM! Proper string pointer declarations for new metadata fields.The cluster and workload string pointers are appropriately declared and positioned within the struct.
Also applies to: 46-47
61-66
: LGTM! Well-defined constants for ConfigMap API and resource types.The new macro and constants follow existing naming conventions:
- ConfigMap API path format is consistent with existing API formats
- Resource type constants are clearly defined and appropriately named
81-81
: LGTM! Appropriate function declaration for pod association initialization.The function declaration follows the established pattern and naming convention for initialization functions in this module.
tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-replicaset.meta (1)
1-126
: LGTM! Appropriate test data for ReplicaSet pod scenario.The Kubernetes Pod manifest JSON is well-structured with:
- Valid metadata with owner reference directly to ReplicaSet (line 23)
- Complete pod specification matching other test files
- Consistent status information for a running pod
- Proper differentiation from deployment scenario by having direct ReplicaSet ownership
This effectively tests pod association with ReplicaSet workloads distinct from deployment-managed ReplicaSets.
tests/runtime/data/kubernetes/meta/options_use-pod-association-enabled.meta (1)
4-4
: Empty metadata object in Pod List responseThe top-level
metadata
field is empty, which is unusual for a Kubernetes Pod List API response. Consider whether this accurately represents the expected API response structure.tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-pod.meta (1)
1-116
: Different JSON structure compared to pod-association-enabled.metaThis file contains a single Pod object while
options_use-pod-association-enabled.meta
contains a Pod List. Ensure this structural difference is intentional for your test scenarios.tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-daemonset.meta (1)
19-28
: LGTM! Appropriate ownerReferences for DaemonSetThe ownerReferences section correctly identifies this pod as being owned by a DaemonSet, which is appropriate for testing pod association with different workload types.
plugins/filter_kubernetes/kube_conf.h (1)
228-235
: LGTM! Comprehensive TLS configurationThe TLS configuration fields for pod association provide proper support for mutual TLS authentication with CA verification, client certificates, and debugging options.
plugins/filter_kubernetes/kubernetes.c (2)
189-273
: LGTM! Well-structured error handling.The function properly handles errors with appropriate cleanup of upstream connections and TLS resources. The TLS certificate rotation logic is well implemented.
125-129
: Memory leak on JSON parsing error.The
buffer
is allocated byflb_pack_json
but not freed before the early return when parsing fails.if (ret < 0) { flb_plg_warn(ctx->ins, "Could not parse json response = %s", api_buf); - flb_free(buffer); return; } +Move the
flb_free(buffer)
to line 186 where it's freed in all paths.Likely an incorrect or invalid review comment.
tests/runtime/filter_kubernetes.c (2)
286-286
: Consider implications of debug log level in tests.Changing from "error" to "debug" will produce more verbose output during test runs, which might make it harder to identify actual test failures.
Is this change intentional for debugging the new pod association feature? Consider using a conditional log level based on an environment variable to allow both verbose and quiet test runs.
491-513
: Verify test certificate paths exist or are mockedThe Kubernetes filter tests (
tests/runtime/filter_kubernetes.c
) use hardcoded absolute paths for TLS fixtures—/tst/ca.crt
,/tst/client.crt
and/tst/client.key
—but I don’t see corresponding files or setup in the repo. Please ensure these files are created or mocked in your test harness (or move them under the existingDPATH
test directory) so the tests can reliably locate them.
- File: tests/runtime/filter_kubernetes.c (around lines 491–513)
- Hardcoded paths to verify or replace:
•"pod_association_host_server_ca_file", "/tst/ca.crt"
•"pod_association_host_client_cert_file", "/tst/client.crt"
•"pod_association_host_client_key_file", "/tst/client.key"
plugins/filter_kubernetes/kube_meta.c (4)
355-418
: LGTM! Good generalization of the function.The refactoring to accept
resource_type
andresource_name
parameters makes the function more reusable for different Kubernetes resources, which is a good architectural improvement.
1996-2020
: LGTM! Well-implemented pod association initialization.The function properly handles TLS setup and upstream connection creation with appropriate error handling and resource cleanup. The thread-safe configuration is correctly applied.
1827-1891
: Verify cache key buffer size calculation.The cache key construction adds multiple pod association fields. While the size calculation appears correct, it's complex and error-prone.
Consider adding a safety margin or using dynamic allocation:
pod_service_found = flb_hash_table_get(ctx->pod_hash_table, meta->podname, meta->podname_len, &tmp_service_attributes, &tmp_service_attr_size); if (pod_service_found != -1 && tmp_service_attributes != NULL) { if (tmp_service_attributes->name[0] != '\0') { n += tmp_service_attributes->name_len + 1; } if (tmp_service_attributes->environment[0] != '\0') { n += tmp_service_attributes->environment_len + 1; } if (tmp_service_attributes->name_source[0] != '\0') { n += tmp_service_attributes->name_source_len + 1; } } + + /* Add safety margin for separators and null terminator */ + n += 10;
2213-2228
: LGTM! Clever platform detection logic.The platform detection based on the presence of the
aws-auth
ConfigMap in thekube-system
namespace is a smart way to differentiate between EKS and native Kubernetes clusters.
/* Configuration used for enabling pod to service name mapping*/ | ||
int use_pod_association; | ||
char *pod_association_host; | ||
char *pod_association_endpoint; | ||
int pod_association_port; | ||
|
||
/* | ||
* TTL is used to check how long should the mapped entry | ||
* remain in the hash table | ||
*/ | ||
struct flb_hash_table *pod_hash_table; | ||
int pod_service_map_ttl; | ||
int pod_service_map_refresh_interval; | ||
flb_sds_t pod_service_preload_cache_path; | ||
struct flb_upstream *pod_association_upstream; | ||
/* | ||
* This variable holds the Kubernetes platform type | ||
* Current checks for EKS or Native Kuberentes | ||
*/ | ||
char *platform; | ||
/* | ||
* This value is used for holding the platform config | ||
* value. Platform will be overriden with this variable | ||
* if it's set | ||
*/ | ||
char *set_platform; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify thread safety for pod_hash_table access
The pod_hash_table
field will likely be accessed from multiple threads (main thread and background fetch thread mentioned in the PR summary). Ensure proper synchronization mechanisms are in place.
🏁 Script executed:
#!/bin/bash
# Search for mutex/lock usage around pod_hash_table access
rg -A 5 -B 5 "pod_hash_table" plugins/filter_kubernetes/
Length of output: 5956
Synchronize pod_hash_table reads in kube_meta.c
While parse_pod_service_map()
properly guards writes to ctx->pod_hash_table
with pthread_mutex_lock(&metadata_mutex)
/ pthread_mutex_unlock(&metadata_mutex)
, the lookups in kube_meta.c
occur without any locking. This introduces a data‐race between the background fetch thread (writes) and the pipeline threads (reads).
Please wrap each flb_hash_table_get(ctx->pod_hash_table, …)
in plugins/filter_kubernetes/kube_meta.c
with the same metadata_mutex
lock/unlock pair. For example:
- pod_service_found = flb_hash_table_get(ctx->pod_hash_table,
- meta->podname, meta->podname_len,
- &tmp_service_attributes,
- &tmp_service_attr_size);
+ pthread_mutex_lock(&metadata_mutex);
+ pod_service_found = flb_hash_table_get(ctx->pod_hash_table,
+ meta->podname, meta->podname_len,
+ &tmp_service_attributes,
+ &tmp_service_attr_size);
+ pthread_mutex_unlock(&metadata_mutex);
Locations to update:
- plugins/filter_kubernetes/kube_meta.c around the two calls to
flb_hash_table_get(...)
wherepod_service_found
is assigned.
This change will ensure safe concurrent access to the shared hash table.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In plugins/filter_kubernetes/kube_meta.c around the lines where
pod_service_found is assigned using flb_hash_table_get, the reads from
ctx->pod_hash_table are not protected by the metadata_mutex, causing a data race
with writes in parse_pod_service_map. To fix this, wrap each flb_hash_table_get
call with pthread_mutex_lock(&metadata_mutex) before the call and
pthread_mutex_unlock(&metadata_mutex) immediately after, ensuring thread-safe
concurrent access to the shared pod_hash_table.
static int get_api_server_configmap(struct flb_kube *ctx, | ||
const char *namespace, const char *configmap, | ||
char **out_buf, size_t *out_size) | ||
{ | ||
int ret; | ||
int packed = -1; | ||
int root_type; | ||
char uri[1024]; | ||
char *buf; | ||
size_t size; | ||
|
||
*out_buf = NULL; | ||
*out_size = 0; | ||
|
||
if (packed == -1) { | ||
|
||
ret = snprintf(uri, sizeof(uri) - 1, FLB_KUBE_API_CONFIGMAP_FMT, namespace, | ||
configmap); | ||
|
||
if (ret == -1) { | ||
return -1; | ||
} | ||
flb_plg_debug(ctx->ins, | ||
"Send out request to API Server for configmap information"); | ||
packed = get_meta_info_from_request(ctx, namespace, FLB_KUBE_CONFIGMAP, configmap, | ||
&buf, &size, &root_type, uri, false); | ||
} | ||
|
||
/* validate pack */ | ||
if (packed == -1) { | ||
return -1; | ||
} | ||
|
||
*out_buf = buf; | ||
*out_size = size; | ||
|
||
return 0; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redundant condition check and improve error handling.
The function has a redundant check and could benefit from better error handling:
- The condition
if (packed == -1)
on line 498 is always true sincepacked
is initialized to -1 and never modified before this check. - The
snprintf
return value should be checked more thoroughly.
Apply this diff to fix these issues:
static int get_api_server_configmap(struct flb_kube *ctx,
const char *namespace, const char *configmap,
char **out_buf, size_t *out_size)
{
int ret;
int packed = -1;
int root_type;
char uri[1024];
char *buf;
size_t size;
*out_buf = NULL;
*out_size = 0;
- if (packed == -1) {
-
- ret = snprintf(uri, sizeof(uri) - 1, FLB_KUBE_API_CONFIGMAP_FMT, namespace,
- configmap);
-
- if (ret == -1) {
- return -1;
- }
- flb_plg_debug(ctx->ins,
- "Send out request to API Server for configmap information");
- packed = get_meta_info_from_request(ctx, namespace, FLB_KUBE_CONFIGMAP, configmap,
- &buf, &size, &root_type, uri, false);
+ ret = snprintf(uri, sizeof(uri) - 1, FLB_KUBE_API_CONFIGMAP_FMT, namespace,
+ configmap);
+ if (ret < 0 || ret >= sizeof(uri) - 1) {
+ flb_plg_error(ctx->ins, "URI too long or snprintf failed");
+ return -1;
}
+ flb_plg_debug(ctx->ins,
+ "Send out request to API Server for configmap information");
+ packed = get_meta_info_from_request(ctx, namespace, FLB_KUBE_CONFIGMAP, configmap,
+ &buf, &size, &root_type, uri, false);
/* validate pack */
if (packed == -1) {
return -1;
}
*out_buf = buf;
*out_size = size;
return 0;
}
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In plugins/filter_kubernetes/kube_meta.c around lines 484 to 522, remove the
redundant condition check 'if (packed == -1)' since 'packed' is initialized to
-1 and not changed before this check. Instead, directly perform the snprintf
call and check its return value to ensure it does not exceed the buffer size or
return an error. Add proper error handling for snprintf by verifying the return
value is non-negative and less than the buffer size. Then proceed to call
get_meta_info_from_request and handle its return value accordingly.
static void get_cluster_from_environment(struct flb_kube *ctx,struct flb_kube_meta *meta) | ||
{ | ||
if(meta->cluster == NULL) { | ||
char* cluster_name = getenv("CLUSTER_NAME"); | ||
if(cluster_name) { | ||
meta->cluster = strdup(cluster_name); | ||
meta->cluster_len = strlen(cluster_name); | ||
meta->fields++; | ||
} else { | ||
free(cluster_name); | ||
} | ||
flb_plg_debug(ctx->ins, "Cluster name is %s.", meta->cluster); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Remove incorrect free() call on getenv() result.
The function has a critical memory management issue. getenv()
returns a pointer to static storage that must NOT be freed. The free(cluster_name)
call on line 620 will cause undefined behavior.
Apply this diff to fix the issue:
static void get_cluster_from_environment(struct flb_kube *ctx,struct flb_kube_meta *meta)
{
if(meta->cluster == NULL) {
char* cluster_name = getenv("CLUSTER_NAME");
if(cluster_name) {
meta->cluster = strdup(cluster_name);
meta->cluster_len = strlen(cluster_name);
meta->fields++;
- } else {
- free(cluster_name);
}
flb_plg_debug(ctx->ins, "Cluster name is %s.", meta->cluster);
}
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
static void get_cluster_from_environment(struct flb_kube *ctx,struct flb_kube_meta *meta) | |
{ | |
if(meta->cluster == NULL) { | |
char* cluster_name = getenv("CLUSTER_NAME"); | |
if(cluster_name) { | |
meta->cluster = strdup(cluster_name); | |
meta->cluster_len = strlen(cluster_name); | |
meta->fields++; | |
} else { | |
free(cluster_name); | |
} | |
flb_plg_debug(ctx->ins, "Cluster name is %s.", meta->cluster); | |
} | |
} | |
static void get_cluster_from_environment(struct flb_kube *ctx,struct flb_kube_meta *meta) | |
{ | |
if (meta->cluster == NULL) { | |
char* cluster_name = getenv("CLUSTER_NAME"); | |
if (cluster_name) { | |
meta->cluster = strdup(cluster_name); | |
meta->cluster_len = strlen(cluster_name); | |
meta->fields++; | |
} | |
flb_plg_debug(ctx->ins, "Cluster name is %s.", meta->cluster); | |
} | |
} |
🤖 Prompt for AI Agents
In plugins/filter_kubernetes/kube_meta.c around lines 611 to 625, the code
incorrectly calls free() on the pointer returned by getenv(), which points to
static memory and must not be freed. Remove the free(cluster_name) call entirely
to fix this critical memory management issue.
@@ -34,6 +34,7 @@ int flb_kube_regex_init(struct flb_kube *ctx) | |||
ctx->regex = flb_regex_create(KUBE_TAG_TO_REGEX); | |||
} | |||
} | |||
ctx->deploymentRegex = flb_regex_create(DEPLOYMENT_REGEX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for deployment regex creation.
The new deploymentRegex
creation lacks error checking. If flb_regex_create(DEPLOYMENT_REGEX)
fails, the function will still return success, potentially leading to runtime issues when the regex is used.
ctx->deploymentRegex = flb_regex_create(DEPLOYMENT_REGEX);
+if (!ctx->deploymentRegex) {
+ return -1;
+}
🤖 Prompt for AI Agents
In plugins/filter_kubernetes/kube_regex.c at line 37, the call to
flb_regex_create for deploymentRegex lacks error handling. Modify the code to
check if flb_regex_create returns NULL or an error indicator, and if so, handle
the failure appropriately by returning an error code or cleaning up resources
before returning. This ensures the function does not proceed with an invalid
regex and prevents potential runtime issues.
@@ -26,6 +26,8 @@ | |||
|
|||
#define KUBE_JOURNAL_TO_REGEX "^(?<name_prefix>[^_]+)_(?<container_name>[^\\._]+)(\\.(?<container_hash>[^_]+))?_(?<pod_name>[^_]+)_(?<namespace_name>[^_]+)_[^_]+_[^_]+$" | |||
|
|||
#define DEPLOYMENT_REGEX "^(?<deployment>.+)-(?<id>[bcdfghjklmnpqrstvwxz2456789]{6,10})$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the deployment regex pattern matches Kubernetes naming conventions.
The regex pattern uses a broad .+
capture for the deployment name and a specific character class [bcdfghjklmnpqrstvwxz2456789]{6,10}
for the ID. Please ensure this pattern accurately matches Kubernetes deployment-generated ReplicaSet names and that the character class aligns with Kubernetes' naming algorithm.
🏁 Script executed:
#!/bin/bash
# Description: Verify the deployment regex pattern against real Kubernetes deployment naming patterns
# Expected: Find examples of deployment-generated ReplicaSet names that match this pattern
# Search for any existing tests or examples that use this regex pattern
rg -A 3 -B 3 "DEPLOYMENT_REGEX|deploymentRegex"
# Look for Kubernetes naming convention documentation or examples in the codebase
rg -A 5 "replicaset.*name|deployment.*name" --type c
Length of output: 4266
Align DEPLOYMENT_REGEX with Kubernetes naming rules
The current regex in plugins/filter_kubernetes/kube_regex.h
(line 29)
#define DEPLOYMENT_REGEX "^(?<deployment>.+)-(?<id>[bcdfghjklmnpqrstvwxz2456789]{6,10})$"
allows any characters in the deployment name and uses a custom consonant-only ID class of length 6–10. Real Kubernetes ReplicaSet suffixes are:
- Generated by hashing the PodTemplateSpec, then encoding in lowercase alphanumeric (base36).
- Always exactly 10 characters.
- Deployment names must conform to DNS-1123 subdomain rules (
[a-z0-9]([-a-z0-9]*[a-z0-9])?
).
Please update and add tests accordingly:
• Enforce DNS-1123 on <deployment>
• Restrict <id>
to exactly 10 lowercase alphanumeric chars
• Add unit tests using known examples (e.g. nginx-deployment-66b6c48dd5
)
Proposed change:
--- a/plugins/filter_kubernetes/kube_regex.h
+++ b/plugins/filter_kubernetes/kube_regex.h
-#define DEPLOYMENT_REGEX "^(?<deployment>.+)-(?<id>[bcdfghjklmnpqrstvwxz2456789]{6,10})$"
+#define DEPLOYMENT_REGEX "^(?<deployment>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?)-(?<id>[a-z0-9]{10})$"
• File: plugins/filter_kubernetes/kube_regex.h (line 29)
• Add matching tests in plugins/filter_kubernetes
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
#define DEPLOYMENT_REGEX "^(?<deployment>.+)-(?<id>[bcdfghjklmnpqrstvwxz2456789]{6,10})$" | |
--- a/plugins/filter_kubernetes/kube_regex.h | |
+++ b/plugins/filter_kubernetes/kube_regex.h | |
@@ -29,1 +29,1 @@ | |
-#define DEPLOYMENT_REGEX "^(?<deployment>.+)-(?<id>[bcdfghjklmnpqrstvwxz2456789]{6,10})$" | |
+#define DEPLOYMENT_REGEX "^(?<deployment>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?)-(?<id>[a-z0-9]{10})$" |
🤖 Prompt for AI Agents
In plugins/filter_kubernetes/kube_regex.h at line 29, update DEPLOYMENT_REGEX to
enforce DNS-1123 subdomain rules on the deployment name by restricting it to
lowercase alphanumeric characters and hyphens in the correct pattern, and change
the ID group to exactly 10 lowercase alphanumeric characters (base36). Then, add
unit tests in the plugins/filter_kubernetes test suite using known valid
examples like "nginx-deployment-66b6c48dd5" to verify the regex matches
correctly.
"name": "use-kubelet-disabled-pod", | ||
"namespace": "options", | ||
"resourceVersion": "74466568", | ||
"selfLink": "/api/v1/namespaces/core/pods/base", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent namespace and pod name in selfLink
The selfLink references namespace "core" and pod "base" while the actual namespace is "options" and pod name is "use-kubelet-disabled-pod". This should be corrected.
- "selfLink": "/api/v1/namespaces/core/pods/base",
+ "selfLink": "/api/v1/namespaces/options/pods/use-kubelet-disabled-pod",
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"selfLink": "/api/v1/namespaces/core/pods/base", | |
"selfLink": "/api/v1/namespaces/options/pods/use-kubelet-disabled-pod", |
🤖 Prompt for AI Agents
In tests/runtime/data/kubernetes/meta/options_use-kubelet-disabled-pod.meta at
line 17, the selfLink field incorrectly references namespace "core" and pod
"base" instead of the correct namespace "options" and pod name
"use-kubelet-disabled-pod". Update the selfLink value to reflect the correct
namespace and pod name to maintain consistency with the actual resource.
"name": "use-kubelet-enabled-pod", | ||
"namespace": "options", | ||
"resourceVersion": "74466568", | ||
"selfLink": "/api/v1/namespaces/core/pods/base", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent namespace and pod name in selfLink
The selfLink references namespace "core" and pod "base" while the actual namespace is "options" and pod name is "use-kubelet-enabled-pod".
- "selfLink": "/api/v1/namespaces/core/pods/base",
+ "selfLink": "/api/v1/namespaces/options/pods/use-kubelet-enabled-pod",
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"selfLink": "/api/v1/namespaces/core/pods/base", | |
"selfLink": "/api/v1/namespaces/options/pods/use-kubelet-enabled-pod", |
🤖 Prompt for AI Agents
In tests/runtime/data/kubernetes/meta/options_use-kubelet-enabled-pod.meta at
line 19, the selfLink field incorrectly references namespace "core" and pod
"base". Update the selfLink to use the correct namespace "options" and pod name
"use-kubelet-enabled-pod" to match the actual resource identifiers.
"name": "use-pod-association-enabled", | ||
"namespace": "options", | ||
"resourceVersion": "74466568", | ||
"selfLink": "/api/v1/namespaces/core/pods/base", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent namespace in selfLink
The selfLink references namespace "core" (/api/v1/namespaces/core/pods/base
) while the Pod is actually in namespace "options" (line 17). This should be /api/v1/namespaces/options/pods/use-pod-association-enabled
.
- "selfLink": "/api/v1/namespaces/core/pods/base",
+ "selfLink": "/api/v1/namespaces/options/pods/use-pod-association-enabled",
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"selfLink": "/api/v1/namespaces/core/pods/base", | |
"selfLink": "/api/v1/namespaces/options/pods/use-pod-association-enabled", |
🤖 Prompt for AI Agents
In tests/runtime/data/kubernetes/meta/options_use-pod-association-enabled.meta
at line 19, the selfLink incorrectly uses the namespace "core" instead of
"options". Update the selfLink value to
"/api/v1/namespaces/options/pods/use-pod-association-enabled" to match the
actual Pod namespace and name.
// Helper function to clear the file | ||
static void clear_file(const char *filename) { | ||
FILE *file; | ||
|
||
// Open the file in "w" mode to empty it | ||
file = fopen(filename, "w"); | ||
if (file == NULL) { | ||
perror("Error opening file to clear content"); | ||
return; | ||
} | ||
|
||
// Close the file to complete truncation | ||
fclose(file); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for file operations.
The clear_file
function logs errors but continues execution, which could lead to test failures that are hard to diagnose.
-static void clear_file(const char *filename) {
+static int clear_file(const char *filename) {
FILE *file;
// Open the file in "w" mode to empty it
file = fopen(filename, "w");
if (file == NULL) {
perror("Error opening file to clear content");
- return;
+ return -1;
}
// Close the file to complete truncation
fclose(file);
+ return 0;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Helper function to clear the file | |
static void clear_file(const char *filename) { | |
FILE *file; | |
// Open the file in "w" mode to empty it | |
file = fopen(filename, "w"); | |
if (file == NULL) { | |
perror("Error opening file to clear content"); | |
return; | |
} | |
// Close the file to complete truncation | |
fclose(file); | |
} | |
// Helper function to clear the file | |
-static void clear_file(const char *filename) { | |
+static int clear_file(const char *filename) { | |
FILE *file; | |
// Open the file in "w" mode to empty it | |
file = fopen(filename, "w"); | |
if (file == NULL) { | |
perror("Error opening file to clear content"); | |
- return; | |
+ return -1; | |
} | |
// Close the file to complete truncation | |
fclose(file); | |
+ return 0; | |
} |
🤖 Prompt for AI Agents
In tests/runtime/filter_kubernetes.c around lines 72 to 85, the clear_file
function currently logs errors when failing to open the file but continues
execution, which can cause unclear test failures. Modify the function to handle
errors more robustly by returning an error code or using a mechanism to signal
failure to the caller, and ensure the caller checks this result to handle the
error appropriately instead of proceeding silently.
static void write_log_to_file(const char *filename) { | ||
FILE *file; | ||
char log_entry[512]; | ||
|
||
// Log message to write | ||
const char *log_template = "{\"log\":\"Fluent Bit is logging\\n\",\"stream\":\"stdout\",\"time\":\"2019-04-01T17:58:33.598656444Z\"}"; | ||
|
||
// Open the file for appending | ||
file = fopen(filename, "a"); | ||
if (file == NULL) { | ||
perror("Error opening file"); | ||
return; | ||
} | ||
// Format the final log entry with the current time | ||
snprintf(log_entry, sizeof(log_entry), log_template); | ||
|
||
// Write the log entry to the file | ||
fprintf(file, "%s\n", log_entry); | ||
|
||
// Close the file | ||
fclose(file); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling and remove hardcoded template.
The function has similar error handling issues and uses a hardcoded timestamp that might not match expected test data.
-static void write_log_to_file(const char *filename) {
+static int write_log_to_file(const char *filename, const char *timestamp) {
FILE *file;
char log_entry[512];
// Log message to write
- const char *log_template = "{\"log\":\"Fluent Bit is logging\\n\",\"stream\":\"stdout\",\"time\":\"2019-04-01T17:58:33.598656444Z\"}";
+ const char *log_template = "{\"log\":\"Fluent Bit is logging\\n\",\"stream\":\"stdout\",\"time\":\"%s\"}";
// Open the file for appending
file = fopen(filename, "a");
if (file == NULL) {
perror("Error opening file");
- return;
+ return -1;
}
// Format the final log entry with the current time
- snprintf(log_entry, sizeof(log_entry), log_template);
+ snprintf(log_entry, sizeof(log_entry), log_template, timestamp ? timestamp : "2019-04-01T17:58:33.598656444Z");
// Write the log entry to the file
fprintf(file, "%s\n", log_entry);
// Close the file
fclose(file);
+ return 0;
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
static void write_log_to_file(const char *filename) { | |
FILE *file; | |
char log_entry[512]; | |
// Log message to write | |
const char *log_template = "{\"log\":\"Fluent Bit is logging\\n\",\"stream\":\"stdout\",\"time\":\"2019-04-01T17:58:33.598656444Z\"}"; | |
// Open the file for appending | |
file = fopen(filename, "a"); | |
if (file == NULL) { | |
perror("Error opening file"); | |
return; | |
} | |
// Format the final log entry with the current time | |
snprintf(log_entry, sizeof(log_entry), log_template); | |
// Write the log entry to the file | |
fprintf(file, "%s\n", log_entry); | |
// Close the file | |
fclose(file); | |
} | |
static int write_log_to_file(const char *filename, const char *timestamp) { | |
FILE *file; | |
char log_entry[512]; | |
// Log message to write | |
const char *log_template = | |
"{\"log\":\"Fluent Bit is logging\\n\",\"stream\":\"stdout\",\"time\":\"%s\"}"; | |
// Open the file for appending | |
file = fopen(filename, "a"); | |
if (file == NULL) { | |
perror("Error opening file"); | |
return -1; | |
} | |
// Format the final log entry with the current time | |
snprintf(log_entry, sizeof(log_entry), | |
log_template, | |
timestamp ? timestamp : "2019-04-01T17:58:33.598656444Z"); | |
// Write the log entry to the file | |
fprintf(file, "%s\n", log_entry); | |
// Close the file | |
fclose(file); | |
return 0; | |
} |
🤖 Prompt for AI Agents
In tests/runtime/filter_kubernetes.c around lines 88 to 109, improve error
handling by checking the return values of file operations and handle errors
appropriately. Replace the hardcoded timestamp in the log template with a
dynamically generated current timestamp to ensure the log entry matches expected
test data. Use appropriate time functions to format the timestamp string before
writing the log entry to the file.
…n ups Signed-off-by: Zhihong Lin <[email protected]>
Background
CloudWatch introduced a new feature called Explored Related which is a UI component that helps user navigate between their telemetry(metrics,logs) and AWS resources. This is done by including a new field in PutLogEvents API which is called Entity. More documentation about this new field can be found here: https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_Entity.html
Problem
For enabling Explore Related experience in AWS EKS cluster, we need additional data such as workload(deployment, daemonset, etc) type, cluster name. There's also an option to associate name of the service running within the cluster but FluentBit doesn't have existing functionality for that.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
Screenshot of how UI interacts with the changes made in this PR
Example configuration file for the change
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Need to add documentation about new configuration values that enables Explored Related experience for users.
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Bug Fixes
Tests