out_s3: added blob handling feature #9907

leonardo-albertovich · 2025-02-03T12:59:37Z

No description provided.

Signed-off-by: Leonardo Alminana <[email protected]>

swapneils

Added some comments that I'd prefer to clarify and/or change before merging this.

Curious why the ubuntu unit tests are failing, as well.

swapneils · 2025-03-18T20:04:37Z

src/aws/flb_aws_util.c

+
+    /* A match against "$TAG[" indicates an invalid or out of bounds tag part. */
+    if (strstr(s3_key, tmp)){
+        flb_warn("[s3_key] Invalid / Out of bounds tag part: At most 10 tag parts "


Nit: Is going above 10 tag parts the most likely cause of this issue? The "first part" of the error message implies that tags in the wrong place also trigger this check, in which case the "second part" should mention that in more detail as well to avoid confusing operators.

This part of the code is faithful to the original, I'd raise that point to the code owner who would be much mor qualified to address it than myself.

swapneils · 2025-03-18T20:06:18Z

src/aws/flb_aws_util.c

+
+    valid_blob_path = (char *) blob_path;
+
+    while (*valid_blob_path == '.' ||


Can this value ever use the ../ format? If so the semantics here would ignore that motion, which is incorrect.

Yes, ignoring that part would be correct, regardless of it, what we want is the internal structure so ./a/b/c/d/file.txt would be the same as ../../../../a/b/c/d/file.txt becase what we want is a/b/c/d/file.txt

Got it. Do we have a way to deal with duplicates in this case, e.g. ./logs/1.txt and ../../logs/1.txt both being sent to the same bucket? With the current implementation it seems like these would overwrite each other, which is unintuitive since from the user side these are two different filepaths.

Alternatively, can we gate this filtering behind a blob-specific flag of some sort? (Assuming the above concern is valid, that is, we shouldn't be introducing flags unless they provide customer benefit.)
I think it's probably reasonable for the current behavior to be the default, to minimize the chance of too-long S3 keys, but users should be able to see this in documentation and opt-out if their specific filenames are incompatible with the assumption that the internal directory structure is a unique key.

I think there's been a misunderstanding here, firstly, this is already gated as it's in flb_get_s3_blob_key which is only used for blobs but also, there should be no way for a blob to have a path that includes ../.

swapneils · 2025-03-18T20:12:56Z

plugins/out_s3/s3.h

+                              struct multipart_upload *m_upload,
+                              char *pre_signed_url);
+
+int abort_multipart_upload(struct flb_s3 *ctx,


Why are we creating this function? I see we're using it to abort multipart blob uploads in s3.c, but why is that needed here?

I don't think I'm following you, it's there because it's a prototype, could you please clarify the point?

Nvm, I see from here that we set this status to 1 in sqlite on exit or retry exhaustion, and given it's only called within the upload function itself there shouldn't be issues with race conditions making us retry when we shouldn't.

swapneils · 2025-03-18T20:24:55Z

src/flb_blob_db.c

+
+int flb_blob_db_lock(struct flb_blob_db *context)
+{
+    return flb_lock_acquire(&context->global_lock,


As I understand every blob publish operation is dependent on manipulating the same sqlite table. Do we know how this affects performance when parallelizing publishes? Is there a way to get a narrower lock here?

I tried to narrow the scope of the locks as much as possible, as it is the only parts that lock are the ones that hold the lock are the ones that query or modify the database, there are very few external calls in that part of the code (aborting an upload, obtaining the pre-signed url, committing an upload) and to be honest I'd rather err on the safe side than risk introducing a bug in this case.

swapneils · 2025-03-18T20:36:30Z

src/flb_blob_db.c

+
+int flb_blob_db_unlock(struct flb_blob_db *context)
+{
+    return flb_lock_release(&context->global_lock,


Nit: out of curiosity, why are we using global_lock here but db_lock in azure_blob_db.c?

azure_blob_db.c was the first version of this code which in turn was derived from another piece of code. When I took over this implementation I decided to abstract the blob database management into a globally available component to prevent future code duplication.

Back then the plan was to refactor azure_blob to use this component but that slipped through the cracks.

swapneils · 2025-03-18T20:42:27Z

plugins/out_s3/s3_multipart.c

+
+    s3_client = ctx->s3_client;
+    if (s3_plugin_under_test() == FLB_TRUE) {
+        /* c = mock_s3_call("TEST_ABORT_MULTIPART_UPLOAD_ERROR", "AbortMultipartUpload"); */


Nit: Do we need this line?

Yes, we do because it's key to the testing system, however, I left it commented because mock_s3_call does not have a code path for AbortMultipartUpload which I meant to add but forgot about.

I'll add that and uncomment the line.

swapneils · 2025-03-18T20:48:46Z

plugins/out_s3/s3.h

+    time_t upload_parts_freshness_threshold;
+    int file_delivery_attempt_limit;
+    int part_delivery_attempt_limit;
+    flb_sds_t authorization_endpoint_url;


How are we using this authorization endpoint? Isn't the process of publishing binary data to S3 equivalent to string data, +/- the declared body type in the HTTP requests?

The authorization endpoint is an internal requirement, @edsiper might be able to explain it better. It's basically a service that provides pre signed URLs.

swapneils · 2025-03-18T21:02:34Z

plugins/out_s3/s3.c

+    sched = flb_sched_ctx_get();
+
+    /* convert from seconds to milliseconds (scheduler needs ms) */
+    ms = ctx->upload_parts_timeout * 1000;


Is this the correct semantics?

From the description of upload_parts_timeout below it seems like a maximum age to retry blob publishing, past which we will drop a blob instead of retrying.
As I understand the code, here we're instead inserting blob parts into the database and then attempting to publish them every upload_parts_timeout seconds, dropping them on the first failure.

It's a naming error, the maximum age is dictated by upload_part_freshness_limit.

As for the dropping part, that's dictated by file_delivery_attempt_limit and part_delivery_attempt_limit which default to 1.

ShelbyZ · 2025-03-18T20:00:08Z

src/aws/flb_aws_util.c

@@ -732,6 +732,203 @@ char* strtok_concurrent(
 #endif
 }

+/* Constructs S3 object key as per the blob format. */
+flb_sds_t flb_get_s3_blob_key(const char *format,


A majority of this is duplicated in flb_get_s3_key(...), does it make sense to try and introduce two helpers:

https://github.com/fluent/fluent-bit/blob/leonardo-master-s3-blob/src/aws/flb_aws_util.c#L936-L1053 - initial validation/replacement

https://github.com/fluent/fluent-bit/blob/leonardo-master-s3-blob/src/aws/flb_aws_util.c#L1080-L1100 - random alpha replacement

This could avoid possible divergence between the two methods over time

There are a few key differences between the two and since I'm not the code owner I'm not comfortable making heavy changes to the original so I'd rather not.

ShelbyZ · 2025-03-18T20:26:33Z

include/fluent-bit/flb_blob_db.h

+                                char *tag,
+                                char *source,
+                                char *destination,
+                                char *path,


There are other instances where we are using cfl_sds_t in place of char*, should we be consistent and migrate all char* here to cfl_sds_t?

No, those are output parameters, the blob db component always returns cfl_sds_t strings (which is not just a typedef for char *) but cannot force the client code to provide them for input parameters as all it needs are NULL terminated strings (otherwise we'd force client code to unnecessarily duplicate strings which would make the API very cumbersome and thus reduce its usage)

ShelbyZ · 2025-03-18T20:27:48Z

src/flb_blob_db.c

+
+/*  Fluent Bit
+ *  ==========
+ *  Copyright (C) 2015-2024 The Fluent Bit Authors


Should the copyright block include our current year? 🫣

ShelbyZ · 2025-03-18T20:30:56Z

src/flb_blob_db.c

+    }
+
+
+    /* file destination update  */


Should this comment include remote-id (similar to file part below)?

Yes, copy & paste betrayed me

ShelbyZ · 2025-03-18T20:32:15Z

src/flb_blob_db.c

+        return FLB_BLOB_DB_ERROR_PREPARING_STATEMENT_GET_NEXT_FILE_PART;
+    }
+
+    result = sqlite3_prepare_v2(context->db->handler,


Should these next results contain a comment above (similar to prior prepared statement blocks)?

I'll add them just to be consistent but TBH the constant name and text are almost the same, IIRC that's why I left the original comments but didn't add new ones.

ShelbyZ · 2025-03-18T20:52:03Z

src/flb_blob_db.c

+
+static int flb_blob_db_file_reset_part_upload_states(struct flb_blob_db *context,
+                                                     uint64_t id,
+                                                     char *path)


Similar to above comments on unused parameters

Thank you, I'll fix it.

ShelbyZ · 2025-03-18T20:52:28Z

src/flb_blob_db.c

+
+int flb_blob_db_file_reset_upload_states(struct flb_blob_db *context,
+                                         uint64_t id,
+                                         char *path)


Similar to above comments on unused parameters

Thank you, I'll fix it.

ShelbyZ · 2025-03-18T20:53:11Z

src/flb_blob_db.c

+                                 uint64_t part_id,
+                                 size_t offset_start,
+                                 size_t offset_end,
+                                 int64_t *out_id)


Similar to above comments on unused parameters

Thank you, I think this one is different than the others as it seems I forgot to set an output parameter, I'll fix it.

ShelbyZ · 2025-03-18T21:35:02Z

plugins/out_s3/s3.c

+
+    flb_blob_db_lock(&ctx->blob_db);
+
+    while (1) {


A large portion of these while(1) loops are duplicating code:

https://github.com/fluent/fluent-bit/blob/leonardo-master-s3-blob/plugins/out_s3/s3.c#L2582-L2645

https://github.com/fluent/fluent-bit/blob/leonardo-master-s3-blob/plugins/out_s3/s3.c#L2674-L2736

Can we consider moving to a helper method to avoid the duplication?

Sure, I've addressed it, thank you.

ShelbyZ · 2025-03-18T21:43:17Z

plugins/out_s3/s3.c

+    return ret;
+}
+
+static int blob_fetch_multipart_complete_pre_signed_url(struct flb_s3 *context,


These fetch_*_pre_signed_url methods are near duplicates with minor changes to the tmp (path variable), can we consider using a helper method for getting the presigned url and have each method handle their own tmp/path generation to reduce code?

Makes sense, I'll address it.

If it's a big lift, we can push it to a later tech-debt PR that is more about code-shrink than feature-build

ShelbyZ · 2025-04-17T21:06:28Z

plugins/out_s3/s3_multipart.c

@@ -334,7 +334,7 @@ static int complete_multipart_upload_payload(struct flb_s3 *ctx,
    int offset = 0;
    flb_sds_t etag;
    size_t size = COMPLETE_MULTIPART_UPLOAD_BASE_LEN;
-    char part_num[7];
+    char part_num[11];


Are we not limited to numbers 1..10000 per docs - https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html

Part numbers can be any number from 1 to 10,000, inclusive.

AWS might reject the request if the part number is higher than 10000 but there's an sprintf call which writes whichever 32 bit signed integer it gets in the part number field into that buffer and considering how memory alignment works I'm not worried about spending 8 more bytes if that means that if there is such a fringe case we get a nice error rather than a stack overflow.

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich · 2025-04-24T11:11:14Z

Please let me know if there's anything I missed, thanks for taking the time to review this PR.

edsiper · 2025-05-16T20:23:24Z

@leonardo-albertovich this is the last item to address: #9907 (comment)

Signed-off-by: Eduardo Silva Pereira <[email protected]>

edsiper · 2025-06-27T16:21:59Z

PR looks good and I run some extra tests. However need to do a manual rebase since GH is not allowing it

leonardo-albertovich added 6 commits February 3, 2025 13:55

flb_blob_db: initial commit of the blob database component

622d278

Signed-off-by: Leonardo Alminana <[email protected]>

signv4: added missing method support

14476ba

Signed-off-by: Leonardo Alminana <[email protected]>

sqldb: added locking mechanics

151a2cc

Signed-off-by: Leonardo Alminana <[email protected]>

aws_util: added blob specific s3 key generation

960f3e8

Signed-off-by: Leonardo Alminana <[email protected]>

out_s3: added blob support

e73d42a

Signed-off-by: Leonardo Alminana <[email protected]>

build: added blob database component

dda8050

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich requested review from PettitWesley, edsiper, fujimotos and koleini as code owners February 3, 2025 12:59

github-actions bot added the docs-required label Feb 3, 2025

leonardo-albertovich had a problem deploying to pr February 3, 2025 13:00 — with GitHub Actions Failure

out_s3: added missing windows header

285144c

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich temporarily deployed to pr February 3, 2025 14:18 — with GitHub Actions Inactive

out_s3: added missing TLS initializer

a768345

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich temporarily deployed to pr February 3, 2025 14:40 — with GitHub Actions Inactive

leonardo-albertovich temporarily deployed to pr February 3, 2025 15:03 — with GitHub Actions Inactive

leonardo-albertovich added 5 commits February 5, 2025 12:15

out_s3: removed erroneously included code (CID 532473)

d083add

Signed-off-by: Leonardo Alminana <[email protected]>

out_s3: fixed potential memory leak (CID 532470)

ca26268

Signed-off-by: Leonardo Alminana <[email protected]>

out_s3: fixed compiler warning (CID 532472)

bde4f84

Signed-off-by: Leonardo Alminana <[email protected]>

out_s3: fixed compiler warning (CID 532471)

ff2ab43

Signed-off-by: Leonardo Alminana <[email protected]>

out_s3: fixed NULL dereferences (CID 532469)

87436be

Signed-off-by: Leonardo Alminana <[email protected]>

out_s3: pre-signed url support addition

039bd43

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich temporarily deployed to pr February 18, 2025 14:49 — with GitHub Actions Inactive

leonardo-albertovich temporarily deployed to pr February 18, 2025 15:12 — with GitHub Actions Inactive

leonardo-albertovich temporarily deployed to pr February 18, 2025 15:17 — with GitHub Actions Inactive

swapneils suggested changes Mar 18, 2025

View reviewed changes

ShelbyZ suggested changes Mar 18, 2025

View reviewed changes

ShelbyZ reviewed Apr 17, 2025

View reviewed changes

leonardo-albertovich added 3 commits April 24, 2025 10:22

flb_blob_db: code quality improvements

493094c

Signed-off-by: Leonardo Alminana <[email protected]>

in_blob: updated date

405014d

Signed-off-by: Leonardo Alminana <[email protected]>

out_s3: addressed review requests

2724474

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich temporarily deployed to pr April 24, 2025 08:33 — with GitHub Actions Inactive

leonardo-albertovich temporarily deployed to pr April 24, 2025 08:56 — with GitHub Actions Inactive

leonardo-albertovich temporarily deployed to pr April 24, 2025 08:57 — with GitHub Actions Inactive

Merge branch 'master' into leonardo-master-s3-blob

7cf4fcf

Signed-off-by: Eduardo Silva Pereira <[email protected]>

edsiper temporarily deployed to pr June 27, 2025 15:23 — with GitHub Actions Inactive

edsiper temporarily deployed to pr June 27, 2025 15:39 — with GitHub Actions Inactive

edsiper mentioned this pull request Aug 1, 2025

out_s3: add Blob handling support (rebase of #9907) #10675

Open


		valid_blob_path = (char *) blob_path;

		while (*valid_blob_path == '.' \|\|

out_s3: added blob handling feature #9907

Are you sure you want to change the base?

out_s3: added blob handling feature #9907

Uh oh!

Conversation

leonardo-albertovich commented Feb 3, 2025

Uh oh!

swapneils left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leonardo-albertovich Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leonardo-albertovich Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leonardo-albertovich Apr 24, 2025 •

edited

Loading

leonardo-albertovich Apr 24, 2025 •

edited

Loading