Fix: dictionary changed size during iteration in GCSObjectMetadataClient #468

ikyasam18 · 2025-06-16T12:06:37Z

The issue was in the _adjust_gcs_metadata_limit_size method where labels
dictionary was being modified while iterating through it. Fixed by:

Creating a list of keys to remove first
Removing the keys after completing the iteration

This prevents the "dictionary changed size during iteration" error that
occurred when large metadata needed to be adjusted to fit GCS size limits.

Copilot

Pull Request Overview

This PR fixes a crash in _adjust_gcs_metadata_limit_size by avoiding in-place dict modification during iteration and adds a test to verify metadata trimming.

Accumulates keys to remove before deleting them to prevent “dictionary changed size during iteration” errors
Adds a unit test to confirm total metadata size remains within GCS limits

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
gokart/gcs_obj_metadata_client.py	Change loop to collect and then remove keys
test/test_gcs_obj_metadata_client.py	New test ensuring large metadata is truncated

Comments suppressed due to low confidence (3)

test/test_gcs_obj_metadata_client.py:143

[nitpick] The test name suggests an expected runtime error but actually checks size truncation. Rename it to something like test_adjust_gcs_metadata_limit_size_truncates_labels for clarity.

def test_adjust_gcs_metadata_limit_size_runtime_error(self):

test/test_gcs_obj_metadata_client.py:151

Consider adding assertions to verify that specific keys beyond the size limit were removed, not just the total size, for a more robust test.

self.assertLessEqual(total_size, 8 * 1024)

gokart/gcs_obj_metadata_client.py:159

This method mutates the labels dict in-place and returns it. To avoid unexpected side effects, consider returning a new dict or documenting the in-place behavior in the docstring.

def _get_label_size(label_name: str, label_value: str) -> int:

gokart/gcs_obj_metadata_client.py

hiro-o918 · 2025-06-16T12:27:33Z

test/test_gcs_obj_metadata_client.py

+        result = GCSObjectMetadataClient._adjust_gcs_metadata_limit_size(large_labels)
+
+        total_size = sum(len(k.encode('utf-8')) + len(v.encode('utf-8')) for k, v in result.items())
+        self.assertLessEqual(total_size, 8 * 1024)


Can you define a constant value in gokart/gcs_obj_metadata_client.py

MAX_GCS_METADATA_SIZE: Final[int] = 8 * 1024

and refer here and L157 of gokart/gcs_obj_metadata_client.py

Thank you for your review. I've fixed it!

kitagry

LGTM! Thanks

kitagry · 2025-06-16T12:48:09Z

test/test_gcs_obj_metadata_client.py

+
+        result = GCSObjectMetadataClient._adjust_gcs_metadata_limit_size(large_labels)
+
+        total_size = sum(len(k.encode('utf-8')) + len(v.encode('utf-8')) for k, v in result.items())


Is encode('utf-8') needs?

When working with non-ASCII characters (such as Japanese etc.), it is necessary because the number of characters and the number of bytes do not match.

I think so. But, in this test case, is it need?

I apologize. I misunderstood it as production code. Since this test only needs to verify that no error occurs, it doesn't seem necessary.

hiro-o918

LGTM!

… test

kitagry · 2025-06-18T06:54:17Z

Thank you!

Fix: dictionary changed size during iteration in GCSObjectMetadataClient

954a135

ikyasam18 requested a review from hiro-o918 as a code owner June 16, 2025 12:06

chroe:ruff

61a5c22

hiro-o918 requested a review from Copilot June 16, 2025 12:15

Copilot AI reviewed Jun 16, 2025

View reviewed changes

gokart/gcs_obj_metadata_client.py Outdated Show resolved Hide resolved

hiro-o918 reviewed Jun 16, 2025

View reviewed changes

Masayuki-Kamoda added 2 commits June 16, 2025 21:33

refactor: Convert lists to tuples to reduce memory usage

e6357e5

fix: Use constant for GCS metadata size

fed2512

kitagry approved these changes Jun 16, 2025

View reviewed changes

hiro-o918 approved these changes Jun 16, 2025

View reviewed changes

fix: Remove unnecessary result assignment in metadata size adjustment…

4b53078

… test

kitagry merged commit 8cf6e6e into m3dev:master Jun 18, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: dictionary changed size during iteration in GCSObjectMetadataClient #468

Fix: dictionary changed size during iteration in GCSObjectMetadataClient #468

Uh oh!

ikyasam18 commented Jun 16, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

hiro-o918 Jun 16, 2025

Uh oh!

ikyasam18 Jun 16, 2025

Uh oh!

kitagry left a comment

Uh oh!

kitagry Jun 16, 2025

Uh oh!

ikyasam18 Jun 16, 2025

Uh oh!

kitagry Jun 17, 2025

Uh oh!

ikyasam18 Jun 17, 2025

Uh oh!

hiro-o918 left a comment

Uh oh!

Uh oh!

kitagry commented Jun 18, 2025

Uh oh!

Uh oh!


		result = GCSObjectMetadataClient._adjust_gcs_metadata_limit_size(large_labels)

		total_size = sum(len(k.encode('utf-8')) + len(v.encode('utf-8')) for k, v in result.items())

Fix: dictionary changed size during iteration in GCSObjectMetadataClient #468

Fix: dictionary changed size during iteration in GCSObjectMetadataClient #468

Uh oh!

Conversation

ikyasam18 commented Jun 16, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

hiro-o918 Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

ikyasam18 Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

kitagry left a comment

Choose a reason for hiding this comment

Uh oh!

kitagry Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

ikyasam18 Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

kitagry Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

ikyasam18 Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

hiro-o918 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kitagry commented Jun 18, 2025

Uh oh!

Uh oh!