Skip to content

Conversation

ryan-gang
Copy link
Contributor

@ryan-gang ryan-gang commented Jul 26, 2025

Description

  • Added utility functions to get encoded bytes and the length of encodable objects.
  • Removed getEncodedLength methods in favor of utils.GetEncodedLength for consistent encoding length calculation.
  • Replaced serializer.GetEncodedBytes with utils.GetEncodedBytes to unify encoding calls across packages.

This is part 2 of 2 in a stack made with GitButler:

Summary by CodeRabbit

  • Refactor

    • Centralized encoding utilities by moving encoding functions to a shared utilities package.
    • Updated internal logic to use new utility functions for encoding and length calculation.
    • Refactored record encoding to ensure length-prefixed encoding is handled consistently.
  • Bug Fixes

    • Corrected record encoding to properly handle empty keys.
  • New Features

    • Added utility functions for obtaining encoded bytes and length for encodable objects.

Copy link

coderabbitai bot commented Jul 26, 2025

Walkthrough

This change centralizes encoding utilities by moving GetEncodedBytes and introducing GetEncodedLength in the utils package, replacing previous usage in the serializer package. It refactors record batch encoding logic to use these utilities, removes redundant length calculation methods, and updates all relevant calls and imports accordingly.

Changes

Cohort / File(s) Change Summary
Assertion/Test Utility Refactor
internal/assertions/fetch_response_assertion.go, internal/cluster_metadata_payload_test.go
Updated imports and replaced serializer.GetEncodedBytes with utils.GetEncodedBytes in assertions and tests.
Cluster Metadata Serialization
protocol/serializer/cluster_metadata.go, protocol/serializer/cluster_metadata_binspec.go
Switched all calls to GetEncodedBytes to use utils.GetEncodedBytes and updated imports.
Produce Request Encoding
protocol/api/produce_request.go
Replaced manual encoded length calculation with utils.GetEncodedLength in partition data encoding.
Record Batch Encoding Refactor
protocol/api/record_batch.go
Removed getEncodedLength methods, added EncodeFull, refactored encoding logic to use new utilities, fixed key nil check bug.
Serializer Utilities Cleanup
protocol/serializer/utils.go
Removed GetEncodedBytes and related imports from the serializer utilities.
Utils Package Enhancements
protocol/utils/utils.go
Added GetEncodedBytes and GetEncodedLength utility functions for generic encoding and length calculation.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Utils
    participant Encoder

    Caller->>Utils: GetEncodedBytes(encodableObject)
    Utils->>Encoder: NewEncoder(buffer)
    Utils->>Encoder: encodableObject.Encode(encoder)
    Utils-->>Caller: Encoded bytes

    Caller->>Utils: GetEncodedLength(encodableObject)
    Utils->>Utils: GetEncodedBytes(encodableObject)
    Utils-->>Caller: len(encoded bytes)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~18 minutes

Possibly related PRs

  • Add FetchResponseAssertion #72: Also updates fetch_response_assertion.go to use utils.GetEncodedBytes, showing a direct code-level relationship.
  • Produce framework #1 #78: Both PRs modify encoding length logic in produce_request.go and record_batch.go, relating to how encoded lengths are calculated and used.

Suggested reviewers

  • rohitpaulk

Poem

In the warren where bytes are spun,
The rabbits refactor, their work never done.
Utilities hop from one home to another,
Encoding and length now live with each other.
With every nibble and hop so spry,
The code grows lean—oh my, oh my!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e57af10 and 7c28edb.

📒 Files selected for processing (8)
  • internal/assertions/fetch_response_assertion.go (2 hunks)
  • internal/cluster_metadata_payload_test.go (7 hunks)
  • protocol/api/produce_request.go (2 hunks)
  • protocol/api/record_batch.go (3 hunks)
  • protocol/serializer/cluster_metadata.go (5 hunks)
  • protocol/serializer/cluster_metadata_binspec.go (3 hunks)
  • protocol/serializer/utils.go (0 hunks)
  • protocol/utils/utils.go (1 hunks)
💤 Files with no reviewable changes (1)
  • protocol/serializer/utils.go
🧰 Additional context used
🧬 Code Graph Analysis (6)
internal/cluster_metadata_payload_test.go (1)
protocol/utils/utils.go (1)
  • GetEncodedBytes (11-18)
protocol/serializer/cluster_metadata_binspec.go (2)
protocol/utils/utils.go (1)
  • GetEncodedBytes (11-18)
protocol/api/record_batch.go (1)
  • RecordHeader (350-353)
internal/assertions/fetch_response_assertion.go (1)
protocol/utils/utils.go (1)
  • GetEncodedBytes (11-18)
protocol/utils/utils.go (4)
protocol/interface/interface.go (1)
  • Encodable (13-15)
protocol/encoder/encoder.go (1)
  • Encoder (13-16)
protocol/serializer/utils.go (1)
  • GetEncodedBytes (14-26)
protocol/api/fetch_request.go (2)
  • Encode (32-48)
  • encode (86-113)
protocol/api/produce_request.go (2)
protocol/utils/utils.go (1)
  • GetEncodedLength (20-22)
protocol/api/record_batch.go (1)
  • RecordBatches (15-15)
protocol/api/record_batch.go (2)
protocol/encoder/encoder.go (1)
  • Encoder (13-16)
protocol/utils/utils.go (1)
  • GetEncodedLength (20-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cursor Bugbot
🔇 Additional comments (13)
protocol/serializer/cluster_metadata_binspec.go (2)

10-10: LGTM: Import statement correctly added.

The utils package import has been properly added to support the centralized encoding utilities.


92-92: LGTM: Function calls consistently updated.

All GetEncodedBytes calls have been correctly updated to use utils.GetEncodedBytes, which aligns with the PR objective of centralizing encoding utilities. The refactoring is consistent across all record value assignments.

Also applies to: 113-113, 120-120, 127-127

internal/assertions/fetch_response_assertion.go (2)

9-9: LGTM: Import correctly updated to utils package.

The import has been properly changed from the serializer package to the centralized utils package.


186-187: LGTM: Encoding function calls correctly updated.

Both utils.GetEncodedBytes calls for expected and actual record batches are correctly updated to use the centralized utility function. The byte comparison logic remains intact.

internal/cluster_metadata_payload_test.go (2)

9-9: LGTM: Test import correctly updated.

The import has been properly changed to use the centralized utils package for encoding utilities.


160-160: LGTM: All test encoding calls consistently updated.

All test functions have been correctly updated to use utils.GetEncodedBytes, maintaining test functionality while using the centralized encoding utilities. The test coverage remains intact.

Also applies to: 177-177, 190-190, 204-204, 219-219, 244-244

protocol/serializer/cluster_metadata.go (2)

9-9: LGTM: Utils package import properly added.

The utils package import has been correctly added to support the centralized encoding utilities.


144-144: LGTM: Encoding function calls consistently refactored.

All GetEncodedBytes calls have been systematically updated to use utils.GetEncodedBytes, maintaining consistency with the centralization effort across the codebase. The record batch encoding logic remains unchanged.

Also applies to: 165-165, 172-172, 193-193, 200-200, 221-221, 228-228, 235-235

protocol/api/produce_request.go (2)

6-6: LGTM: Utils package import added for centralized utilities.

The utils package import has been correctly added to support the new centralized encoding length calculation.


16-16: LGTM: Encoding length calculation correctly centralized.

The call has been properly updated to use utils.GetEncodedLength() instead of the previous getEncodedLength() method, which aligns with the PR objective of centralizing encoding utilities. The type cast RecordBatches(p.RecordBatches) ensures proper interface compatibility.

protocol/utils/utils.go (1)

11-18: LGTM! Clean implementation of encoding utility.

The GetEncodedBytes function correctly initializes an encoder, encodes the object, and returns the appropriate byte slice. The buffer size of 1024 bytes should be sufficient for most use cases.

protocol/api/record_batch.go (2)

61-61: LGTM! Correctly uses the new EncodeFull method.

The change from record.Encode(pe) to record.EncodeFull(pe) ensures that each record is properly length-prefixed in the record batch encoding.


228-243: LGTM! Encode method correctly implements record fields.

The Encode method properly encodes all record fields including attributes, timestamp delta, offset delta, key, value, and headers. The logic for handling nil keys is correct.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch r-add-encodable

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@ryan-gang ryan-gang changed the title Add methods on Encodable Add utils on Encodable Jul 26, 2025
@ryan-gang ryan-gang self-assigned this Jul 26, 2025
@ryan-gang ryan-gang requested a review from rohitpaulk July 26, 2025 11:14
Base automatically changed from r-branch-fetch to main July 27, 2025 16:41
@@ -233,11 +219,13 @@ type Record struct {
Headers []RecordHeader
}

func (r Record) Encode(pe *encoder.Encoder) {
pe.PutVarint(int64(r.getEncodedLength())) // Length placeholder
func (r Record) EncodeFull(pe *encoder.Encoder) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this get an "Encode" and "EncodeFull" instead of just Encode? If it's about satisfying the interface so you can use GetEncodedLength, we're doing this wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Record.Encode() requires the length of the encoded bytes.
To get the encodedLength it needs to call Encode.
So, Encode is dependent on Encode.
So, to implement the interface, add to not repeat the encode logic in 2 different places, this was my solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't seem like a good enough reason to pollute a public interface. Haven't thought about how this would work with the encoder thing we're using but if we were using a simple Encode() []byte interface for example, I'd just use a private method encodeContents() and then call something like encodeLength() + encodeContents() within Encode(). That way the public interface is simple and the same as all other "encodable" entities we have.

@rohitpaulk rohitpaulk closed this Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants