Skip to content

Conversation

@colinlyguo
Copy link
Contributor

@colinlyguo colinlyguo commented Jun 16, 2025

Purpose or design rationale of this PR

This PR implements codecv8.

PR title

Your PR title must follow conventional commits (as we are doing squash merge for each PR), so it must start with one of the following types:

  • feat: A new feature

Breaking change label

Does this PR have the breaking-change label?

  • No, this PR is not a breaking change

Summary by CodeRabbit

  • New Features
    • Added support for CodecV8 and the "feynman" hardfork, enabling enhanced batch processing for post-Feynman blocks.
    • Introduced a new compression scheme for CodecV8 with improved compression and compatibility validation.
    • Unified batch byte compression interface extended to all codec versions, including CodecV0 through CodecV8.
  • Bug Fixes
    • Updated codec version selection and compression compatibility checks to include CodecV8.
  • Tests
    • Added extensive CodecV8 tests covering batch creation, encoding, decoding, hashing, and compression.
  • Chores
    • Updated dependencies for improved stability.
    • Added build scripts and tooling for compression libraries with symbol prefixing to prevent conflicts.
    • Refactored compression code by separating legacy and standard implementations and reorganizing Rust encoder libraries.
    • Enhanced README with detailed build instructions for compression libraries on multiple platforms.

@coderabbitai
Copy link

coderabbitai bot commented Jun 16, 2025

Warning

Rate limit exceeded

@colinlyguo has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 7 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 61656c1 and efd0b3a.

📒 Files selected for processing (1)
  • encoding/zstd/add_symbol_prefix.sh (1 hunks)

Walkthrough

This update introduces support for codec version 8 (CodecV8) and the "feynman" hardfork. It adds a new DACodecV8 struct embedding DACodecV7 with overridden methods for a new compression scheme. The DACodecV7 struct is extended with a forcedVersion field to support dynamic versioning. Codec selection logic is updated to return DACodecV8 for Feynman-era blocks. Comprehensive tests for CodecV8 batch encoding, decoding, hashing, and compression are added. Related functions and comments are updated for the new version and hardfork. Compression logic is refactored to separate legacy and standard zstd implementations with corresponding Rust encoder crates and build scripts. Symbol prefixing scripts and build constraints for static libraries are added or updated.

Changes

File(s) Change Summary
encoding/codecv7.go Added forcedVersion *CodecVersion to DACodecV7; updated Version() method; replaced hardcoded version constants with Version() calls; added CompressScrollBatchBytes method calling legacy compression.
encoding/codecv8.go Added new DACodecV8 struct embedding DACodecV7; implemented new compression method using standard zstd; added methods for batch creation, blob construction, compatibility checks, and size estimation; detailed error handling and logging.
encoding/codecv8_test.go Added comprehensive unit tests for CodecV8 covering batch creation, encoding, decoding, hashing, compression, and edge cases.
encoding/da.go Updated comments to include CodecV8; extended GetHardforkName to return "feynman" for Feynman hardfork; updated GetCodecVersion to return CodecV8 for Feynman; included CodecV8 in compression enabling functions.
encoding/interfaces.go Added CompressScrollBatchBytes method to Codec interface; added CodecV8 constant; updated CodecFromVersion and CodecFromConfig to support DACodecV8; added top-level CompressScrollBatchBytes function selecting codec by block info.
encoding/codecv0.go Added CompressScrollBatchBytes method returning input unmodified for DACodecV0.
encoding/codecv2.go Added CompressScrollBatchBytes method to DACodecV2 wrapping legacy compression; replaced direct calls to package compression function.
encoding/codecv4.go Replaced direct calls to package compression function with calls to CompressScrollBatchBytes method on DACodecV4.
encoding/da_test.go Modified test to use legacy compression function explicitly.
encoding/interfaces_test.go Added test case for CodecV8 in CodecFromVersion tests.
encoding/zstd/zstd.go Split CompressScrollBatchBytes into two functions: CompressScrollBatchBytesLegacy (for codec v2-v7) and CompressScrollBatchBytesStandard (for codec v8+); updated error messages accordingly.
encoding/zstd/libscroll_zstd_*.go (linux_amd64, linux_arm64, darwin_arm64) Refined build constraints; changed linker flags to link separate legacy and standard encoder static libraries instead of one combined.
encoding/zstd/add_scroll_prefix_in_zstd_related_symbols.sh Deleted old symbol prefixing script.
encoding/zstd/add_symbol_prefix.sh Added new script to prefix symbols in static libraries to avoid conflicts; performs conflict detection and verification.
libzstd/encoder-legacy/Cargo.toml Renamed package to encoder-legacy; set library name to encoder_legacy; specified staticlib crate type; minor formatting.
libzstd/encoder-legacy/Makefile Added Makefile for building, installing, cleaning legacy encoder static library with platform detection and macOS deployment target.
libzstd/encoder-legacy/src/lib.rs Added Rust source implementing legacy compression with customized zstd encoder and FFI function compress_scroll_batch_bytes_legacy.
libzstd/encoder-standard/.gitignore Added .gitignore to ignore build output directories.
libzstd/encoder-standard/Cargo.toml Converted from workspace to standalone package; renamed to encoder-standard; replaced local dependency with external zstd crate with experimental feature.
libzstd/encoder-standard/Makefile Added Makefile for building, installing, cleaning standard encoder static library with platform detection and macOS deployment target.
libzstd/encoder-standard/rust-toolchain Added Rust toolchain specification file locking nightly version.
libzstd/encoder-standard/src/lib.rs Added Rust source implementing standard compression with zstd encoder and FFI function compress_scroll_batch_bytes_standard; re-exported zstd crate.
libzstd/encoder/src/lib.rs Deleted legacy encoder Rust source wrapping zstd with old parameters.
libzstd/src/lib.rs Deleted Rust source implementing original compression FFI function compress_scroll_batch_bytes.
README.md Expanded FAQ with detailed instructions for building legacy and standard encoders separately, running symbol prefix script, and platform-specific notes.

Sequence Diagram(s)

sequenceDiagram
    participant App
    participant CodecSelector
    participant DACodecV7
    participant DACodecV8

    App->>CodecSelector: Request codec for block number & timestamp
    CodecSelector->>CodecSelector: Check IsFeynman(timestamp)
    alt Feynman active (CodecV8)
        CodecSelector->>DACodecV8: Instantiate with forcedVersion=CodecV8
        CodecSelector-->>App: Return DACodecV8
    else EuclidV2 active (CodecV7)
        CodecSelector->>DACodecV7: Instantiate with forcedVersion=CodecV7
        CodecSelector-->>App: Return DACodecV7
    else
        CodecSelector->>DACodecV7: Instantiate default
        CodecSelector-->>App: Return DACodecV7
    end

    App->>Codec: CompressScrollBatchBytes(batchBytes)
    alt CodecV7
        Codec->>zstd: CompressScrollBatchBytesLegacy(batchBytes)
        zstd-->>Codec: CompressedData or error
    else CodecV8
        Codec->>zstd: CompressScrollBatchBytesStandard(batchBytes)
        zstd-->>Codec: CompressedData or error
    end
    Codec-->>App: CompressedData or error
Loading

Possibly related PRs

Suggested reviewers

  • Thegaram
  • jonastheis

Poem

🐇 Hopping through code with nimble grace,
CodecV8 joins the race.
Feynman’s fork, fresh and bright,
Compression leaps to new height.
Tests ensure the path is clear,
New versions bring the cheer!
🌿✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
encoding/codecv7.go (1)

307-318: ⚠️ Potential issue

Blob size estimation ignores compression branch

When enableCompression == true we append compressedPayloadBytes, but still
return blobEnvelopeV7OffsetPayload + uint64(len(payloadBytes)).
payloadBytes is the uncompressed length, so the estimate can overshoot and
lead to gas over-estimation.

-   return blobEnvelopeV7OffsetPayload + uint64(len(payloadBytes)), calculatePaddedBlobSize(uint64(len(blobBytes))), nil
+   return blobEnvelopeV7OffsetPayload + uint64(len(blobBytes)-blobEnvelopeV7OffsetPayload), // actual payload length
+          calculatePaddedBlobSize(uint64(len(blobBytes))),
+          nil
🧹 Nitpick comments (3)
encoding/codecv8.go (1)

3-14: Add compile-time interface assertion for new codec

DACodecV8 relies entirely on embedded DACodecV7 methods.
A compile-time guard makes accidental interface drift visible during go vet.

 package encoding

 type DACodecV8 struct {
     DACodecV7
 }

+// Compile-time safety: ensure we still satisfy the Codec interface.
+var _ Codec = (*DACodecV8)(nil)
+
 func NewDACodecV8() *DACodecV8 {
     v := CodecV8
     return &DACodecV8{
         DACodecV7: DACodecV7{
             forcedVersion: &v,
         },
     }
 }
encoding/codecv8_test.go (2)

61-64: Run sub-tests in parallel

Add t.Parallel() inside each t.Run body to let the cases execute concurrently and speed the suite up.


247-266: Minor: pre-allocate slice index instead of manual j counter

Instead of manually skipping L1 messages with an auxiliary counter, filter the slice up-front:

var l2Txs []*types.TransactionData
for _, tx := range block.Transactions {
    if tx.Type != types.L1MessageTxType {
        l2Txs = append(l2Txs, tx)
    }
}
require.Equal(t, len(l2Txs), len(txDataDecoded))
for i := range l2Txs {
    assertEqualTransactionData(t, l2Txs[i], txDataDecoded[i])
}

It is clearer and avoids off-by-one risks.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bfa7133 and 17bf3e4.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • encoding/codecv7.go (5 hunks)
  • encoding/codecv8.go (1 hunks)
  • encoding/codecv8_test.go (1 hunks)
  • encoding/da.go (6 hunks)
  • encoding/interfaces.go (2 hunks)
  • go.mod (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
encoding/interfaces.go (1)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/interfaces.go:95-108
Timestamp: 2024-10-17T04:13:14.579Z
Learning: In the `CodecFromConfig` function in the Go `encoding/interfaces.go` file, if none of the chain configuration conditions match, it's acceptable to default to returning `&DACodecV0{}` because, in the current logic, we can only deduce the codec version as the function implements, and the logic is complete.
encoding/codecv8.go (1)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/interfaces.go:95-108
Timestamp: 2024-10-17T04:13:14.579Z
Learning: In the `CodecFromConfig` function in the Go `encoding/interfaces.go` file, if none of the chain configuration conditions match, it's acceptable to default to returning `&DACodecV0{}` because, in the current logic, we can only deduce the codec version as the function implements, and the logic is complete.
encoding/codecv7.go (3)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:105-116
Timestamp: 2024-10-18T03:40:09.800Z
Learning: The code in `encoding/codecv1_types.go`, specifically the `Encode` method in `daBatchV1`, has been updated. Previous comments regarding hardcoded byte offsets may be outdated.
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/interfaces.go:95-108
Timestamp: 2024-10-17T04:13:14.579Z
Learning: In the `CodecFromConfig` function in the Go `encoding/interfaces.go` file, if none of the chain configuration conditions match, it's acceptable to default to returning `&DACodecV0{}` because, in the current logic, we can only deduce the codec version as the function implements, and the logic is complete.
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:152-154
Timestamp: 2024-10-18T03:41:31.377Z
Learning: In the `daBatchV1` struct, the `BlobBytes()` method is intentionally returning `nil`.
encoding/da.go (1)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:105-116
Timestamp: 2024-10-18T03:40:09.800Z
Learning: The code in `encoding/codecv1_types.go`, specifically the `Encode` method in `daBatchV1`, has been updated. Previous comments regarding hardcoded byte offsets may be outdated.
🧬 Code Graph Analysis (4)
encoding/interfaces.go (1)
encoding/codecv8.go (1)
  • NewDACodecV8 (7-14)
encoding/codecv8.go (2)
encoding/codecv7.go (1)
  • DACodecV7 (20-22)
encoding/interfaces.go (1)
  • CodecV8 (93-93)
encoding/codecv8_test.go (2)
encoding/interfaces.go (3)
  • CodecFromVersion (97-120)
  • CodecV8 (93-93)
  • DABatch (33-43)
encoding/da.go (3)
  • Batch (119-129)
  • Block (102-107)
  • TxsToTxsData (378-416)
encoding/da.go (1)
encoding/interfaces.go (5)
  • CodecV7 (92-92)
  • CodecV8 (93-93)
  • CodecV4 (89-89)
  • CodecV5 (90-90)
  • CodecV6 (91-91)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: tests
🔇 Additional comments (5)
encoding/codecv7.go (1)

20-30: forcedVersion design looks correct

Nice use of an override hook; this keeps V7 logic reusable for V8.
No functional issues spotted here.

encoding/interfaces.go (2)

93-97: Remember to update any String()/Marshal helpers for CodecVersion

CodecV8 is added to the enum – ensure helper methods such as
func (v CodecVersion) String() string or any JSON marshalers include the new
case; otherwise logs/JSON will show numeric 8.


122-128: Codec selection path prioritisation looks good

Placing IsFeynman before IsEuclidV2 guarantees the newer hard-fork wins.
No further concerns.

encoding/da.go (2)

769-773: Duplicated hard-fork labels

GetHardforkName now returns "euclidV2" for the window between Euclid-V2
and Feynman, which is fine, but note that "euclid" and "euclidV2" are now
both possible outputs. Verify downstream consumers expect this extra label.


818-840: Compression gate updated – good

Including CodecV8 in the compatibility-check branch keeps behaviour aligned
with V7+; change looks correct.

jonastheis
jonastheis previously approved these changes Jun 18, 2025
@jonastheis
Copy link
Contributor

jonastheis commented Jun 18, 2025

Will this PR also contain the zstd version update?

@colinlyguo
Copy link
Contributor Author

colinlyguo commented Jun 18, 2025

Will this PR also contain the zstd version update?

Sounds good. will change the target branch of this PR: #51.

* update zstd dep

* turn zstd dependening to official

* support 2 zstd libs

* address comments

* tweak comments

* typo fix

* add CompressScrollBatchBytes interface

---------

Co-authored-by: colin <[email protected]>
Co-authored-by: colinlyguo <[email protected]>
noel2004
noel2004 previously approved these changes Jun 25, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
encoding/codecv8_test.go (2)

73-89: Tests should use public interface methods instead of casting to private types

The test is still casting to the private *daBatchV7 type, which creates tight coupling to internal implementation details. This makes the tests brittle and could break if the embedding structure changes.

Also applies to: 92-100


237-238: Incorrect use of strings.TrimRight on hex strings

The use of strings.TrimRight with "0" can remove individual '0' characters, potentially creating odd-length hex strings and losing significant half-byte data.

🧹 Nitpick comments (8)
libzstd/encoder-legacy/src/lib.rs (1)

69-101: Review unsafe code usage and error handling in FFI function.

The unsafe extern "C" function handles memory correctly:

  • Proper slice construction from raw pointers
  • Buffer size validation before writing
  • Comprehensive error handling and propagation
  • Correct output size reporting

However, consider these improvements:

 pub unsafe extern "C" fn compress_scroll_batch_bytes_legacy(
     src: *const c_uchar,
     src_size: u64,
     output_buf: *mut c_uchar,
     output_buf_size: *mut u64,
 ) -> *const c_char {
+    // Add null pointer checks
+    if src.is_null() || output_buf.is_null() || output_buf_size.is_null() {
+        return b"compress_scroll_batch_bytes_legacy: null pointer passed\0".as_ptr() as *const c_char;
+    }
+    
     let buf_size = *output_buf_size;
+    if buf_size == 0 {
+        return b"compress_scroll_batch_bytes_legacy: zero output buffer size\0".as_ptr() as *const c_char;
+    }
encoding/zstd/zstd.go (1)

21-24: Consider consistent empty input handling.

Both functions return an error for empty input, but the behavior might be inconsistent with the underlying C functions. Consider whether empty input should return empty output instead, especially since some compression scenarios might legitimately have empty batches.

-	if len(batchBytes) == 0 {
-		return nil, fmt.Errorf("input batch is empty")
-	}
+	if len(batchBytes) == 0 {
+		return []byte{}, nil
+	}

Also applies to: 42-45

libzstd/encoder-standard/Makefile (1)

1-1: Consider adding a test target for completeness.

While not critical, adding a test target would align with standard Makefile conventions and the static analysis recommendation.

-.PHONY: all clean build install
+.PHONY: all clean build install test

+test:
+	cargo test
libzstd/encoder-legacy/Makefile (1)

1-1: Consider adding a test target for consistency.

Same as the standard encoder, adding a test target would improve consistency with Makefile conventions.

-.PHONY: all clean build install
+.PHONY: all clean build install test

+test:
+	cargo test
libzstd/encoder-standard/src/lib.rs (1)

80-82: Consider handling the expect case more gracefully.

While the expect message indicates this should be infallible, consider whether a proper error return would be more robust for the FFI boundary.

-    encoder.set_pledged_src_size(Some(src.len() as u64)).expect(
-        "compress_scroll_batch_bytes_standard: failed to set pledged src size, should be infallible",
-    );
+    if let Err(e) = encoder.set_pledged_src_size(Some(src.len() as u64)) {
+        return out_as_err(&format!("failed to set pledged src size: {}", e), out);
+    }
encoding/codecv8_test.go (1)

295-295: Fix typo in variable name

The variable name contains a typo: "Incompressable" should be "Incompressible".

-maxAvailableBytesIncompressable := maxEffectiveBlobBytes - 5 - blobPayloadV7MinEncodedLength
+maxAvailableBytesIncompressible := maxEffectiveBlobBytes - 5 - blobPayloadV7MinEncodedLength

Also update all usages of this variable in lines 345, 351, and 357.

encoding/zstd/add_symbol_prefix.sh (2)

34-38: Prefix-existence check may yield false positives

grep -q "${PREFIX}" triggers on any symbol containing the substring, not necessarily one starting with the prefix, e.g. foo_scroll_legacy_bar.

-if "$LLVM_NM" "$LIB_FILE" 2>/dev/null | grep -q "${PREFIX}"; then
+if "$LLVM_NM" "$LIB_FILE" 2>/dev/null | awk '$3 ~ /^'"$PREFIX"'/ {found=1; exit} END{exit !found}'; then

71-76: Potential race & loss of file metadata during mv

Renaming ${LIB_FILE%.*}_new.a over the original loses timestamps and, if interrupted, may leave the library missing. Use mv -f to force overwrite and/or cp --preserve=timestamps followed by mv to reduce risk.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17bf3e4 and ba3a9d0.

⛔ Files ignored due to path filters (2)
  • libzstd/encoder-legacy/Cargo.lock is excluded by !**/*.lock
  • libzstd/encoder-standard/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • encoding/codecv0.go (1 hunks)
  • encoding/codecv2.go (5 hunks)
  • encoding/codecv4.go (3 hunks)
  • encoding/codecv7.go (7 hunks)
  • encoding/codecv8.go (1 hunks)
  • encoding/codecv8_test.go (1 hunks)
  • encoding/da_test.go (1 hunks)
  • encoding/interfaces.go (4 hunks)
  • encoding/interfaces_test.go (1 hunks)
  • encoding/zstd/add_scroll_prefix_in_zstd_related_symbols.sh (0 hunks)
  • encoding/zstd/add_symbol_prefix.sh (1 hunks)
  • encoding/zstd/libscroll_zstd_darwin_arm64.go (1 hunks)
  • encoding/zstd/libscroll_zstd_linux_amd64.go (1 hunks)
  • encoding/zstd/libscroll_zstd_linux_arm64.go (1 hunks)
  • encoding/zstd/zstd.go (3 hunks)
  • libzstd/encoder-legacy/Cargo.toml (1 hunks)
  • libzstd/encoder-legacy/Makefile (1 hunks)
  • libzstd/encoder-legacy/src/lib.rs (1 hunks)
  • libzstd/encoder-standard/.gitignore (1 hunks)
  • libzstd/encoder-standard/Cargo.toml (1 hunks)
  • libzstd/encoder-standard/Makefile (1 hunks)
  • libzstd/encoder-standard/rust-toolchain (1 hunks)
  • libzstd/encoder-standard/src/lib.rs (1 hunks)
  • libzstd/encoder/src/lib.rs (0 hunks)
  • libzstd/src/lib.rs (0 hunks)
💤 Files with no reviewable changes (3)
  • encoding/zstd/add_scroll_prefix_in_zstd_related_symbols.sh
  • libzstd/src/lib.rs
  • libzstd/encoder/src/lib.rs
✅ Files skipped from review due to trivial changes (7)
  • libzstd/encoder-standard/rust-toolchain
  • libzstd/encoder-standard/.gitignore
  • encoding/zstd/libscroll_zstd_darwin_arm64.go
  • encoding/zstd/libscroll_zstd_linux_amd64.go
  • encoding/zstd/libscroll_zstd_linux_arm64.go
  • libzstd/encoder-legacy/Cargo.toml
  • libzstd/encoder-standard/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (2)
  • encoding/codecv7.go
  • encoding/interfaces.go
🧰 Additional context used
🧠 Learnings (4)
encoding/codecv4.go (1)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:105-116
Timestamp: 2024-10-18T03:40:09.800Z
Learning: The code in `encoding/codecv1_types.go`, specifically the `Encode` method in `daBatchV1`, has been updated. Previous comments regarding hardcoded byte offsets may be outdated.
encoding/codecv2.go (1)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:105-116
Timestamp: 2024-10-18T03:40:09.800Z
Learning: The code in `encoding/codecv1_types.go`, specifically the `Encode` method in `daBatchV1`, has been updated. Previous comments regarding hardcoded byte offsets may be outdated.
encoding/codecv8_test.go (1)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:105-116
Timestamp: 2024-10-18T03:40:09.800Z
Learning: The code in `encoding/codecv1_types.go`, specifically the `Encode` method in `daBatchV1`, has been updated. Previous comments regarding hardcoded byte offsets may be outdated.
encoding/codecv8.go (1)
Learnt from: colinlyguo
PR: scroll-tech/da-codec#25
File: encoding/codecv1_types.go:105-116
Timestamp: 2024-10-18T03:40:09.800Z
Learning: The code in `encoding/codecv1_types.go`, specifically the `Encode` method in `daBatchV1`, has been updated. Previous comments regarding hardcoded byte offsets may be outdated.
🪛 checkmake (0.2.2)
libzstd/encoder-legacy/Makefile

[warning] 1-1: Missing required phony target "test"

(minphony)

libzstd/encoder-standard/Makefile

[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: tests
🔇 Additional comments (23)
encoding/codecv0.go (1)

434-436: LGTM! Correct no-op compression implementation for CodecV0.

The implementation correctly provides a no-op compression method that returns the input bytes unchanged, which is appropriate for CodecV0 that doesn't perform compression. This maintains interface compatibility with other codec versions.

libzstd/encoder-legacy/src/lib.rs (3)

17-20: Verify the block size target and maximum blocks constants.

The constants N_BLOCK_SIZE_TARGET = 124 * 1024 and N_MAX_BLOCKS = 10 should be verified to ensure they align with the expected compression requirements for scroll batches.

Please confirm these values are appropriate for the intended use case and match any existing configuration in the Go codebase.


23-50: Well-configured zstd encoder for legacy compression.

The encoder configuration appropriately:

  • Disables literal compression for deterministic output
  • Sets window log limit for compatibility
  • Configures target block size
  • Disables checksum, magic bytes, and dictionary ID for minimal overhead
  • Includes content size for decode-time validation

This configuration appears optimized for the scroll-tech use case.


53-64: Robust error message handling with buffer overflow protection.

The error handling correctly:

  • Checks buffer capacity before copying
  • Provides fallback error message when buffer is too small
  • Properly null-terminates C-style strings
  • Uses safe slice operations

Good defensive programming practices.

encoding/interfaces_test.go (1)

27-27: Test case correctly added for CodecV8.

The test case follows the established pattern and correctly validates that CodecFromVersion returns a DACodecV8 instance for CodecV8 input.

encoding/da_test.go (1)

132-132: Correctly updated to use legacy compression method.

The test appropriately uses CompressScrollBatchBytesLegacy which aligns with the compression method separation introduced in this PR.

encoding/codecv4.go (3)

206-206: Correctly refactored to use codec instance method for compression.

The change from direct package call to codec instance method d.CompressScrollBatchBytes(batchBytes) enables version-specific compression behavior and improves encapsulation.


268-268: Consistent compression method refactoring.

Properly updated to use the codec instance method, maintaining consistency with the compression refactoring pattern throughout the codebase.


296-296: Appropriate use of codec instance compression method.

The refactoring correctly delegates compression to the codec instance, enabling proper version-specific compression handling.

encoding/codecv2.go (2)

157-157: Well-executed refactoring to instance methods.

The migration from direct zstd.CompressScrollBatchBytes calls to the instance method d.CompressScrollBatchBytes is consistent and enables codec-specific compression implementations while maintaining the same functionality.

Also applies to: 239-239, 252-252, 266-266


293-296: Clean wrapper implementation for legacy compression.

The new CompressScrollBatchBytes method properly encapsulates the legacy compression logic, maintaining backward compatibility for CodecV2 while enabling the architectural flexibility for different compression implementations across codec versions.

encoding/zstd/zstd.go (1)

17-36: Excellent separation of legacy and standard compression.

The split into two distinct functions with clear documentation enables proper codec versioning while maintaining consistent API patterns. The error handling and C interop are implemented correctly.

Also applies to: 38-57

libzstd/encoder-standard/Makefile (1)

1-57: Well-structured Makefile with proper platform support.

The platform detection, build process, and installation logic are correctly implemented. The macOS deployment target setting and ranlib usage are appropriate for ensuring compatibility.

libzstd/encoder-legacy/Makefile (1)

1-57: Consistent Makefile implementation for legacy encoder.

The structure and logic mirror the standard encoder Makefile appropriately, ensuring consistent build processes across both encoder libraries.

libzstd/encoder-standard/src/lib.rs (3)

23-50: Well-configured encoder initialization.

The encoder configuration appropriately disables literal compression, sets the window log limit, and configures frame parameters for the CodecV8 use case. The parameter choices align well with the compression requirements.


52-64: Robust error message handling for FFI.

The out_as_err function properly handles error message truncation and null-termination for C-style strings, ensuring safe error reporting across FFI boundaries.


66-101: Solid FFI implementation with proper safety measures.

The compression function correctly handles memory safety, buffer bounds checking, and error propagation. The unsafe code usage is appropriate and well-contained.

encoding/codecv8.go (6)

17-40: Well-documented design rationale

Excellent documentation explaining why specific methods need to be overridden due to Go's method receiver behavior. The constructor properly initializes the codec with the forced version.


42-65: Correct implementation of compression compatibility check

The method properly implements the new compression scheme with clear documentation of the dual-mode behavior. Good error handling and debug logging.


67-142: Robust blob construction with proper compression handling

The implementation correctly constructs blobs with optional compression, proper size validation, and challenge digest calculation. The padding strategy for challenge digest is well-documented.


144-168: Clear separation of concerns for compatibility checking

The method correctly implements compatibility checking for batch processing with clear documentation of its purpose and behavior.


170-208: Accurate size estimation with compression awareness

The estimation methods correctly account for compression effects while maintaining consistent interfaces for chunk and batch operations.


210-213: Clean delegation to new compression implementation

The method correctly implements the new compression scheme by delegating to CompressScrollBatchBytesStandard.

@colinlyguo
Copy link
Contributor Author

Will this PR also contain the zstd version update?

Sounds good. will change the target branch of this PR: #51.

Follow up on this comment, merged this PR: #51.

Thegaram
Thegaram previously approved these changes Jun 25, 2025
@colinlyguo colinlyguo dismissed stale reviews from Thegaram and noel2004 via 33af79e June 25, 2025 13:49
georgehao
georgehao previously approved these changes Jun 25, 2025
Copy link
Member

@georgehao georgehao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
README.md (3)

22-43: Consolidate the 3-step rebuild into a single make target

The current instructions require developers to manually cd into two different crates and then run a separate symbol-prefixing script. This is error-prone (easy to forget step 3) and makes CI scripting harder. Consider adding a top-level Make target (e.g. make libzstd) that:

  1. Invokes make install in both encoder folders.
  2. Runs encoding/zstd/add_symbol_prefix.sh automatically (optionally no-op on unsupported OSes).

That would reduce these three bullets to a single command in the README and keeps human and CI steps in sync.


47-52: Specify a minimum Rust toolchain to avoid mismatched builds

Explicitly state the tested Rust version (rustup default 1.75.0 or rust-toolchain file) so that future Rust releases with breaking changes don’t silently produce incompatible .a files.


54-64: Docker snippet: add missing build commands & non-root tip

The Docker examples install build-essentials but never run the encoder builds. Readers may assume the library is produced automatically. Append:

cd libzstd/encoder-{legacy,standard}
make install

Also add --user flags or a note about chown so generated files aren’t root-owned when the container exits.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76abffb and 5d118d0.

📒 Files selected for processing (1)
  • README.md (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: tests
🔇 Additional comments (1)
README.md (1)

45-45: add_symbol_prefix.sh macOS-only ⇒ Linux rebuild still requires a macOS host

The note effectively means Linux-only contributors cannot produce reproducible binaries. Either document a Linux alternative (e.g. objcopy --prefix-symbols flow) or flag that cross-platform contributors need macOS access. Otherwise downstream packagers may be blocked.

Thegaram
Thegaram previously approved these changes Jun 25, 2025
@colinlyguo colinlyguo force-pushed the feat-support-codecv8 branch from 0cc621b to 0212dc2 Compare June 25, 2025 15:44
@Thegaram Thegaram merged commit 58b8994 into main Jun 26, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants