Skip to content

Conversation

danila-schelkov
Copy link
Contributor

findFrameCompressedSize JNI wrapper on zstd library, proposed as a solution for #360. decompressFrame method for arrays, that uses findFrameCompressedSize for decompression a single frame

if (srcPosition + srcSize > src.length) {
throw new ArrayIndexOutOfBoundsException(srcPosition + srcSize);
}
return findFrameCompressedSize0(src, srcPosition, srcSize);
Copy link
Owner

@luben luben Jun 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should guard here for invalid input, something like:

long size = findFrameCompressedSize0(...);
if Zstd.isError(size) {
    throw new ZstdException(size);
}
return size;

Else it will fail in decompress for invalid size that is harder to figure out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add it, but what should we do with findDirectByteBufferFrameCompressedSize which is a native method, should we wrap it and handle the errors too?

If so, shouldn't we do the same thing with all getFrameContentSize methods

They may return one of these errors:

#define ZSTD_CONTENTSIZE_UNKNOWN (0ULL - 1)
#define ZSTD_CONTENTSIZE_ERROR (0ULL - 2)

Which for some reason not present in the zstd_errors.h as a ZSTD_ErrorCode enum value

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think we should wrap it also and handle the error in similar way

@luben
Copy link
Owner

luben commented Jun 29, 2025

LGTM, added one comment

Copy link

codecov bot commented Jun 29, 2025

Codecov Report

Attention: Patch coverage is 42.85714% with 12 lines in your changes missing coverage. Please review.

Project coverage is 59.84%. Comparing base (c76455c) to head (80f7b7e).
Report is 52 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #361      +/-   ##
============================================
- Coverage     60.01%   59.84%   -0.17%     
- Complexity      308      324      +16     
============================================
  Files            26       27       +1     
  Lines          1473     1584     +111     
  Branches        170      183      +13     
============================================
+ Hits            884      948      +64     
- Misses          434      468      +34     
- Partials        155      168      +13     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@luben
Copy link
Owner

luben commented Jun 29, 2025

LGTM. Thanks for the contribution!

@luben luben merged commit 051e8de into luben:master Jun 29, 2025
8 checks passed
@danila-schelkov danila-schelkov deleted the decompress-frame branch June 30, 2025 13:32
@danila-schelkov
Copy link
Contributor Author

Thanks for merge! Could you please answer my questions here: #361 (comment)

Also, will you release an updated version soon?

@luben
Copy link
Owner

luben commented Jun 30, 2025

Hmm, thanks for raising that question - I missed it. I think you are right and we should also wrap that method

Comment on lines +1403 to +1412
/**
* Decompress data, assuming that whole buffer is a compressed data
*
* @param src the source buffer
* @return byte array with the decompressed data
*/
public static byte[] decompress(byte[] src) {
return decompress(src, (int) getFrameContentSize(src));
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair I think we should remove this variant of decompress - all other variants decompress all frames in the given payload, this one will decompress only the first frame.

I think people should use decompressFrame if they want that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Should I open a new PR removing this variant, or will we continue in this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also modify this method to decompress all frames frame by frame using decompressFrame and calculating offsets. In my opinion such API would be great to use, if people don't want to work with streams

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that behaviour is going to be closer to the rest of decompress methods.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's open new PR

dongjoon-hyun added a commit to apache/orc that referenced this pull request Jul 15, 2025
### What changes were proposed in this pull request?

This PR aims to upgrade `zstd-jni` to 1.5.7-4.

### Why are the changes needed?

To bring the latest bug fixes and improvements.
- luben/zstd-jni#360
- luben/zstd-jni#361
- luben/zstd-jni#362

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #2329 from dongjoon-hyun/ORC-1957.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun added a commit to apache/orc that referenced this pull request Jul 15, 2025
### What changes were proposed in this pull request?

This PR aims to upgrade `zstd-jni` to 1.5.7-4.

### Why are the changes needed?

To bring the latest bug fixes and improvements.
- luben/zstd-jni#360
- luben/zstd-jni#361
- luben/zstd-jni#362

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #2329 from dongjoon-hyun/ORC-1957.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit e138293)
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants