Skip to content

Implement gzread and gzbuffer #345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 19, 2025

Conversation

brian-pane
Copy link

No description provided.

Copy link

codecov bot commented Apr 17, 2025

Codecov Report

Attention: Patch coverage is 87.19212% with 26 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
libz-rs-sys/src/gz.rs 87.19% 26 Missing ⚠️
Flag Coverage Δ
fuzz-compress ?
fuzz-decompress ?
test-aarch64-apple-darwin 89.56% <63.05%> (+0.09%) ⬆️
test-x86_64-apple-darwin 87.58% <50.73%> (-0.09%) ⬇️
test-x86_64-unknown-linux-gnu 90.39% <74.87%> (+0.32%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
libz-rs-sys/src/gz.rs 93.52% <87.19%> (-1.36%) ⬇️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@brian-pane
Copy link
Author

This is still a work in progress. I'm still tracking down an uninitialized read that happens somewhere in the inflate operation.

@brian-pane

This comment was marked as outdated.

@folkertdev
Copy link
Collaborator

What command do you run?

@folkertdev
Copy link
Collaborator

I think the miri error is actually a soundness bug in the inflate code, where some uninitialized memory is read. We should not be doing that of course, but I don't think it's dangerous in this case. I'll fix that separately.

@brian-pane
Copy link
Author

My local Miri command is MIRIFLAGS="-Zmiri-disable-isolation" cargo +nightly miri test gz::gread_gzip ... and if I patch in #346 it fixed the error message I was getting.

@brian-pane
Copy link
Author

I'm not sure what's happening in the failed x86_64-pc-windows-gnu test. I'll try to set up a local Windows dev environment tomorrow to repro it.

@brian-pane
Copy link
Author

I got the test error to repro in a local x86_64-pc-windows-gnu environment. I'll debug there to try to fix it.

@folkertdev
Copy link
Collaborator

I can reproduce the problem by using wine to evaluate the test. I've added the following to my local .cargo/config.toml file:

testw = "test --target x86_64-pc-windows-gnu"

[target.'cfg(windows)']
runner = "wine"

The I can run these commands to gather the stdout, and add dbg!s to try and figure out where the two implementations differ

> cargo test --features="gz" gread_gzip -- --nocapture 2&> /tmp/old.txt
> cargo testw --features="gz" gread_gzip -- --nocapture 2&> /tmp/new.txt

I use meld to quickly see where the two outputs diverge. The first difference between the functions is that a read on windows here returns 0

while have < len {
ret = unsafe { libc::read(state.fd, buf.add(have).cast::<_>(), (len - have) as _) };
if ret <= 0 {
break;
}

@brian-pane
Copy link
Author

Yes, on Windows I see libc read returning 128 bytes on the first five reads, then 22 bytes, then zero.

@brian-pane
Copy link
Author

The latest change that I just pushed adds a little test case called libc_read in test-libz-rs-sys that bypasses all the gz code and iterates on the libc read call to read the test file. That test case only manages to read 662 bytes from the file on Windows, which is the same number that the gread_gzip test reads before a read returns zero. That's consistent, at least, but it's surprising. Does libc read work differently on Windows than on Posix?

@folkertdev
Copy link
Collaborator

ah yes, you need to add the libc::O_BINARY flag on windows to the libc::open so that the file is opened in binary mode

@brian-pane
Copy link
Author

Thanks! O_BINARY fixed the issue. Now the test case that reads a text file is failing - but only in the Windows CI run, not when I run it locally on Windows. In the CI x86_64-pc-windows-gnu test failure, the number of bytes read from the file matches what I get if I pass the file through unix2dos. My guess is that the file gets translated to DOS-style newlines when checked out by git in the Windows test environment. I'll rework my test case to validate the number of bytes read against the file size returned by stat, rather than hard-coding a check for what the file size is on Linux.

@brian-pane brian-pane force-pushed the gzread branch 2 times, most recently from 7b4ff3b to 97c28f0 Compare April 19, 2025 16:56
let mut count = 0;
loop {
count += 1;
assert_ne!(count, 100);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this left from debugging?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Leftover from debugging. Thanks for catching that!

Comment on lines 957 to 959
continue; // Now that we've tried reading, we can try to copy from the output buffer.
// The copy above assures that we will leave with space in the
// output buffer, allowing at least one gzungetc() to succeed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reads nicer if you put the comment above the continue (and maybe add an empty line after the assert_ne

@folkertdev
Copy link
Collaborator

2 final minor things, otherwise this looks good!

@folkertdev folkertdev merged commit a625663 into trifectatechfoundation:main Apr 19, 2025
24 checks passed
@brian-pane brian-pane deleted the gzread branch April 22, 2025 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants