Skip to content

Implement gzungetc #355

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 1, 2025
Merged

Conversation

brian-pane
Copy link

No description provided.

Copy link

codecov bot commented Apr 30, 2025

Codecov Report

Attention: Patch coverage is 98.66667% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
libz-rs-sys/src/gz.rs 98.66% 1 Missing ⚠️
Flag Coverage Δ
fuzz-compress ?
fuzz-decompress ?
test-aarch64-apple-darwin 88.43% <98.66%> (+0.68%) ⬆️
test-x86_64-apple-darwin 86.17% <5.33%> (-0.43%) ⬇️
test-x86_64-unknown-linux-gnu 88.00% <4.00%> (-0.52%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
libz-rs-sys/src/gz.rs 92.11% <98.66%> (+0.84%) ⬆️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

if state.next == state.output {
// Safety: `state.have` is at least 1 if we got this far, so `state.output` has been
// allocated, and `state.next` always points within the `state.output` buffer.
let dst = unsafe { (state.next as *mut u8).add(1) };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this 1 come from? I'm having some trouble mapping this back to the C original

    if (state->x.next == state->out) {
        unsigned char *src = state->out + state->x.have;
        unsigned char *dest = state->out + (state->size << 1);
        while (src > state->out)
            *--dest = *--src;
        state->x.next = dest;
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. I misread the C original as handling the case where there is 1 free byte at the end of the buffer, but it actually covers all cases between 0 and state->size << 1 free bytes. I'll update the PR to shift the content to the very end of the buffer to match the C version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not confident that the current implementation is right, especially given that the test did not flag the problem earlier.

This bit of logic is tricky, so while I appreciate the from-first-principles reasoning, I think it would also be useful to add why/how this maps back to the C source code, and to make sure that the tests properly exercise this logic.

I also tried running the test locally but with use libz_sys::*; at the top, overwriting the imports with the C version of those functions, and that fails (I commented out the line using gzbuffer, which libz_sys does not export, it then fails on assert_eq!(unsafe { gzungetc('-' as c_int, file) }, -1);, which apparently returns 45 using the C version.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can address why the test didn't flag the problem in the previous version: it was a performance regression, rather than a functionality bug. zlib-ng shifts the content as far right as possible, which avoids a quadratic cost if the caller invokes gzungetc repeatedly. My original implementation, which shifted just one byte to the right, had the same functional result (the character passed to gzungetc was pushed onto the front of the buffered content) but would be very inefficient if gzungetc was called repeatedly in that state.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gzbuffer plays an important role in the test case: it ensures that the buffer is small enough so the various gzread and gzungetc operations can fill and drain it. The test case sets up 16 bytes of data in the buffer before the line assert_eq!(unsafe { gzungetc('-' as c_int, file) }, -1); to exercise the full-buffer case. But without calling gzbuffer, the default buffer size in zlib-ng is something like 128KB. So the gzungetc call on that line will succeed, returning 45 (ascii for -).

I'll add some more comments to the test code to document the buffer arithmetic that the tests are using.

@brian-pane brian-pane force-pushed the gzungetc branch 2 times, most recently from 154e1ca to d66e3aa Compare April 30, 2025 21:34
@folkertdev folkertdev merged commit 8b15596 into trifectatechfoundation:main May 1, 2025
24 checks passed
@brian-pane brian-pane deleted the gzungetc branch May 1, 2025 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants