-
-
Notifications
You must be signed in to change notification settings - Fork 28
Implement gzungetc #355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement gzungetc #355
Conversation
Codecov ReportAttention: Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 5 files with indirect coverage changes 🚀 New features to boost your workflow:
|
libz-rs-sys/src/gz.rs
Outdated
if state.next == state.output { | ||
// Safety: `state.have` is at least 1 if we got this far, so `state.output` has been | ||
// allocated, and `state.next` always points within the `state.output` buffer. | ||
let dst = unsafe { (state.next as *mut u8).add(1) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does this 1 come from? I'm having some trouble mapping this back to the C original
if (state->x.next == state->out) {
unsigned char *src = state->out + state->x.have;
unsigned char *dest = state->out + (state->size << 1);
while (src > state->out)
*--dest = *--src;
state->x.next = dest;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching that. I misread the C original as handling the case where there is 1 free byte at the end of the buffer, but it actually covers all cases between 0 and state->size << 1
free bytes. I'll update the PR to shift the content to the very end of the buffer to match the C version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not confident that the current implementation is right, especially given that the test did not flag the problem earlier.
This bit of logic is tricky, so while I appreciate the from-first-principles reasoning, I think it would also be useful to add why/how this maps back to the C source code, and to make sure that the tests properly exercise this logic.
I also tried running the test locally but with use libz_sys::*;
at the top, overwriting the imports with the C version of those functions, and that fails (I commented out the line using gzbuffer
, which libz_sys
does not export, it then fails on assert_eq!(unsafe { gzungetc('-' as c_int, file) }, -1);
, which apparently returns 45
using the C version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can address why the test didn't flag the problem in the previous version: it was a performance regression, rather than a functionality bug. zlib-ng shifts the content as far right as possible, which avoids a quadratic cost if the caller invokes gzungetc repeatedly. My original implementation, which shifted just one byte to the right, had the same functional result (the character passed to gzungetc was pushed onto the front of the buffered content) but would be very inefficient if gzungetc was called repeatedly in that state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gzbuffer
plays an important role in the test case: it ensures that the buffer is small enough so the various gzread
and gzungetc
operations can fill and drain it. The test case sets up 16 bytes of data in the buffer before the line assert_eq!(unsafe { gzungetc('-' as c_int, file) }, -1);
to exercise the full-buffer case. But without calling gzbuffer
, the default buffer size in zlib-ng is something like 128KB. So the gzungetc
call on that line will succeed, returning 45 (ascii for -
).
I'll add some more comments to the test code to document the buffer arithmetic that the tests are using.
154e1ca
to
d66e3aa
Compare
No description provided.