-
Notifications
You must be signed in to change notification settings - Fork 7
Add support for filelike objects in create_dataset #183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #183 +/- ##
==========================================
+ Coverage 93.98% 94.25% +0.26%
==========================================
Files 42 46 +4
Lines 3375 3568 +193
==========================================
+ Hits 3172 3363 +191
- Misses 203 205 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@greglucas Just making sure you saw this. I realized last night that for supporting S3 files, we actually don't have to make any changes because an S3Path behaves exactly like a Path object and create_dataset at least appears to work with it, but MyPy throws a fit if you pass in a cloudpathlib.S3Path object into create_dataset. That said, I think this is still a good thing to support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the double nesting going on with an internal helper function. I'm curious about your thoughts of restructuring things in two potentially different ways.
- Do the handling within the generator itself. Can we do a
yield from generator(f)
after doing the transformation within the generator? i.e. if we get a file that is unopened, lets open that within a context block and then call the generator again yielding from itself. - Perhaps this is a good usecase for singledispatch where you could choose what to do based upon the input type and register how to open the file-likes, sockets, raw bytes, ... https://docs.python.org/3/library/functools.html#functools.singledispatch
a561ce7
to
8e0685c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments on trying to consolidate a few more things.
I'm not sure how I feel about singledispatch. I think it seems OK, but I'm not sure it helped as much as I thought it would in my mind :)
The one concern I have with it is that I think the read_packet_file dispatch will read the entire file at once before entering the generator whereas your implementation before would have passed the open file handle in and then called read()
(potentially a full read by default, but controllable by a user).
current_pos = 0 # Keep track of where we are in the buffer | ||
start_time = time.time_ns() | ||
while True: | ||
if total_length_bytes and n_bytes_parsed == total_length_bytes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need the first check? I think you could just keep the equality and 0 == None
would also be False if that is what you're looking to catch. (I'm guessing this was just copied over, so just noting that here because I saw it)
if total_length_bytes and n_bytes_parsed == total_length_bytes: | |
if n_bytes_parsed == total_length_bytes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, yeah that seems like it's unnecessary. I'll see if it breaks anything in some way I'm not thinking of.
- Refactor generator definitions into generators subpackage - Use singledispatch to read packet files in xarr.py - Add singledispatch for setting up generator binary reader - Cleanup typehinting in definitions.py - Add tests for generators module
19abd78
to
67caef2
Compare
Add File-like Object Support to
create_dataset
Summary
Enhanced the
create_dataset
function to accept file-like objects (such as those created withopen(filepath, "rb")
orio.BytesIO
)in addition to file paths for both packet files and XTCE definitions.
Changes Made
Core Implementation (
space_packet_parser/xarr.py
)packet_files
parameter to acceptBinaryIO
andbytes
types alongside existingstr
andPath
typesobjects (used directly)
_process_generator()
to avoid codeduplication
Comprehensive Test Coverage (
tests/unit/test_xarr.py
)open(filepath, "rb")
andio.BytesIO
objectsio.StringIO
for XTCE definitions alongside binary file-like objects for packet dataChecklist