-
Notifications
You must be signed in to change notification settings - Fork 11
feat: Add CLI validate/info #281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tion - Introduced a command-line interface (CLI) for the ome-zarr-models package, allowing users to validate and retrieve information about OME-Zarr datasets. - Improved error handling in the open_ome_zarr function to provide detailed feedback on validation failures. - Updated versioning for LabelsAttrs and Plate classes to explicitly define the version as "0.5". - Adjusted validation logic in the Image class to ensure proper multiscale level matching.
tests/test_cli.py
Outdated
| ], | ||
| ) | ||
| @pytest.mark.parametrize("cmd", ["validate", "info"]) | ||
| def test_cli(url: str, cmd: str, monkeypatch: pytest.MonkeyPatch) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your call on whether you want to include this test. can always be removed if those URLS become invalid (and currently the test will only xfail if the URL is invalid)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Snap, see my comment below! I think pytest-vcr might be a nice halfway house here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really up to you. if you can use vcr or just change the tests to use your internal fixtures. what would you like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets keep it simple for now and use an internal fixture if that works.
|
👏 thanks a lot for this! I left some comments inline, but overall 👍 for the CLI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted a bunch of stuff
tests/test_cli.py
Outdated
| ], | ||
| ) | ||
| @pytest.mark.parametrize("cmd", ["validate", "info"]) | ||
| def test_cli(url: str, cmd: str, monkeypatch: pytest.MonkeyPatch) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really up to you. if you can use vcr or just change the tests to use your internal fixtures. what would you like?
That's what this library is designed to fix! |
just a thought, from an outside perspective: Given the confusing state of the spec/schema, with self-contradictory stuff happening in multiple places... i do think that one incredibly valuable "service" this library could provide is to actually make the experience of reading/writing OME-ZARR more pleasant and clear. Every time I come back to check in on stuff, to try to adopt the spec and use it in more places, i find that things are definitely not in a state of readiness where I would feel comfortable encouraging the "average" user to dip their toes into it. If you take a hard line here and say "we strictly follow the spec" and that spec is itself unstable and self-inconsistent... then all we really do here is relay that confusion. One approach might be to warn on confusions or instabilities, and then cast to your current interpretation of what valid actually means. |
That's an interesting idea, we could basically take a "parse, don't validate" approach. This would widen the types of data that this library consumes, but the output of this library would still be strictly typed. And users could be notified when input had to be coerced / corrected. Given the inconsistent implementations out there, this is probably a good direction from a "use this library with the most data" perspective. |
Very open to features that make working with OME-Zarr groups easier, without violating the specification 👍
The spec itself being unstable and self-incosistent is not something we can fix in this implementation. We try our best to follow the specification strictly, because that's what other implementations (should) also do. That's the whole point of having a specification, so that everyone agrees what the (meta)data should look like. In this particular case, what if someone has an image reader that relies on the version field that the specification must be present, but ome-zarr-models allows users to write data without that version field? Then we're causing even more confusion!
100% hear and agree with this. Myself and other folks developing this library have tried to use our learnings here to influence and improve the specification and development practices around it, but it is somewhat of an uphill struggle
This has been suggested already at #184, perhaps we should move discussion there? |
no, you wouldn't just pass that through unaltered when you write ... you would do the "parse don't validate" approach that @d-v-b was mentioning. That is, follow Postel's law... "Be strict in what you send, be liberal in what you accept" don't demand that everyone out there be following every twist and turn of the spec, and read through the git repo in addition to the spec in order to figure out what they're supposed to do when loading a file that someone sent them... (e.g. from IDR). Instead, when you're reading a document, give a best faith effort to parse it, but when writing data, you always guarantee it's valid according to the spec (at least, as you interpret it :). you can, of course, always offer a "strict" mode (just like pydantic) that also mandates that the incoming data already be valid ... no casting allowed ... this whole robustness principle of ensuring that output is valid, without demanding that input also be valid, is indeed a big point/benefit of having a pydantic model |
|
the proposal here is equivalent to defining two types: an input type and an output type, where the input type is wider than the output type, along with a well-behaved (non-invertible) transformation from instances of the input type to instances of the output type (i.e., parsing). |
|
yes, but, to be clear (and as you know) ... that doesn't practically mean you have to define two different pydantic models (input/output) for every thing. The model defines the output type, and |
|
Ooh, I hadn't come across Postel's law before, or considered the difference between reading and writing really. This is definitley off topic now though - could you open an fresh issue (can just be a copy/paste of some of the comments above) about being less strict with input data where the meaning would be unambiguous. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall - a few requests for changes, and a couple of questions.
Co-authored-by: David Stansby <[email protected]>
Co-authored-by: David Stansby <[email protected]>
Co-authored-by: David Stansby <[email protected]>
|
all set! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 - just a minor nitpick about a keword only argument. Once that's fixed feel free to merge (I should have invited you as a developer)
Co-authored-by: David Stansby <[email protected]>
|
this looks great @tlambert03! |
this adds a new
ome-zarr-modelscli, with two (related) commands:not doing anything fancy yet, just making sure the models get constructed properly. No new dependencies.
relevant forum topic: https://forum.image.sc/t/where-to-place-ome-zarr-file-to-test-the-ngff-validator/104637/20?u=talley