Skip to content

Conversation

@tlambert03
Copy link
Contributor

@tlambert03 tlambert03 commented Sep 15, 2025

this adds a new ome-zarr-models cli, with two (related) commands:

# simply validate and exit with proper return code.
ome-zarr-models validate https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0066/ExpD_chicken_embryo_MIP.ome.zarr

# print info
ome-zarr-models info https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0066/ExpD_chicken_embryo_MIP.ome.zarr

not doing anything fancy yet, just making sure the models get constructed properly. No new dependencies.

relevant forum topic: https://forum.image.sc/t/where-to-place-ome-zarr-file-to-test-the-ngff-validator/104637/20?u=talley

…tion

- Introduced a command-line interface (CLI) for the ome-zarr-models package, allowing users to validate and retrieve information about OME-Zarr datasets.
- Improved error handling in the open_ome_zarr function to provide detailed feedback on validation failures.
- Updated versioning for LabelsAttrs and Plate classes to explicitly define the version as "0.5".
- Adjusted validation logic in the Image class to ensure proper multiscale level matching.
],
)
@pytest.mark.parametrize("cmd", ["validate", "info"])
def test_cli(url: str, cmd: str, monkeypatch: pytest.MonkeyPatch) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your call on whether you want to include this test. can always be removed if those URLS become invalid (and currently the test will only xfail if the URL is invalid)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snap, see my comment below! I think pytest-vcr might be a nice halfway house here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really up to you. if you can use vcr or just change the tests to use your internal fixtures. what would you like?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets keep it simple for now and use an internal fixture if that works.

@tlambert03 tlambert03 changed the title feat: Add CLI support (and modify some error handling) feat: Add CLI validate/info (and modify some error handling) Sep 15, 2025
@dstansby
Copy link
Collaborator

👏 thanks a lot for this! I left some comments inline, but overall 👍 for the CLI

Copy link
Contributor Author

@tlambert03 tlambert03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted a bunch of stuff

],
)
@pytest.mark.parametrize("cmd", ["validate", "info"])
def test_cli(url: str, cmd: str, monkeypatch: pytest.MonkeyPatch) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really up to you. if you can use vcr or just change the tests to use your internal fixtures. what would you like?

@d-v-b
Copy link
Contributor

d-v-b commented Sep 15, 2025

(it's surprisingly hard to find an actual valid v0.5 example out there 😂 )

That's what this library is designed to fix!

@tlambert03
Copy link
Contributor Author

(it's surprisingly hard to find an actual valid v0.5 example out there 😂 )

That's what this library is designed to fix!

just a thought, from an outside perspective:

Given the confusing state of the spec/schema, with self-contradictory stuff happening in multiple places... i do think that one incredibly valuable "service" this library could provide is to actually make the experience of reading/writing OME-ZARR more pleasant and clear. Every time I come back to check in on stuff, to try to adopt the spec and use it in more places, i find that things are definitely not in a state of readiness where I would feel comfortable encouraging the "average" user to dip their toes into it. If you take a hard line here and say "we strictly follow the spec" and that spec is itself unstable and self-inconsistent... then all we really do here is relay that confusion. One approach might be to warn on confusions or instabilities, and then cast to your current interpretation of what valid actually means.

@d-v-b
Copy link
Contributor

d-v-b commented Sep 15, 2025

One approach might be to warn on confusions or instabilities, and then cast to your current interpretation of what valid actually means.

That's an interesting idea, we could basically take a "parse, don't validate" approach. This would widen the types of data that this library consumes, but the output of this library would still be strictly typed. And users could be notified when input had to be coerced / corrected. Given the inconsistent implementations out there, this is probably a good direction from a "use this library with the most data" perspective.

@dstansby
Copy link
Collaborator

Given the confusing state of the spec/schema, with self-contradictory stuff happening in multiple places... i do think that one incredibly valuable "service" this library could provide is to actually make the experience of reading/writing OME-ZARR more pleasant and clear.

Very open to features that make working with OME-Zarr groups easier, without violating the specification 👍

If you take a hard line here and say "we strictly follow the spec" and that spec is itself unstable and self-inconsistent... then all we really do here is relay that confusion.

The spec itself being unstable and self-incosistent is not something we can fix in this implementation. We try our best to follow the specification strictly, because that's what other implementations (should) also do. That's the whole point of having a specification, so that everyone agrees what the (meta)data should look like.

In this particular case, what if someone has an image reader that relies on the version field that the specification must be present, but ome-zarr-models allows users to write data without that version field? Then we're causing even more confusion!

Every time I come back to check in on stuff, to try to adopt the spec and use it in more places, i find that things are definitely not in a state of readiness where I would feel comfortable encouraging the "average" user to dip their toes into it. I

100% hear and agree with this. Myself and other folks developing this library have tried to use our learnings here to influence and improve the specification and development practices around it, but it is somewhat of an uphill struggle ☹️

That's an interesting idea, we could basically take a "parse, don't validate" approach.

This has been suggested already at #184, perhaps we should move discussion there?

@tlambert03
Copy link
Contributor Author

tlambert03 commented Sep 15, 2025

In this particular case, what if someone has an image reader that relies on the version field that the specification must be present, but ome-zarr-models allows users to write data without that version field? Then we're causing even more confusion!

no, you wouldn't just pass that through unaltered when you write ... you would do the "parse don't validate" approach that @d-v-b was mentioning. That is, follow Postel's law... "Be strict in what you send, be liberal in what you accept"

don't demand that everyone out there be following every twist and turn of the spec, and read through the git repo in addition to the spec in order to figure out what they're supposed to do when loading a file that someone sent them... (e.g. from IDR). Instead, when you're reading a document, give a best faith effort to parse it, but when writing data, you always guarantee it's valid according to the spec (at least, as you interpret it :).

you can, of course, always offer a "strict" mode (just like pydantic) that also mandates that the incoming data already be valid ... no casting allowed

... this whole robustness principle of ensuring that output is valid, without demanding that input also be valid, is indeed a big point/benefit of having a pydantic model

@d-v-b
Copy link
Contributor

d-v-b commented Sep 15, 2025

the proposal here is equivalent to defining two types: an input type and an output type, where the input type is wider than the output type, along with a well-behaved (non-invertible) transformation from instances of the input type to instances of the output type (i.e., parsing).

@tlambert03
Copy link
Contributor Author

yes, but, to be clear (and as you know) ... that doesn't practically mean you have to define two different pydantic models (input/output) for every thing. The model defines the output type, and model/field_validator define the input type.

@tlambert03 tlambert03 changed the title feat: Add CLI validate/info (and modify some error handling) feat: Add CLI validate/info Sep 15, 2025
@dstansby
Copy link
Collaborator

Ooh, I hadn't come across Postel's law before, or considered the difference between reading and writing really. This is definitley off topic now though - could you open an fresh issue (can just be a copy/paste of some of the comments above) about being less strict with input data where the meaning would be unambiguous.

Copy link
Collaborator

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall - a few requests for changes, and a couple of questions.

@tlambert03
Copy link
Contributor Author

all set!

Copy link
Collaborator

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 - just a minor nitpick about a keword only argument. Once that's fixed feel free to merge (I should have invited you as a developer)

@d-v-b
Copy link
Contributor

d-v-b commented Sep 16, 2025

this looks great @tlambert03!

@d-v-b d-v-b merged commit 074d272 into ome-zarr-models:main Sep 16, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants