-
Notifications
You must be signed in to change notification settings - Fork 662
Provide a way to extract XMP metadata (png & webp & tiff only for now) #2567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
XMP is a common metadata format and multiple image formats supported by this crate are able to extract XMP metadata. Similiar to the icc profile and exif metadata, we extend the ImageDecoder trait to provide this functionality. For now this is only implemented for png.
4cdedb2
to
53f7700
Compare
Sorry for the heck-meck, I decided to add Tiff now as well, since the roll pulled in the fix for processing the Bytes in Tiff :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually having more decoders support it, that set in particular is really core, makes the addition much more convincing as common behavior.
It would be great to coordinate with the zune-jpeg author to clear the remaining issues and cut a release. |
This PR unfortunately makes it harder to fix the memory limit handling for the PNG decoder because it requires |
Isn't that what we added the |
I am also happy to move the metadata extraction to the png crate if it fits better there. That way we could just peek the keyword and otherwise seek over the contents? |
Yes, since we've moved to ignore ancillary chunks that are broken we might also partially ignore them when they exceed memory limits. And by partially I mean keep a list of offsets where they occur so that their contents can be selectively retrieved. That should combine well with interfaces to read such chunks without fully allocating them in memory (also planned for chunks we want to ignore). We could add a flag to do so preemptively while keeping only a prefix in memory, enough to determine if they are relevant for XMP or Adobe embeds. I think there are a lot of potential variants that won't break functionality (with minor coordination to use those features in |
Thought about this a bit more, and I think there's a way we can implement this without needing invasive changes to the png crate chunk parsing state machine. The main point would be storing the positions of text chunks during initial parsing, and only seeking to each one and reading its label if the |
XMP is a common metadata format and multiple image formats supported by this crate are able to extract XMP metadata. Similiar to the icc profile and exif metadata, we extend the ImageDecoder trait to provide this functionality.
For now this is only implemented for png and webp and tiff.
This is related to #2568.