-
Notifications
You must be signed in to change notification settings - Fork 304
Description
Sorry if this is already covered in some documentation I didn't find, I read over the existing docs and previous issues about video.
From what I understand, basisu video currently has the following constraints:
- You can only seek to i-frames
- The encoder only generates one i-frame at the start, so you can only seek to the start, but this can be fixed in the future
- Once you seek to an i-frame you have to decode every frame as you go forward, you can't skip any frames
- Every input frame in a video needs to be the same size
- If I were to use texture arrays in basisu in order to get the ability to "seek", I think every element needs to be the same size too?
Is that correct?
Some questions as follow-up:
- Is there any documentation on how to sense whether a frame is an i-frame, for once the encoder learns how to insert iframes periodically?
- Do I need to do something special when seeking 'back' to a specific i-frame?
- If I mess up seeking (by trying to seek to a non-iframe or skipping a frame) will it crash, or just produce garbled output? I'm wrapping basisu for use in a higher level language so I want to know what kind of guardrails to implement at the higher level to avoid erroneous reports heading your way.
- When I 'decode' a new frame is there a way to get a delta region to reduce how much data I upload to the GPU? My animations are quite large (2Kx2K or more in some cases, where most of the space is transparent/static) so it might be helpful to do this if the decoder already supports it. If the decoder doesn't support this, no worries.
- Can video have mips? If so, is it legal to skip decoding some of them, or do I also need to decode every mip of every frame as I go?
Right now my game's animations are encoded as a basis image with mips for every frame, and I decode all of them at startup to have them ready. As you'd expect this increases steady-state VRAM usage and load times quite a bit, so I'd love to migrate to basisu video. The alternatives (AV1, theora, h264, etc) either have bad performance, bad portability, or lack alpha channel support. The fact that basis decodes to block-compressed texture data is an added bonus.
One upside I get right now for my approach to videos (one ktx2 file per frame) is that I can crop the input to avoid encoding big blocks of transparent pixels, which helps reduce memory usage. My understanding is that basisu video will handle this scenario just fine since it won't re-encode all that empty space if it hasn't changed - it should be fine to stop cropping if I just have one 'canvas' texture I decode into instead of having separate textures for every frame.
The only reason I'm concerned about seeking is that a common scenario for animations - and one I use extensively - is custom loop regions, so I'd need a relatively efficient way to seek from i.e. frame 8 to frame 4 every second or so. I'm okay with doing the seek and decode on a worker thread, and I'm okay with it eating a couple MS of CPU time if it needs to since my current frame times are around 1-3ms anyway. I know my loop points in advance (they're in a configuration file), so I could potentially feed them to the encoder to give it a hint about where to place an iframe, if the encoder learns how to do that at some point.
I'd be happy to share an example animation to give you a sense of what I'm working with, if that would give you ideas on what to do here or help you make suggestions.
I'd also be happy to contribute more detailed documentation to upstream, assuming the replies to this issue aren't a good place to store information.