Skip to content

Conversation

@rom1v
Copy link
Collaborator

@rom1v rom1v commented Jun 15, 2025

Introduce a new packet type, a "session" packet, containing metadata about the encoding session. It is used only for the video stream, and currently includes the video resolution.

For illustration, here is a sequence of packets on the video stream:

                                    device rotation
                                    v
CODEC | SESSION | MEDIA | MEDIA | … | SESSION | MEDIA | MEDIA | …
       1920x1080 <-----------------> 1080x1920 <------------------
                  encoding session 1            encoding session 2

This metadata is not strictly necessary, since the video resolution can be determined after decoding. However, it allows detection of cases where the encoder does not respect the requested size (and logs a warning), even without decoding (e.g., when there is no video playback).

Additional metadata could be added later if necessary, for example the actual device rotation.

Refs #5918
Refs #5894

rom1v and others added 4 commits June 13, 2025 14:04
The stream metadata will contain both:
 - the codec id at the start of the stream
 - the session metadata (video width and height) at the start of every
   "session" (typically on rotation)
Introduce a new packet type, a "session" packet, containing metadata
about the encoding session. It is used only for the video stream,
and currently includes the video resolution.

For illustration, here is a sequence of packets on the video stream:

                                        device rotation
                                        v
    CODEC | SESSION | MEDIA | MEDIA | … | SESSION | MEDIA | MEDIA | …
           1920x1080 <-----------------> 1080x1920 <------------------
                      encoding session 1            encoding session 2

This metadata is not strictly necessary, since the video resolution can
be determined after decoding. However, it allows detection of cases
where the encoder does not respect the requested size (and logs a
warning), even without decoding (e.g., when there is no video playback).

Additional metadata could be added later if necessary, for example the
actual device rotation.

Refs #5918 <#5918>
Refs #5984 <#5894>

Co-authored-by: gz0119 <[email protected]>
The delay buffer must forward the session packets while preserving
their order relative to media packets.
Warn if the size of a decoded video frame does not match the session
metadata.
@sbfkcel
Copy link

sbfkcel commented Oct 9, 2025

To make the session data more compact, additional metadata is reserved. Would it be better to put the flag and width together in the first 4 bytes?

    public void writeSessionMeta(int width, int height) throws IOException {
        writeSessionMeta(width, height, 0);
    }

    public void writeSessionMeta(int width, int height, int extra) throws IOException {
        if (sendStreamMeta) {
            headerBuffer.clear();

            int flagAndWidth = (int)(PACKET_FLAG_SESSION | (width & ((1 << 31) - 1)));
            headerBuffer.putInt(flagAndWidth);
            headerBuffer.putInt(height);
            headerBuffer.putInt(extra);
            headerBuffer.flip();
            IO.writeFully(fd, headerBuffer);
        }
    }

public void writeSessionMeta(int width, int height) throws IOException {
if (sendStreamMeta) {
headerBuffer.clear();
headerBuffer.putInt((int) (PACKET_FLAG_SESSION >> 32)); // Set the first bit to 1
headerBuffer.putInt(width);
headerBuffer.putInt(height);
headerBuffer.flip();
IO.writeFully(fd, headerBuffer);
}
}

@rom1v
Copy link
Collaborator Author

rom1v commented Oct 9, 2025

That way, the packet header is always 12 bytes:

ssize_t r = net_recv_all(demuxer->socket, header, SC_PACKET_HEADER_SIZE);

If it was variable, it would require the client to buffer the data (more copies) and "packetize" (more complexity). As a tradeoff, it uses 2 syscalls (one to read the header, one to read the full payload).

@sbfkcel
Copy link

sbfkcel commented Oct 13, 2025

Sorry, I may not have explained it clearly enough. The Session Meta data is indeed always 12 bytes. However, since the flag, width, and height each occupy 4 bytes, there’s no remaining space for expansion.

Therefore, my proposal is to combine the flag and width into 4 bytes, keep height as 4 bytes, and reserve the remaining 4 bytes for future expansion.

@rom1v
Copy link
Collaborator Author

rom1v commented Oct 13, 2025

Note that there is no backward or forward compatibility (the protocol is always used between matching client and server), so there is nothing to "reserve", the protocol can be changed completely for any version.

If we want to add something taking 31 bits or less, you can store it just after the flag. If we need a lot more data, we can add a payload. We can also move the width/height fields as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants