Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 144 additions & 0 deletions proposals/3552-extensible-events-images.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# MSC3552: Extensible Events - Images and Stickers
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maltee1 says:

What about sending multiple images in a single message with a single caption? I imagine people would want to send an "album" rather than cluttering the timeline with individual messages, if they have lots of related pictures to share. It's also something that other platforms support (I know signal does) and would improve bridging if it's available in matrix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Events in Matrix are intended to represent a single unit of information - it's already a bit questionable to have a caption on an image in the same event, but there are slightly more pros than cons for doing so (currently). Albums would instead be represented by a series of image events, linked together using relationships of some kind, potentially with an "information event" to store things like the caption.

A relationship-based system would allow for richer support too: adding images/videos after the fact, mixed content, content from other senders, edits, better organization (moving images between albums), etc. Representing the whole thing as one event gets complicated to manage at a technical level.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excuse my ignorance, but wouldn't splitting up an album in several events lead to the same problems that splitting up caption+imagine into two events would have? For example a bridge not knowing how long to wait for related events to appear before bridging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the problems are shared, though from looking at it in the past it always felt like albums should be distinct events.

Regardless, albums would be handled by another MSC (this MSC defines an image format and is tightly scoped to that).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, I see value in adding caption+image in the same event from an accessibility standpoint (see: alt text in HTML/some social media platforms)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maltee1 says:

Another question: Does it make sense to offer an image in several resolutions rather than distinguishing between a thumbnail and the full image? I know of at least one platform (WeChat) that offers sending and receiving images in two resolutions, one that should be sufficient for most phone screens and "full size", which is just the original image. I don't know exactly how the protocol works, but I suppose it also uses thumbnails, which brings the number of resolutions to 3. The current proposal allows for several thumbnails, which could be used to represent that, but defining an image as a thumbnail implies that it shouldn't be used to show full-screen, even if the resolution may be sufficient. Instead, we should possibly not distinguish a "thumbnail" but simply offer several resolutions as desired (with the suggestion that one of them is thumbnail-sized) and have clients pick the one they want to show for a given purpose. Depending on available image resolution and expected bandwidth a thumbnail might even be unnecessary but we want to include a high-res version. The current proposal is not designed for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short, a 1080p thumbnail wouldn't be unreasonable to include here. Some clients already rely on "high quality" thumbnails, which would be represented here.

Clients are already asked to find the thumbnail that works for them.


As an aside, please use comments on the diff - comments not on the diff are likely to get ignored.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience, I will reply to the diff from now on.
If clients are expected to go through all available image sizes and pick the one that suits a particular purpose, isn't it just complicating implementations to include all sizes but one in a single list and keep the largest size separate? What's the meaning of "thumbnail" in that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mostly to deliminate "original copy" and "machine-resized" images.


[MSC1767](https://github.com/matrix-org/matrix-doc/pull/1767) describes Extensible Events in detail,
though deliberately does not include schemas for some messaging types. This MSC covers only images
and stickers.

*Rationale*: Splitting the MSCs down into individual parts makes it easier to implement and review in
stages without blocking other pieces of the overall idea. For example, an issue with the way images
are represented should not block the overall schema from going through.

This MSC additionally relies upon [MSC3551](https://github.com/matrix-org/matrix-doc/pull/3551).

## Proposal

Using [MSC1767](https://github.com/matrix-org/matrix-doc/pull/1767)'s system, a new event type
is introduced to describe applicable functionality: `m.image`. This event type is simply an image
upload, akin to the now-legacy [`m.image` `msgtype` from `m.room.message`](https://spec.matrix.org/v1.1/client-server-api/#mimage).
Comment on lines +15 to +17

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my view, creating a different event type is less "extensible". With this proposal, it is possible to add a text caption to an existing m.image event by replacing (editing) it and adding the caption, but it is not possible to add an image to an existing m.message text event, since you cannot replace the type of an event. This doesn't make any sense. Text messages and image messages should both be candidates for being edited into a text+image message.

I think it would make more sense if the m.message event was more extensible and allowed the inclusion of text, images, or both.

Is there any rationale for making it a different message type?


An example is:

```json5
{
"type": "m.image",
"content": {
"m.markup": [
// Format of the fallback is not defined, but should have enough information for a text-only
// client to do something with the image, just like with plain file uploads.
{"body": "matrix.png (12 KB) https://example.org/_matrix/media/v3/download/example.org/abcd1234"}
],
"m.file": {
"url": "mxc://example.org/abcd1234",
"name": "matrix.png",
"mimetype": "image/png",
"size": 12345
},
"m.image_details": { // optional
"width": 640,
"height": 480
},
"m.thumbnail": [ // optional
{
// A thumbnail is an m.file+m.image, or a small image
"m.file": {
"url": "mxc://exmaple.org/efgh5678",
"mimetype": "image/jpeg",
"size": 123

// "name" is optional in this scenario
},
"m.image_details": {
"width": 160,
"height": 120
}
},
// ...
],
"m.caption": { // optional - goes above/below image
"m.markup": [{"body": "Look at this cool Matrix logo"}]
},
"m.alt_text": { // optional - accessibility consideration for image
"m.markup": [{"body": "matrix logo"}]
}
}
}
```

With consideration for extensible events, the following content blocks are defined:

* `m.image_details` - Currently records width and height (both required, in pixels), but in
future could additionally supply other image details such as colour space.
* `m.thumbnail` - An array of (usually) smaller images the client can use to show in place of
the event's image for bandwidth or size considerations. Currently requires two other content
blocks nested under it: `m.file` and `m.image_details`.
* Clients should find the thumbnail most suitable for them - the array is not ordered, but
encouraged to have smaller images (by byte size) first.
* Multiple thumbnail formats may be supplied (webp, webm, jpeg, etc) with the same dimensions.
Clients should ensure they are capable of rendering the type before picking that thumbnail.
* `m.file`'s `mimetype` is a required field in this block.
* `m.file`'s `name` is optional in this block.
* `m.alt_text` - Alternative text for the content, for accessibility considerations. Currently
requires an `m.markup` content block to be nested within it, however senders should only
specify a plain text body for ease of parsing.
* *Note*: We use the full capability of `m.markup` here not for mimetype, but future support
for translations and other text-based extensions.

Together with content blocks from other proposals, an `m.image` is described as:

* **Required** - An `m.markup` block to act as a fallback for clients which can't process images.
* **Required** - An `m.file` block to contain the image itself. Clients use this to show the image.
* **Optional** - An `m.image_details` block to describe any image-specific metadata, such as dimensions.
Like with existing `m.room.message` events today, clients should keep images within a set of
reasonable bounds, regardless of sender-supplied values. For example, keeping images at a minimum
size and within a maximum size.
* **Optional** - An `m.thumbnail` block (array) to describe any thumbnails for the image.
* **Optional** - An `m.caption` block to represent any text that should be shown above or below the
image. Currently this MSC does not describe a way to pick whether the text goes above or below,
leaving this as an implementation detail. A future MSC may investigate ways of representing this,
if needed.
* **Optional** - An `m.alt_text` block to represent alternative/descriptive text for the image. This
is used as an accessibility feature, and per the block's definition above should only contain a plain
text representation at the moment. Clients are encouraged to assume there is no alt text if no plain
text representations are present. For clarity, this value would be supplied to the `alt` attribute
of an `img` node in HTML.

The above describes the minimum requirements for sending an `m.image` event. Senders can add additional
blocks, however as per the extensible events system, receivers which understand image events should not
honour them.

To represent stickers, we instead use a mixin on `m.image_details`. A new (optional) boolean field
called `m.sticker` is added if the client is intended to render the image as a sticker. When rendering
as a sticker, the `m.caption` can be shown as a tooltip (or similar) rather than inline with the image
itself. `m.sticker` defaults to `false`.

The [`m.sticker` event type](https://spec.matrix.org/v1.1/client-server-api/#msticker) is deprecated
and removed, like `m.room.message` in MSC1767.

Note that `m.file` supports encryption and therefore it's possible to encrypt thumbnails and images
too.

If a client does not support rendering images inline, the client would instead typically represent
the event as a plain file upload, then fall further back to a plain text message.

## Potential issues

The schema duplicates some of the information into the text fallback, though this is unavoidable
and intentional for fallback considerations.

## Alternatives

No significant alternatives known.

## Security considerations

The same considerations which currently apply to files, images, stickers, and extensible events also
apply here. For example, bounds on image size, assuming sender-provided details about the file are
false, etc.

## Unstable prefix

While this MSC is not considered stable, implementations should use `org.matrix.msc1767.*` as a prefix in
place of `m.*` throughout this proposal. Note that this uses the namespace of the parent MSC rather than
the namespace of this MSC - this is deliberate.

Note that extensible events should only be used in an appropriate room version as well.
1 change: 1 addition & 0 deletions themes/docsy
Submodule docsy added at 5023a2