Skip to content

Conversation

@WhitePeter
Copy link

@WhitePeter WhitePeter commented Nov 14, 2025

For the time being this is a POC to eventually fix/mitigate the display of long URLs in OSD stats, see #10975 (comment).

This introduces a simple ellipsize function and an additional append attribute max_len to automagically filter the str argument. The hard-coded length of 32 chars was chosen for testing purposes. Ideally it can be dynamically changed depending on the available space but I haven't figured out yet if and how that can be determined. If it can't be done, providing a script option is easy enough.

In conjunction with #17021 this could be a fix for #10975.

P.S.: I chose to do this in stats.lua because it fixes the immediate issue and Lua comes some batteries included to make this easier and readable. But I am pondering, if extending the filename attribute getter is the better place to do this, i.e. a new subproperty /short? Anyway, for now this is an easy (partial) win.

Takes a string and an optional length and returns string of that length
with '...' in the middle. If no length is provided it defaults to the
good old 80 chars. If the string is shorter than target length the input
string is returned unchanged.

This, in conjunction with mpv-player#17021, is intended to eventually fix mpv-player#10975.
If set the string gets ellipsized to the desired length before being
appended.
As a POC for a possible fix of mpv-player#10975 the "File" field now uses the
filename property ellipsized to 32 chars. For now the length is
hard-coded and rather small to make its effect more prominent in
testing.
return text end

local middle = max_len/2 + max_len%2
local ellipsized = ("%s...%s"):format(text:sub(1, middle-1), text:sub(-middle+2))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will split multi-byte characters and grapheme clusters and so won't work as expected or produce invalid UTF-8 with non-ASCII values.

Best you can do in Lua here is to at least operate on codepoints and avoid the invalid UTF-8 part (it may still split things like emoji sequences or modifier characters and produce weird results).

Copy link
Author

@WhitePeter WhitePeter Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I was hoping for Schrödinger's UTF-8 support cat to be alive in Lua. So there doesn't seem to be any way to get this on the cheap. I'll explore bstr support for UTF-8 then.

@guidocella
Copy link
Contributor

stats already had this for terminal output and we removed it in a187110. Clipping is already handled by \q2, which is accurate for proportional fonts, and ${term-clip-cc}, which considers multi-cell characters.

@WhitePeter
Copy link
Author

stats already had this for terminal output and we removed it in a187110. Clipping is already handled by \q2, which is accurate for proportional fonts, and ${term-clip-cc}, which considers multi-cell characters.

But I don't like that the clipping is done at the end. I want something that inserts an ellipsis in the middle if at all possible. I feel, I may be getting somewhere with Lua, though. I want to try some things before I push. With some inspiration from keyname_cells and some digging in the Lua wiki, I think I can get something working. It won't be perfect, since that would be too expensive IMO, but I don't think it has to be for stats.lua. What's the worst that can happen? Some chars off-screen?

@WhitePeter
Copy link
Author

WhitePeter commented Nov 14, 2025

Also, isn't stats.lua what's responsible for rendering the stats pages (default keys i/I)? How does terminal output figure in that? (edit: never mind, I see my mistake now; changed title to disambiguate)
This is mainly about the File: field in stats page 1.

@WhitePeter WhitePeter changed the title stats: ellipsize filename player/lua/stats.lua: ellipsize filename Nov 14, 2025
@guidocella
Copy link
Contributor

guidocella commented Nov 14, 2025

That is not feasible with ASS. You can't know the text length before rendering short of using osd-overlay with compute_bounds everywhere. You are arbitrarily cutting a few characters completely ignoring the window width, font size and window resizing while the text is displayed. Why should users only see 32 characters on a 4k window with small font size?

It is also completely arbitrary do this for the stats filename and not any other OSD message. If anything this should be implemented in libass.

This also doesn't fix the linked issue which is about the output when printing tracks to the terminal, and has no relation to stats.

@WhitePeter
Copy link
Author

WhitePeter commented Nov 14, 2025

Those 32 chars are just for testing, because my filename examples are rather short. It is meant to be configurable at least, eventually. I held off while this is still POC.

Re: text width, the underlying assumption is that the worst case is a monospace font. So if the text fits using that it sure does with proportional. Doesn't that make it easier to determine what max_len to aim for?

Anyway, if this cannot be done because of limitations with ASS rendering, I'd still like to have the second best option, some script-opt to set, say filename_max_len (opt-in: if unset, no change compared to prior versions).

Or just tell me if this is a dead end so I don't waste any more time.

@afishhh
Copy link

afishhh commented Nov 14, 2025

Re: text width, the underlying assumption is that the worst case is a monospace font. So if the text fits using that it sure does with proportional. Doesn't that make it easier to determine what max_len to aim for?

This is so wrong. There is no correspondence between a character's size in bytes and the width of the glyphs it generates. For example fullwidth Chinese characters usually occupy two ASCII characters worth of space but consist of more than two bytes. When dealing with codepoints which is what you should be doing at the very least, both of those occupy only one.

In the terminal this is handled by agreeing on a function for character width in cells that programs and terminal emulators use (see: wcwidth). This doesn't give perfect results and many programs/terminals don't do it properly but it works. I think this is what guido is talking about mpv implementing above.

In the GUI world I like to believe that we hold ourselves to high standards and do precise width calculations after shaping text with the appropriate fonts. This gets complicated when doing things like line-breaking (or inserting ellipses) though due to technicalities of text layout. This is why for it to be perfect this has to have work done by libass.

@WhitePeter
Copy link
Author

Re: text width, the underlying assumption is that the worst case is a monospace font. So if the text fits using that it sure does with proportional. Doesn't that make it easier to determine what max_len to aim for?

This is so wrong.

Or just oversimplified?

There is no correspondence between a character's size in bytes and the width of the glyphs it generates. For example fullwidth Chinese characters usually occupy two ASCII characters worth of space but consist of more than two bytes.

I know as much. But knowing that a certain cluster of bytes will result in the equivalent of two chars width in a terminal (or monospace font) and knowing the font size should be enough for calculating the space required, no? And if one does use a monospace font, the result should fit perfectly in said space. If the font is proportional, all that's lost is some vacant space.

In the GUI world I like to believe that we hold ourselves to high standards and do precise width calculations after shaping text with the appropriate fonts.

But you don't, do you? I've just played a video file with a >200 char filename and File: in the stats page (i) just seems to render what doesn't fit off-screen. There is a half character at the rightmost edge suggesting so. And for such cases, i.e. when absolute URIs get returned verbatim by the filename property, I think it would be good to ellipsize. Doesn't have to be perfect but it will be an improvement to status quo.

This gets complicated when doing things like line-breaking (or inserting ellipses) though due to technicalities of text layout. This is why for it to be perfect this has to have work done by libass.

I think perfection can never be achieved anyway, so why not take something that will work in the vicinity of good enough and doesn't explode in case it misses the mark by some exotic unicode symbol or two. And if one is fine with status quo, don't opt-in once the option exists and you'll be none the wiser.

@guidocella
Copy link
Contributor

I know as much. But knowing that a certain cluster of bytes will result in the equivalent of two chars width in a terminal (or monospace font) and knowing the font size should be enough for calculating the space required, no? And if one does use a monospace font, the result should fit perfectly in said space. If the font is proportional, all that's lost is some vacant space.

But you don't know the width of ASS output without compute_bounds.

But you don't, do you? I've just played a video file with a >200 char filename and File: in the stats page (i) just seems to render what doesn't fit off-screen. There is a half character at the rightmost edge suggesting so. And for such cases, i.e. when absolute URIs get returned verbatim by the filename property, I think it would be good to ellipsize. Doesn't have to be perfect but it will be an improvement to status quo.

It is worse because it cuts arbitrarily leaving empty space, while clipping used all available space at any window and font width, with no code required on our side.

I think perfection can never be achieved anyway, so why not take something that will work in the vicinity of good enough and doesn't explode in case it misses the mark by some exotic unicode symbol or two. And if one is fine with status quo, don't opt-in once the option exists and you'll be none the wiser.

It is not good enough to cut a fixed number of bytes. And it is still stupid to do for a single string in all of mpv.

@WhitePeter
Copy link
Author

WhitePeter commented Nov 14, 2025

I know as much. But knowing that a certain cluster of bytes will result in the equivalent of two chars width in a terminal (or monospace font) and knowing the font size should be enough for calculating the space required, no? And if one does use a monospace font, the result should fit perfectly in said space. If the font is proportional, all that's lost is some vacant space.

But you don't know the width of ASS output without compute_bounds.

So the script-opt variant then.

But you don't, do you? I've just played a video file with a >200 char filename and File: in the stats page (i) just seems to render what doesn't fit off-screen. There is a half character at the rightmost edge suggesting so. And for such cases, i.e. when absolute URIs get returned verbatim by the filename property, I think it would be good to ellipsize. Doesn't have to be perfect but it will be an improvement to status quo.

It is worse because it cuts arbitrarily leaving empty space, while clipping used all available space at any window and font width, with no code required on our side.

But clipping has downsides too, someone else has pointed out in the linked issue, IIRC. Having start, end and '...' in the middle leaves more context, especially the file extension.

I think perfection can never be achieved anyway, so why not take something that will work in the vicinity of good enough and doesn't explode in case it misses the mark by some exotic unicode symbol or two. And if one is fine with status quo, don't opt-in once the option exists and you'll be none the wiser.

It is not good enough to cut a fixed number of bytes. And it is still stupid to do for a single string in all of mpv.

Then you can just leave said option alone (unset) and be golden. As I've said before, this is not necessarily exclusive to the file name. Any append caller can just set the attribute and be done with it. It's up to the caller to determine max_len beforehand, or the user to set a script-opt, which is not there yet.

@WhitePeter
Copy link
Author

@guidocella Also, where is clipping even happening right now? Because, I couldn't find anything that seems to be doing it and the File: field in stats.lua page 1 definitely doesn't do it. The last thing on that line with a >200 char (pure ASCII) filename is half a char and no '..' in sight.

@guidocella
Copy link
Contributor

I guess it doesn't actively clip ASS output, it just doesn't wrap if there are no spaces in the filename and it keeps going beyond the window. It does clip terminal output with term-clip-cc.

@WhitePeter
Copy link
Author

Now I get it, the terminal thing. stats.lua works with --no-video as well, for instance. That's when term-clip-cc comes into play. But if I were to use that for OSD purposes the clipping would be done to terminal line length which is totally uncoupled from vo dimensions. So the way I see it, there is currently no way of clipping or ellipsizing in the OSD.

@guidocella
Copy link
Contributor

ASS clipping is done with \q2, without ellipsis. See https://aegisub.org/docs/latest/ass_tags/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants