Skip to content

Conversation

Theta-Dev
Copy link
Contributor

I added support for the new, short date format (e.g. 1wk ago).

Fixes #1067

@Theta-Dev Theta-Dev force-pushed the fix/short-date-pasing branch from 59ba106 to 019a6a9 Compare June 1, 2023 10:45
@TobiGr
Copy link
Contributor

TobiGr commented Jun 6, 2023

LGTM, but what about the other languages? I'd guess we need to update them, too. Should we ask the community for help here?

@Theta-Dev
Copy link
Contributor Author

Parsing YouTube data in other languages is currently disabled and it would require more extensive changes to the dictionary and the parser.

I have implemented (and tested) a parser that works with all languages. Here is the dictionary for it. Also note that there are some special cases that have to be handled seperately (e.g. in French a is both an article and the short form of "years").

https://code.thetadev.de/ThetaDev/rustypipe/src/branch/main/testfiles/dict/dictionary.json

@AudricV AudricV added bug Issue is related to a bug youtube service, https://www.youtube.com/ labels Jun 10, 2023
@AudricV
Copy link
Member

AudricV commented Jun 17, 2023

Parsing YouTube data in other languages is currently disabled

Yes, but the timeago parser can be also used for something else than YouTube by clients, even if that's its main goal.

Is the data you provided extracted from YouTube? Do you wish to do other languages support using your similar approach to us? I think we should have a parser separating date units (seconds, hours, days, ...), digits (1, 2, 3, ...) and number units (tens, hundreds, thousands, ...) for each language we want to add support.

in French a is both an article and the short form of "years"

Did you mean the verb have at the third singular person and the short form of years? There is not a article, but a à one.

@Theta-Dev
Copy link
Contributor Author

Theta-Dev commented Jun 17, 2023

This is the French term in question (5 years ago):

il y a 5 a

I currently have a special case for the French language which checks if the string ends with a.

The data from the parsing dictionary is a combination of data extracted from YouTube and the CLDR repository.

@AudricV AudricV changed the title [YouTube] fix: parsing of short date formats [YouTube] Fix parsing short date formats (English only) Jun 18, 2023
@AudricV
Copy link
Member

AudricV commented Jun 18, 2023

Merging this PR, as we would need to overhaul the timeago parser system to work with other languages on short time units. Thanks for the fix!

@AudricV AudricV changed the title [YouTube] Fix parsing short date formats (English only) [YouTube] Fix parsing short relative date formats (English only) Jun 18, 2023
@AudricV AudricV merged commit ad97f08 into TeamNewPipe:dev Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Issue is related to a bug youtube service, https://www.youtube.com/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[YouTube] New short date format

4 participants