Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

There is no underscore in the character class in the regular expression capture for charset detection in URL previews #10307

@srividyut

Description

@srividyut

There is no underscore in the character class in the regular expression capture for charset detection >> There is no underscore in the character class in the regular expression capture for charset detection in URL previews

line61

_charset_match = re.compile(br'<\s*meta[^>]*charset\s*=\s*"?([a-z0-9-]+)"?', flags=re.I)

line63

br'\s*<\s*\?\s*xml[^>]*encoding="([a-z0-9-]+)"', flags=re.I

  • When used in countries other than Europe and the United States, garbled characters are awkward on web pages that use certain character codes. As you know the names of these character codes, we also use underscores.
  • The fix is 2 lines but only 2 character correction.

([a-z0-9-]+) -> ([a-z0-9-_]+)
(21/07/16 02:00) ([a-z0-9-]+) -> ([a-z0-9_-]+)

(21/07/16 02:00) Hyphens need to be escaped unless they are at the beginning or end. The "Source Editor Screenshot" is also incorrect, so I deleted it.

  • When I added an underscore and sent a message including a URL from the client, the content containing the underscore in the name of the character code such as Shift_JIS was displayed without garbled characters.
  • Pull request with the same content / Sorry for being a beginner in python.
    Ignore this as it seems to be excluded in the test
  • Or I read deeply that there may be a deep reason why there is no underscore.

Metadata

Metadata

Assignees

No one assigned

    Labels

    S-MinorBlocks non-critical functionality, workarounds exist.T-DefectBugs, crashes, hangs, security vulnerabilities, or other reported issues.good first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions