Skip to content

Need census of Unicode characters in PG texts #276

@eshellman

Description

@eshellman

Here something that woud be really useful for PG development going forward: A count of the occurrence of unicode code points used in PG texts. A recent issue #271 notes that the 2em dash is often used but missing in many typefaces.

Applications:

  • we could automatically add an embedded "polyfill" subsetted font to epubs to improve the appearances of characters with spotty coverage in typefaces, or we could recommend to users to use fonts with comprehensive coverage.
  • we could produce a test epub for use by producers to see coverage of non-ascii code point.
  • data science!

How:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions