Skip to content

Expose java.lang.Character.getType() and constant fields like COMBINING_SPACING_MARK #103

@ctjlewis

Description

@ctjlewis

It seems that J2CL is unable to load the Character.getType or Unicode category constant fields (like Character.COMBINING_SPACING_MARK), and throws a "symbol not found" error.

For reference, see google/closure-compiler#3639, where Closure Compiler was unable to interpret a composite Unicode sequence as a valid IdentifierPart. In CC, there is a part of the parsing process which relies on Scanner.java, a class that determines if a given token is an IdentifierStart or IdentifierPart in compliance with the ECMAScript spec. All token Unicode category checks are currently done by evaluating if the character belongs to any hard-coded Unicode ranges (see below), an approach that I replicated that for this fix, but is not as future-proof nor as legible as Character.getType(char) == Character.COMBINING_SPACING_MARK, which will work as the Unicode standard evolves over time.

private static boolean isCombiningMark(char ch) {
    return (
      // 0300-036F
      (0x0300 <= ch & ch <= 0x036F) |
      // 1AB0–1AFF
      (0x1AB0 <= ch & ch <= 0x1AFF) |
      // 1DC0–1DFF
      (0x1DC0 <= ch & ch <= 0x1DFF) |
      // 20D0–20FF
      (0x20D0 <= ch & ch <= 0x20FF) |
      // FE20–FE2F
      (0xFE20 <= ch & ch <= 0xFE2F)
    );
    // TODO (ctjl): Implement in a more reliable and future-proofed way, i.e.:
    // return Character.getType(ch) == Character.NON_SPACING_MARK;
  }

This hardcoded, manual approach is taken for every Unicode category check in the jsComp library because the J2CL compile must succeed in order to push a release (using Character.getType() will compile using maven, but not with bazel). It would be beneficial for the CC library if J2CL could support these.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions