Skip to content

Commit 10ba78d

Browse files
mszabo-wikiasjamesr
authored andcommitted
Reduce the incidence of infinite loops while case folding
dc7d6e5 unfortunately increases the incidence of infinite loops during case folding if re2j is running on a JVM newer than the version used to generate the bundled UnicodeTables.java and the input contains a rune that would require special case folding rules to form a closed fold loop. \u1C80 (Cyrillic Small Letter Rounded Ve) is an example of such a rune. Workaround the issue by inverting the order of parameters passed to equalsIgnoreCase() so that the rune from the pattern being matched, rather than the input content, undergoes case folding instead. This does not fully eliminate the possibility of an infinite loop in this scenario, since the pattern may well contain one of the problematic runes, but it effectively restores the situation as it was pre dc7d6e5, since the previous logic also performed case folding on the rune from the pattern and not on the content. Signed-off-by: Máté Szabó <[email protected]>
1 parent 9b3f052 commit 10ba78d

File tree

2 files changed

+7
-1
lines changed

2 files changed

+7
-1
lines changed

java/com/google/re2j/Inst.java

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,12 @@ boolean matchRune(int r) {
5151
if (runes.length == 1) {
5252
int r0 = runes[0];
5353

54+
// If this pattern is case-insensitive, apply Unicode case folding to compare the two runes.
55+
// Note that this may result in a case-folding loop when executed on a JVM newer than the version used
56+
// to generate Unicode data for re2j, so attempt to reduce the chance of that occurring
57+
// by performing case folding on |r0| from the pattern rather than |r| from the input.
5458
if ((arg & RE2.FOLD_CASE) != 0) {
55-
return Unicode.equalsIgnoreCase(r, r0);
59+
return Unicode.equalsIgnoreCase(r0, r);
5660
}
5761
return r == r0;
5862
}

java/com/google/re2j/Unicode.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,8 @@ static int simpleFold(int r) {
125125
// equalsIgnoreCase performs case-insensitive equality comparison
126126
// on the given runes |r1| and |r2|, with special consideration
127127
// for the likely scenario where both runes are ASCII characters.
128+
// If non-ASCII, Unicode case folding will be performed on |r1|
129+
// to compare it to |r2|.
128130
// -1 is interpreted as the end-of-file mark.
129131
static boolean equalsIgnoreCase(int r1, int r2) {
130132
// Runes already match, or one of them is EOF

0 commit comments

Comments
 (0)