You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
// When doing string search operations using ICU, it is internally using a break iterator which doesn't allow breaking between some characters according to
460
+
// the Grapheme Cluster Boundary Rules specified in http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules.
461
+
// Unfortunately, not all rules will have the desired behavior we need to get in .NET. For example, the rules don't allow breaking between CR '\r' and LF '\n' characters.
462
+
// When searching for "\n" in a string like "\r\n", will get not found result.
463
+
// We are customizing the break iterator to include only the rules which give the desired behavior. Mainly, we include the GB9 rule http://www.unicode.org/reports/tr29/#GB9
464
+
// which doesn't allow breaking before the nonspace marks.
465
+
// The general rules syntax explained in the doc https://unicode-org.github.io/icu/userguide/boundaryanalysis/break-rules.html.
466
+
// The ICU rules definition exist here https://github.com/unicode-org/icu/blob/main/icu4c/source/data/brkitr/rules/char.txt.
0 commit comments