You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Is this code unit a lead surrogate (U+d800..U+dbff)?
14
+
* @param c 16-bit code unit
15
+
* @return true or false
16
+
*/
17
+
#defineIS_LEAD(c) (((c)&0xfffffc00) == 0xd800)
18
+
19
+
/**
20
+
* Is this code unit a trail surrogate (U+dc00..U+dfff)?
21
+
* @param c 16-bit code unit
22
+
* @return true or false
23
+
*/
24
+
#defineIS_TRAIL(c) (((c)&0xfffffc00) == 0xdc00)
25
+
26
+
/**
27
+
* Get a code point index from a string at a code point boundary offset,
28
+
* and advance the offset to the next code point boundary.
29
+
* (Post-incrementing forward iteration.)
30
+
* "Safe" macro, handles unpaired surrogates and checks for string boundaries.
31
+
*
32
+
* The length can be negative for a NUL-terminated string.
33
+
*
34
+
* The offset may point to the lead surrogate unit
35
+
* for a supplementary code point, in which case for casing will be read
36
+
* the following trail surrogate as well.
37
+
* If the offset points to a trail surrogate or
38
+
* to a single, unpaired lead surrogate, then for casing will be read that unpaired surrogate.
39
+
*
40
+
* @param s const uint16_t* string
41
+
* @param i output string offset, must be i<length
42
+
* @param length string length
43
+
*/
44
+
#defineNEXTOFFSET(s, i, length) { \
45
+
uint16_t c = (s)[(i)++]; \
46
+
if (IS_LEAD(c)) { \
47
+
uint16_t __c2; \
48
+
if ((i) != (length) && IS_TRAIL(__c2 = (s)[(i)])) { \
49
+
++(i); \
50
+
} \
51
+
} \
52
+
}
12
53
13
54
/**
14
55
* Append a code point to a string, overwriting 1 or 2 code units.
@@ -46,6 +87,11 @@
46
87
ChangeCaseNative
47
88
48
89
Performs upper or lower casing of a string into a new buffer, taking into account the specified locale.
90
+
Two things we are considering here:
91
+
1. Prohibiting code point expansions. Some characters code points expand when uppercased or lowercased, which may lead to an insufficient destination buffer.
92
+
Instead, we prohibit these expansions and iterate through the string character by character opting for the original character if it would have been expanded.
93
+
2. Properly handling surrogate pairs. Characters can be comprised of more than one code point
94
+
(i.e. surrogate pairs like \uD801\uDC37). All code points for a character are needed to properly change case
49
95
Returns 0 for success, non-zero on failure see ErrorCodes.
Performs upper or lower casing of a string into a new buffer.
140
+
Two things we are considering here:
141
+
1. Prohibiting code point expansions. Some characters code points expand when uppercased or lowercased, which may lead to an insufficient destination buffer.
142
+
Instead, we prohibit these expansions and iterate through the string character by character opting for the original character if it would have been expanded.
143
+
2. Properly handling surrogate pairs. Characters can be comprised of more than one code point
144
+
(i.e. surrogate pairs like \uD801\uDC37). All code points for a character are needed to properly change case
84
145
Returns 0 for success, non-zero on failure see ErrorCodes.
0 commit comments