Skip to content

Conversation

@iamleot
Copy link

@iamleot iamleot commented Aug 31, 2025

isspace(3), tolower(3) and toupper(3) functions only accepts values that are representable as unsigned char or is EOF. For all other values the behavior is undefined.

Cast to unsigned char to avoid any undefined behavior.


This was noticed on NetBSD.


Trying to build commit 34b8886 in NetBSD/amd64 -current (11.99.1) we have compiler warnings:

[...]
cc -DVERSION=\"1.1\" -D_GNU_SOURCE -MD -Wall -Wextra -g -std=c99 -O3 -pedantic -Ideps -Werror=vla -c -o src/match.o src/match.c
In file included from /usr/include/ctype.h:100,
                 from src/match.c:1:
src/match.c: In function ‘setup_match_struct’:
src/match.c:62:56: warning: array subscript has type ‘char’ [-Wchar-subscripts]
   62 |                 match->lower_needle[i] = tolower(needle[i]);
      |                                                        ^
src/match.c:65:60: warning: array subscript has type ‘char’ [-Wchar-subscripts]
   65 |                 match->lower_haystack[i] = tolower(haystack[i]);
      |                                                            ^
[...]
In file included from /usr/include/ctype.h:100,
                 from deps/greatest/greatest.h:99,
                 from test/test_properties.c:4:
test/test_properties.c: In function ‘prop_positions_should_match_characters_in_string’:
test/test_properties.c:138:35: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  138 |                 if (toupper(needle[i]) != toupper(haystack[positions[i]])) {
      |                                   ^
test/test_properties.c:138:59: warning: array subscript has type ‘char’ [-Wchar-subscripts]
  138 |                 if (toupper(needle[i]) != toupper(haystack[positions[i]])) {
      |                                                           ^
[...]

...and also the tests ends up in crashing fzytest:

[...]
./test/fzytest

* Suite match_suite:
........................
24 tests - 24 passed, 0 failed, 0 skipped (0 ticks, 0.000 sec)

* Suite choices_suite:
....gmake: *** [Makefile:29: check] Segmentation fault (core dumped)
$ gdb -core fzytest.core test/fzytest
Reading symbols from test/fzytest...
[New process 18720]
[New process 10539]
[New process 15827]
Core was generated by `fzytest'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000004069f2 in setup_match_struct (match=0x7166511fd760, needle=0x40a569 "e",
    haystack=0x40ac90 "Edmund Husserl - Méditations cartésiennes - Introduction a la phénoménologie.pdf") at src/match.c:65
65                      match->lower_haystack[i] = tolower(haystack[i]);
[Current thread is 1 (process 18720)]
(gdb) quit
[...]

This PR avoids that and all the tests now passes:

$ gmake test
./test/fzytest

* Suite match_suite:
........................
24 tests - 24 passed, 0 failed, 0 skipped (0 ticks, 0.000 sec)

* Suite choices_suite:
......
6 tests - 6 passed, 0 failed, 0 skipped (4 ticks, 0.040 sec)

* Suite properties_suite:
..
2 tests - 2 passed, 0 failed, 0 skipped (113 ticks, 1.130 sec)

Total: 32 tests (117 ticks, 1.170 sec), 102 assertions
Pass: 32, fail: 0, skip: 0.

For possible more information regarding ctype(3) abuse please give a look to CAVEATS section of ctype(3) NetBSD man page.

isspace(3), tolower(3) and toupper(3) functions only accepts values
that are representable as unsigned char or is EOF. For all other values
the behavior is undefined.

Cast to unsigned char to avoid any undefined behavior.
@iamleot iamleot marked this pull request as ready for review August 31, 2025 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant