summaryrefslogtreecommitdiffhomepage
path: root/src/nxt_clang.h (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2024-10-29Use nxt_nitems() instead of sizeof() for strings (arrays)Alejandro Colomar1-1/+1
sizeof() should never be used to get the size of an array. It is very unsafe, since arrays easily decay to pointers, and sizeof() applied to a pointer gives false results that compile and produce silent bugs. It's better to use nxt_items(), which implements sizeof() division, which recent compilers warn when used with pointers. This change would have caught a couple of bugs that were *almost* introduced First up is the _infamous_ ternary macro bug (yes, using the ternary operator in a macro is of itself a bad idea) nxt_str_set(&port, (r->tls ? "https://" : "http://")); which in the macro expansion runs: (&port)->length = nxt_length((r->tls ? : "https://" : "http://")); which evaluates to: port.length = sizeof(r->tls ? "https://" : "http://") - 1; which evaluates to: port.length = 8 - 1; Of course, we didn't want a compile-time-constant 8 there, but rather the length of the string. The above bug is not obvious to the untrained eye, so let's show some example programs that may give some more hints about the problem. $ cat sizeof.c #include <stdio.h> int main(void) { printf("%zu\n", sizeof("01")); printf("%zu\n", sizeof("012")); printf("%zu\n", sizeof(char *)); } $ cc -Wall -Wextra sizeof.c $ ./a.out 3 4 8 sizeof() returns the size in bytes of the array passed to it, which in case of char strings, it is equivalent to the length of the string + 1 (for the terminating '\0'). However, arrays decay very easily in C, and they decay to a pointer to the first element in the array. In case of strings, that is a 'char *'. When sizeof() is given a pointer, it returns the size of the pointer, which in most platforms is 8. The ternary operator (?) performs default promotions (and other nefarious stuff) that may surprise even the most experienced programmers. It contrasts the __builtin_choose_expr() GCC builtin [1], which performs almost equivalently, but without the unwanted effects of the ternary operator. [1]: <https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fchoose_005fexpr> $ cat ?.c #include <stdio.h> int main(void) { printf("%zu\n", sizeof("01")); printf("%zu\n", sizeof(__builtin_choose_expr(1, "01", "01"))); printf("%zu\n", sizeof(1 ? "01" : "01")); printf("%zu\n", sizeof(char *)); } $ cc -Wall -Wextra ?.c $ ./a.out 3 3 8 8 In the above program, we can see how the ternary operator (?) decays the array into a pointer, and makes it so that sizeof() will return a constant 8. As we can see, everything in the use of the macro would make it look like it should work, but the combination of some seemingly-safe side effects of various C features produces a completely unexpected bug. Second up is a more straight forward case of simply calling nxt_length() on a char * pointer. Like the above this will generally result in a length of 7. When you sit and think about it, you know very well sizeof(char *) is probably 8 these days (but may be some other value like 4). But when you're in the depths of code it's very easy to overlook this when all you're thinking about is to get the length of some string. Let's look at this patch in action $ cat sdiv.c #include <stdio.h> #define nxt_nitems(x) (sizeof(x) / sizeof((x)[0])) #define nxt_length(s) (nxt_nitems(s) - 1) #define nxt_unsafe_length(s) (sizeof(s) - 1) #define STR_LITERAL "1234567890" static const char *str_lit = "1234567890"; int main(void) { printf("[STR_LITERAL] nxt_unsafe_length(\"1234567890\") [%lu]\n", nxt_unsafe_length(STR_LITERAL)); printf("[STR_LITERAL] nxt_length(\"1234567890\") [%lu]\n", nxt_length(STR_LITERAL)); printf("[char * ] nxt_unsafe_length(\"1234567890\") [%lu]\n", nxt_unsafe_length(str_lit)); printf("[char * ] nxt_length(\"1234567890\") [%lu]\n", nxt_length(str_lit)); return 0; } First lets compile it without any flags $ make sdiv $ ./sdiv [STR_LITERAL] nxt_unsafe_length("1234567890") [10] [STR_LITERAL] nxt_length("1234567890") [10] [char * ] nxt_unsafe_length("1234567890") [7] [char * ] nxt_length("1234567890") [7] It compiled without error and runs, although with incorrect results for the two char *'s. Now lets build it with -Wsizeof-pointer-div (also enabled with -Wall) $ CFLAGS="-Wsizeof-pointer-div" make sdiv cc -Wsizeof-pointer-div nxt_nitems.c -o nxt_nitems sdiv.c: In function ‘main’: sdiv.c:3:44: warning: division ‘sizeof (const char *) / sizeof (char)’ does not compute the number of array elements [-Wsizeof-pointer-div] 3 | #define nxt_nitems(x) (sizeof(x) / sizeof((x)[0])) | ^ nxt_nitems.c:4:34: note: in expansion of macro ‘nxt_nitems’ 4 | #define nxt_length(s) (nxt_nitems(s) - 1) | ^~~~~~~~~~ nxt_nitems.c:22:16: note: in expansion of macro ‘nxt_length’ 22 | nxt_length(str_lit)); | ^~~~~~~~~~ nxt_nitems.c:10:20: note: first ‘sizeof’ operand was declared here 10 | static const char *str_lit = "1234567890"; | ^~~~~~~ So we now get a very loud compiler warning (coming from nxt_length(char *), nxt_unsafe_length() of course didn't trigger any warnings), telling us we're being daft. The good news is this didn't find any existing bugs! Let's keep it that way... Link: <https://stackoverflow.com/a/57537491> Cc: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Tested-by: Andrew Clayton <a.clayton@nginx.com> [ Tweaked and expanded the commit message - Andrew ] Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-05-03Fixed #define style.Alejandro Colomar1-44/+22
We had a mix of styles for declaring function-like macros: Style A: #define \ foo() \ do { \ ... \ } while (0) Style B: #define foo() \ do { \ ... \ } while (0) We had a similar number of occurences of each style: $ grep -rnI '^\w*(.*\\' | wc -l 244 $ grep -rn 'define.*(.*)' | wc -l 239 (Those regexes aren't perfect, but a very decent approximation.) Real examples: $ find src -type f | xargs sed -n '/^nxt_double_is_zero/,/^$/p' nxt_double_is_zero(f) \ (fabs(f) <= FLT_EPSILON) $ find src -type f | xargs sed -n '/define nxt_http_field_set/,/^$/p' #define nxt_http_field_set(_field, _name, _value) \ do { \ (_field)->name_length = nxt_length(_name); \ (_field)->value_length = nxt_length(_value); \ (_field)->name = (u_char *) _name; \ (_field)->value = (u_char *) _value; \ } while (0) I'd like to standardize on a single style for them, and IMO, having the identifier in the same line as #define is a better option for the following reasons: - Programmers are used to `#define foo() ...` (readability). - One less line of code. - The program for finding them is really simple (see below). function grep_ngx_func() { if (($# != 1)); then >&2 echo "Usage: ${FUNCNAME[0]} <func>"; return 1; fi; find src -type f \ | grep '\.[ch]$' \ | xargs grep -l "$1" \ | sort \ | xargs pcregrep -Mn "(?s)^\$[\w\s*]+?^$1\(.*?^}"; find src -type f \ | grep '\.[ch]$' \ | xargs grep -l "$1" \ | sort \ | xargs pcregrep -Mn "(?s)define $1\(.*?^$" \ | sed -E '1s/^[^:]+:[0-9]+:/&\n\n/'; } $ grep_ngx_func Usage: grep_ngx_func <func> $ grep_ngx_func nxt_http_field_set src/nxt_http.h:98: #define nxt_http_field_set(_field, _name, _value) \ do { \ (_field)->name_length = nxt_length(_name); \ (_field)->value_length = nxt_length(_value); \ (_field)->name = (u_char *) _name; \ (_field)->value = (u_char *) _value; \ } while (0) $ grep_ngx_func nxt_sprintf src/nxt_sprintf.c:56: u_char * nxt_cdecl nxt_sprintf(u_char *buf, u_char *end, const char *fmt, ...) { u_char *p; va_list args; va_start(args, fmt); p = nxt_vsprintf(buf, end, fmt, args); va_end(args); return p; } ................ Scripted change: ................ $ find src -type f \ | grep '\.[ch]$' \ | xargs sed -i '/define *\\$/{N;s/ *\\\n/ /;s/ //}'
2022-04-27Added NXT_MAYBE_UNUSED for __attribute__((__unused__)).Alejandro Colomar1-0/+11
When testing some configurations of compilers and OSes, I noticed that clang(1) 13 on Debian caused a function to be compiled but unused, and the compiler triggered a compile error. To avoid that error, use __attribute__((__unused__)). Let's call our wrapper NXT_MAYBE_UNUSED, since it describes itself more precisely than the GCC attribute name. It's also the name that C2x (likely C23) has given to the standard attribute, which is [[maybe_unused]], so it's also likely to be more readable because of that name being in ISO C.
2018-06-25Introduced nxt_length() macro.Valentin Bartenev1-0/+4
2018-06-21More effective implementation of nxt_popcount().Valentin Bartenev1-2/+2
This method requires as many iterations as there are set bits, while the previous one has to shift up to the position of the highest bit.
2018-06-20Using own popcount where the compiler builtin is not available.Sergey Kandaurov1-0/+21
2018-04-04Style: capitalized letters in hexadecimal literals.Valentin Bartenev1-2/+2
2017-07-18More accurate "packed" attribute declaration.Valentin Bartenev1-2/+11
2017-07-07Cosmetic changes to remove some annoying valgrind messages.Max Romanov1-0/+2
2017-06-26Interface for mapping JSON configuration objects to C structures.Valentin Bartenev1-0/+4
2017-06-26Build on Solaris 11 fixed.Max Romanov1-3/+0
2017-06-19Memory pools refactoring.Igor Sysoev1-0/+6
2017-05-31Skeleton of router configuration and request processing.Igor Sysoev1-0/+4
2017-04-11The nxt_expect() macro.Valentin Bartenev1-2/+10