Age | Commit message (Collapse) | Author | Files | Lines |
|
Due to 'char' (unless explicitly set) being signed or unsigned depending
on architecture, e.g on x86 it's signed, while on Arm it's unsigned,
this can lead to subtle bugs such if you use a plain char as a byte
thinking it's unsigned on all platforms (maybe you live in the world of
Arm).
What we can do is tell the compiler to treat 'char' as unsigned by
default, thus it will be consistent across platforms. Seeing as most of
the time it doesn't matter whether char is signed or unsigned, it
really only matters when you're dealing with 'bytes', which means it
makes sense to default char to unsigned.
The Linux Kernel made this change at the end of 2022.
This will also allow in the future to convert our u_char's to char's
(which will now be unsigned) and pass them directly into the libc
functions for example, without the need for casting.
Here is what the ISO C standard has to say
From §6.2.5 Types ¶15
The three types char, signed char, and unsigned char are collectively
called the character types. The implementation shall define char to
have the same range, representation, and behavior as either signed
char or unsigned char.[45]
and from Footnote 45)
CHAR_MIN, defined in <limits.h>, will have one of the values 0 or
SCHAR_MIN, and this can be used to distinguish the two options.
Irrespective of the choice made, char is a separate type from the
other two and is not compatible with either.
If you're still unsure why you'd want this change...
It was never clear to me, why we used u_char, perhaps that was used as
an alternative to -funsigned-char...
But that still leaves the potential for bugs with char being unsigned vs
signed...
Then because we use u_char but often need to pass such things into libc
(and perhaps other functions) which normally take a 'char' we need to
cast these cases.
So this change brings at least two (or more) benefits
1) Removal of potential for char unsigned vs signed bugs.
2) Removal of a bunch of casts. Reducing casting to the bare minimum
is good. This helps the compiler to do proper type checking.
3) Readability/maintainability, everything is now just char...
What if you want to work with bytes?
Well with char being unsigned (everywhere) you can of course use char.
However it would be much better to use the uint8_t type for that to
clearly signify that intention.
Link: <https://lore.kernel.org/lkml/Y1Bfg06qV0sDiugt@zx2c4.com/>
Link: <https://lore.kernel.org/lkml/20221019203034.3795710-1-Jason@zx2c4.com/>
Link: <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3bc753c06dd02a3517c9b498e3846ebfc94ac3ee>
Link: <https://www.iso-9899.info/n1570.html#6.2.5p15>
Suggested-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Currently Unit doesn't specify any specific C standard for compiling and
will thus be compiled under whatever the compiler happens to default to.
Current releases of GCC and Clang (13.x & 17.x respectively at the time
of writing) default to gnu17 (C17 + GNU extensions).
Our oldest still-supported system is RHEL/CentOS 7, that comes with GCC
4.8.5 which defaults to gnu90.
Up until now this hasn't really been an issue and we have been able to
use some C99 features that are implemented as GNU extensions in older
compilers, e.g
- designated initializers
- flexible array members
- trailing comma in enum declaration (compiles with -std=c89, warns
with -std=c89 -pedantic)
- snprintf(3)
- long long (well we test for it but don't actually use it)
- bool / stdbool.h
- variadic macros
However there are a couple of C99 features that aren't GNU extensions
that would be handy to be able to use, i.e
- The ability to declare variables inside for () loops, e.g
for (int i = 0; ...; ...)
- C99 inline functions (not to be confused with what's available with
-std=gnu89).
However, if we are going to switch up to C99, then perhaps we should
just leap frog to C11 instead (the Linux Kernel did in fact make the
switch from gnu89 to gnu11 in March '22). C17 is perhaps still a little
new and is really just C11 + errata.
GCC 4.8 as in RHEL 7 has *some* support for C11, so while we can make
full use of C99, we couldn't yet make full use of C11, However RHEL 7 is
EOL on June 30th 2024, after which we will no longer have that
restriction and in the meantime we can restrict ourselves to the
supported set of features (or implement fallbacks where appropriate).
It can only be a benefit that we would be compiling Unit consistently
under the same language standard.
This will also help give the impression that Unit is a modern C code
base.
It is also worth noting the following regarding GCC
"A version with corrections integrated was prepared in 2017 and published
in 2018 as ISO/IEC 9899:2018; it is known as C17 and is supported with
-std=c17 or -std=iso9899:2017; the corrections are also applied with -
std=c11, and the only difference between the options is the value of
STDC_VERSION."
Suggested-by: Andrew Clayton <a.clayton@nginx.com>
Acked-by: Andrew Clayton <a.clayton@nginx.com>
[ Andrew wrote the commit message ]
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Link: <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e8c07082a810fbb9db303a2b66b66b8d7e588b53>
Link: <https://www.ibm.com/blog/announcement/ibm-is-announcing-red-hat-enterprise-linux-7-is-going-end-of-support-on-30-june-2024/>
Link: <https://gcc.gnu.org/onlinedocs/gcc-8.1.0/gcc/Standards.html#C-Language>
Cc: Dan Callahan <d.callahan@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
One issue you have when trying to debug Unit under say GDB is that at
the default optimisation level we use of -O (-O1) the compiler will
often optimise things out which means they are not available for
inspection in the debugger.
This patch allows you to pass 'D=1' to make, e.g
$ make D=1 ...
Which will set -O0 overriding the previously set -O, basically disabling
optimisations, we could use -Og, but the clang(1) man page says this is
best and it seems to not cause any issues when debugging GCC generated
code.
Co-developed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This causes signed integer & pointer overflow to have a defined
behaviour of wrapping according to two's compliment. I.e INT_MAX will
wrap to INT_MIN and vice versa.
This is mainly to cover existing cases, not an invitation to add more.
Cc: Dan Callahan <d.callahan@f5.com>
Suggested-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Aliasing is essentially when you access the same memory via different
types.
If the compiler knows this doesn't happen it can make some
optimisations.
There is however code in Unit, for example in the wasm language module
and the websocket code that may fall foul of strict-aliasing rules.
(For the wasm module I explicitly disable it there)
In auto/cc/test for GCC we have
NXT_CFLAGS="$NXT_CFLAGS -O"
...
# -O2 enables -fstrict-aliasing and -fstrict-overflow.
#NXT_CFLAGS="$NXT_CFLAGS -O2"
#NXT_CFLAGS="$NXT_CFLAGS -Wno-strict-aliasing"
So with GCC by default we effectively compile with -fno-strict-aliasing.
For clang we have this
NXT_CFLAGS="$NXT_CFLAGS -O"
...
#NXT_CFLAGS="$NXT_CFLAGS -O2"
...
NXT_CFLAGS="$NXT_CFLAGS -fstrict-aliasing"
(In _clang_, -fstrict-aliasing is always enabled by default)
So in clang we always build with -fstrict-aliasing. I don't think this
is the best idea, building with something as fundamental as this
disabled in one compiler and enabled in another.
This patch adjusts the Clang side of things to match that of GCC. I.e
compile with -fno-strict-aliasing. It also explicitly sets
-fno-strict-aliasing for GCC, which is what we were getting anyway but
lets be explicit about it.
Cc: Dan Callahan <d.callahan@f5.com>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Expand on the comment on why we don't enable -Wstrict-overflow=5 on GCC.
Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96658>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This is what -Wextra used to be called, but any version of GCC or Clang
in at least the last decade has -Wextra.
Cc: Dan Callahan <d.callahan@f5.com>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
We really only support building Unit with GCC and Clang.
Cc: Dan Callahan <d.callahan@f5.com>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
We really only support building Unit with GCC and Clang.
Cc: Dan Callahan <d.callahan@f5.com>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
We only really support building Unit with GCC and Clang.
Cc: Dan Callahan <d.callahan@f5.com>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
We don't run on Windows and only really support compiling Unit with GCC
and Clang.
Cc: Dan Callahan <d.callahan@f5.com>
Co-developed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|