Are flexible array members valid in C++? - c++

In C99, you can declare a flexible array member of a struct as such:
struct blah
{
int foo[];
};
However, when someone here at work tried to compile some code using clang in C++, that syntax did not work. (It had been working with MSVC.) We had to convert it to:
struct blah
{
int foo[0];
};
Looking through the C++ standard, I found no reference to flexible member arrays at all; I always thought [0] was an invalid declaration, but apparently for a flexible member array it is valid. Are flexible member arrays actually valid in C++? If so, is the correct declaration [] or [0]?

C++ was first standardized in 1998, so it predates the addition of flexible array members to C (which was new in C99). There was a corrigendum to C++ in 2003, but that didn't add any relevant new features. The next revision of C++ (C++2b) is still under development, and it seems flexible array members still aren't added to it.

C++ doesn't support C99 flexible array members at the end of structures, either using an empty index notation or a 0 index notation (barring vendor-specific extensions):
struct blah
{
int count;
int foo[]; // not valid C++
};
struct blah
{
int count;
int foo[0]; // also not valid C++
};
As far as I know, C++0x will not add this, either.
However, if you size the array to 1 element:
struct blah
{
int count;
int foo[1];
};
the code will compile, and work quite well, but it is technically undefined behavior. You can allocate the appropriate memory with an expression that is unlikely to have off-by-one errors:
struct blah* p = (struct blah*) malloc( offsetof(struct blah, foo[desired_number_of_elements]);
if (p) {
p->count = desired_number_of_elements;
// initialize your p->foo[] array however appropriate - it has `count`
// elements (indexable from 0 to count-1)
}
So it's portable between C90, C99 and C++ and works just as well as C99's flexible array members.
Raymond Chen did a nice writeup about this: Why do some structures end with an array of size 1?
Note: In Raymond Chen's article, there's a typo/bug in an example initializing the 'flexible' array. It should read:
for (DWORD Index = 0; Index < NumberOfGroups; Index++) { // note: used '<' , not '='
TokenGroups->Groups[Index] = ...;
}

If you can restrict your application to only require a few known sizes, then you can effectively achieve a flexible array with a template.
template <typename BASE, typename T, unsigned SZ>
struct Flex : public BASE {
T flex_[SZ];
};

The second one will not contain elements but rather will point right after blah. So if you have a structure like this:
struct something
{
int a, b;
int c[0];
};
you can do things like this:
struct something *val = (struct something *)malloc(sizeof(struct something) + 5 * sizeof(int));
val->a = 1;
val->b = 2;
val->c[0] = 3;
In this case c will behave as an array with 5 ints but the data in the array will be after the something structure.
The product I'm working on uses this as a sized string:
struct String
{
unsigned int allocated;
unsigned int size;
char data[0];
};
Because of the supported architectures this will consume 8 bytes plus allocated.
Of course all this is C but g++ for example accepts it without a hitch.

If you only want
struct blah { int foo[]; };
then you don't need the struct at all an you can simply deal with a malloc'ed/new'ed int array.
If you have some members at the beginning:
struct blah { char a,b; /*int foo[]; //not valid in C++*/ };
then in C++, I suppose you could replace foo with a foo member function:
struct blah { alignas(int) char a,b;
int *foo(void) { return reinterpret_cast<int*>(&this[1]); } };
Example use:
#include <stdlib.h>
struct blah {
alignas(int) char a,b;
int *foo(void) { return reinterpret_cast<int*>(&this[1]); }
};
int main()
{
blah *b = (blah*)malloc(sizeof(blah)+10*sizeof(int));
if(!b) return 1;
b->foo()[1]=1;
}

A proposal is underway, and might make into some future C++ version.
See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1039r0.html for details (the proposal is fairly new, so it's subject to changes)

I faced the same problem to declare a flexible array member which can be used from C++ code. By looking through glibc headers I found that there are some usages of flexible array members, e.g. in struct inotify which is declared as follows (comments and some unrelated members omitted):
struct inotify_event
{
//Some members
char name __flexarr;
};
The __flexarr macro, in turn is defined as
/* Support for flexible arrays.
Headers that should use flexible arrays only if they're "real"
(e.g. only if they won't affect sizeof()) should test
#if __glibc_c99_flexarr_available. */
#if defined __STDC_VERSION__ && __STDC_VERSION__ >= 199901L
# define __flexarr []
# define __glibc_c99_flexarr_available 1
#elif __GNUC_PREREQ (2,97)
/* GCC 2.97 supports C99 flexible array members as an extension,
even when in C89 mode or compiling C++ (any version). */
# define __flexarr []
# define __glibc_c99_flexarr_available 1
#elif defined __GNUC__
/* Pre-2.97 GCC did not support C99 flexible arrays but did have
an equivalent extension with slightly different notation. */
# define __flexarr [0]
# define __glibc_c99_flexarr_available 1
#else
/* Some other non-C99 compiler. Approximate with [1]. */
# define __flexarr [1]
# define __glibc_c99_flexarr_available 0
#endif
I'm not familar with MSVC compiler, but probably you'd have to add one more conditional macro depending on MSVC version.

Flexible arrays are not part of the C++ standard yet. That is why int foo[] or int foo[0] may not compile. While there is a proposal being discussed, it has not been accepted to the newest revision of C++ (C++2b) yet.
However, almost all modern compiler do support it via compiler extensions.
GCC has zero length array extension which is supported for C++.
Clang aims to supports a broad range of GCC extensions.
MSVC has a non standard extension and a warning associated with it.
The catch is that if you use this extension with the highest warning level (-Wall --pedantic), it may result into a warning.
A workaround to this is to use an array with one element and do access out of bounds. While this solution is UB by the spec (dcl.array and expr.add), most of the compilers will produce valid code and even clang -fsanitize=undefined is happy with it:
#include <new>
#include <type_traits>
struct A {
int a[1];
};
int main()
{
using storage_type = std::aligned_storage_t<1024, alignof(A)>;
static storage_type memory;
A *ptr_a = new (&memory) A;
ptr_a->a[2] = 42;
return ptr_a->a[2];
}
demo
Having all that said, if you want your code to be standard compliant and do not depend on any compiler extension, you will have to avoid using this feature.

Flexible array members are not supported in standard C++, however the clang documentation says.
"In addition to the language extensions listed here, Clang aims to support a broad range of GCC extensions."
The gcc documentation for C++ says.
"The GNU compiler provides these extensions to the C++ language (and you can also use most of the C language extensions in your C++ programs)."
And the gcc documentation for C documents support for arrays of zero length.
https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

The better solution is to declare it as a pointer:
struct blah
{
int* foo;
};
Or better yet, to declare it as a std::vector:
struct blah
{
std::vector<int> foo;
};

Related

How can I indicate to the compiler that a pointer parameter is aligned?

I'm writing the spectacular function:
void foo(void* a) {
if (check_something_at_runtime_only()) {
int* as_ints { a };
// do things with as_ints
}
else {
char* as_chars { a };
// do things with as_chars
}
}
Suppose we know that some work with as_ints would benefit from it being better-aligned; e.g. if the memory transaction size on my platform is N bytes, then I can read the first N/sizeof(int) elements with a single machine instruction (ignoring SIMD/vectorization here) - provided a is N-byte-aligned.
Now, I could indicate alignment by having foo always take an int * - at least on platforms for which larger types can only be read from aligned addresses - but I would rather keep the type void *, since it doesn't have to be an array of ints, really.
I would have liked to be able to write something like
void foo(alignas(sizeof(int)) void* a) { ... }
but, apparently, alignas doesn't apply to pointers, so I can't.
Is there another way to guarantee to the compiler than the argument address will be aligned?
Notes:
I'm interested both in what the C++ standard (any version) allows, and in compiler-specific extensions in GCC, clang and NVCC (the CUDA compiler).
In C++20 you can use std::assume_aligned:
#include <memory>
int *as_ints = std::assume_aligned<sizeof(int)>(a);
In GCC/Clang you can do
int *as_ints = __builtin_assume_aligned(a);
or if a is function parameter just mark it directly with __attribute((aligned(4))).

clang++ cannot initialize a variable of type 'int(*)[dim2]' with an rvalue of type 'int (*)[dim2]'

Why does the code
void fcn(int *twoDArrayPtr, const int dim1, const int dim2) {
int (*array)[dim2] = reinterpret_cast<int (*)[dim2]>(twoDArrayPtr);
}
int main() {
return 0;
}
generate the compiler error
error: cannot initialize a variable of type 'int (*)[dim2]' with
an rvalue of type 'int (*)[dim2]'
The types are the same, so I'd think the assignment can be performed. Since int (*)[dim2] is a pointer to an array of size dim2 and as such could be a pointer to a bunch of arrays of size dim2 in contiguous memory indexable by the pointer, I would think this should work.
I'm using clang++ on Mac OS/X with the following version information:
Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin14.0.0
Thread model: posix
dim2 is not a compile-time constant, and VLAs (variable-length arrays) don't exist in C++. Some other compilers (such as gcc) have language extensions to allow VLAs in C++, but clang's behavior is standard-conforming.
You can work around the problem with a class (or class template) that does the address translation for you, such as
// This is very rudimentary, but it's a point to start.
template<typename T>
class array2d_ref {
public:
array2d_ref(T *p, std::size_t dim) : data_(p), dim_(dim) { }
T *operator[](std::size_t i) { return &data_[i * dim_]; }
private:
T *data_;
std::size_t dim_;
};
...
array2d_ref<int> array(twoDArrayPtr, dim2);
But I'm afraid it is not possible (portably) to have a pointer-to-array unless you know the dimension of the array at compile time.
You're trying to use C99's Variable Length Array(VLA) feature when you use dim2 as the array dimension in your cast. (gcc, for example does support this by extension: https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html.)
Good news, you can't do this now but you will be able to soon with the introduction of C++14's Runtime Sized Arrays.
Pertainant quotes:
Runtime-sized arrays offer the same syntax and performance of C99’s VLAs... Bear in mind that runtime-sized arrays aren’t precisely the same as C99’s VLAs. The C++14 feature is more restrained, which is just as well. Specifically, the following properties are excluded:
Runtime-sized multidimensional arrays
Modifications to the function declarator syntax
sizeof(a) being a runtime-evaluated expression returning the size of a
typedef int a[n]; evaluating n and passing it through the typedef
So you're code will be legal soon, circa C++14.
I've tried it out on the Visual Studio 2015 Beta and sadly at time of writing it is not supported :(
Although clang does not support variable-length arrays, there is a workaround. The following compiles with clang++ 4.0.0:
void fcn(int *twoDArrayPtr, const int dim1, const int dim2) {
using array_type = int (*)[dim2];
array_type array = reinterpret_cast<array_type>(twoDArrayPtr);
}
int main() {
return 0;
}
I'm not sure why this alias declaration should make any difference. It certainly seems inconsistent.

Can code that is valid in both C and C++ produce different behavior when compiled in each language?

C and C++ have many differences, and not all valid C code is valid C++ code.
(By "valid" I mean standard code with defined behavior, i.e. not implementation-specific/undefined/etc.)
Is there any scenario in which a piece of code valid in both C and C++ would produce different behavior when compiled with a standard compiler in each language?
To make it a reasonable/useful comparison (I'm trying to learn something practically useful, not to try to find obvious loopholes in the question), let's assume:
Nothing preprocessor-related (which means no hacks with #ifdef __cplusplus, pragmas, etc.)
Anything implementation-defined is the same in both languages (e.g. numeric limits, etc.)
We're comparing reasonably recent versions of each standard (e.g. say, C++98 and C90 or later)
If the versions matter, then please mention which versions of each produce different behavior.
Here is an example that takes advantage of the difference between function calls and object declarations in C and C++, as well as the fact that C90 allows the calling of undeclared functions:
#include <stdio.h>
struct f { int x; };
int main() {
f();
}
int f() {
return printf("hello");
}
In C++ this will print nothing because a temporary f is created and destroyed, but in C90 it will print hello because functions can be called without having been declared.
In case you were wondering about the name f being used twice, the C and C++ standards explicitly allow this, and to make an object you have to say struct f to disambiguate if you want the structure, or leave off struct if you want the function.
For C++ vs. C90, there's at least one way to get different behavior that's not implementation defined. C90 doesn't have single-line comments. With a little care, we can use that to create an expression with entirely different results in C90 and in C++.
int a = 10 //* comment */ 2
+ 3;
In C++, everything from the // to the end of the line is a comment, so this works out as:
int a = 10 + 3;
Since C90 doesn't have single-line comments, only the /* comment */ is a comment. The first / and the 2 are both parts of the initialization, so it comes out to:
int a = 10 / 2 + 3;
So, a correct C++ compiler will give 13, but a strictly correct C90 compiler 8. Of course, I just picked arbitrary numbers here -- you can use other numbers as you see fit.
The following, valid in C and C++, is going to (most likely) result in different values in i in C and C++:
int i = sizeof('a');
See Size of character ('a') in C/C++ for an explanation of the difference.
Another one from this article:
#include <stdio.h>
int sz = 80;
int main(void)
{
struct sz { char c; };
int val = sizeof(sz); // sizeof(int) in C,
// sizeof(struct sz) in C++
printf("%d\n", val);
return 0;
}
C90 vs. C++11 (int vs. double):
#include <stdio.h>
int main()
{
auto j = 1.5;
printf("%d", (int)sizeof(j));
return 0;
}
In C auto means local variable. In C90 it's ok to omit variable or function type. It defaults to int. In C++11 auto means something completely different, it tells the compiler to infer the type of the variable from the value used to initialize it.
Another example that I haven't seen mentioned yet, this one highlighting a preprocessor difference:
#include <stdio.h>
int main()
{
#if true
printf("true!\n");
#else
printf("false!\n");
#endif
return 0;
}
This prints "false" in C and "true" in C++ - In C, any undefined macro evaluates to 0. In C++, there's 1 exception: "true" evaluates to 1.
Per C++11 standard:
a. The comma operator performs lvalue-to-rvalue conversion in C but not C++:
char arr[100];
int s = sizeof(0, arr); // The comma operator is used.
In C++ the value of this expression will be 100 and in C this will be sizeof(char*).
b. In C++ the type of enumerator is its enum. In C the type of enumerator is int.
enum E { a, b, c };
sizeof(a) == sizeof(int); // In C
sizeof(a) == sizeof(E); // In C++
This means that sizeof(int) may not be equal to sizeof(E).
c. In C++ a function declared with empty params list takes no arguments. In C empty params list mean that the number and type of function params is unknown.
int f(); // int f(void) in C++
// int f(*unknown*) in C
This program prints 1 in C++ and 0 in C:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int d = (int)(abs(0.6) + 0.5);
printf("%d", d);
return 0;
}
This happens because there is double abs(double) overload in C++, so abs(0.6) returns 0.6 while in C it returns 0 because of implicit double-to-int conversion before invoking int abs(int). In C, you have to use fabs to work with double.
#include <stdio.h>
int main(void)
{
printf("%d\n", (int)sizeof('a'));
return 0;
}
In C, this prints whatever the value of sizeof(int) is on the current system, which is typically 4 in most systems commonly in use today.
In C++, this must print 1.
Another sizeof trap: boolean expressions.
#include <stdio.h>
int main() {
printf("%d\n", (int)sizeof !0);
}
It equals to sizeof(int) in C, because the expression is of type int, but is typically 1 in C++ (though it's not required to be). In practice they are almost always different.
An old chestnut that depends on the C compiler, not recognizing C++ end-of-line comments...
...
int a = 4 //* */ 2
+2;
printf("%i\n",a);
...
The C++ Programming Language (3rd Edition) gives three examples:
sizeof('a'), as #Adam Rosenfield mentioned;
// comments being used to create hidden code:
int f(int a, int b)
{
return a //* blah */ b
;
}
Structures etc. hiding stuff in out scopes, as in your example.
Another one listed by the C++ Standard:
#include <stdio.h>
int x[1];
int main(void) {
struct x { int a[2]; };
/* size of the array in C */
/* size of the struct in C++ */
printf("%d\n", (int)sizeof(x));
}
Inline functions in C default to external scope where as those in C++ do not.
Compiling the following two files together would print the "I am inline" in case of GNU C but nothing for C++.
File 1
#include <stdio.h>
struct fun{};
int main()
{
fun(); // In C, this calls the inline function from file 2 where as in C++
// this would create a variable of struct fun
return 0;
}
File 2
#include <stdio.h>
inline void fun(void)
{
printf("I am inline\n");
}
Also, C++ implicitly treats any const global as static unless it is explicitly declared extern, unlike C in which extern is the default.
#include <stdio.h>
struct A {
double a[32];
};
int main() {
struct B {
struct A {
short a, b;
} a;
};
printf("%d\n", sizeof(struct A));
return 0;
}
This program prints 128 (32 * sizeof(double)) when compiled using a C++ compiler and 4 when compiled using a C compiler.
This is because C does not have the notion of scope resolution. In C structures contained in other structures get put into the scope of the outer structure.
struct abort
{
int x;
};
int main()
{
abort();
return 0;
}
Returns with exit code of 0 in C++, or 3 in C.
This trick could probably be used to do something more interesting, but I couldn't think of a good way of creating a constructor that would be palatable to C. I tried making a similarly boring example with the copy constructor, that would let an argument be passed, albeit in a rather non-portable fashion:
struct exit
{
int x;
};
int main()
{
struct exit code;
code.x=1;
exit(code);
return 0;
}
VC++ 2005 refused to compile that in C++ mode, though, complaining about how "exit code" was redefined. (I think this is a compiler bug, unless I've suddenly forgotten how to program.) It exited with a process exit code of 1 when compiled as C though.
Don't forget the distinction between the C and C++ global namespaces. Suppose you have a foo.cpp
#include <cstdio>
void foo(int r)
{
printf("I am C++\n");
}
and a foo2.c
#include <stdio.h>
void foo(int r)
{
printf("I am C\n");
}
Now suppose you have a main.c and main.cpp which both look like this:
extern void foo(int);
int main(void)
{
foo(1);
return 0;
}
When compiled as C++, it will use the symbol in the C++ global namespace; in C it will use the C one:
$ diff main.cpp main.c
$ gcc -o test main.cpp foo.cpp foo2.c
$ ./test
I am C++
$ gcc -o test main.c foo.cpp foo2.c
$ ./test
I am C
int main(void) {
const int dim = 5;
int array[dim];
}
This is rather peculiar in that it is valid in C++ and in C99, C11, and C17 (though optional in C11, C17); but not valid in C89.
In C99+ it creates a variable-length array, which has its own peculiarities over normal arrays, as it has a runtime type instead of compile-time type, and sizeof array is not an integer constant expression in C. In C++ the type is wholly static.
If you try to add an initializer here:
int main(void) {
const int dim = 5;
int array[dim] = {0};
}
is valid C++ but not C, because variable-length arrays cannot have an initializer.
Empty structures have size 0 in C and 1 in C++:
#include <stdio.h>
typedef struct {} Foo;
int main()
{
printf("%zd\n", sizeof(Foo));
return 0;
}
This concerns lvalues and rvalues in C and C++.
In the C programming language, both the pre-increment and the post-increment operators return rvalues, not lvalues. This means that they cannot be on the left side of the = assignment operator. Both these statements will give a compiler error in C:
int a = 5;
a++ = 2; /* error: lvalue required as left operand of assignment */
++a = 2; /* error: lvalue required as left operand of assignment */
In C++ however, the pre-increment operator returns an lvalue, while the post-increment operator returns an rvalue. It means that an expression with the pre-increment operator can be placed on the left side of the = assignment operator!
int a = 5;
a++ = 2; // error: lvalue required as left operand of assignment
++a = 2; // No error: a gets assigned to 2!
Now why is this so? The post-increment increments the variable, and it returns the variable as it was before the increment happened. This is actually just an rvalue. The former value of the variable a is copied into a register as a temporary, and then a is incremented. But the former value of a is returned by the expression, it is an rvalue. It no longer represents the current content of the variable.
The pre-increment first increments the variable, and then it returns the variable as it became after the increment happened. In this case, we do not need to store the old value of the variable into a temporary register. We just retrieve the new value of the variable after it has been incremented. So the pre-increment returns an lvalue, it returns the variable a itself. We can use assign this lvalue to something else, it is like the following statement. This is an implicit conversion of lvalue into rvalue.
int x = a;
int x = ++a;
Since the pre-increment returns an lvalue, we can also assign something to it. The following two statements are identical. In the second assignment, first a is incremented, then its new value is overwritten with 2.
int a;
a = 2;
++a = 2; // Valid in C++.

Binary serialization of variable length data and zero length arrays, is it safe?

I did some research but cannot find a definite approval or disapproval.
What I want is, a fixed size structure + variable length part, so that serialization can be expressed in simple and less error prone way.
struct serialized_data
{
int len;
int type;
char variable_length_text[0];
};
And then:
serialize_data buff = (serialize_data*)malloc(sizeof(serialize_data)+5);
buff->len=5;
buff->type=1;
memcpy(buff->variable_length_text, "abcd", 5);
Unfortunately I can't find if MSVC, GCC, CLang etc., are ok with it.
Maybe there is a better way to achieve the same?
I really don't want those ugly casts all around:
memcpy((char*)(((char*)buffer)+sizeof(serialize_data)), "abcd", 5);
This program is using a zero length array. This is not C but a GNU extension.
http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
A common idiom in C89, called the struct hack, was to use:
struct serialized_data
{
int len;
int type;
char variable_length_text[1];
};
Unfortunately its common use as a flexible array is not strictly conforming.
C99 comes with something similar to perform the same task: a feature called the flexible array member.
Here is an example right from the Standard (C99, 6.7.2.1p17)
struct s { int n; double d[]; };
int m = 12; // some value
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));

Examples of code that compiles but executes differently in C versus C++ [closed]

C and C++ have many differences, and not all valid C code is valid C++ code.
(By "valid" I mean standard code with defined behavior, i.e. not implementation-specific/undefined/etc.)
Is there any scenario in which a piece of code valid in both C and C++ would produce different behavior when compiled with a standard compiler in each language?
To make it a reasonable/useful comparison (I'm trying to learn something practically useful, not to try to find obvious loopholes in the question), let's assume:
Nothing preprocessor-related (which means no hacks with #ifdef __cplusplus, pragmas, etc.)
Anything implementation-defined is the same in both languages (e.g. numeric limits, etc.)
We're comparing reasonably recent versions of each standard (e.g. say, C++98 and C90 or later)
If the versions matter, then please mention which versions of each produce different behavior.
Here is an example that takes advantage of the difference between function calls and object declarations in C and C++, as well as the fact that C90 allows the calling of undeclared functions:
#include <stdio.h>
struct f { int x; };
int main() {
f();
}
int f() {
return printf("hello");
}
In C++ this will print nothing because a temporary f is created and destroyed, but in C90 it will print hello because functions can be called without having been declared.
In case you were wondering about the name f being used twice, the C and C++ standards explicitly allow this, and to make an object you have to say struct f to disambiguate if you want the structure, or leave off struct if you want the function.
For C++ vs. C90, there's at least one way to get different behavior that's not implementation defined. C90 doesn't have single-line comments. With a little care, we can use that to create an expression with entirely different results in C90 and in C++.
int a = 10 //* comment */ 2
+ 3;
In C++, everything from the // to the end of the line is a comment, so this works out as:
int a = 10 + 3;
Since C90 doesn't have single-line comments, only the /* comment */ is a comment. The first / and the 2 are both parts of the initialization, so it comes out to:
int a = 10 / 2 + 3;
So, a correct C++ compiler will give 13, but a strictly correct C90 compiler 8. Of course, I just picked arbitrary numbers here -- you can use other numbers as you see fit.
The following, valid in C and C++, is going to (most likely) result in different values in i in C and C++:
int i = sizeof('a');
See Size of character ('a') in C/C++ for an explanation of the difference.
Another one from this article:
#include <stdio.h>
int sz = 80;
int main(void)
{
struct sz { char c; };
int val = sizeof(sz); // sizeof(int) in C,
// sizeof(struct sz) in C++
printf("%d\n", val);
return 0;
}
C90 vs. C++11 (int vs. double):
#include <stdio.h>
int main()
{
auto j = 1.5;
printf("%d", (int)sizeof(j));
return 0;
}
In C auto means local variable. In C90 it's ok to omit variable or function type. It defaults to int. In C++11 auto means something completely different, it tells the compiler to infer the type of the variable from the value used to initialize it.
Another example that I haven't seen mentioned yet, this one highlighting a preprocessor difference:
#include <stdio.h>
int main()
{
#if true
printf("true!\n");
#else
printf("false!\n");
#endif
return 0;
}
This prints "false" in C and "true" in C++ - In C, any undefined macro evaluates to 0. In C++, there's 1 exception: "true" evaluates to 1.
Per C++11 standard:
a. The comma operator performs lvalue-to-rvalue conversion in C but not C++:
char arr[100];
int s = sizeof(0, arr); // The comma operator is used.
In C++ the value of this expression will be 100 and in C this will be sizeof(char*).
b. In C++ the type of enumerator is its enum. In C the type of enumerator is int.
enum E { a, b, c };
sizeof(a) == sizeof(int); // In C
sizeof(a) == sizeof(E); // In C++
This means that sizeof(int) may not be equal to sizeof(E).
c. In C++ a function declared with empty params list takes no arguments. In C empty params list mean that the number and type of function params is unknown.
int f(); // int f(void) in C++
// int f(*unknown*) in C
This program prints 1 in C++ and 0 in C:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int d = (int)(abs(0.6) + 0.5);
printf("%d", d);
return 0;
}
This happens because there is double abs(double) overload in C++, so abs(0.6) returns 0.6 while in C it returns 0 because of implicit double-to-int conversion before invoking int abs(int). In C, you have to use fabs to work with double.
#include <stdio.h>
int main(void)
{
printf("%d\n", (int)sizeof('a'));
return 0;
}
In C, this prints whatever the value of sizeof(int) is on the current system, which is typically 4 in most systems commonly in use today.
In C++, this must print 1.
Another sizeof trap: boolean expressions.
#include <stdio.h>
int main() {
printf("%d\n", (int)sizeof !0);
}
It equals to sizeof(int) in C, because the expression is of type int, but is typically 1 in C++ (though it's not required to be). In practice they are almost always different.
An old chestnut that depends on the C compiler, not recognizing C++ end-of-line comments...
...
int a = 4 //* */ 2
+2;
printf("%i\n",a);
...
The C++ Programming Language (3rd Edition) gives three examples:
sizeof('a'), as #Adam Rosenfield mentioned;
// comments being used to create hidden code:
int f(int a, int b)
{
return a //* blah */ b
;
}
Structures etc. hiding stuff in out scopes, as in your example.
Another one listed by the C++ Standard:
#include <stdio.h>
int x[1];
int main(void) {
struct x { int a[2]; };
/* size of the array in C */
/* size of the struct in C++ */
printf("%d\n", (int)sizeof(x));
}
Inline functions in C default to external scope where as those in C++ do not.
Compiling the following two files together would print the "I am inline" in case of GNU C but nothing for C++.
File 1
#include <stdio.h>
struct fun{};
int main()
{
fun(); // In C, this calls the inline function from file 2 where as in C++
// this would create a variable of struct fun
return 0;
}
File 2
#include <stdio.h>
inline void fun(void)
{
printf("I am inline\n");
}
Also, C++ implicitly treats any const global as static unless it is explicitly declared extern, unlike C in which extern is the default.
#include <stdio.h>
struct A {
double a[32];
};
int main() {
struct B {
struct A {
short a, b;
} a;
};
printf("%d\n", sizeof(struct A));
return 0;
}
This program prints 128 (32 * sizeof(double)) when compiled using a C++ compiler and 4 when compiled using a C compiler.
This is because C does not have the notion of scope resolution. In C structures contained in other structures get put into the scope of the outer structure.
struct abort
{
int x;
};
int main()
{
abort();
return 0;
}
Returns with exit code of 0 in C++, or 3 in C.
This trick could probably be used to do something more interesting, but I couldn't think of a good way of creating a constructor that would be palatable to C. I tried making a similarly boring example with the copy constructor, that would let an argument be passed, albeit in a rather non-portable fashion:
struct exit
{
int x;
};
int main()
{
struct exit code;
code.x=1;
exit(code);
return 0;
}
VC++ 2005 refused to compile that in C++ mode, though, complaining about how "exit code" was redefined. (I think this is a compiler bug, unless I've suddenly forgotten how to program.) It exited with a process exit code of 1 when compiled as C though.
Don't forget the distinction between the C and C++ global namespaces. Suppose you have a foo.cpp
#include <cstdio>
void foo(int r)
{
printf("I am C++\n");
}
and a foo2.c
#include <stdio.h>
void foo(int r)
{
printf("I am C\n");
}
Now suppose you have a main.c and main.cpp which both look like this:
extern void foo(int);
int main(void)
{
foo(1);
return 0;
}
When compiled as C++, it will use the symbol in the C++ global namespace; in C it will use the C one:
$ diff main.cpp main.c
$ gcc -o test main.cpp foo.cpp foo2.c
$ ./test
I am C++
$ gcc -o test main.c foo.cpp foo2.c
$ ./test
I am C
int main(void) {
const int dim = 5;
int array[dim];
}
This is rather peculiar in that it is valid in C++ and in C99, C11, and C17 (though optional in C11, C17); but not valid in C89.
In C99+ it creates a variable-length array, which has its own peculiarities over normal arrays, as it has a runtime type instead of compile-time type, and sizeof array is not an integer constant expression in C. In C++ the type is wholly static.
If you try to add an initializer here:
int main(void) {
const int dim = 5;
int array[dim] = {0};
}
is valid C++ but not C, because variable-length arrays cannot have an initializer.
Empty structures have size 0 in C and 1 in C++:
#include <stdio.h>
typedef struct {} Foo;
int main()
{
printf("%zd\n", sizeof(Foo));
return 0;
}
This concerns lvalues and rvalues in C and C++.
In the C programming language, both the pre-increment and the post-increment operators return rvalues, not lvalues. This means that they cannot be on the left side of the = assignment operator. Both these statements will give a compiler error in C:
int a = 5;
a++ = 2; /* error: lvalue required as left operand of assignment */
++a = 2; /* error: lvalue required as left operand of assignment */
In C++ however, the pre-increment operator returns an lvalue, while the post-increment operator returns an rvalue. It means that an expression with the pre-increment operator can be placed on the left side of the = assignment operator!
int a = 5;
a++ = 2; // error: lvalue required as left operand of assignment
++a = 2; // No error: a gets assigned to 2!
Now why is this so? The post-increment increments the variable, and it returns the variable as it was before the increment happened. This is actually just an rvalue. The former value of the variable a is copied into a register as a temporary, and then a is incremented. But the former value of a is returned by the expression, it is an rvalue. It no longer represents the current content of the variable.
The pre-increment first increments the variable, and then it returns the variable as it became after the increment happened. In this case, we do not need to store the old value of the variable into a temporary register. We just retrieve the new value of the variable after it has been incremented. So the pre-increment returns an lvalue, it returns the variable a itself. We can use assign this lvalue to something else, it is like the following statement. This is an implicit conversion of lvalue into rvalue.
int x = a;
int x = ++a;
Since the pre-increment returns an lvalue, we can also assign something to it. The following two statements are identical. In the second assignment, first a is incremented, then its new value is overwritten with 2.
int a;
a = 2;
++a = 2; // Valid in C++.