What is the size of void? - c++

What would this statement yield?
void *p = malloc(sizeof(void));
Edit: An extension to the question.
If sizeof(void) yields 1 in GCC compiler, then 1 byte of memory is allocated and the pointer p points to that byte and would p++ be incremented to 0x2346? Suppose p was 0x2345. I am talking about p and not *p.

The type void has no size; that would be a compilation error. For the same reason you can't do something like:
void n;
EDIT.
To my surprise, doing sizeof(void) actually does compile in GNU C:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc -w - && ./a.out
1
However, in C++ it does not:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc++ -w - && ./a.out
<stdin>: In function 'int main()':
<stdin>:1: error: invalid application of 'sizeof' to a void type
<stdin>:1: error: 'printf' was not declared in this scope

If you are using GCC and you are not using compilation flags that remove compiler specific extensions, then sizeof(void) is 1. GCC has a nonstandard extension that does that.
In general, void is a incomplete type, and you cannot use sizeof for incomplete types.

Although void may stand in place for a type, it cannot actually hold a value. Therefore, it has no size in memory. Getting the size of a void isn’t defined.
A void pointer is simply a language construct meaning a pointer to untyped memory.

void has no size. In both C and C++, the expression sizeof (void) is invalid.
In C, quoting N1570 6.5.3.4 paragraph 1:
The sizeof operator shall not be applied to an expression that
has function type or an incomplete type, to the parenthesized name of
such a type, or to an expression that designates a bit-field member.
(N1570 is a draft of the 2011 ISO C standard.)
void is an incomplete type. This paragraph is a constraint, meaning that any conforming C compiler must diagnose any violation of it. (The diagnostic message may be a non-fatal warning.)
The C++ 11 standard has very similar wording. Both editions were published after this question was asked, but the rules go back to the 1989 ANSI C standard and the earliest C++ standards. In fact, the rule that void is an incomplete type to which sizeof may not be applied goes back exactly as far as the introduction of void into the language.
gcc has an extension that treats sizeof (void) as 1. gcc is not a conforming C compiler by default, so in its default mode it doesn't warn about sizeof (void). Extensions like this are permitted even for fully conforming C compilers, but the diagnostic is still required.

Taking the size of void is a GCC extension.

sizeof() cannot be applied to incomplete types. And void is incomplete type that cannot be completed.

In C, sizeof(void) == 1 in GCC, but this appears to depend on your compiler.
In C++, I get:
In function 'int main()':
Line 2: error: invalid application of 'sizeof' to a void type
compilation terminated due to -Wfatal-errors.

To the 2nd part of the question: Note that sizeof(void *)!= sizeof(void).
On a 32-bit arch, sizeof(void *) is 4 bytes, so p++, would be set accordingly.The amount by which a pointer is incremented is dependent on the data it is pointing to. So, it will be increased by 1 byte.

while sizeof(void) perhaps makes no sense in itself, it is important when you're doing any pointer math.
eg.
void *p;
while(...)
p++;
If sizeof(void) is considered 1 then this will work.
If sizeof(void) is considered 0 then you hit an infinite loop.

Most C++ compilers choosed to raise a compile error when trying to get sizeof(void).
When compiling C, gcc is not conforming and chose to define sizeof(void) as 1. It may look strange, but has a rationale. When you do pointer arithmetic adding or removing one unit means adding or removing the object pointed to size. Thus defining sizeof(void) as 1 helps defining void* as a pointer to byte (untyped memory address). Otherwise you would have surprising behaviors using pointer arithmetic like p+1 == p when p is void*. Such pointer arithmetic on void pointers is not allowed in c++ but works fine with when compiling C with gcc.
The standard recommended way would be to use char* for that kind of purpose (pointer to byte).
Another similar difference between C and C++ when using sizeof occurs when you defined an empty struct like:
struct Empty {
} empty;
Using gcc as my C compiler sizeof(empty) returns 0.
Using g++ the same code will return 1.
I'm not sure what states both C and C++ standards on this point, but I believe defining the size of some empty structs/objects helps with reference management to avoid that two references to differing consecutive objects, the first one being empty, get the same address. If reference are implemented using hidden pointers as it is often done, ensuring different address will help comparing them.
But this is merely avoiding a surprising behavior (corner case comparison of references) by introduction another one (empty objects, even PODs consume at least 1 byte memory).

Related

Conversion to void** on different compilers

I've been running the following code through different compilers:
int main()
{
float **a;
void **b;
b = a;
}
From what I've been able to gather, void ** is not a generic pointer which means that any conversion from another pointer should not compile or at least throw a warning. However, here are my results (all done on Windows):
gcc - Throws a warning, as expected.
g++ - Throws an error, as expected (this is due to the less permissive typing of C++, right?)
MSVC (cl.exe) - Throws no warnings whatsoever, even with /Wall specified.
My question is: Am I missing something about the whole thing and is there any specific reason why MSVC does not produce a warning? MSVC does produce a warning when converting from void ** to float **.
Another thing of note: If I replace a = b with the explicit conversion a = (void **)b, none of the compilers throw a warning. I thought this should be an invalid cast, so why wouldn't there be any warnings?
The reason I am asking this question is because I was starting to learn CUDA and in the official Programming Guide (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#device-memory) the following code can be found:
// Allocate vectors in device memory
float* d_A;
cudaMalloc(&d_A, size);
which should perform an implicit conversion to void ** for &d_A, as the first argument of cudaMalloc is of type void **. Similar code can be found all over the documentation. Is this just sloppy work on NVIDIA's end or am I, again, missing something? Since nvcc uses MSVC, the code compiles without warnings.
Am I missing something about the whole thing and is there any specific reason why MSVC does not produce a warning? MSVC does produce a warning when converting from void ** to float **
This assignment without a cast is a constraint violation, so a standard compliant compiler will print a warning or error. However, MSVC is not fully compliant C implementation.
Another thing of note: If I replace a = b with the explicit conversion a = (void **)b, none of the compilers throw a warning. I thought this should be an invalid cast, so why wouldn't there be any warnings?
Pointer conversions via a cast are allowed in some situations. The C standard says the following in section 6.3.2.3p7:
A pointer to an object type may be converted to a pointer to a different object type. If the
resulting pointer is not correctly aligned for the referenced type, the behavior is
undefined. Otherwise, when converted back again, the result shall compare equal to the
original pointer. When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
So you can convert between pointer types provided there are no alignment issues and you only convert back (unless the target is a char *).
float* d_A;
cudaMalloc(&d_A, size);
...
Is this just sloppy work on NVIDIA's end or am I, again, missing something?
Presumably, this function is dereferencing the given pointer and writing the address of some allocated memory. That would mean it is trying to write to a float * as if it were a void *. This is not the same as the typical conversion to/from a void *. Strictly speaking this looks like undefined behavior, although it "works" because modern x86 processors (when not in real mode) use the same representation for all pointer types.

Difference between Function pointer casting in c and c++

The behavior of my code is different in c and c++.
void *(*funcPtr)() = dlsym(some symbol..) ; // (1) works in c but not c++
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
I dont understand how the 2nd casting works in c++ while the 1st casting doesnt work in c++. The error message for (1) show is invalid conversion from void* to void*() in c++.
The problem is that dlsym returns a void* pointer. While in C, any such pointer is implicitly convertible into any other (object!) pointer type (for comparison: casting the result of malloc), this is not the case in C++ (here you need to cast the result of malloc).
For function pointers, though, this cast is not implicit even in C. Apparently, as your code compiles, your compiler added this implicit cast for function pointers, too, as a compiler extension (in consistency with object pointers); however, for being fully standard compliant, it actually should issue a diagnostic, e. g. a compiler warning, mandated by C17 6.5.4.3:
Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1, shall be specified by means of an explicit cast.
But now instead of casting the target pointer, you rather should cast the result of dlsym into the appropriate function pointer:
int (*funcPtr)() = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or simpler:
auto funcPtr = reinterpret_cast<int(*)()>(dlsym(some symbol..));
or even:
int (*funcPtr)() = reinterpret_cast<decltype(funcPtr)>(dlsym(some symbol..));
(Last one especially interesting if funcPtr has been declared previously.)
Additionally, you should prefer C++ over C style casts, see my variants above. You get more precise control over what type of cast actually occurs.
Side note: Have you noticed that the two function pointers declared in your question differ in return type (void* vs. int)? Additionally, a function not accepting any arguments in C needs to be declared as void f(void); accordingly, function pointers: void*(*f)(void). C++ allows usage of void for compatibility, while skipping void in C has totally different meaning: The function could accept anything, you need to know from elsewhere (documentation) how many arguments of which type actually can be passed to.
Strictly speaking, none of this code is valid in either language.
void *(*funcPtr)() = dlsym(some symbol..) ;
dlsym returns type void* so this is not valid in either language. The C standard only allows for implicit conversions between void* and pointers to object type, see C17 6.3.2.3:
A pointer to void may be converted to or from a pointer to any object type. A pointer to
any object type may be converted to a pointer to void and back again; the result shall
compare equal to the original pointer.
Specifically, your code is a language violation of one of the rules of simple assignment, C17 6.5.16.1, emphasis mine:
the left operand has atomic, qualified, or unqualified pointer type, and (considering
the type the left operand would have after lvalue conversion) one operand is a pointer
to an object type, and the other is a pointer to a qualified or unqualified version of
void, and the type pointed to by the left has all the qualifiers of the type pointed to
by the right;
The reason why it might compile on certain compilers is overly lax compiler settings. If you are for example using gcc, you need to compile with gcc -std=c17 -pedantic-errors to get a language compliant compiler, otherwise it defaults to the non-standard "gnu11" language.
int (*funcPtr)();
*(void**)(&funcPtr) = dlsym(some symbol..) ; // (2) works in c++
Here you explicitly force the function pointer to become type void**. The cast is fine syntax-wise, but this may or may not be a valid pointer conversion on the specific compiler. Again, conversions between object pointers and function pointers are not supported by the C or C++ standard, but you are relying on non-standard extensions. Formally this code invokes undefined behavior in both C and C++.
In practice, lots of compilers have well-defined behavior here, because most systems have the same representation of object pointers and function pointers.
Given that the conversion is valid, you can of course de-reference a void** to get a void* and then assign another void* to it.

c++ function call with square parenthesis

int func(int n)
{return n;}
int main()
{ cout << func[4] ;
cout << func[4,3,5] ;}
what do these actually mean? I guess it is about accessing func+4 and func is allocated space on calling func[4].
But, func[4,3,5] is just absurd.
The reason this code compiles and func[4] is not a syntax error is:
1.Function types can implicitly convert to pointers of the same type.
So, if we have code like this:
int f(int);
using func_t = int(*)(int);
void g(func_t);
we can write
g(f)
and aren't forced to write g(&f). The &, taking us from type int(int) to int(*)(int) happens implicitly.
2.In C (and necessarily in C++ for compatibility) pointers are connected to arrays, and when p is a pointer p[x] is the same as *(p + x). So func[4] is the same as *(func + 4).
3.*(p+x) has the type of a function int(int), but again can implicitly decay to a pointer type whenever necessary. So *(func + 4) can implicitly just be (func + 4).
4.Pointers of any type are streamable to std::cout.
Note, that just because it isn't a syntax error doesn't mean it is valid. Of course it is undefined behavior, and as the compiler warning emitted by gcc and clang indicates, pointer arithmetic with a function pointer is generally wrong, because you cannot make an array of functions. The implementation places functions however it likes. (You can make an array of function pointers but that is something else entirely.)
Edit: I should correct myself -- this answer is not entirely correct. func[4] is not valid, because the pointer is not a pointer to an object type. #holyblackcat answer is correct, see his answer for reference in the standard.
This code should be ill-formed, and gcc only compiles it without an error because they are using a nonstandard extension by default. Clang and msvc correctly reject this code.
I'm surprised no answer mentions it, but:
The code in the question is simply not valid C++.
It's rejected by Clang and MSVC with no flags. GCC rejects it with -pedantic-errors.
a[b] (in absence of operator overloading) is defined as *(a + b), and the builtin operator + requires the pointer operand to be a pointer to an object type (which functions pointers are not).
[expr.add]/1
...either both operands shall have arithmetic or unscoped enumeration type, or one operand shall be a pointer to a completely-defined object type and the other shall have integral or unscoped enumeration type.
(Emphasis mine.)
GCC compiles the code because an extension allowing arithmetic on function pointer is enabled by default.
Due to function-to-pointer decay, func[4] is treated as &(&func)[4], which effectively means &func + 4, which (as the link explains) simply adds 4 to the numerical value of the pointer. Calling resulting pointer will most likely cause a crash or unpredicatble results.
std::cout doesn't have an overload of << suitable for printing function pointers, and the best suitable overload the compiler is able to find is the one for printing bools. The pointer gets converted to bool, and since it's non-null, it becomes true, which is then printed as 1.
Lastly, func[4,3,5] has the same effect as func[5], since in this context , is treated as an operator, and x , y is equal to y.
Since it has not been mentioned yet: func[3, 4, 5] is identical to func[5] - the commas in there are the builtin comma operator which evaluates the left hand side expression, discards it and then evaluates the right hand side expression. There is no function call happening here and the commas in the code are not delimiting function parameters.
Yes,It is about accessing the func+4 which is not already defined leading to a garbage value.So the compiler will indicate you with the following warning message.
hereProgram: In function 'int main()':
Program:7: warning: pointer to a function used in arithmetic

Semantics of unary & on numeric literal

What is the unary-& doing here?
int * a = 1990;
int result = &5[a];
If you were to print result you would get the value 2010.
You have to compile it with -fpermissive or it will stop due to errors.
In C, x [y] and y [x] are identical. So &5[a] is the same as &a[5].
&5[a] is the same as &a[5] and the same as a + 5. In your case it's undefined behavior because a points to nowhere.
C11 standard chapter 6.5.6 Additive operators/8 (the same in C++):
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
"...unary & on numeric literal"?
Postfix operators in C always have higher priority than prefix ones. In case of &5[a], the [] has higher priority than the &. Which means that in &5[a] the unary & is not applied to "numeric literal" as you seem to incorrectly believe. It is applied to the entire 5[a] subexpression. I.e. &5[a] is equivalent to &(5[a]).
As for what 5[a] means - this is a beaten-to-death FAQ. Look it up.
And no, you don't have "to compile it with -fpermissive" (my compiler tells me it doesn't even know what -fpermissive is). You have to figure out that this
int * a = 1990;
is not legal code in either C or C++. If anything, it requires an explicit cast
int * a = (int *) 1990;
not some obscure switch of some specific compiler you happened to be using at the moment. The same applies to another illegal initialization in int result = &5[a].
Finally, even if we overlook the illegal code and the undefined behavior triggered by that 5[a], the behavior of this code will still be highly implementation-dependent. I.e. the answer is no, in general case you will not get 2010 in result.
You cannot apply the unary & operator to an integer literal, because a literal is not an lvalue.
Due to operator precedence, your code doesn't do that. Since the indexing operator [] binds more tightly than unary &, &5[a] is equivalent to &(5[a]).
Here's a program similar to yours, except that it's valid code, not requiring -fpermissive to compile:
#include <stdio.h>
int main(void) {
int arr[6];
int *ptr1 = arr;
int *ptr2 = &5[ptr1];
printf("%p %p\n", ptr1, ptr2);
}
As explained in this question and my answer, the indexing operator is commutative (because it's defined in terms of addition, and addition is commutative), so 5[a] is equivalent to a[5]. So the expression &5[ptr1] computes the address of element 5 of arr.
In your program:
int * a = 1990;
int result = &5[a];
the initialization of a is invalid because a is of type int* and 1990 is of type int, and there is no implicit conversion from int to int*. Likewise, the initialization of result is invalid because &5[a] is of type int*. Apparently -fpermissive causes the compiler to violate the rules of the language and permit these invalid implicit conversions.
At least in the version of gcc I'm using, the -fpermissive option is valid only for C++ and Objective-C, not for C. In C, gcc permits such implicit conversions (with a warning) anyway. I strongly recommend not using this option. (Your question is tagged both C and C++. Keep in mind that C and C++ are two distinct, though closely related, languages. They happen to behave similarly in this case, but it's usually best to pick one language or the other.)

Can sizeof return 0 (zero)

Is it possible for the sizeof operator to ever return 0 (zero) in C or C++? If it is possible, is it correct from a standards point of view?
In C++ an empty class or struct has a sizeof at least 1 by definition. From the C++ standard, 9/3 "Classes": "Complete objects and member subobjects of class type shall have nonzero size."
In C an empty struct is not permitted, except by extension (or a flaw in the compiler).
This is a consequence of the grammar (which requires that there be something inside the braces) along with this sentence from 6.7.2.1/7 "Structure and union specifiers": "If the struct-declaration-list contains no named members, the behavior is undefined".
If a zero-sized structure is permitted, then it's a language extension (or a flaw in the compiler). For example, in GCC the extension is documented in "Structures with No Members", which says:
GCC permits a C structure to have no members:
struct empty {
};
The structure will have size zero. In C++, empty structures are part of the language. G++ treats empty structures as if they had a single member of type char.
sizeof never returns 0 in C and in C++. Every time you see sizeof evaluating to 0 it is a bug/glitch/extension of a specific compiler that has nothing to do with the language.
Every object in C must have a unique address. Worded another way, an address must hold no more than one object of a given type (in order for pointer dereferencing to work). That being said, consider an 'empty' struct:
struct emptyStruct {};
and, more specifically, an array of them:
struct emptyStruct array[10];
struct emptyStruct* ptr = &array[0];
If the objects were indeed empty (that is, if sizeof(struct emptyStruct) == 0), then ptr++ ==> (void*)ptr + sizeof(struct emptyStruct) ==> ptr, which doesn't make sense. Which object would *ptr then refer to, ptr[0] or ptr[1]?
Even if a structure has no contents, the compiler should treat it as if it is one byte in length in order to maintain the "one address, one object" principle.
The C language specification (section A7.4.8) words this requirement as
when applied to a structure or union,
the result (of the sizeof operator)
is the number of bytes in the object,
including any padding required to make
the object tile an array
Since a padding byte must be added to an "empty" object in order for it to work in an array, sizeof() must therefore return a value of at least 1 for any valid input.
Edit:
Section A8.3 of the C spec calls a struct without a list of members an incomplete type, and the definition of sizeof specifically states (with emphasis added):
The operator (sizeof) may not be
applied to an operand of function
type, or of incomplete type, or to a
bit-field.
That would imply that using sizeof on an empty struct would be equally as invalid as using it on a data type that has not been defined. If your compiler allows the use of empty structs, be aware that using sizeof on them is not allowed as per the C spec. If your compiler allows you to do this anyway, understand that this is non-standard behavior that will not work on all compilers; do not rely on this behavior.
Edit: See also this entry in Bjarne Stroustrup's FAQ.
Empty structs, as isbadawi mentions. Also gcc allows arrays of 0 size:
int a[0];
sizeof(a);
EDIT: After seeing the MSDN link, I tried the empty struct in VS2005 and sizeof did return 1. I'm not sure if that's a VS bug or if the spec is somehow flexible about that sort of thing
in my view, it is better that sizeof returns 0 for a structure of size 0 (in the spirit of c).
but then the programmer has to be careful when he takes the sizeof an empty struct.
but it may cause a problem.
when array of such structures is defined, then
&arr[1] == &arr[2] == &arr[0]
which makes them lose their identities.
i guess this doesnt directly answer your question, whether it is possible or not.
well that may be possible depending on the compiler. (as said in Michael's answer above).
typedef struct {
int : 0;
} x;
x x1;
x x2;
Under MSVC 2010 (/Za /Wall):
sizeof(x) == 4
&x1 != &x2
Under GCC (-ansi -pedantic -Wall) :
sizeof(x) == 0
&x1 != &x2
i.e. Even though under GCC it has zero size, instances of the struct have distinct addresses.
ANSI C (C89 and C99 - I haven't looked at C++) says "It shall be possible to express the address of each individual byte of an object uniquely." This seems ambiguous in the case of a zero-sized object, since it arguably has no bytes.
Edit: "A bit-field declaration with no declarator, but only a colon and a width, indicates an unnamed bit-field. As a special case of this, a bit-field with a width of 0 indicates that no further bit-field is to be packed into the unit in which the previous bit-field, if any, was placed."
I think it never returns 0 in c , no empty structs is allowed
Here's a test, where sizeof yields 0
#include <stdio.h>
void func(int i)
{
int vla[i];
printf ("%u\n",(unsigned)sizeof vla);
}
int main(void)
{
func(0);
return 0;
}
If you have this :
struct Foo {};
struct Bar { Foo v[]; }
g++ -ansi returns sizeof(Bar) == 0. As does the clang & intel compiler.
However, this does not compile with gcc. I deduce it's a C++ extension.
struct Empty {
} em;
struct Zero {
Empty a[0];
} zr;
printf("em=%d\n", sizeof(em));
printf("zr=%d\n", sizeof(zr));
Result:
em=1
zr=0