how sizeof works on unimplemented function? [duplicate] - c++

This question already has answers here:
What does sizeof (function(argument)) return?
(4 answers)
The function call inside sizeof doesn't invoke it? C++
(1 answer)
Closed 12 days ago.
I have two functions without any implementation.
I expect that the linker returns an undefined reference to hello and world error.
But surprisingly, the code compiles and runs without any error.
#include <stdio.h>
int hello();
char world();
int main() {
printf("sizeof hello = %zu, sizeof world = %zu\n", sizeof(hello()), sizeof(world()));
}
sizeof hello = 4, sizeof world = 1

sizeof(hello()) is the size of the the return type, int, not the size of the function. The function is not called.
The function does not need to be defined to determine its return type declared by int hello();.
Deeper (in C)
sizeof works with sizeof unary-expression and sizeof ( type-name ).
The size is determined from the type of the operand. The result
is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant. C17dr § 6.5.3.4 2
Since the type of the operand hello() is an int (and not a variable length array), the operand is not evaluated and is then like sizeof(int).
Aside: sizeof returns a size_t.
"%zu" is a correct specifier to match a size_t. "%ld" is not specified to work.
// printf("sizeof hello = %ld, sizeof world = %ld\n", sizeof(hello()), sizeof(world()));
printf("sizeof hello = %zu, sizeof world = %zu\n", sizeof(hello()), sizeof(world()));

sizeof is unevaluated context. It does not actually call the functions. No definition is required. The declaration is sufficient to know that they return int and char.
You could as well have written sizeof(int) and sizeof(char) to get the same output. sizeof(&hello) would result in the size of a function pointer to hello and does not require the definition either. And sizeof(hello) just makes no sense, because functions have no size (at least not in a sense that sizeof could tell you).
For details I refer you to https://en.cppreference.com/w/cpp/language/sizeof (c++) and https://en.cppreference.com/w/c/language/sizeof (c).

The title of your question:
how sizeof works on unimplemented function?
implies you are trying to determine how much memory the function implementation uses, similar to sizeof() int returning how much memory an int variable uses.
You can't use sizeof() on a function like that.
Per 6.5.3.4 The sizeof and _Alignof operators, paragraph 1 of the (draft) C11 standard:
The sizeof operator shall not be applied to an expression that has function type...
The draft C23 standard has the exact same wording:
The sizeof operator shall not be applied to an expression that has function type...

Related

Size of a function when call sizeof [duplicate]

I happened to stumble across this piece of code.
int x(int a){
std::cout<<a<<std::endl;
return a + 1;
}
int main()
{
std::cout<<sizeof(x(20))<<std::endl;
return 0;
}
I expected it to print 20 followed 4. But it just prints 4. Why does it happen so? Isn't it incorrect to optimize out a function, that has a side effect (printing to IO/file etc)?
sizeof is a compile-time operator, and the operand is never evaluated.
sizeof is actually an operator and it is evaluated in compile-time.
The compiler can evaluate it because the size of the return type of x is fixed; it cannot change during program execution.
result of sizeof is computed in compiling time in C++. so there has of function call to x(20)
sizeof() gives the size of the datatype. In your case it doesn't need to call the function to obtain the datatype.
I suspect sizeof also does it's business at compile time rather than runtime...
Let me quote c++03 standard, #5.3.3.
The sizeof operator yields the number of bytes in the object
representation of its operand. The operand is either an expression,
which is not evaluated, or a parenthesized type-id.

Why sizeof int is wrong, while sizeof(int) is right?

We know that sizeof is an operator used for calculating the size of any datatype and expression, and when the operand is an expression, the parentheses can be omitted.
int main()
{
int a;
sizeof int;
sizeof( int );
sizeof a;
sizeof( a );
return 0;
}
the first usage of sizeof is wrong, while others are right.
When it is compiled using gcc, the following error message will be given:
main.c:5:9: error: expected expression before ‘int’
My question is why the C standard does not allow this kind of operation. Will sizeof int cause any ambiguity?
The following could be ambiguous:
sizeof int * + 1
Is that (sizeof (int*)) + 1, or (sizeof(int)) * (+1)?
Obviously the C language could have introduced a rule to resolve the ambiguity, but I can imagine why it didn't bother. With the language as it stands, a type specifier never appears "naked" in an expression, and so there is no need for rules to resolve whether that second * is part of the type or an arithmetic operator.
The existing grammar does already resolve the potential ambiguity of sizeof (int *) + 1. It is (sizeof(int*))+1, not sizeof((int*)(+1)).
C++ has a somewhat similar issue to resolve with function-style cast syntax. You can write int(0) and you can write typedef int *intptr; intptr(0);, but you can't write int*(0). In that case, the resolution is that the "naked" type must be a simple type name, it can't just be any old type id that might have spaces in it, or trailing punctuation. Maybe sizeof could have been defined with the same restriction, I'm not certain.
From C99 Standard
6.5.3.4.2
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a
type.
In your case int is neither expression nor parenthesized name.
There are two ways to use the sizeof operator in C. The syntax is this:
C11 6.5.3 Unary operators
...
sizeof unary-expression
sizeof ( type-name )
Whenever you use a type as operand, you must have the parenthesis, by the syntax definition of the language. If you use sizeof on an expression, you don't need the parenthesis.
The C standard gives one such example of where you might want to use it on an expression:
sizeof array / sizeof array[0]
However, for the sake of consistency, and to avoid bugs related to operator precedence, I would personally advise to always use () no matter the situation.

Why does decay to pointer for array argument appear not to apply to sizeof()?

I read a question earlier that was closed due to being an exact duplicate of this
When a function has a specific-size array parameter, why is it replaced with a pointer?
and
How to find the 'sizeof' (a pointer pointing to an array)?
but after reading this I am still confused by how sizeof() works. I understand that passing an array as an argument to a function such as
void foo(int a[5])
will result in the array argument decaying to a pointer. What I did not find in the above 2 question links was a clear answer as to why it is that the sizeof() function itself is exempt from (or at least seemingly exempt from) this pointer decay behaviour. If sizeof() behaved like any other function then
int a[5] = {1,2,3,4,5};
cout << sizeof(a) << endl;
then the above should output 4 instead of 20. Have I missed something obvious as this seems to be a contradiction of the decay to pointer behaviour??? Sorry for bringing this up again but I really am having a hard time of understanding why this happens despite having happily used the function for years without really thinking about it.
Because the standard says so (emphasis mine):
(C99, 6.3.2.1p3) "Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue."
Note that for C++, the standard explicitly says the size is the size of the array:
(C++11, 5.3.3p2 sizeof) "[...] When applied to an array, the result is the total number of bytes in the array. This implies that the size of an array of n
elements is n times the size of an element."
sizeof is an operator, not a function. It's a specific one at that, too. The parentheses aren't even necessary if it's an expression:
int a;
sizeof (int); //needed because `int` is a type
sizeof a; //optional because `a` is an expression
sizeof (a); //^ also works
As you can see, it's on this chart of precedence as well. It's also one of the non-overloadable operators.

Is sizeof(*ptr) undefined behavior when pointing to invalid memory?

We all know that dereferencing an null pointer or a pointer to unallocated memory invokes undefined behaviour.
But what is the rule when used within an expression passed to sizeof?
For example:
int *ptr = 0;
int size = sizeof(*ptr);
Is this also undefined?
In most cases, you will find that sizeof(*x) does not actually evaluate *x at all. And, since it's the evaluation (de-referencing) of a pointer that invokes undefined behaviour, you'll find it's mostly okay. The C11 standard has this to say in 6.5.3.4. The sizeof operator /2 (my emphasis in all these quotes):
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.
This is identical wording to the same section in C99. C89 had slightly different wording because, of course, there were no VLAs at that point. From 3.3.3.4. The sizeof operator:
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand, which is not itself evaluated. The result is an integer constant.
So, in C, for all non-VLAs, no dereferencing takes place and the statement is well defined. If the type of *x is a VLA, that's considered an execution-phase sizeof, something that needs to be worked out while the code is running - all others can be calculated at compile time. If x itself is the VLA, it's the same as the other cases, no evaluation takes place when using *x as an argument to sizeof().
C++ has (as expected, since it's a different language) slightly different rules, as shown in the various iterations of the standard:
First, C++03 5.3.3. Sizeof /1:
The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is not evaluated, or a parenthesized type-id.
In, C++11 5.3.3. Sizeof /1, you'll find slightly different wording but the same effect:
The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is an unevaluated operand (Clause 5), or a parenthesized type-id.
C++11 5. Expressions /7 (the above mentioned clause 5) defines the term "unevaluated operand" as perhaps one of the most useless, redundant phrases I've read for a while, but I don't know what was going through the mind of the ISO people when they wrote it:
In some contexts ([some references to sections detailing those contexts - pax]), unevaluated operands appear. An unevaluated operand is not evaluated.
C++14/17 have the same wording as C++11 but not necessarily in the same sections, as stuff was added before the relevant parts. They're in 5.3.3. Sizeof /1 and 5. Expressions /8 for C++14 and 8.3.3. Sizeof /1 and 8. Expressions /8 for C++17.
So, in C++, evaluation of *x in sizeof(*x) never takes place, so it's well defined, provided you follow all the other rules like providing a complete type, for example. But, the bottom line is that no dereferencing is done, which means it does not cause a problem.
You can actually see this non-evaluation in the following program:
#include <iostream>
#include <cmath>
int main() {
int x = 42;
std::cout << x << '\n';
std::cout << sizeof(x = 6) << '\n';
std::cout << sizeof(x++) << '\n';
std::cout << sizeof(x = 15 * x * x + 7 * x - 12) << '\n';
std::cout << sizeof(x += sqrt(4.0)) << '\n';
std::cout << x << '\n';
}
You might think that the final line would output something vastly different to 42 (774, based on my rough calculations) because x has been changed quite a bit. But that is not actually the case since it's only the type of the expression in sizeof that matters here, and the type boils down to whatever type x is.
What you do see (other than the possibility of different pointer sizes on lines other than the first and last) is:
42
4
4
4
4
42
No. sizeof is an operator, and works on types, not the actual value (which is not evaluated).
To remind you that it's an operator, I suggest you get in the habit of omitting the brackets where practical.
int* ptr = 0;
size_t size = sizeof *ptr;
size = sizeof (int); /* brackets still required when naming a type */
The answer may well be different for C, where sizeof is not necessarily a compile-time construct, but in C++ the expression provided to sizeof is never evaluated. As such, there is never a possibility for undefined behavior to exhibit itself. By similar logic, you can also "call" functions that are never defined [because the function is never actually called, no definition is necessary], a fact that is frequently used in SFINAE rules.
sizeof and decltype do not evaluate their operands, computing types only.
sizeof(*ptr) is the same as sizeof(int) in this case.
Since sizeof does not evaluate its operand (except in the case of variable length arrays if you're using C99 or later), in the expression sizeof (*ptr), ptr is not evaluated, therefore it is not dereferenced. The sizeof operator only needs to determine the type of the expression *ptr to get the appropriate size.

How does "sizeof" work in this helper for determining array size?

I've found this article that brings up the following template and a macro for getting array size:
template<typename Type, size_t Size>
char ( &ArraySizeHelper(Type( &Array )[Size]) )[Size];
#define _countof(Array) sizeof(ArraySizeHelper(Array))
and I find the following part totally unclear. sizeof is applied to a function declaration. I'd expect the result to be "size of function pointer". Why does it obtain "size of return value" instead?
sizeof is applied to the result of a function call, not a declaration. It therefore gives the size of the return value, which in this case is a reference to an array of chars.
The template causes the array in the return type to have the same number of elements as the argument array, which is fed to the function from the macro.
Finally, sizeof is then applied to a reference to this char array. sizeof on a reference is the same as sizeof on the type itself. Since sizeof(char) == 1, this gives the number of elements in the array.
template<typename Type, size_t Size>
char (&ArraySizeHelper(Type(&Array)[Size]))[Size];
#define _countof(Array) sizeof(ArraySizeHelper(Array))
sizeof is applied to a function declaration. I'd expect the result to be "size of function pointer". Why does it obtain "size of return value" instead?
It's not sizeof ArraySizeHelper (which would be illegal - can't take sizeof a function), nor sizeof &ArraySizeHelper - not even implicitly as implicit conversion from function to pointer-to-function is explicitly disallowed by the Standard, for C++0x see 5.3.3). Rather, it's sizeof ArraySizeHelper(Array) which is equivalent to sizeof the value that the function call returns, i.e. sizeof char[Size] hence Size.
ArraySizeHelper is a function template which returns a char array of size Size. The template takes two parameters, one is type (which is Type), and other is value (which is Size).
So when you pass an object of type, say, A[100] to the function. The compiler deduces both arguments of the template: Type becomes A, and Size becomes 100.
So the instantiated function return type becomes char[100]. Since the argument of sizeof is never evaluated, so the function need not to have definition. sizeof only needs to know the return type of the function which is char[100]. That becomes equivalent to sizeof(char[100]) which returns 100 - the size of the array.
Another interesting point to be noted is that sizeof(char) is not compiler-dependent, unlike other primitive types (other than the variants of char1). Its ALWAYS 1. So sizeof(char[100]) is guaranteed to be 100.
1. Size of all variants of char is ONE, be it char, signed char, unsigned char according to the Standard.