I happened to stumble across this piece of code.
int x(int a){
std::cout<<a<<std::endl;
return a + 1;
}
int main()
{
std::cout<<sizeof(x(20))<<std::endl;
return 0;
}
I expected it to print 20 followed 4. But it just prints 4. Why does it happen so? Isn't it incorrect to optimize out a function, that has a side effect (printing to IO/file etc)?
sizeof is a compile-time operator, and the operand is never evaluated.
sizeof is actually an operator and it is evaluated in compile-time.
The compiler can evaluate it because the size of the return type of x is fixed; it cannot change during program execution.
result of sizeof is computed in compiling time in C++. so there has of function call to x(20)
sizeof() gives the size of the datatype. In your case it doesn't need to call the function to obtain the datatype.
I suspect sizeof also does it's business at compile time rather than runtime...
Let me quote c++03 standard, #5.3.3.
The sizeof operator yields the number of bytes in the object
representation of its operand. The operand is either an expression,
which is not evaluated, or a parenthesized type-id.
Related
This question already has answers here:
What does sizeof (function(argument)) return?
(4 answers)
The function call inside sizeof doesn't invoke it? C++
(1 answer)
Closed 12 days ago.
I have two functions without any implementation.
I expect that the linker returns an undefined reference to hello and world error.
But surprisingly, the code compiles and runs without any error.
#include <stdio.h>
int hello();
char world();
int main() {
printf("sizeof hello = %zu, sizeof world = %zu\n", sizeof(hello()), sizeof(world()));
}
sizeof hello = 4, sizeof world = 1
sizeof(hello()) is the size of the the return type, int, not the size of the function. The function is not called.
The function does not need to be defined to determine its return type declared by int hello();.
Deeper (in C)
sizeof works with sizeof unary-expression and sizeof ( type-name ).
The size is determined from the type of the operand. The result
is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant. C17dr § 6.5.3.4 2
Since the type of the operand hello() is an int (and not a variable length array), the operand is not evaluated and is then like sizeof(int).
Aside: sizeof returns a size_t.
"%zu" is a correct specifier to match a size_t. "%ld" is not specified to work.
// printf("sizeof hello = %ld, sizeof world = %ld\n", sizeof(hello()), sizeof(world()));
printf("sizeof hello = %zu, sizeof world = %zu\n", sizeof(hello()), sizeof(world()));
sizeof is unevaluated context. It does not actually call the functions. No definition is required. The declaration is sufficient to know that they return int and char.
You could as well have written sizeof(int) and sizeof(char) to get the same output. sizeof(&hello) would result in the size of a function pointer to hello and does not require the definition either. And sizeof(hello) just makes no sense, because functions have no size (at least not in a sense that sizeof could tell you).
For details I refer you to https://en.cppreference.com/w/cpp/language/sizeof (c++) and https://en.cppreference.com/w/c/language/sizeof (c).
The title of your question:
how sizeof works on unimplemented function?
implies you are trying to determine how much memory the function implementation uses, similar to sizeof() int returning how much memory an int variable uses.
You can't use sizeof() on a function like that.
Per 6.5.3.4 The sizeof and _Alignof operators, paragraph 1 of the (draft) C11 standard:
The sizeof operator shall not be applied to an expression that has function type...
The draft C23 standard has the exact same wording:
The sizeof operator shall not be applied to an expression that has function type...
Is there a way in C and C++ to cause functions returning void to be evaluated in unspecified order?
I know that function arguments are evaluated in unspecified order so for functions not returning void this can be used to evaluate those functions in unspecified order:
#include <stdio.h>
int hi(void) {
puts("hi");
return 0;
}
int bye(void) {
puts("bye");
return 0;
}
int moo(void) {
puts("moo");
return 0;
}
void dummy(int a, int b, int c) {}
int main(void) {
dummy(hi(), bye(), moo());
}
Legal C and C++ code compiled by a conforming compiler may print hi, bye, and moo in any order. This is not undefined behavior (nasal demons would not be valid), there is simply more than one but less than infinite valid outputs and a compliant compiler need not even be deterministic in what it produces.
Is there any way to do this without the dummy return values?
Clarification: This is an abstract question about C and C++. A better original phrasing might have been is there any context in which function evaluation order is unspecified for functions returning void? I'm not trying to solve a specific problem.
You can take advantage of the fact that the left hand side of a the comma operator is a discarded value expression (void expression in C) like this (see it live):
int main(void) {
dummy((hi(),0), (bye(),0), (moo(),0));
}
From the draft C++ standard section 5.18 Comma operator:
A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-value expression (Clause 5).
and C11 section 6.5.17 Comma operator:
The left operand of a comma operator is evaluated as a void expression; there is a
sequence point between its evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value.
As Matt points out is is also possible to mix the above method with arithmetic operators to achieve unspecified order of evaluation:
(hi(),0) + (bye(),0) + (moo(),0) ;
Well there's always the obvious approach of putting pointers to the functions in a container, shuffling it up (or as suggested in a comment sorting it), and calling each item in the container. If you need to have the same behavior each run just make sure your seed is the same each time.
We know that sizeof is an operator used for calculating the size of any datatype and expression, and when the operand is an expression, the parentheses can be omitted.
int main()
{
int a;
sizeof int;
sizeof( int );
sizeof a;
sizeof( a );
return 0;
}
the first usage of sizeof is wrong, while others are right.
When it is compiled using gcc, the following error message will be given:
main.c:5:9: error: expected expression before ‘int’
My question is why the C standard does not allow this kind of operation. Will sizeof int cause any ambiguity?
The following could be ambiguous:
sizeof int * + 1
Is that (sizeof (int*)) + 1, or (sizeof(int)) * (+1)?
Obviously the C language could have introduced a rule to resolve the ambiguity, but I can imagine why it didn't bother. With the language as it stands, a type specifier never appears "naked" in an expression, and so there is no need for rules to resolve whether that second * is part of the type or an arithmetic operator.
The existing grammar does already resolve the potential ambiguity of sizeof (int *) + 1. It is (sizeof(int*))+1, not sizeof((int*)(+1)).
C++ has a somewhat similar issue to resolve with function-style cast syntax. You can write int(0) and you can write typedef int *intptr; intptr(0);, but you can't write int*(0). In that case, the resolution is that the "naked" type must be a simple type name, it can't just be any old type id that might have spaces in it, or trailing punctuation. Maybe sizeof could have been defined with the same restriction, I'm not certain.
From C99 Standard
6.5.3.4.2
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a
type.
In your case int is neither expression nor parenthesized name.
There are two ways to use the sizeof operator in C. The syntax is this:
C11 6.5.3 Unary operators
...
sizeof unary-expression
sizeof ( type-name )
Whenever you use a type as operand, you must have the parenthesis, by the syntax definition of the language. If you use sizeof on an expression, you don't need the parenthesis.
The C standard gives one such example of where you might want to use it on an expression:
sizeof array / sizeof array[0]
However, for the sake of consistency, and to avoid bugs related to operator precedence, I would personally advise to always use () no matter the situation.
I happened to stumble across this piece of code.
int x(int a){
std::cout<<a<<std::endl;
return a + 1;
}
int main()
{
std::cout<<sizeof(x(20))<<std::endl;
return 0;
}
I expected it to print 20 followed 4. But it just prints 4. Why does it happen so? Isn't it incorrect to optimize out a function, that has a side effect (printing to IO/file etc)?
sizeof is a compile-time operator, and the operand is never evaluated.
sizeof is actually an operator and it is evaluated in compile-time.
The compiler can evaluate it because the size of the return type of x is fixed; it cannot change during program execution.
result of sizeof is computed in compiling time in C++. so there has of function call to x(20)
sizeof() gives the size of the datatype. In your case it doesn't need to call the function to obtain the datatype.
I suspect sizeof also does it's business at compile time rather than runtime...
Let me quote c++03 standard, #5.3.3.
The sizeof operator yields the number of bytes in the object
representation of its operand. The operand is either an expression,
which is not evaluated, or a parenthesized type-id.
We all know that dereferencing an null pointer or a pointer to unallocated memory invokes undefined behaviour.
But what is the rule when used within an expression passed to sizeof?
For example:
int *ptr = 0;
int size = sizeof(*ptr);
Is this also undefined?
In most cases, you will find that sizeof(*x) does not actually evaluate *x at all. And, since it's the evaluation (de-referencing) of a pointer that invokes undefined behaviour, you'll find it's mostly okay. The C11 standard has this to say in 6.5.3.4. The sizeof operator /2 (my emphasis in all these quotes):
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.
This is identical wording to the same section in C99. C89 had slightly different wording because, of course, there were no VLAs at that point. From 3.3.3.4. The sizeof operator:
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand, which is not itself evaluated. The result is an integer constant.
So, in C, for all non-VLAs, no dereferencing takes place and the statement is well defined. If the type of *x is a VLA, that's considered an execution-phase sizeof, something that needs to be worked out while the code is running - all others can be calculated at compile time. If x itself is the VLA, it's the same as the other cases, no evaluation takes place when using *x as an argument to sizeof().
C++ has (as expected, since it's a different language) slightly different rules, as shown in the various iterations of the standard:
First, C++03 5.3.3. Sizeof /1:
The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is not evaluated, or a parenthesized type-id.
In, C++11 5.3.3. Sizeof /1, you'll find slightly different wording but the same effect:
The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is an unevaluated operand (Clause 5), or a parenthesized type-id.
C++11 5. Expressions /7 (the above mentioned clause 5) defines the term "unevaluated operand" as perhaps one of the most useless, redundant phrases I've read for a while, but I don't know what was going through the mind of the ISO people when they wrote it:
In some contexts ([some references to sections detailing those contexts - pax]), unevaluated operands appear. An unevaluated operand is not evaluated.
C++14/17 have the same wording as C++11 but not necessarily in the same sections, as stuff was added before the relevant parts. They're in 5.3.3. Sizeof /1 and 5. Expressions /8 for C++14 and 8.3.3. Sizeof /1 and 8. Expressions /8 for C++17.
So, in C++, evaluation of *x in sizeof(*x) never takes place, so it's well defined, provided you follow all the other rules like providing a complete type, for example. But, the bottom line is that no dereferencing is done, which means it does not cause a problem.
You can actually see this non-evaluation in the following program:
#include <iostream>
#include <cmath>
int main() {
int x = 42;
std::cout << x << '\n';
std::cout << sizeof(x = 6) << '\n';
std::cout << sizeof(x++) << '\n';
std::cout << sizeof(x = 15 * x * x + 7 * x - 12) << '\n';
std::cout << sizeof(x += sqrt(4.0)) << '\n';
std::cout << x << '\n';
}
You might think that the final line would output something vastly different to 42 (774, based on my rough calculations) because x has been changed quite a bit. But that is not actually the case since it's only the type of the expression in sizeof that matters here, and the type boils down to whatever type x is.
What you do see (other than the possibility of different pointer sizes on lines other than the first and last) is:
42
4
4
4
4
42
No. sizeof is an operator, and works on types, not the actual value (which is not evaluated).
To remind you that it's an operator, I suggest you get in the habit of omitting the brackets where practical.
int* ptr = 0;
size_t size = sizeof *ptr;
size = sizeof (int); /* brackets still required when naming a type */
The answer may well be different for C, where sizeof is not necessarily a compile-time construct, but in C++ the expression provided to sizeof is never evaluated. As such, there is never a possibility for undefined behavior to exhibit itself. By similar logic, you can also "call" functions that are never defined [because the function is never actually called, no definition is necessary], a fact that is frequently used in SFINAE rules.
sizeof and decltype do not evaluate their operands, computing types only.
sizeof(*ptr) is the same as sizeof(int) in this case.
Since sizeof does not evaluate its operand (except in the case of variable length arrays if you're using C99 or later), in the expression sizeof (*ptr), ptr is not evaluated, therefore it is not dereferenced. The sizeof operator only needs to determine the type of the expression *ptr to get the appropriate size.