This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
array initialization, is referencing a previous element ok?
I wonder if its safe to do such initialization in c/c++ standard:
int a = 5;
int tab[] = { a , tab[0] + 1 , tab[1] };
It successfully compiles and executes with gcc 4.5 and clang 2.9, but will it always be true?
Printing this table gives 5 6 6. Its initialized in global scope.
Generally its interesting in both, c and c++, but i want to use it in c++:)
C++03/C++11 answer
No, it won't.
On the right-hand side of the =, tab exists1 but — if it has automatic storage duration — it has not yet been initialised so your use of tab[0] and tab[1] is using an uninitialised variable.
If tab is at namespace scope (and thus has static storage duration and has been zero-initialized), then this is "safe" but your use of tab[0] there is not going to give you 5.
It's difficult to provide standard references for this, other than to say that there is nothing in 8.5 "Initializers" that explicitly makes this possible, and rules elsewhere fill in the rest.
1 [n3290: 3.3.2/1]: The point of declaration for a name is immediately after its complete declarator (Clause 8) and before its initializer (if any) [..]
int a =5;
int tab[] = { a , tab[0] + 1 , tab[1] };
If these variables are declared at namespace scope, then they're okay, as at namespace scope variables are zero-initialized (because of static initialization - read this for detail).
But if they're declared at function scope, then second line invokes undefined behaviour, since the local variables are not statically initialized, that means, tab[0] and tab[1] are uninitialized, which you use to initialize the array. Reading uninitialized variables invokes undefined behavior.
In the C99 standard, it seems that the order of initialization of the members is guaranteed:
§6.7.8/17: Each brace-enclosed initializer list has an associated current object. When no designations are present, subobjects of the current object are initialized in order according to the type of the current object: array elements in increasing subscript order, structure members in declaration order, and the first named member of a union. In contrast, a designation causes the following initializer to begin initialization of the subobject described by the designator. Initialization then continues forward in order, beginning with the next subobject after that described by the designator.
But as #Tomalak mentions in the comment, that does not provide a full guarantee for the operation, as the compiler would first evaluate all of the arguments and then apply the results in the previous order. That is, the previous quote does not impose an order between the initialization of tab[0] and the evaluation of the expression tab[0]+1 that is used to initialize tab[1] (it only imposes an ordering between the initialization of tab[0] and tab[1])
As of the C++ standard, neither in the current standard nor the FDIS of the upcoming C++0x standard seem to have an specific clause defining the order in which the initialization is performed. The only mention of ordering comes from
§8.5.1/2 When an aggregate is initialized the initializer can contain an initializer-clause consisting of a brace-enclosed, comma-separated list of initializer-clauses for the members of the aggregate, written in increasing subscript or member order.
But that only relates to the order by which the entries in the initializer are written, not how it is actually evaluated.
so, now I've run a couple of tests regarding your problem.
All compilations have been performed with your example code above, using the following scheme:
$(GCC) -o a.out test.c -Wall -Wextra -pedantic -std=$(STD)
This yielded the following results:
for GCC = gcc, the standards -std=c89; -std=iso9899:1990; -std=iso9899:199409; -std=gnu89 resulted in a warning showing up: initializer element is not computable at load time and undefined behaviour at runtime, meaning that the second and third value of the array were random garbage.
the standards -std=c99; std=iso9899:1999; -std=gnu99 did not produce this warning, but also showed undefined behaviour at runtime.
for GCC = g++, the standards -std=c++98; -std=gnu++98; -std=c++0x produced no warning, and the code worked as you'd expected it to, resulting in an array containing the values {5, 6, 6}.
However, as most of the people advised, it might be unwise to use this, since your code might behave differently on other compilers, or maybe even other versions of the same compiler, which is generally a bad thing :)
hope that helped.
Yes - It will probably work as you expect it to be.
No - (you didn't ask but) Don't use it, it make no logic and it's a bad practice.
Related
I was researching the initializer syntax in C++, and according to cppreference, there are three possible ways of writing it, 1) brackets, 2) equal sign, 3) braces.
When trying to initialize an array with the 1) brackets syntax, I encounter an error.
int test() {
int a[](1);
return 0;
}
Testing that on Compiler Explorer, I get from clang 11.0.0
error: array initializer must be an initializer list
int a[](1);
^
And similarly, on the same site with gcc 10.2
error: array must be initialized with a brace-enclosed initializer
Now, I know how to use braces to initialize the array without an error. But that is not the point of this question.
I'm looking for a correspondence of C++ standard with this reported error.
I'm looking at this standard draft timsong-cpp (should be around the C++ 20 time).
The section "(17.1)
If the initializer is a (non-parenthesized) braced-init-list or is = braced-init-list, ..." talks about braced lists - not our case.
Then there is a section "(17.5)
Otherwise, if the destination type is an array, the object is initialized as follows ..."
I think this should cover our case. It is about initialization of an array, it is also an "otherwise" section, meaning it does nat talk about the braced lists. It could talk about 1) brackets or 2) equal sign, but from further text we see that it requires an expression-list:
"Let x1, …, xk be the elements of the expression-list. "
That expression list will be there when using the 1) brackets syntax. As a side note, the section "14 If the entity being initialized ..." requires the expression list to only be a single expression in this case.
According to the standard wording, the declaration int a[](1); should set the length of the array to 1 and initialize its only element with value 1. But that does not happen in the implementations.
Which other parts of the standard can prevent this interpretation? Or is there something else I'm missing?
The feature that you're trying to use was added in C++20, and is called "Parenthesized initialization of aggregates". As can be seen from the compiler support page, GCC supports this from GCC10, whereas Clang doesn't support this feature yet.
I noticed just now that the following code can be compiled with clang/gcc/clang++/g++, using c99, c11, c++11 standards.
int main(void) {
int i = i;
}
and even with -Wall -Wextra, none of the compilers even reports warnings.
By modifying the code to int i = i + 1; and with -Wall, they may report:
why.c:2:13: warning: variable 'i' is uninitialized when used within its own initialization [-Wuninitialized]
int i = i + 1;
~ ^
1 warning generated.
My questions:
Why is this even allowed by compilers?
What does the C/C++ standards say about this? Specifically, what's the behavior of this? UB or implementation dependent?
Because i is uninitialized when use to initialize itself, it has an indeterminate value at that time. An indeterminate value can be either an unspecified value or a trap representation.
If your implementation supports padding bits in integer types and if the indeterminate value in question happens to be a trap representation, then using it results in undefined behavior.
If your implementation does not have padding in integers, then the value is simply unspecified and there is no undefined behavior.
EDIT:
To elaborate further, the behavior can still be undefined if i never has its address taken at some point. This is detailed in section 6.3.2.1p2 of the C11 standard:
If the lvalue designates an object of automatic storage
duration that could have been declared with the register storage
class (never had its address taken), and that object is uninitialized
(not declared with an initializer and no assignment to it
has been performed prior to use), the behavior is undefined.
So if you never take the address of i, then you have undefined behavior. Otherwise, the statements above apply.
This is a warning, it's not related to the standard.
Warnings are heuristic with "optimistic" approach. The warning is issued only when the compiler is sure that it's going to be a problem. In cases like this you have better luck with clang or newest versions of gcc as stated in comments (see another related question of mine: why am I not getting an "used uninitialized" warning from gcc in this trivial example?).
anyway, in the first case:
int i = i;
does nothing, since i==i already. It is possible that the assignment is completely optimized out as it's useless. With compilers which don't "see" self-initialization as a problem you can do this without a warning:
int i = i;
printf("%d\n",i);
Whereas this triggers a warning all right:
int i;
printf("%d\n",i);
Still, it's bad enough not to be warned about this, since from now on i is seen as initialized.
In the second case:
int i = i + 1;
A computation between an uninitialized value and 1 must be performed. Undefined behaviour happens there.
I believe you are okay with getting the warning in case of
int i = i + 1;
as expected, however, you expect the warning to be displayed even in case of
int i = i;
also.
Why is this even allowed by compilers?
There is nothing inherently wrong with the statement. See the related discussions:
Why does the compiler allow initializing a variable with itself?
Why is initialization of a new variable by itself valid?
for more insight.
What does the C/C++ standards say about this? Specifically, what's the behavior of this? UB or implementation dependent?
This is undefined behavior, as the type int can have trap representation and you never have taken the address of the variable in discussion. So, technically, you'll face UB as soon as you try to use the (indeterminate) value stored in variable i.
You should turn on your compiler warnings. In gcc,
compile with -Winit-self to get a warning. in C.
For C++, -Winit-self is enabled with -Wall already.
I noticed just now that the following code can be compiled with clang/gcc/clang++/g++, using c99, c11, c++11 standards.
int main(void) {
int i = i;
}
and even with -Wall -Wextra, none of the compilers even reports warnings.
By modifying the code to int i = i + 1; and with -Wall, they may report:
why.c:2:13: warning: variable 'i' is uninitialized when used within its own initialization [-Wuninitialized]
int i = i + 1;
~ ^
1 warning generated.
My questions:
Why is this even allowed by compilers?
What does the C/C++ standards say about this? Specifically, what's the behavior of this? UB or implementation dependent?
Because i is uninitialized when use to initialize itself, it has an indeterminate value at that time. An indeterminate value can be either an unspecified value or a trap representation.
If your implementation supports padding bits in integer types and if the indeterminate value in question happens to be a trap representation, then using it results in undefined behavior.
If your implementation does not have padding in integers, then the value is simply unspecified and there is no undefined behavior.
EDIT:
To elaborate further, the behavior can still be undefined if i never has its address taken at some point. This is detailed in section 6.3.2.1p2 of the C11 standard:
If the lvalue designates an object of automatic storage
duration that could have been declared with the register storage
class (never had its address taken), and that object is uninitialized
(not declared with an initializer and no assignment to it
has been performed prior to use), the behavior is undefined.
So if you never take the address of i, then you have undefined behavior. Otherwise, the statements above apply.
This is a warning, it's not related to the standard.
Warnings are heuristic with "optimistic" approach. The warning is issued only when the compiler is sure that it's going to be a problem. In cases like this you have better luck with clang or newest versions of gcc as stated in comments (see another related question of mine: why am I not getting an "used uninitialized" warning from gcc in this trivial example?).
anyway, in the first case:
int i = i;
does nothing, since i==i already. It is possible that the assignment is completely optimized out as it's useless. With compilers which don't "see" self-initialization as a problem you can do this without a warning:
int i = i;
printf("%d\n",i);
Whereas this triggers a warning all right:
int i;
printf("%d\n",i);
Still, it's bad enough not to be warned about this, since from now on i is seen as initialized.
In the second case:
int i = i + 1;
A computation between an uninitialized value and 1 must be performed. Undefined behaviour happens there.
I believe you are okay with getting the warning in case of
int i = i + 1;
as expected, however, you expect the warning to be displayed even in case of
int i = i;
also.
Why is this even allowed by compilers?
There is nothing inherently wrong with the statement. See the related discussions:
Why does the compiler allow initializing a variable with itself?
Why is initialization of a new variable by itself valid?
for more insight.
What does the C/C++ standards say about this? Specifically, what's the behavior of this? UB or implementation dependent?
This is undefined behavior, as the type int can have trap representation and you never have taken the address of the variable in discussion. So, technically, you'll face UB as soon as you try to use the (indeterminate) value stored in variable i.
You should turn on your compiler warnings. In gcc,
compile with -Winit-self to get a warning. in C.
For C++, -Winit-self is enabled with -Wall already.
There is an old post asking for a construct for which sizeof would return 0. There are some high score answers from high reputation users saying that by the standard no type or variable can have sizeof 0. And I agree 100% with that.
However there is this new answer which presents this solution:
struct ZeroMemory {
int *a[0];
};
I was just about to down-vote and comment on it, but time spent here taught me to check even the things that I am 100% sure on. So... to my surprise both gcc and clang show the same results: sizeof(ZeroMemory) == 0. Even more, sizeof a variable is 0:
ZeroMemory z{};
static_assert(sizeof(z) == 0); // Awkward...
Whaaaat...?
Godbolt link
How is this possible?
Before C was standardized, many compilers would have had no difficulty handling zero-size types as long as code never tried to subtract one pointer to a zero-size type from another. Such types were useful, and supporting them was easier and cheaper than forbidding them. Other compilers decided to forbid such types, however, and some static-assertion code may have relied upon the fact that they would squawk if code tried to create a zero-sized array. The authors of the Standard were faced with a choice:
Allow compilers to silently accept zero-sized array declarations, even
in cases where the purpose of such declarations would be to trigger a
diagnostic and abort compilation, and require that all compilers accept
such declarations (though not necessarily silently) as producing zero-
sized objects.
Allow compilers to silently accept zero-sized array declarations, even
in cases where the purpose of such declarations would be to trigger a
diagnostic and abort compilation, and allow compilers encountering such
declarations to either abort compilation or continue it at their leisure.
Require that implementations issue a diagnostic if code declares a
zero-sized array, but then allow implementations to either abort
compilation or continue it (with whatever semantics they see fit) at
their leisure.
The authors of the Standard opted for #3. Consequently, zero-sized array declarations are regarded by the Standard "extension", even though such constructs were widely supported before the Standard forbade them.
The C++ Standard allows for the existence of empty objects, but in an effort to allow the addresses of empty objects to be usable as tokens it mandates that they have a minimum size of 1. For an object that has no members to have a size of 0 would thus violate the Standard. If an object contains zero-sized members, however, the C++ Standard imposes no requirements about how it is processed beyond the fact that a program containing such a declaration must trigger a diagnostic. Since most code that uses such declarations expects the resulting objects to have a size of zero, the most useful behavior for compilers receiving such code is to treat them that way.
As pointed out by Jarod42 zero size arrays are not standard C++, but GCC and Clang extensions.
Adding -pedantic produces this warning:
5 : <source>:5:12: warning: zero size arrays are an extension [-Wzero-length-array]
int *a[0];
^
I always forget that std=c++XX (instead of std=gnu++XX) doesn't disable all extensions.
This still doesn't explain the sizeof behavior. But at least we know it's not standard...
In C++, a zero-size array is illegal.
ISO/IEC 14882:2003 8.3.4/1:
[..] If the constant-expression (5.19) is present, it shall be an integral constant expression and its value shall be greater than zero. The constant expression specifies the bound of (number of elements in) the array. If the value of the constant expression is N, the array has N elements numbered 0 to N-1, and the type of the identifier of D is “derived-declarator-type-list array of N T”. [..]
g++ requires the -pedantic flag to give a warning on a zero-sized array.
Zero length arrays are an extension by GCC and Clang. Applying sizeof to zero-length arrays evaluates to zero.
A C++ class (empty) can't have size 0, but note that the class ZeroMemory is not empty. It has a named member with size 0 and applying sizeof will return zero.
I have some simple C++ code:
#include <iostream>
int main(){
{
int a = 10;
tag:
std::cout << a << std::endl;
}
goto tag;
return 0;
}
Now I know it is not a good idea to use goto and if I jump with goto to some other scope, I will get compile error. I have tried this and it is naturally giving me a compile error which is obvious. But my question is whether there is any way for which this may get into some infinite loop
I am asking this question because of this question
I note that the code in the question is C++ code and not C code. However, the question is dual-tagged with C and C++, which is irksome since the rules for C and C++ are different.
The code fails to compile in C++ but equivalent code compiles in C
The C++ code in the question should not compile. Analogous code written in C should compile, but the net result is an infinite loop.
C++11
In ISO/IEC 14882:2011 (the C++11 standard; I don't have an official copy of the 2014 standard to report on), it says:
6.6.4 The goto statement [stmt.goto]
¶1 The goto statement unconditionally transfers control to the statement labeled by the identifier. The identifier shall be a label (6.1) located in the current function.
6.7 Declaration statement [stmt.dcl]
¶1 A declaration statement introduces one or more new identifiers into a block; it has the form declaration-statement:
block-declaration
If an identifier introduced by a declaration was previously declared in an outer block, the outer declaration is hidden for the remainder of the block, after which it resumes its force.
¶2 Variables with automatic storage duration (3.7.3) are initialized each time their declaration-statement is executed. Variables with automatic storage duration declared in the block are destroyed on exit from the block (6.6).
¶3 It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A program that jumps87 from a point where a variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has scalar type, class type with a trivial default constructor and a trivial destructor, a cv-qualified version of one of these types, or an array of one of the preceding types and is declared without an initializer (8.5).
87) The transfer from the condition of a switch statement to a case label is considered a jump in this respect.
Although a plain int is a scalar type, the jump bypasses the initialization and so is not allowed.
C11
In ISO/IEC 9899:2011 (the C11 standard), it says:
6.8.6.1 The goto statement
Constraints
¶1 The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.
Semantics
¶2 A goto statement causes an unconditional jump to the statement prefixed by the named label in the enclosing function.
Note that constraint violations require a diagnostic. Violations of rules in semantics sections do not require a diagnostic.
And in Annex I (Common Warnings), which is an informative annex and not a normative one, it says:
— A block with initialization of an object that has automatic storage duration is jumped into (6.2.4).
And
6.2.4 Storage durations of objects
¶5 An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, as do some compound literals. The result of attempting to indirectly access an object with automatic storage duration from a thread other than the one with which the object is associated is implementation-defined.
¶6 For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.
¶7 For such an object that does have a variable length array type, its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration.35) If the scope is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate.
35) Leaving the innermost block containing the declaration, or jumping to a point in that block or an embedded block prior to the declaration, leaves the scope of the declaration.
Note that there is no variably modified type (no VLA or variable length array) in the code in the question. Standard C++ does not support the concept of VLAs (though the GNU C++ compiler does allow them as an extension).
Code (goto1.c):
#include <stdio.h>
int main(void)
{
{
int a = 10;
tag:
printf("%d\n", a);
}
goto tag;
return 0;
}
Sample compilation:
$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror goto1.c -o goto1
$
Those are fairly stringent warning options, and GCC utters not a peep — which is permissible behaviour given what the C standard says.
As the first comment to the question you mention says, the behavior is undefined. That basically means every compiler is free to interpret your code however it wants. Clearly, your compiler sees the problem and throws an error, while a different compiler may allow the code to compile, especially if it's internal representation of scope allows the loop to happen.