enum element limit - c++

Is there a maximum number of allowable enum elements in C++?
(Question arose from answer to my previous question on defines)

There isn't any specified maximum or minimum, it depends on your implementation. However, note that Annex B states:
— Enumeration constants in a single enumeration [4096].
As a recommendation. But this is strictly a recommendation, not a requirement.

The language doesn't specify any such thing. However, compilers can have limits. You'd have to check your compiler docs for that.

In the case of C an enum is just a better scoped set of #defines. Whatever that means in detail from the standard C: an enum value is of a
type that is compatible with an
implementation-defined one of the
integral types.
My guess is that C++ has a similar definition and C++0x adds some typing possibility. All in one that would mean the amount you can have of them is theoritically limited by the underlying type (whatever it is? int most of the time, I suppose, the C standard is not clear enough regarding this). But before you can setup millions of symbols your compiler will crash or probably run out of memory.

Related

What is the maximum number of dimensions allowed for an array, and why?

What is the maximum number of dimensions that you can use when declaring an array?
For Example.
#include <iostream.h>
#include <conio.h>
{
int a[3][3][3][4][3];
a[2][2][2][2][2] = 9;
}
So, how many dimensions can we declare on an array.
What is limitation of it?
And what is reason behind it?
ISO/IEC 9899:2011 — C
In C, the C11 standard requires:
5.2.4.1 Translation limits
The implementation shall be able to translate and execute at least one program that
contains at least one instance of every one of the following limits:18)
…
12 pointer, array, and function declarators (in any combinations) modifying an
arithmetic, structure, union, or void type in a declaration.
…
18) Implementations should avoid imposing fixed translation limits whenever possible.
That means that to be a standard-compliant compiler, it must allow at least 12 array dimensions on a simple type like int, but should avoid imposing any limit if at all possible. The C90 and C99 standards also required the same limit.
ISO/IEC 14882:2011 — C++
For C++11, the equivalent information is:
Annex B (informative) Implementation quantities [implimits]
Because computers are finite, C++ implementations are inevitably limited in the size of the programs they
can successfully process. Every implementation shall document those limitations where known. This documentation
may cite fixed limits where they exist, say how to compute variable limits as a function of available
resources, or say that fixed limits do not exist or are unknown.
2 The limits may constrain quantities that include those described below or others. The bracketed number
following each quantity is recommended as the minimum for that quantity. However, these quantities are
only guidelines and do not determine compliance.
…
Pointer, array, and function declarators (in any combination) modifying a class, arithmetic, or incomplete
type in a declaration [256].
…
Thus, in C++, the recommendation is that you should be able to use at least 256 dimensions in an array declaration.
Note that even after you've got the compiler to accept your code, there will ultimately be limits imposed by the memory on the machine where the code is run. The standards specify the minimum number of dimensions that the compiler must allow (over-specify in the C++ standard; the mind boggles at the thought of a 256-dimensional array). The intention is that you shouldn't run into a problem — use as many dimensions as you need. (Can you imagine working with the source code for a 64-dimensional array, let alone anything more — the individual expressions in the source would be horrid to behold, let alone write, read, modify.)
It is not hard to understand that it is only limited by the amount of memory your machine has. You can take 100 (n)dimensional array also.1
Note: your code is accessing a memory out of the bound which is undefined behavior.
1.standard specifies a minimum limit of 12 in case of C and 256 in case of c++11.(This information is added after discussion with Jonathan leffler.My earlier answer only points out the maximum limits which is constrained my machine memory.
maximum number depend on stack size. ex, if stack size = 1Mb --> size of int a[xx][xx][xx][xx][xx] must < 1Mb

For what values of the alignment parameter does std::align work as expected?

I want to use std::align to align a buffer of unsigned char to a specific power-of-two value.
So far the documentation that I looked at seems to say this should work, except for this critical caveat at the end of the description:
The behavior is undefined if alignment is not a fundamental or
extended alignment value supported by the implementation (until C++17)
power of two (since C++17).
I can't fully parse that. If I had to take a guess, the last part means "until C++17, alignment needs to be a fundamental or extended alignment and after C++17, it just needs to be a power of two".
Is my parsing correct?
More importantly, I'm interested in the C++11 behavior: what are the definitions of fundamental and extended alignment and how can I determine what my implementation supports?
Finally, what happens if I pass a value of alignment that does satisfy the above - is it UB or just an implementation-defined result?
Perhaps I could determine the answers myself with access to the standard, but based on my search it seems to be a pay-to-read document.

Is it undefined behaviour to allocate overlarge stack structures?

This is a C specification question.
We all know this is legal C and should work fine on any platform:
/* Stupid way to count the length of a number */
int count_len(int val) {
char buf[256];
return sprintf(buf, "%d", val);
}
But this is almost guaranteed to crash:
/* Stupid way to count the length of a number */
int count_len(int val) {
char buf[256000000];
return sprintf(buf, "%d", val);
}
The difference is that the latter program blows the stack and will probably crash. But, purely semantically, it really isn't any different than the previous program.
According to the C spec, is the latter program actually undefined behavior? If so, what distinguishes it from the former? If not, what in the C spec says it's OK for a conforming implementation to crash?
(If this differs between C89/C99/C11/C++*, this would be interesting too).
Language standards for C(89,99,11) begin with a scope section with this wording (also found in some C++, C#, Fortran and Pascal standards):
This International Standard does not specify
the size or complexity of a program and its data that will exceed the capacity of any specific data-processing system or the capacity of a particular processor;
all minimal requirements of a data-processing system that is capable of supporting a conforming implementation.
The gcc compiler does offer an option to check for stack overflow at runtime
21.1 Stack Overflow Checking
For most operating systems, gcc does not perform stack overflow checking by default. This means that if the main environment task or some other task exceeds the available stack space, then unpredictable behavior will occur. Most native systems offer some level of protection by adding a guard page at the end of each task stack. This mechanism is usually not enough for dealing properly with stack overflow situations because a large local variable could “jump” above the guard page. Furthermore, when the guard page is hit, there may not be any space left on the stack for executing the exception propagation code. Enabling stack checking avoids such situations.
To activate stack checking, compile all units with the gcc option -fstack-check. For example:
gcc -c -fstack-check package1.adb
Units compiled with this option will generate extra instructions to check that any use of the stack (for procedure calls or for declaring local variables in declare blocks) does not exceed the available stack space. If the space is exceeded, then a Storage_Error exception is raised.
There was an attempt during the standardization process for C99 to make a stronger statement within the standard that while size and complexity are beyond the scope of the standard the implementer has a responsibility to document the limits.
The rationale was
The definition of conformance has always been a problem with the C
Standard, being described by one author as "not even rubber teeth, more
like rubber gums". Though there are improvements in C9X compared with C89,
many of the issues still remain.
This paper proposes changes which, while not perfect, hopefully improve the
situation.
The following wording was suggested for inclusion to section 5.2.4.1
translation or execution might fail if the size or complexity of a program or its data exceeds the capacity of the implementation.
The implementation shall document a way to determine if the size or complexity of a correct program exceeds or might exceed the capacity of the implementation.
5.2.4.1. An implementation is always free to state that a given program is
too large or too complex to be translated or executed. However, to stop
this being a way to claim conformance while providing no useful facilities
whatsoever, the implementer must show provide a way to determine whether a
program is likely to exceed the limits. The method need not be perfect, so
long as it errs on the side of caution.
One way to do this would be to have a formula which converted values such
as the number of variables into, say, the amount of memory the compiler
would need. Similarly, if there is a limit on stack space, the formula need
only show how to determine the stack requirements for each function call
(assuming this is the only place the stack is allocated) and need not work
through every possible execution path (which would be impossible in the
face of recursion). The compiler could even have a mode which output a
value for each function in the program.
The proposed wording did not make it into the C99 standard, and therefore this area remained outside the scope of the standard. Section 5.2.4.1 of C99 does list these limits
The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:
127 nesting levels of blocks
63 nesting levels of conditional inclusion
12 pointer, array, and function declarators (in any combinations) modifying an arithmetic, structure, union, or incomplete type in a declaration
63 nesting levels of parenthesized declarators within a full declarator
63 nesting levels of parenthesized expressions within a full expression
63 significant initial characters in an internal identifier or a macro name (each universal character name or extended source character is considered a single character)
31 significant initial characters in an external identifier (each universal character name specifying a short identifier of 0000FFFF or less is considered 6 characters, each universal character name specifying a short identifier of 00010000 or more is considered 10 characters, and each extended source character is considered the same number of characters as the corresponding universal character name, if any)
4095 external identifiers in one translation unit
511 identifiers with block scope declared in one block
4095 macro identifiers simultaneously defined in one preprocessing translation unit
127 parameters in one function definition
127 arguments in one function call
127 parameters in one macro definition
127 arguments in one macro invocation
4095 characters in a logical source line
4095 characters in a character string literal or wide string literal (after concatenation)
65535 bytes in an object (in a hosted environment only)
15 nesting levels for #included files
1023 case labels for a switch statement (excluding those for any nested switch statements)
1023 members in a single structure or union
1023 enumeration constants in a single enumeration
63 levels of nested structure or union definitions in a single struct-declaration-list
In C++, Annex B indicates that the maximum size of an object is an implementation-specific finite number. That would tend to limit arrays with automatic storage class.
However, I'm not seeing something specifically for space accumulated by all automatic variables on the call stack, which is where a stack overflow ought to be triggered. I'm also not seeing a recursion limit in Annex B, although that would be closely related.
The C standard is silent on all issues relating to stack overflow. This is a bit strange since it's very vocal in just about every other corner of C programming. AFAIK there is no specification that a certain amount of automatic storage must be available, and no way of detecting or recovering from exhaustion of the space available for automatic storage. The abstract machine is assumed to have an unlimited amount of automatic storage.
I believe the behavior is undefined by omission -- if a 250,000,000-byte local object actually exceeds the implementation's capacity.
Quoting the 2011 ISO C standard, section 1 (Scope), paragraph 2:
This International Standard does not specify
[...]
- the size or complexity of a program and its data that will exceed the capacity of any
specific data-processing system or the capacity of a particular processor
So the standard explicitly acknowledges that a program may exceed the capacity of an implementation.
I think we can safely assume that a program that exceeds the capacity of an implementation is not required to behave the same way as one that does not; otherwise there would be no point in mentioning it.
Since nothing in the standard defines the behavior of such a program, the behavior is undefined. This is specified by section 4 (Conformance), paragraph 2:
[...] Undefined behavior is otherwise indicated in this
International Standard by the words "undefined behavior" or by the
omission of any explicit definition of behavior. There is no
difference in emphasis among these three; they all describe "behavior
that is undefined".
Of course an implementation on most modern computers could easily allocate 250 million bytes of memory; that's only a small fraction of the available RAM on the computer I'm typing this on, for example. But many operating systems place a fairly low limit on the amount of stack space that a program can allocate.
(Incidentally, I'm assuming that the code in the question is a fragment of some complete program that actually calls the function. As it stands, the code has no behavior, since there's no call to count_len, nor is there a main function. I might not normally mention that, but you did use the "language-lawyer" tag.)
Anyone who would argue that the behavior is not undefined should explain either (a) why having the program crash does not make the implementation non-conforming, or (b) how a program crash is within the scope of defined behavior (even if it's implementation-defined or unspecified).

Maximum number of cases that can be addressed using switch statement

This is out of curiosity. What is the maximum number of switch cases I can have in a single switch including the default: case. I mean like this:
switch(ch)
{
case 1:
//some statement
break;
case 2:
//some statement
break;
.
.
.
.
case n:
//some statement
break;
default:
//default statement
}
My question is what is the maximum value that we can have here? Although this is not programatically significant, I found this a rather intriguing thought. I searched some blogs and found a statement here.
From a doc I have, it is said that:
Standard C specifies that a switch can have at least 257 case
statements. Standard C++ recommends that at least 16,384 case
statements be supported! The real value must be implementation
dependent.
But I don't know how accurate this information is, can somebody give me an idea? Also what does it mean by implementation dependent? Suppose there is a limit like this, can I somehow change it to a higher or lower value?
The draft C++ standard Annex B (informative) Implementation quantities says (emphasis mine):
Because computers are finite, C++ implementations are inevitably limited in the size of the programs they can successfully process. Every implementation shall document those limitations where known. [...]
The limits may constrain quantities that include those described below or others. The bracketed number following each quantity is recommended as the minimum for that quantity. However, these quantities are only guidelines and do not determine compliance.
and includes the follow item:
— Case labels for a switch statement (excluding those for any nested switch statements) [16384].
but these are not hard limits only a recommendation on minimums.
The implementation is the compiler, standard library and supporting tools and so implementation dependent basically means for this case the compiler will decide what the limit is but it should document this limit. The draft standard defines implementation-defined behavior in section 1.3.10 as:
behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents
We can see that gcc does not impose a limit for C:
GCC is only limited by available memory.
which should also cover C++ in this case and it looks like Visual Studio also does not place a limit:
Microsoft C does not limit the number of case values in a switch statement. The number is limited only by the available memory. ANSI C requires at least 257 case labels be allowed in a switch statement.
I can not find similar documentation for clang.
Your question is tagged C++, so per C++98 Annex B/1:
Because computers are finite, C++ implementations are inevitably
limited in the size of the programs they can successfully process.
Every implementation shall document those limitations where known.
This documentation may cite fixed limits where they exist, say how to
compute variable limits as a function of available resources, or say
that fixed limits do not exist or are unknown.
And then Annex B/2:
The limits may constrain quantities that include those described below
or others. The bracketed number following each quantity is recommended
as the minimum for that quantity. However, these quantities are only
guidelines and do not determine compliance.
So as long as the implementation documents what it's doing, ANY max number of case statements is allowed. The standard recommends 16384 in a following list however.
Per the c99 standard, section 5.2.4.1 Translation limits says:
The implementation shall be able to translate and execute at least one program that
contains at least one instance of every one of the following limits:13)
and includes the following line:
— 1023 case labels for a switch statement (excluding those for any nested switch
statements)
Per c++98 standard, Annex B (informative) Implementation quantities says:
The limits may constrain quantities that include those described below
or others. The bracketed number following each quantity is recommended
as the minimum for that quantity. However, these quantities are only
guidelines and do not determine compliance.
— Case labels for a switch statement (excluding those for any nested
switch statements) [16 384].
In theory the max number of cases a switch statement can have depends on the data type of the variable you use:
data_type x
switch(x)
{
...
}
for char, you have 256, for short you have 65536 ...and so on; the maximum number of values you can represent given that data_type.
However, the compiler has to generate code for this switch(statement), and to code it usually generates is something like
cmp(R1,$value)
IFT jmp _subroutine
cmp(R1,$value2)
IFT jmp _subroutine2
...
The more cases you add, the higher the pressure on the registers and the larger the code size gets. Since memory and registers are not infinite, and the compiler is human-written there has to be a limit - and that is what is meant by implementation dependent. Each compiler can permit a different number of cases for a switch statement.
Implementation dependant means, the behaviour is not defined by standard, it is the decision of the compiler. The C++ standard does not set a minimum value for how many labels a switch statement shall support.

Determining the alignment of C/C++ structures in relation to its members

Can the alignment of a structure type be found if the alignments of the structure members are known?
Eg. for:
struct S
{
a_t a;
b_t b;
c_t c[];
};
is the alignment of S = max(alignment_of(a), alignment_of(b), alignment_of(c))?
Searching the internet I found that "for structured types the largest alignment requirement of any of its elements determines the alignment of the structure" (in What Every Programmer Should Know About Memory) but I couldn't find anything remotely similar in the standard (latest draft more exactly).
Edited:
Many thanks for all the answers, especially to Robert Gamble who provided a really good answer to the original question and the others who contributed.
In short:
To ensure alignment requirements for structure members, the alignment of a structure must be at least as strict as the alignment of its strictest member.
As for determining the alignment of structure a few options were presented and with a bit of research this is what I found:
c++ std::tr1::alignment_of
not standard yet, but close (technical report 1), should be in the C++0x
the following restrictions are present in the latest draft: Precondition:T shall be a complete type, a reference type, or an array of
unknown bound, but shall not be a function type or (possibly
cv-qualified) void.
this means that my presented use case with the C99 flexible array won't work (this is not that surprising since flexible arrays are not standard c++)
in the latest c++ draft it is defined in the terms of a new keyword - alignas (this has the same complete type requirement)
in my opinion, should c++ standard ever support C99 flexible arrays, the requirement could be relaxed (the alignment of the structure with the flexible array should not change based on the number of the array elements)
c++ boost::alignment_of
mostly a tr1 replacement
seems to be specialized for void and returns 0 in that case (this is forbidden in the c++ draft)
Note from developers: strictly speaking you should only rely on the value of ALIGNOF(T) being a multiple of the true alignment of T, although in practice it does compute the correct value in all the cases we know about.
I don't know if this works with flexible arrays, it should (might not work in general, this resolves to compiler intrinsic on my platform so I don't know how it will behave in the general case)
Andrew Top presented a simple template solution for calculating the alignment in the answers
this seems to be very close to what boost is doing (boost will additionally return the object size as the alignment if it is smaller than the calculated alignment as far as I can see) so probably the same notice applies
this works with flexible arrays
use Windbg.exe to find out the alignment of a symbol
not compile time, compiler specific, didn't test it
using offsetof on the anonymous structure containing the type
see the answers, not reliable, not portable with c++ non-POD
compiler intrinsics, eg. MSVC __alignof
works with flexible arrays
alignof keyword is in the latest c++ draft
If we want to use the "standard" solution we're limited to std::tr1::alignment_of, but that won't work if you mix your c++ code with c99's flexible arrays.
As I see it there is only 1 solution - use the old struct hack:
struct S
{
a_t a;
b_t b;
c_t c[1]; // "has" more than 1 member, strictly speaking this is undefined behavior in both c and c++ when used this way
};
The diverging c and c++ standards and their growing differences are unfortunate in this case (and every other case).
Another interesting question is (if we can't find out the alignment of a structure in a portable way) what is the most strictest alignment requirement possible. There are a couple of solutions I could find:
boost (internally) uses a union of variety of types and uses the boost::alignment_of on it
the latest c++ draft contains std::aligned_storage
The value of default-alignment shall be the most stringent alignment requirement for any C++ object type whose size is no greater than Len
so the std::alignment_of< std::aligned_storage<BigEnoughNumber>>::value should give us the maximum alignment
draft only, not standard yet (if ever), tr1::aligned_storage does not have this property
Any thoughts on this would also be appreciated.
I have temporarily unchecked the accepted answer to get more visibility and input on the new sub-questions
There are two closely related concepts to here:
The alignment required by the processor to access a particular object
The alignment that the compiler actually uses to place objects in memory
To ensure alignment requirements for structure members, the alignment of a structure must be at least as strict as the alignment of its strictest member. I don't think this is spelled out explicitly in the standard but it can be inferred from the the following facts (which are spelled out individually in the standard):
Structures are allowed to have padding between their members (and at the end)
Arrays are not allowed to have padding between their elements
You can create an array of any structure type
If the alignment of a structure was not at least as strict as each of its members you would not be able to create an array of structures since some structure members some elements would not be properly aligned.
Now the compiler must ensure a minimum alignment for the structure based on the alignment requirements of its members but it can also align objects in a stricter fashion than required, this is often done for performance reasons. For example, many modern processors will allow access to 32-bit integers in any alignment but accesses may be significantly slower if they are not aligned on a 4-byte boundary.
There is no portable way to determine the alignment enforced by the processor for any given type because this is not exposed by the language, although since the compiler obviously knows the alignment requirements of the target processor it could expose this information as an extension.
There is also no portable way (at least in C) to determine how a compiler will actually align an object although many compilers have options to provide some level of control over the alignment.
I wrote this type trait code to determine the alignment of any type(based on the compiler rules already discussed). You may find it useful:
template <class T>
class Traits
{
public:
struct AlignmentFinder
{
char a;
T b;
};
enum {AlignmentOf = sizeof(AlignmentFinder) - sizeof(T)};
};
So now you can go:
std::cout << "The alignment of structure S is: " << Traits<S>::AlignmentOf << std::endl;
The following macro will return the alignment requirement of any given type (even if it's a struct):
#define TYPE_ALIGNMENT( t ) offsetof( struct { char x; t test; }, test )
Note: I probably borrowed this idea from a Microsoft header at some point way back in my past...
Edit: as Robert Gamble points out in the comments, this macro is not guaranteed to work. In fact, it will certainly not work very well if the compiler is set to pack elements in structures. So if you decide to use it, use it with caution.
Some compilers have an extension that allows you obtain the alignment of a type (for example, starting with VS2002, MSVC has an __alignof() intrinsic). Those should be used when available.
As the others mentioned, its implementation dependant. Visual Studio 2005 uses 8 bytes as the default structure alignment. Internally, items are aligned by their size - a float has 4 byte alignment, a double uses 8, etc.
You can override the behavior with #pragma pack. GCC (and most compilers) have similar compiler options or pragmas.
It is possible to assume a structure alignment if you know more details about the compiler options that are in use. For example, #pragma pack(1) will force alignment on the byte level for some compilers.
Side note: I know the question was about alignment, but a side issue is padding. For embedded programming, binary data, and so forth -- In general, don't assume anything about structure alignment if possible. Rather use explicit padding if necessary in the structures. I've had cases where it was impossible to duplicate the exact alignment used in one compiler to a compiler on a different platform without adding padding elements. It had to do with the alignment of structures inside of structures, so adding padding elements fixed it.
If you want to find this out for a particular case in Windows, open up windbg:
Windbg.exe -z \path\to\somemodule.dll -y \path\to\symbols
Then, run:
dt somemodule!CSomeType
I don't think memory layout is guaranteed in any way in any C standard. This is very much vendor and architect-dependent. There might be ways to do it that work in 90% of cases, but they are not standard.
I would be very glad to be proven wrong, though =)
I agree mostly with Paul Betts, Ryan and Dan. Really, it's up to the developer, you can either keep the default alignment symanic's which Robert noted about (Robert's explanation is just the default behaviour and not by any means enforced or required), or you can setup whatever alignment you want /Zp[##].
What this means is that if you have a typedef with floats', long double's, uchar's etc... various assortments of arrays's included. Then have another type which has some of these oddly shaped members, and a single byte, then another odd member, it will simply be aligned at whatever preference the make/solution file defines.
As noted earlier, using windbg's dt command at runtime you can find out how the compiler laid out the structure in memory.
You can also use any pdb reading tool like dia2dump to extract this info from pdb's statically.
Modified from Peeter Joot's Blog
C structure alignment is based on the biggest size native type in the structure, at least generally (an exception is something like using a 64-bit integer on win32 where only 32-bit alignment is required).
If you have only chars and arrays of chars, once you add an int, that int will end up starting on a 4 byte boundary (with possible hidden padding before the int member). Additionally, if the structure isn’t a multiple of sizeof(int), hidden padding will be added at the end. Same thing for short and 64-bit types.
Example:
struct blah1 {
char x ;
char y[2] ;
};
sizeof(blah1) == 3
struct blah1plusShort {
char x ;
char y[2] ;
// <<< hidden one byte inserted by the compiler here
// <<< z will start on a 2 byte boundary (if beginning of struct is aligned).
short z ;
char w ;
// <<< hidden one byte tail pad inserted by the compiler.
// <<< the total struct size is a multiple of the biggest element.
// <<< This ensures alignment if used in an array.
};
sizeof(blah1plusShort) == 8
I read this answer after 8 years and I feel that the accepted answer from #Robert is generally right, but mathematically wrong.
To ensure alignment requirements for structure members, the alignment of a structure must be at least as strict as the least common multiple of the alignment of its members. Consider an odd example, where the alignment requirements of members are 4 and 10; in which case the alignment of the structure is LCM(4, 10) which is 20, and not 10. Of course, it is odd to see platforms with such alignment requirement which is not a power of 2, and thus for all practical cases, the structure alignment is equal to the maximum alignment of its members.
The reason for this is that, only if the address of the structure starts with the LCM of its member alignments, the alignment of all the members can be satisfied and the padding between the members and the end of the structure is independent of the start address.
Update: As pointed out by #chqrlie in the comment, C standard does not allow the odd values of the alignment. However this answer still proves why structure alignment is the maximum of its member alignments, just because the maximum happens to be the least common multiple, and thus the members are always aligned relative to the common multiple address.