Is there a limit to the length of identifier names in C++? - c++

Is there a length limit to the names of variables in C++? What is it? Does this have anything to do with the "64/32-bitness" of the machine?
EDIT: Specifically, what is GCC's limit?

section lex.name of the C++ standard says
An identifier is an arbitrarily long sequence of letters and digits.
However, variable names which share a very large number of initial characters may not be treated as separate variables, the exact number of initial characters used is implementation-specific. Annex B says:
Because computers are finite, C++ implementations are inevitably limited in the size of the programs they can successfully process. Every implementation shall document those limitations where known. This documentation may cite fixed limits where they exist, say how to compute variable limits as a function of available resources, or say that fixed limits do not exist or are unknown.
The limits may constrain quantities that include those described below or others. The bracketed number following each quantity is recommended as the minimum for that quantity. However, these quantities are only guidelines and do not determine compliance.
For gcc, the limits are:
Preprocessor: no limit
C language: no limit
C++: Probably same as C, no separate limit documented. "Some choices are documented in the corresponding document for the C language"
Linker (controls external names linked across compilation units): Platform-specific, often unlimited

In MS Visual Studio 2003–2012 the maximum length of an identifier is 2047 characters (per MSDN).

Related

How many variables can be in local scope

My question is pretty simple: how many variables can be in local scope to be properly translated?
I have to create a small translator (for studying purposes) from C++ to Assembly. During the translation process, there is a dynamic table of identifiers (variable names, in simple case, I suppose). How many can there be?
I mean, my table is dynamic anyway as well, but I need to create an array of tokens where each has 2 numbers - a table ID and a record ID in the table. So I want to know, which type should these IDs be - int, short, long, etc?
How many variables can be in local scope
The C++ standard does not specify an exact maximum number.
It does have following recommendation (quote from latest standard draft):
[implimits]
Because computers are finite, C++ implementations are inevitably limited in the size of the programs they can successfully process.
Every implementation shall document those limitations where known.
This documentation may cite fixed limits where they exist, say how to compute variable limits as a function of available resources, or say that fixed limits do not exist or are unknown.
The limits may constrain quantities that include those described below or others.
The bracketed number following each quantity is recommended as the minimum for that quantity.
However, these quantities are only guidelines and do not determine compliance.
Identifiers with block scope declared in one block ([basic.scope.block]) [1'024].
Someone wrote a test for this, and commonly used compilers appear to support at least 8k: https://github.com/fritzone/cpp-stresstest

What is the maximum number of dimensions allowed for an array, and why?

What is the maximum number of dimensions that you can use when declaring an array?
For Example.
#include <iostream.h>
#include <conio.h>
{
int a[3][3][3][4][3];
a[2][2][2][2][2] = 9;
}
So, how many dimensions can we declare on an array.
What is limitation of it?
And what is reason behind it?
ISO/IEC 9899:2011 — C
In C, the C11 standard requires:
5.2.4.1 Translation limits
The implementation shall be able to translate and execute at least one program that
contains at least one instance of every one of the following limits:18)
…
12 pointer, array, and function declarators (in any combinations) modifying an
arithmetic, structure, union, or void type in a declaration.
…
18) Implementations should avoid imposing fixed translation limits whenever possible.
That means that to be a standard-compliant compiler, it must allow at least 12 array dimensions on a simple type like int, but should avoid imposing any limit if at all possible. The C90 and C99 standards also required the same limit.
ISO/IEC 14882:2011 — C++
For C++11, the equivalent information is:
Annex B (informative) Implementation quantities [implimits]
Because computers are finite, C++ implementations are inevitably limited in the size of the programs they
can successfully process. Every implementation shall document those limitations where known. This documentation
may cite fixed limits where they exist, say how to compute variable limits as a function of available
resources, or say that fixed limits do not exist or are unknown.
2 The limits may constrain quantities that include those described below or others. The bracketed number
following each quantity is recommended as the minimum for that quantity. However, these quantities are
only guidelines and do not determine compliance.
…
Pointer, array, and function declarators (in any combination) modifying a class, arithmetic, or incomplete
type in a declaration [256].
…
Thus, in C++, the recommendation is that you should be able to use at least 256 dimensions in an array declaration.
Note that even after you've got the compiler to accept your code, there will ultimately be limits imposed by the memory on the machine where the code is run. The standards specify the minimum number of dimensions that the compiler must allow (over-specify in the C++ standard; the mind boggles at the thought of a 256-dimensional array). The intention is that you shouldn't run into a problem — use as many dimensions as you need. (Can you imagine working with the source code for a 64-dimensional array, let alone anything more — the individual expressions in the source would be horrid to behold, let alone write, read, modify.)
It is not hard to understand that it is only limited by the amount of memory your machine has. You can take 100 (n)dimensional array also.1
Note: your code is accessing a memory out of the bound which is undefined behavior.
1.standard specifies a minimum limit of 12 in case of C and 256 in case of c++11.(This information is added after discussion with Jonathan leffler.My earlier answer only points out the maximum limits which is constrained my machine memory.
maximum number depend on stack size. ex, if stack size = 1Mb --> size of int a[xx][xx][xx][xx][xx] must < 1Mb

Maximum number of arguments [duplicate]

I came across some Fortran 90 code where 68 arguments are passed to a function.
Upon searching the web I only found something about a limit of passing 256 bytes for some CUDA Fortran related stuff (http://www.pgroup.com/userforum/viewtopic.php?t=2235&sid=f241ca3fd406ef89d0ba08a361acd962).
So I wonder: is there a limit to the number of arguments that may be passed to a function for Intel/Visual/GNU fortran compilers?
I came across this discussion of the Fortran 90 standards:
http://www.nag.co.uk/sc22wg5/Guidelines_for_Bindings-b.html
The relevant section is in italics below:
3.4 Statements
Constraints on the length of a Fortran statement impose an upper limit on the length of a procedure call. Although it is unlikely to be a serious imposition, a single Fortran statement in free-form format is subject to an upper limit of 5241 characters, including a possible label (F90 sections 3.3.1, 3.3.1.4); in fixed-form format the upper limit is 1320 characters plus a possible label (F90 section 3.3.2). These limits are subject to the characters being of "default kind", that is in practice single-byte characters; for double-byte characters, used for example for ideographic languages, the limits are processor-dependent but are likely to be of the same order of size.
There is no limit in the Fortran Standard on the maximum number of arguments to a procedure.
I have a security tool that I am developing in my research that analyzes binaries. I knew that the C language has limits on number of arguments (31 in the C90 standard, 127 in the C99 standard), so I thought that I could dimension a vector to hold 128 items pertaining to incoming arguments. I encountered a FORTRAN-derived binary that had 290 arguments passed, which led me to this discussion. The binary is from the SPEC CPU2006 benchmark suite, benchmark 481.wrf, where I see a procedure that is named (in the binary) "solve_interface_" which sets up 290 arguments on the stack and then calls "solve_em_" which actually processes these arguments. You can no doubt find the FORTRAN source code for these procedures online. The binary was produced by the GNU compiler tools for an x86/Linux/ELF system.
I don't believe that the Fortran standards explicitly impose such a limit. However, they do place a limit on the length of a line of code (132 characters) and the number of lines which can together form a single statement (256). I'll leave it to you to figure out how many arguments you could use in a single call to a routine.
Many compilers on the market have more relaxed limits on both statement length and the number of continuation lines that can be used. However, it would not surprise me if a compiler did impose a maximum number of arguments for any routine but the number is probably higher than any realistic need requires.

Maximum number of cases that can be addressed using switch statement

This is out of curiosity. What is the maximum number of switch cases I can have in a single switch including the default: case. I mean like this:
switch(ch)
{
case 1:
//some statement
break;
case 2:
//some statement
break;
.
.
.
.
case n:
//some statement
break;
default:
//default statement
}
My question is what is the maximum value that we can have here? Although this is not programatically significant, I found this a rather intriguing thought. I searched some blogs and found a statement here.
From a doc I have, it is said that:
Standard C specifies that a switch can have at least 257 case
statements. Standard C++ recommends that at least 16,384 case
statements be supported! The real value must be implementation
dependent.
But I don't know how accurate this information is, can somebody give me an idea? Also what does it mean by implementation dependent? Suppose there is a limit like this, can I somehow change it to a higher or lower value?
The draft C++ standard Annex B (informative) Implementation quantities says (emphasis mine):
Because computers are finite, C++ implementations are inevitably limited in the size of the programs they can successfully process. Every implementation shall document those limitations where known. [...]
The limits may constrain quantities that include those described below or others. The bracketed number following each quantity is recommended as the minimum for that quantity. However, these quantities are only guidelines and do not determine compliance.
and includes the follow item:
— Case labels for a switch statement (excluding those for any nested switch statements) [16384].
but these are not hard limits only a recommendation on minimums.
The implementation is the compiler, standard library and supporting tools and so implementation dependent basically means for this case the compiler will decide what the limit is but it should document this limit. The draft standard defines implementation-defined behavior in section 1.3.10 as:
behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents
We can see that gcc does not impose a limit for C:
GCC is only limited by available memory.
which should also cover C++ in this case and it looks like Visual Studio also does not place a limit:
Microsoft C does not limit the number of case values in a switch statement. The number is limited only by the available memory. ANSI C requires at least 257 case labels be allowed in a switch statement.
I can not find similar documentation for clang.
Your question is tagged C++, so per C++98 Annex B/1:
Because computers are finite, C++ implementations are inevitably
limited in the size of the programs they can successfully process.
Every implementation shall document those limitations where known.
This documentation may cite fixed limits where they exist, say how to
compute variable limits as a function of available resources, or say
that fixed limits do not exist or are unknown.
And then Annex B/2:
The limits may constrain quantities that include those described below
or others. The bracketed number following each quantity is recommended
as the minimum for that quantity. However, these quantities are only
guidelines and do not determine compliance.
So as long as the implementation documents what it's doing, ANY max number of case statements is allowed. The standard recommends 16384 in a following list however.
Per the c99 standard, section 5.2.4.1 Translation limits says:
The implementation shall be able to translate and execute at least one program that
contains at least one instance of every one of the following limits:13)
and includes the following line:
— 1023 case labels for a switch statement (excluding those for any nested switch
statements)
Per c++98 standard, Annex B (informative) Implementation quantities says:
The limits may constrain quantities that include those described below
or others. The bracketed number following each quantity is recommended
as the minimum for that quantity. However, these quantities are only
guidelines and do not determine compliance.
— Case labels for a switch statement (excluding those for any nested
switch statements) [16 384].
In theory the max number of cases a switch statement can have depends on the data type of the variable you use:
data_type x
switch(x)
{
...
}
for char, you have 256, for short you have 65536 ...and so on; the maximum number of values you can represent given that data_type.
However, the compiler has to generate code for this switch(statement), and to code it usually generates is something like
cmp(R1,$value)
IFT jmp _subroutine
cmp(R1,$value2)
IFT jmp _subroutine2
...
The more cases you add, the higher the pressure on the registers and the larger the code size gets. Since memory and registers are not infinite, and the compiler is human-written there has to be a limit - and that is what is meant by implementation dependent. Each compiler can permit a different number of cases for a switch statement.
Implementation dependant means, the behaviour is not defined by standard, it is the decision of the compiler. The C++ standard does not set a minimum value for how many labels a switch statement shall support.

C/C++ Control Structure Limitations?

I have heard of a limitation in VC++ (not sure which version) on the number of nested if statements (somewhere in the ballpark of 300). The code was of the form:
if (a) ...
else if (b) ...
else if (c) ...
...
I was surprised to find out there is a limit to this sort of thing, and that the limit is so small. I'm not looking for comments about coding practice and why to avoid this sort of thing altogether.
Here's a list of things that I'd imagine could have some limitation:
Number of functions in a scope (global, class, or namespace).
Number of expressions in a single statement (e.g., compound conditionals).
Number of cases in a switch.
Number of parameters to a function.
Number of classes in a single hierarchy (either inheritance or containment).
What other control structures/language features have limits such as this? Do the language standards say anything about these limits (perhaps minimum requirements for an implementation)? Has anyone run into a particular language limitation like this with a particular compiler/implementation?
EDIT: Please note that the above form of if statements is indeed "nested." It is equivalent to:
if (a) { //...
}
else {
if (b) { //...
}
else {
if (c) { //...
}
else { //...
}
}
}
Visual C++ Compiler Limits
The C++ standard recommends limits for
various language constructs. The
following is a list of constructs
where the Visual C++ compiler does not
implement the recommended limits. The
first number is the recommended limit
and the second number is the limit
implemented by Visual C++:
Nesting levels of compound statements,
iteration control structures, and
selection control structures [256]
(256).
Parameters in one macro definition
[256] (127).
Arguments in one macro invocation
[256] (127).
Characters in a character string
literal or wide string literal (after
concatenation) [65536] (65535).
Levels of nested class, structure, or
union definitions in a single
struct-declaration-list [256] (16).
Member initializers in a constructor
definition [6144] (approximately 600,
memory dependent, can increase with
the /Zm compiler option).
Scope qualifications of one identifier
[256] (127).
Nested external specifications [1024]
(10).
Template arguments in a template
declaration [1024] (64).
Do the language standards say anything
about these limits (perhaps minimum
requirements for an implementation)?
No, the standard sets no minimum limits on this. But it is good practice for an implementation to set and document a hard limit on such things, rather than fail in some unknown way when the limit is exceeded.
Edit: The standard recommends some minimum limits In Annex B - there are really too many to post here and they are in any case advisory:
The limits may constrain quantities
that include those described below or
others. The bracketed number following
each quantity is recommended as the
minimum for that quantity. However,
these quantities are only guidelines
and do not determine compliance.
C specifies that implementations must be able to translate a program that contains an instance of each of a number of limits. The first limit is that of 127 nesting levels of blocks. (5.2.4.1 of ISO/IEC 9899:1999)
C doesn't say that any valid program that contains no more than 127 nesting levels must be translated; it could be unreasonably large in other ways. The rationale was to set some level of expectation that portable programs can have, while allowing latitude not to exclude small implementations and implementations targetting small systems.
In short, if you want more than 127 nesting levels it probably means that you should consult your implementation's documentation to see if it guarantees to support a larger number.
Just to put the whole scoping thing to bed, the following is legal C++ code:
int main() {
if ( int x = 1 ) {
}
else if ( int x = 2 ) {
}
}
which it would not be if both the if and the else if were at the same scope. I think there have been a lot of misunderstandings, perhaps engendered by my comment:
The compiler cares a great deal about scope.
which is of course true, but maybe not helpful in this situation.