Is static deprecated when ensuring availability between translation units? - c++

From the following stackoverflow answer, the user says:
It means that the variable is local to a translation unit (simply put,
to a single source file), and cannot be accessed from outside it. This
use of static is in fact deprecated in the current C++ Standard -
instead you are supposed to use anonymous namespaces:
static int x = 0;
should be:
namespace {
int x = 0;
}
I don't disagree that anonymous namespaces are the preferred method,
but is using static really deprecated now?
Where does the standard say this?

No, it is not currently deprecated. It was at one point but this was reversed due to C comparability issues. At some point before 1999 it was deprecated and this lead to defect report 174 which says:
The decision to deprecate global static should be reversed.
We cannot deprecate static because it is an important part of C and to abandon it would make C++ unnecessarily incompatible with C.
Because templates may be instantiated on members of unnamed namespaces, some compilation systems may place such symbols in the
global linker space, which could place a significant burden on the
linker. Without static, programmers have no mechanism to avoid the
burden.
This lead to defect report 223 in which the meaning of deprecation was revised from:
deprecated is defined as: Normative for the current edition of the Standard, but not guaranteed to be part of the Standard in future revisions.
it was noted that this implies, only non-deprecated features will be support in future standards:
However, this definition would appear to say that any non-deprecated feature is "guaranteed to be part of the Standard in future revisions." It's not clear that that implication was intended, so this definition may need to be amended.
and changed the meaning of deprecated to:
These are deprecated features, where deprecated is defined as: Normative for the current edition of the Standard, but having been identified as a candidate for removal from future revisions.
and later the feature was undeprecated due to C compatibility issues by defect report 1012:
Although 7.3.1.1 [namespace.unnamed] states that the use of the static keyword for declaring variables in namespace scope is deprecated because the unnamed namespace provides a superior alternative, it is unlikely that the feature will be removed at any point in the foreseeable future, especially in light of C compatibility concerns. The Committee should consider removing the deprecation.

Related

What's the meaning of "reserved for any use"?

NOTE: This is a c question, though I added c++ in case some C++ expert can provide a rationale or historical reason why C++ is using a different wording than C.
In the C standard library specification, we have this normative text, C17 7.1.3 Reserved identifiers (emphasis mine):
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
Now I keep reading answers on SO by various esteemed C experts, where they claim it is fine for a compiler or standard library to use identifiers with underscore + uppercase, or double underscore.
Doesn't "reserved for any use" mean reserved for anyone except future extensions to the C language itself? Meaning that the implementation is not allowed to use them.
While the second phrase above, regarding single leading underscore seems to be directed to the implementation?
In general, the C standard is written in a way that expects compiler vendors/library implementers to be the typical reader - not so much the application programmers.
Notably, C++ has a very different wording:
Each name that contains a double underscore (__) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the implementation for any use.
(See What are the rules about using an underscore in a C++ identifier?)
Is this perhaps a mix-up between C and C++ and the languages are different here?
In the C standard, the meaning of the term "reserved" is defined by 7.1.3p2, immediately below the bullet list you are quoting:
No other identifiers are reserved. If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.
Emphasis mine: reserved identifiers place a restriction on the program, not the implementation. Thus, the common interpretation – reserved identifiers may be used by the implementation to any purpose – is correct for C.
I have not kept up with the C++ standard and no longer feel qualified to interpret it.
While the Standard is primarily written to guide implementers, it is written as a description of what makes a program well-formed, and what its effect is. That's because the basic definition of a standards-conforming compiler is one that does the correct thing for any standards-conforming program:
A strictly conforming program shall use only those features of the language and library
specified in this International Standard....A conforming
hosted implementation shall accept any strictly conforming program.
Read separately, this is hugely restrictive of extensions to a compiler. For instance, based solely on that clause, a compiler shouldn't get to define any of its own reserved words. After all, any given word a particular compiler might want to reserve, could nevertheless show up in a strictly conforming program, forcing the compiler's hand.
The standard goes on, however:
A conforming implementation may have extensions (including additional
library functions), provided they do not alter the behavior of any strictly conforming
program.
That's the key piece. Compiler extensions need to be written in such a way that they affect nonconforming programs (ones which contain undefined behavior, or which shouldn't even compile at all), allowing them to compile and do fun extra things.
So the purpose of defining "reserved identifiers", when the language doesn't actually need those identifiers for anything, is to give implementations some extra wiggle room by providing them with some things which make a program nonconforming. The reason a compiler can recognize, say, __declspec as part of a declaration is because putting __declspec into a declaration is otherwise illegal, so the compiler is allowed to do whatever it wants!
The importance of "reserved for any use", therefore, is that it leaves no question about a compiler's power to treat such identifiers as having any meaning it cares to. Future compatibility is a comparatively distant concern.
The C++ standard works in a similar way, though it's a bit more explicit about the gambit:
A conforming implementation may have extensions (including additional library functions), provided they do
not alter the behavior of any well-formed program. Implementations are required to diagnose programs that
use such extensions that are ill-formed according to this International Standard. Having done so, however,
they can compile and execute such programs.
I suspect the difference in wording is down to the C++ standard just being clearer about how extensions are meant to work. Nevertheless, nothing in the C standard precludes an implementation from doing the same thing. (And we all basically ignore the requirement that the compiler warn you every time you use __declspec.)
Regarding the difference in wording in C versus C++, I'm posting my own little research here as reference:
The early K&R C 1st edition has this text:
...names which are intended for use only by functions of the library begin with an underscore so they are less likely to collide with names in a user's program.
K&R 2nd edition added an Appendix B which addresses the standard library, where we can read
External identifiers that begin with an underscore are reserved for use by the library, as are all
other identifiers that begin with an underscore and an upper-case letter or another underscore.
Early ANSI C drafts, as well as "C90" ISO 9899:1990, has the same text as in the current ISO standard.
The earliest C++ drafts however, has a different text, as noted by #hvd, possibly a clarification of the C standard. From DRAFT: 20 September 1994:
17.3.3.1.2 Global names
...
Each name that begins with an underscore and either an uppercase letter or another underscore (2.8) is
reserved to the implementation for any use
So apparently the wording "reserved for any use" was invented by the ANSI/ISO C90 committee, whereas the C++ committee some years later used a clearer wording, similar to the wording in the pre-standard K&R book.
The C99 rationale V5.10 says this below 7.1.3:
Also reserved for the implementor are all external identifiers beginning with an underscore, and
all other identifiers beginning with an underscore followed by a capital letter or an underscore.
This gives a name space for writing the numerous behind-the-scenes non-external macros and
functions a library needs to do its job properly.
This makes the committee's intention quite clear: "reserved for any use" means "reserved for the implementor".
Also of note, the current C standard has the following normative text elsewhere, in 6.2.5:
There may also be
implementation-defined extended signed integer types. 38)
where the informative foot note 38 says:
Implementation-defined keywords shall have the form of an identifier reserved for any use as
described in 7.1.3.
C has multiple contexts in which a symbol can have a definition:
The space of macro names,
The space of formal names of arguments to a macro (this space is specific to each function-like macro),
The space of ordinary identifiers,
The space of tag names,
The space of labels (this space is specific to each function), and
The space of structure/union members (this space is specific to each struct/union).
What "reserved for any use" means that the user code in a compliant program cannot use1 symbols that start with an underscore that is followed by an uppercase letter or another underscore in any of the above contexts. Compare with identifiers that start with a single underscore but are followed by a lowercase number or a digit. This falls into the second class of identifiers that start with an underscore. User code can can be use these identifiers as the names of macro arguments, as labels, or as the names of structure/union members.
"Reserved for any use" does not mean that the implementation cannot use such symbols. The intent of the reservation is to provide a name space that implementations can freely use without concern that the names defined by the implementation will conflict with the names defined by the user code in a compliant program.
1The standard does not quite mean "cannot use". The standard encourages the programmatic use of a small number of names that start with a double underscore. For example, a compliant implementation is required to define __STDC_VERSION__, __FILE__, __LINE__, and __func__. The 2011 version of the standard even gives an example of a presumably compliant program that references __func__.
The C Standard allows implementations to attach any meaning they see fit to reserved identifiers. Most implementations will treat unrecognized identifiers of reserved forms the same as any other recognized identifiers when there is no reason to do otherwise, thus allowing something like:
#ifdef __ACME_COMPILER
#define near __near
#else
#define near
#endif
int near foo;
to declare an identifier foo using a __near qualifier if the code is being processed in an Acme compiler (which would presumably support such a thing), but also be compatible with other compilers that would not require or benefit from the use of such a directive. Nothing would forbid a conforming implementation from defining __ACME_COMPILER and interpreting __near to mean "launch nuclear missiles", but a quality implementation shouldn't go out of its way to break code like the above. If an implementation doesn't know what __ACME_COMPILER is supposed to mean, treating it like any other unknown identifier would allow it to support useful constructs like the above.
It is months late but one point remains the others have not addressed.
Your question can be viewed from the opposite direction. The standard allows the implementation (as you have observed) to use a symbol like _Foo but, more importantly, thereby forbids the implementation from using foo. The latter is reserved for your use.
To understand, for discussion's sake, suppose that a future C standard introduced the new keyword _Foo. The hypothetical implementation was already using this symbol, so what happens?
Answer:
At first, the implementation will not yet have implemented the new standard. Until implemented, the new standard lacks practical effect.
Later, as part of implementing the new standard, the implementation quietly changes each _Foo to _Bar.
No problem.
In fact, if you think about it in this manner, you can say that the way the standard reserves such words is almost the only way it could reserve them.

Is it definitely illegal to refer to a reserved name?

On the std-proposals list, the following code was given:
#include <vector>
#include <algorithm>
void foo(const std::vector<int> &v) {
#ifndef _ALGORITHM
std::for_each(v.begin(), v.end(), [](int i){std::cout << i; }
#endif
}
Let's ignore, for the purposes of this question, why that code was given and why it was written that way (as there was a good reason but it's irrelevant here). It supposes that _ALGORITHM is a header guard inside the standard header <algorithm> as shipped with some known standard library implementation. There is no inherent intention of portability here.
Now, _ALGORITHM would of course be a reserved name, per:
[C++11: 2.11/3]: In addition, some identifiers are reserved for use by C++ implementations and standard libraries (17.6.4.3.2) and shall not be used otherwise; no diagnostic is required.
[C++11: 17.6.4.3.2/1]: Certain sets of names and function signatures are always reserved to the implementation:
Each name that contains a double underscore _ _ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.
Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
I was always under the impression that the intent of this passage was to prevent programmers from defining/mutating/undefining names that fall under the above criteria, so that the standard library implementors may use such names without any fear of conflicts with client code.
But, on the std-proposals list, it was claimed that this code is itself ill-formed for merely referring to such a reserved name. I can now see how the use of the phrase "shall not be used otherwise" from [C++11: 2.11/3]: may indeed suggest that.
One practical rationale given was that the macro _ALGORITHM could expand to some code that wipes your hard drive, for example. However, taking into account the likely intention of the rule, I'd say that such an eventuality has more to do with the obvious implementation-defined* nature of the _ALGORITHM name, and less to do with it being outright illegal to refer to it.
* "implementation-defined" in its English language sense, not the C++ standard sense of the phrase
I'd say that, as long as we're happy that we are going to have implementation-defined results and that we should investigate what that macro means on our implementation (if it exists at all!), it should not be inherently illegal to refer to such a macro provided we do not attempt to modify it.
For example, code such as the following is used all over the place to distinguish between code compiled as C and code compiled as C++:
#ifdef __cplusplus
extern "C" {
#endif
and I've never heard a complaint about that.
So, what do you think? Does "shall not be used otherwise" include simply writing such a name? Or is it probably not intended to be so strict (which may point to an opportunity to adjust the standard wording)?
Whether it's legal or not is implementation-specific (and identifier-specific).
When the Standard gives the implementation the sole right to use these names, that includes the right to make the names available in user code. If an implementation does so, great.
But if an implementation doesn't expressly give you the right, it is clear from "shall not be used otherwise" that the Standard does not, and you have undefined behavior.
The important part is "reserved to the implementation". It means that the compiler vendor may use those names and even document them. Your code may then use those names as documented. This is often used for extensions like __builtin_expect, where the compiler vendor avoids any clash with your identifiers (that are declared by your code) by using those reserved names. Even the standard uses them for things like __attribute__ to make sure it doesn't break existing (legal) code when adding new features.
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1882
Each identifier that contains a double understore __ or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.
any use. (similar text occurs both before and after that defect fix is applied)
__cplusplus is defined by the standard. _ALGORITHM is reserved by the standard to be used by implementations. These seem quite different? (The two sections of the standard do conflict, in that one states that __cplusplus is reserved for any use, and another uses it specifically, but I think that the winner of that conflict is clear).
The _ALGORITHM identifier could, under the standard, be used as part of a pre-processing step to say "replace this source code with hard drive deleting code". Its existence (prior to pre-processing, or after) could be sufficient to completely change your program behavior.
Now this is unlikely, but I do not think it results in an non-conforming implementation. It is a matter of quality of implementation only.
An implementation is free to document and define what _ALGORITHM means. For example, it could document that it is a header guard for <algorithm>, and indicates if that header file has been included. Treating your current <algorithm> implementation as documentation is probably going to far.
I'd guess using __cplusplus in C mode is technically "just as bad" as using _ALGORITHM, but this question is a c++ question, not a c question. I haven't delved into the c standard to look for quotes about it.
The names in [cpp.predefined] are different. Those have a specified meaning, so an implementation can't reserve them for any use, and using them in a program has a well-defined portable meaning. Using an implementation-specific identifier like the example of _ALGORITHM is ill-formed because it violates a shall-rule.
Yes, I'm fully aware of multiple examples where the library specification uses "shall" to mean "this is a requirement on user code, and violations are UB, not ill-formed".
Regarding whether it's UB or implementation-defined, running an ill-formed program results in UB. The standard wording clearly says the program is ill-formed, UB occurs if the implementation still chooses to accept the program and run it.
So, if a program uses the identifier _ALGORITHM, that program is ill-formed, and running such a program is UB, but that does not mean it doesn't work fine on an implementation that uses _ALGORITHM as an include guard, nor does it mean that it doesn't work fine on an implementation that doesn't.
If users are concerned about such ill-formedness and potential UB, and said users want to write portable C++, they shouldn't use reserved identifiers in portable C++ programs. If users accept that regardless of the standard prohibiting such a use, no practical implementation will wipe your hard drive, they can freely use such reserved identifiers, but by the letter of the standard, such uses are still ill-formed.
Historically, the purpose for making the use of such tokens "undefined behavior" is that compilers are free to attach any meaning they want to any such token that are not defined within the C standard. For example, on some embedded processors, using __xdata as a storage class for a variable will ask that it be stored in an area of RAM which is slower to access than the normal variable-storage area, but is much larger. On typical processors of that family, storage for "normal" variables would be limited to about 100 bytes, but storage for xdata variables may be much larger--up to 64K. The standard says basically nothing about what compilers are allowed to do with such directives, although typically (I'm not sure if the standard mandates this behavior, though I'm unaware of compilers violating it) such tokens are generally ignored within code that is disabled using a #if or similar directives.
Some libraries' header files will start their own internal identifiers with something that starts with two underscores but includes a pattern that's unlikely to be used by a compiler for any purpose (e.g. version 23 of the Foozle library might precede its identifiers with use __FZ23). It would be perfectly legitimate for a future compilers to use identifiers starting with __FZ23 for other purposes, and if that were to happen the Foozle library would need to be changed to use something else. If, however, it is likely that a major compiler upgrade would likely necessitate rewrites of the Foozle library for other reasons anyway, that risk may be acceptable compared to the risk of identifiers conflicting with outside code.
Note also that some project header files which are targeted toward a processor that requires __ directives may conditionally define macros with those names when compiled for other processors, for example:
#ifndef USE_XDATA
#define __XDATA
#endif
though a somewhat better pattern would generally be:
#ifdef USE_XDATA
#define XDATA __XDATA
#else
#define XDATA
#endif
When writing new code, the latter pattern is often better, but the former pattern may sometimes be useful when adapting existing code written on a platform that requires __XDATA so that it may be used both on platforms that use/require that directive and on platforms that do not.
Whether or not it is legal is a matter of local law. Whether it means anything, and if so, what, is a matter for the language definition. When you use a name that's reserved to the implementation the behavior of your program is undefined. That means that the language definition does not tell you what the program does. Nothing more, nothing less. If the compiler you're using documents what a particular reserved identifier does, then you can use that identifier with that compiler. If you hunt through headers and guess what various un-documented identifiers mean you might be able to use them, but don't be surprised if your code breaks when a subsequent update changes something.
Don't get hung up on __cplusplus. It's core language, and the stuff about double underscores, etc. is library. If that's not convincing, just consider it a glitch. You can use __cplusplus in C++ programs; its meaning is well defined.

putc implemented as Macro ic C++?

I know Macro implementation of putc() in C, but is it same in C++?
It will depend on your implementation of cstdio. In most cases this is really just a wrapper around stdio.h, with wrappers declared inside the std namespace, and the C and C++ compilers share the same standard library for C functions. For example, VS2010 uses stdio.h for C++, in which putc is implemented as both a macro and a function, depending on environment and other compile-time definitions.
Which version of C++? C++83 (1983)? C++98 (1998)? C++11 (2011)?
The C++98 and C++11 Specifications rely on the ISO C specifications for C Library functions, and do not put additional implementation constraints on them, other than trivial ones like renaming stdio.h to cstdio.h and allowing inclusion without the dot-h suffix.
See: C++98 Specification
See: C++11 Specification
Look in cstdio.h if you are interested in your particular compiler.
However, if we dig deeper and take a look at the ISO C standard: "ISO/IEC 9899:1990" (C89/C90), well, we find that it is unavailable for free viewing on the web (not even the final draft standard), so moving on to C99 (NOT ISO C), you find...
...that C99 (Not "ISO C") says putc() MAY be implemented as a macro,
See: C99 Specification
So if you are really developing in Obj-C++ (which uses C99), then C99 is the relevant specification to consider, not ISO C (C90). Also, since C99 lets the compiler writer decide whether to make putc() a macro or not, you should consider it an open possibility, and decide whether you really care to know about the C90 (ISO C) spec which is becoming obsolete (now that even C11 (2011) is out.)
Yes it is. Both C and C++ use <stdio.h> which has the same scheme in all implementations that I know of.

Can't use "not", "or", or "plus" as identifier?

I tried to compile this:
enum class conditional_operator { plus, or, not };
But apparently GCC (4.6) thinks these are special, while I can't find a standard that says they are (neither C++0x n3290 or C99 n2794). I'm compiling with g++ -pedantic -std=c++0x. Is this a compiler convenience? How do I turn it off? Shouldn't -std=c++0x turn this "feature" off?
PS: Hmmm, apparently, MarkDown code formatting thinks so too...
Look at 2.5. They are alternative tokens for || and !.
There is a bunch of other alternative tokens BTW.
Edit: The rationale for their inclusion is the same as the one of trigraphs: allow the use of non ASCII character sets. The committee has tried to get rid of them (at least of trigraphs, I don't remember for alternative tokens), and has met opposition of people (mostly IBM mainframe users) which are using them.
Edit for completeness: as other have make the remarks, plus isn't in that class and should not be a problem unless you are using namespace std.
These are actually defined as alternative tokens (and reserved) oddly enough, as alternative representations for operators. I believe this was originally to aid people who were using keyboards which made the relevant symbols hard to produce, although this seems a pretty poor reason to add extra keywords to the language :(
There may be a GCC compiler option to disable them, but I'm not sure.
(As mentioned in comments, plus should be okay unless you're using the std namespace.)
or and not are alternative representations of || and ! respectively. You can't turn them off and you can't use these tokens for anything else, they are part of the language (current C++, not even just C++0x). ( See ISO/IEC 14882:2003 2.5 [lex.digraph] and 2.11 [lex.key] / 2. )
You should be safe with plus unless you use using namespace std; or using std::plus;.
The Standard lists keywords in 2.11. There's also a list of alternative representations separate from the keyword list that is reserved and can't be used otherwise, but aren't keywords. and and or are on that list. Section 17.4.3 describes restrictions on programs that use libraries, and 17.4.3.1.3 describes that names declared with external linkage in a header are reserved both in std:: and the global namespace.
In other words, you don't have to go to C++0x to have those problems. and and or are already reserved, and header <functional> contains plus as a templated struct type, and plus is therefore off-limits if <functional> is directly or indirectly #included.
I'm not sure dumping that much stuff into the global namespace was really wise, but that's what the standard says.
It is an year 1995 amendment to the C90 standard. Probably a compiler may choose on how to behave on this. GCC probably includes the header as part of the standard library. With microsoft it doesn't and you have to include the iso646.h.
Here is a link to wikipedia regarding this.

This is a valid C code but not a valid C++ code?

In some library I'm using (written in C) its
StorePGM(image, width, height, filename)
char *image;
int width, height;
char *filename;
{
// something something
}
All functions are defined this way. I never seen such function definitions in my life. They seem to be valid to MSVC but when I compile it as C++ it gives errors.
What is it? some kind of old version C?
Yes. K&R, pre-standard C. Avoid using it.
Before the 1989 ANSI C standard, C didn't have prototypes (function declarations that specify the types of the parameters); these old-style declarations and definitions were all that was available.
In 1989, the ANSI C standard (which essentially became the 1990 ISO C standard) introduced prototypes. If I recall correctly, the idea actually came from C++ (which had not yet been standardized at the time). Old-style declarations and definitions remained legal, so that old code could still be compiled. The 1989 standard also said that old-style declarations were "obsolescent", meaning that they could be removed in a future version of the standard.
The 1999 ISO C standard, which (officially) superseded the 1990 standard, left this alone; old-style declarations and definitions are still legal, and all conforming compilers must support them (though they're free to warn about them, as they can warn about anything else).
As of the latest C201X draft (large PDF), this still hasn't changed. Old-style function declarations and definitions are still a required part of the language, and all conforming compilers must support them. (Personally, I consider this unfortunate.)
C++, on the other hand, has never (?) supported anything other than prototypes as function declarations; Stroustrup wasn't as concerned about maintaining compatibility with old C code.
But unless you need to maintain very old code and/or use very old compilers, there is no good reason to use old-style function declarations or definitions.
Note that, at least in C, this definition:
int main() {
/* ... */
}
is actually an old-style definition. It's correct for C++, where it's a prototype indicating that main has no parameters, but in C it should be
int main(void) {
/* ... */
}
(C++ also accepts this form for compatibility with C -- but by the time you're writing main, you should already have decided which language you're using.)
Yep, it's K&R-Style. (Kernighan & Ritchie are the inventors of C) See also http://www.lysator.liu.se/c/bwk-tutor.html for examples of this pre-ANSI style.