Reasons four double undescores in standard library implementations - c++

Is there any technical reason for standard library (C or C++) implementations to, IMO abuse, underscores the way they do (=prefix everything with two undescore + add a trailing underscore to denote that a variable is a member variable)?
I get that /.*__.*/ and /_[A-Z].*/ (<= regexes) are reserved-by-implementation. But isn't that supposed to refer to the implementation of the compiler rather than a (standard) library?
Couldn't a standard library behave like any other library in terms of choosing its internal names?

There is a good reason for the standard library to start internal names with two underscores: Such names are reserved for the implementation.
Imagine you write the following code:
#include <iostream>
using namespace std;
long square(long x)
{
return x*x;
}
int main()
{
cout << square(3) << endl;
}
I guess you would not be happy if this ended up calling some internal function square(int) used in implementing the standard library and doing something completely different, because it's a better match than your square(long) for square(3).
By prefixing all internal names with double underscores and at the same time the standard declaring that you are not allowed to do the same, the standard library authors ensure that something like this cannot happen.
Now you may say that <iostream> is not part of the STL, but every standard library header may include any other standard library header, so iostream may well include an STL header for use in its implementation.
Another reason why identifiers with double underscores make sense even in the case of local identifiers that are not seen externally is that you might have defined a macro of the same name. Consider:
#define value 15
#include <iostream>
int main()
{
std::cout << value;
}
This is legal code and certainly should output 15. But now imagine what happened if some object in iostream declared a local variable names value. Your code obviously wouldn't compile.
Note that the standard library is part of the implementation (it's described in the C++ standard, after all), so it can use reserved names however it likes.

Related

What are the reasons that extending the std namespace is considered undefined behavior?

Why is adding names to the std namespace undefined behaviour?
The obvious answer is "because the standard says so," e.g. in C++14 [namespace.std] 17.6.4.2.1/1:
The behavior of a C++ program is undefined if it adds declarations or definitions to namespace std or to a
namespace within namespace std unless otherwise specified. ...
However, I would be really interested in the reasons for this ruling. I can of course understand adding overloads of names already in std could break behaviour; but why is adding new, unrelated names a problem?
Programs can already wreak havoc inside std with macros, which is why pretty much all standard library implementations have to consist solely of reserved names (double-underscore and starting-underscore-followed-by-capital) for all non-public parts.
I would really be interested in a situation in which something like this can be problematic:
namespace std
{
int foo(int i)
{ return i * 42; }
}
#include <algorithm> // or one or more other standard library headers
when this is perfectly legal and the standard library has to cope:
#define foo %%
#include <algorithm> // or one or more other standard library headers
What is the rationale for this Undefined Behaviour?
Here are a few reasons:
Even if names in headers have to be uglified to avoid interactions with macros, this requirement does not exist for name in the source files actually implementing the code. If an implementation does use ::std::foo(int) as part of its implementation it would be a violation of the one definition rule.
The standard is expected to grow. If names could be added to namespace std any name added to the standard C++ library would be a likely breaking change. To some extent this is already true in the sense that any such name could be a macro but it is considered acceptable to break those.
There is actually no need to add names to namespace std: they can be added to arbitrary other namespace, i.e., even if the motivations given above are not particular strong, the restriction isn't considered to matter in any form. ...and if there is a reason to add a name to namespace std, it clearly does affect the behavior.

C++ Primer Questions

I am currently going through the 5th edition of C++ Primer by Lahoie, Lippman and Moo and have been struggling with a few things.
Firstly, I just wanted to confirm, when using any of the cctype functions, I have to make sure I include the header, right? Because, initially, I forgot to include it and yet it still ran. That's really confused me.
Also, I was browsing for a different problem (which I'll get to) and found another issue haha! When using anything from cctype, are I supposed to write it as std::/write using std::, e.g. if I use the tolower either write std::tolower at every instance/write a using statement for it. This would make sense as it did say that they are "defined in the std namespace" but I didn't realise and have been writing it without and haven't had an issue. And I'm guessing similar for size_t, right?
Speaking of size_t, I have an issue. This is my code:
// Exercise Section 3.5.2., Exercise 3.30
#include <iostream>
#include <cstddef>
using std::cout; using std::endl;
int main()
{
constexpr size_t num_size = 10;
int num[num_size] = {};
for (size_t n = 0; n < num_size; ++n) {
num[n] = n;
cout << num[n] << endl;
}
return 0;
}
So the code is supposed define an array of 10 ints and give each element the same value as its position in the array.
It runs correctly, but I am receiving an error at the num[n]=n part. It says Implicit conversion loses integer precision: size_t (aka 'unsigned long') to int.
I understand what this means, but my issue is that the book says "when we use a variable to subscript an array, we normally should define that variable to type size_t". I have done this and it gives this error. It does run fine but it seems like that sort of thing that can lead to errors.
P.S. In this code, like I asked above, should I have using std::size_t?
Must I include the header even though it works without?
Yes, you must always include at least one header providing each definition / declaration you need, unless the exact prototype / type definition is guaranteed and you put it directly into your source code.
Some standard headers may include others, which might let you get away with being sloppy some of the time, but you will rue it the day you upgrade / port to a different implementation.
I read all the declarations and definitions are "defined in the std namespace" but I didn't realise and have been writing it without and haven't had an issue. And I'm guessing similar for size_t, right?
Yes, it's the same. Due to compatibility, it is possible for the <c...> headers adopted by inclusion from C / Unicode to also provide their symbols in the global namespace.
17.6.1.2 Headers [headers]
1 Each element of the C++ standard library is declared or defined (as appropriate) in a header.175
2 The C++ standard library provides 55 C++ library headers, as shown in Table 14.
3 The facilities of the C standard Library are provided in 26 additional headers, as shown in Table 15.
4 Except as noted in Clauses 18 through 30 and Annex D, the contents of each header cname shall be the same as that of the corresponding header name.h, as specified in the C standard library (1.2) or the C Unicode TR, as appropriate, as if by inclusion. In the C++ standard library, however, the declarations (except for names which are defined as macros in C) are within namespace scope (3.3.6) of the namespace std. It is unspecified whether these names are first declared within the global namespace scope and are then injected into namespace std by explicit using-declarations (7.3.3).
5 Names which are defined as macros in C shall be defined as macros in the C++ standard library, even if C grants license for implementation as functions. [ Note: The names defined as macros in C include the following: assert, offsetof, setjmp, va_arg, va_end, and va_start. —end note ]
6 Names that are defined as functions in C shall be defined as functions in the C++ standard library.176
7 Identifiers that are keywords or operators in C++ shall not be defined as macros in C++ standard library headers.177
8 D.5, C standard library headers, describes the effects of using the name.h (C header) form in a C++ program. 178
In my example program I used size_t for the array index. That works, though I get a warning. Should I have done so, and could it in general lead to errors?
Well, naturally the same about namespace std applies, as you guessed.
As to the rest, there's a wonderful phrase: "Good advice comes with a rationale".
The reason you should use std::size_t for indices is that this type a) signals to readers that it's a size or index and b) it is guaranteed to be big enough.
In your case, a lowly int, with its guaranteed minimum maximum of 215-1 would have been perfectly fine.
An alternative is just casting to the proper type on assignment.
You asked:
Firstly, I just wanted to confirm, when using any of the cctype functions, I have to make sure I include the header, right?
Yes, that is right. You might get some function declarations or other declarations indirectly but that is not portable code. You should understand where the standard says a declaration is available and include that header file before using a function, a type, etc.
You asked:
And I'm guessing similar for size_t, right?
Yes, you should use std::size_t.
There is an SO post related to the topic. Browse Difference between size_t and std::size_t.
You asked:
In this code, like I asked above, should I have using std::size_t?
In the loop, it's OK to use int too. The suggestion to use std::size_t for indexing an array is a good suggestion but is not inviolable.
If you choose to use size_t for n, it's OK to use static_cast to convert it to an int to get rid of the compiler warning/error.
num[n] = static_cast<int>(n);
You declare your array as
int num[num_size] = {};
This means it is an array of int that has 10 elements.
Then you say
for (size_t n = 0; n < num_size; ++n)
num[n] = n;
Note that n is of type size_t aka unsigned long. Therefore you are putting unsigned long values into your int array, so they are being implicitly converted to int values.
The standard headers put a bunch of stuff in the global namespace. Ideally they wouldn't, but they do. Usually that's because something is really a macro rather than a typedef or function.
You can sometimes get away without including headers because some other header that you have included includes the missing one.
I have to make sure I include the header, right? Because, initially, I
forgot to include it and yet it still ran.
Some standard headers can include other headers. However it is a good idea to include a header explicitly if some declarations from the header are used in the program. It may occur such a way that in other implementations of the included headers the required header will not be included and the compiler will issue an error.
. This would make sense as it did say that they are "defined in the
std namespace" but I didn't realise and have been writing it without
and haven't had an issue.
The C++ Standard allows compilers to place C standard functions in the global namespace. Though even in this case it is better to specify explicitly namespace std where the function will be declared in any case.
As for the last question then the elements of the array have type int while you assign to them values of type size_t The problem is that type int can not accomodate all values of type size_t and the compiler warns you about this. You could explicitly specify casting that to say the compiler that you know what you are doing.
num[n] = ( int )n;
or
num[n] = static_cast<int>( n );

Unknown error perhaps caused by naming conflict?

I wrote a very simple c++ code, where I defined a function called sqrt which just calls
std::sqrt. Unexpectedly, I got a segmentation fault. The problem doesn't exist if I rename
the function sqrt as something else. However, I can not see any naming conflict since
the sqrt function I defined is not in the namespace std so the two should be perfectly
separated. So what is the real cause of the problem? Thanks!
#include<iostream>
#include<cmath>
double sqrt(double d);
double sqrt(double d) {
return std::sqrt(d);
}
int main() {
double x = 3.0;
std::cout << "The square root of " << x << " is " << sqrt(x) << '\n';
return 0;
}
<cmath> is a funny header. It is allowed to (but not required to) make ::sqrt and
std::sqrt synonyms. If you include it, it's best to assume
that both are present (or just include <math.h>, in which
case, ::sqrt is all that you should get). What's probably
happening in your case is that 1) std::sqrt is in fact a
synonym (via using) for ::sqrt, and 2) the linker is picking
up your ::sqrt first, so you end up with endless recursion.
The only solution, short of changing the name, is to put your
sqrt in a namespace.
EDIT:
Just to be clear: the above is C++11. Earlier versions of C++ did not allow <cmath> to introduce anything into global namespace. All implementations did, however, so the standard was changed to bless the practice. (I guess that's one way of getting compilers to be standard compliant.)
EDIT:
Some additional information as to how a library "picks up"
symbols, in response to the question in comments. Formally,
according to the C++ standard, you may not have two definitions
of the same function (same name, namespace and argument types)
in a program. If the two definitions are in separate
translation units, the behavior is undefine. With this in mind,
there are several practical considerations.
The first can be considered the definition of a library (or at
least the traditional definition). A library is a set of
modules—translation units, in terms of the standard.
(Generally, but not always, the modules consist of compiled
object files.) Linking in a library, however, does not bring
in all of the modules in it; a module from a library is
incorporated into your program only if it resolves an unresolved
external. Thus, if ::sqrt is already defined (resolved)
before the linker looks at the library, the module containing
::sqrt in the library will not become part of your program.
In practice, the term library has been abused in recent years,
to the point where one might say that its meaning has changed.
In particular, what Microsoft calls "dynamically loaded
libraries" (and what were called "shared objects" in Unix, long
before), are not libraries in the traditional sense, and the
above doesn't apply to them. Other issues do, however,
depending on how the dynamic loader works. In the case of Unix,
if several shared objects have the same symbol, all will resolve
to the first one loaded (by default—this can be controlled
by options passed to dlopen). In the case of Windows, by
default, a symbol will be resolved within the DLL if possible;
in your case, if std::sqrt is an inline function, or is
specified as using ::sqrt, this will be the DLL which calls
std::sqrt; if in the header, it is __declspec(dllexport),
this will be the DLL that contains the implementation of
std::sqrt.
Finally, almost all linkers today support some form of weak
references. This is usually used for template instantiations:
something like std::vector<int>::vector( size_t, int ) will be
instantiated in every translation unit which uses it, but as
a "weak" symbol. The linker then chooses one (probably the
first it encounters, but it's not specified), and throws out all
of the others. While this technique is mainly used for template
instantiations, a compiler can define any function using weak
references (and will do so if the function is inline). In this
case, if the definitions are different (as in your case of
::sqrt), then we can truly say that the program is illegal,
since it violates the one definition rule. But the results are
undefined behavior, and don't require a diagnostic. It you
define an inline function or a function template differently in
two different translation units, for example, you will almost
never get an error; if the compiler doesn't actually inline
them, the linker will choose one, and use it in both translation
units. In your case (::sqrt), I doubt that this applies;
I would expect this to be a real library function, and not
inlined. (If it were inlined, the definition would be in the
header <cmath>, and you'd get a duplicate definition error,
since both definitions would be in the same translation unit.)
The problem seems to be that <cmath> is bringing in the sqrt name (without the std:: namespace), as well as std::sqrt. I am afraid you need to use another name.
See this example, using a snapshot of GCC 4.8:
#include<iostream>
#include<cmath>
int main() {
double x = 9.0;
std::cout << sqrt(x) << '\n'; // look, no std::sqrt
}
Per Paragraph 17.6.1.2/4:
Except as noted in Clauses 18 through 30 and Annex D, the contents of each header cname shall be the same
as that of the corresponding header name.h, as specified in the C standard library (1.2) or the C Unicode
TR, as appropriate, as if by inclusion. In the C++ standard library, however, the declarations (except for
names which are defined as macros in C) are within namespace scope (3.3.6) of the namespace std. It is
unspecified whether these names are first declared within the global namespace scope and are then injected
into namespace std by explicit using-declarations (7.3.3).
Also, per Annex D.5/2:
Every C header, each of which has a name of the form name.h, behaves as if each name placed in the standard
library namespace by the corresponding cname header is placed within the global namespace scope. It is
unspecified whether these names are first declared or defined within namespace scope (3.3.6) of the namespace
std and are then injected into the global namespace scope by explicit using-declarations (7.3.3).
Since the exact technique to be used for making global functions available is left up to implementations, your implementation is probably having a using directive such as the one below inside the std namespace:
namespace std
{
using ::sqrt;
// ...
}
Which means that std::sqrt actually becomes an alias for ::sqrt, and you are providing a definition of ::sqrt which effectively ends up calling itself recursively.
The only solution is then to pick a different name.

Is it OK to put a standard, pure C header #include directive inside a namespace? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is it a good idea to wrap an #include in a namespace block?
I've got a project with a class log in the global namespace (::log).
So, naturally, after #include <cmath>, the compiler gives an error message each time I try to instantiate an object of my log class, because <cmath> pollutes the global namespace with lots of three-letter methods, one of them being the logarithm function log().
So there are three possible solutions, each having their unique ugly side-effects.
Move the log class to it's own namespace and always access it with it's fully qualified name. I really want to avoid this because the logger should be as convenient as possible to use.
Write a mathwrapper.cpp file which is the only file in the project that includes <cmath>, and makes all the required <cmath> functions available through wrappers in a namespace math. I don't want to use this approach because I have to write a wrapper for every single required math function, and it would add additional call penalty (cancelled out partially by the -flto compiler flag)
The solution I'm currently considering:
Replace
#include <cmath>
by
namespace math {
#include "math.h"
}
and then calculating the logarithm function via math::log().
I have tried it out and it does, indeed, compile, link and run as expected. It does, however, have multiple downsides:
It's (obviously) impossible to use <cmath>, because the <cmath> code accesses the functions by their fully qualified names, and it's deprecated to use in C++.
I've got a really, really bad feeling about it, like I'm gonna get attacked and eaten alive by raptors.
So my question is:
Is there any recommendation/convention/etc that forbid putting include directives in namespaces?
Could anything go wrong with
diferent C standard library implementations (I use glibc),
different compilers (I use g++ 4.7, -std=c++11),
linking?
Have you ever tried doing this?
Are there any alternate ways to banish the math functions from the global namespace?
I've found several similar questions on stackoverflow, but most were about including other C++ headers, which obviously is a bad idea, and those that weren't made contradictory statements about linking behaviour for C libraries. Also, would it be beneficial to additionally put the #include <math.h> inside extern "C" {}?
edit
So I decided to do what probably everyone else is doing, and put all of my code in a project namespace, and to access the logger with it's fully qualified name when including <cmath>.
No, the solution that you are considering is not allowed. In practice what it means is that you are changing the meaning of the header file. You are changing all of its declarations to declare differently named functions.
These altered declarations won't match the actual names of the standard library functions so, at link time, none of the standard library functions will resolve calls to the functions declared by the altered declarations unless they happen to have been declared extern "C" which is allowed - but not recommended - for names which come from the C standard library.
ISO/IEC 14882:2011 17.6.2.2/3 [using.headers] applies to the C standard library headers as they are part of the C++ standard library:
A translation unit shall include a header only outside of any external declaration or definition[*], and shall include the header lexically before the first reference in that translation unit to any of the entities declared in that header.
[*] which would include a namespace definition.
Why not putting a log class in it's own namespace and using typedef namespace::log logger; to avoid name clashes in a more convenient way?
Change your class's name. Not that big of a deal. ;-)
Seriously though, it's not a great idea to put names in the global namespace that collide with names from any standard header. C++03 didn't explicitly permit <cmath> to define ::log. But implementations were chronically non-conforming about that due to the practicalities of defining <cmath> on top of an existing <math.h> (and perhaps also an existing static-link library for some headers, including math). So C++11 ratifies existing practice, and allows <cmath> to dump everything into the global namespace. C++11 also reserves all those names for use with extern "C" linkage, and all function signatures for use with C++ linkage, even if you don't include the header. But more on that later.
Because in C++ any standard header is allowed to define the names from any other standard header (i.e, they're allowed to include each other), this means that any standard header at all can define ::log. So don't use it.
The answer to your question about different implementations is that even if your scheme works to begin with (which isn't guaranteed), in some other implementation there might be a header that you use (or want to use in future in the same TU as your log class), that includes <cmath>, and that you didn't give the namespace math treatment to. Off the top of my head, <random> seems to me a candidate. It provides a whole bunch of continuous random number distributions that plausibly could be implemented inline with math functions.
I suggest Log, but then I like capitalized class names. Partly because they're always distinct from standard types and functions.
Another possibility is to define your class as before and use struct log in place of log. This doesn't clash with the function, for reasons that only become clear if you spend way too much time with the C and C++ standards (you only use log as a class name, not as a function and not as a name with "C" linkage, so you don't infringe on the reserved name. Despite all appearances to the contrary, class names in C++ still inhabit a parallel universe from other names, rather like struct tags do in C).
Unfortunately struct log isn't a simple-type-identifier, so for example you can't create a temporary with struct log(VERY_VERBOSE, TO_FILE). To define a simple-type-identifier:
typedef struct log Log;
Log(VERY_VERBOSE, TO_FILE); // unused temporary object
An example of what I say in a comment below, based on a stated example usage. I think this is valid, but I'm not certain:
#include <iostream>
#include <cmath>
using std::log; // to enforce roughly what the compiler does anyway
enum Foo {
foo, bar
};
std::ostream &log(Foo f) { return std::cout; }
int main() {
log(foo) << log(10) << "\n";
}
It is ugly hack too, but I believe will not cause any linker problems. Just redefine log name from <math.h>
#define log math_log
#include <math.h>
#undef log
It could cause problems with inline functions from math using this log, but maybe you'd be lucky...
Math log() is still accessible but it's not easy. Within functions where you want to use it, just repeat its real declaration:
int somefunc() {
double log(double); // not sure if correct
return log(1.1);
}

"built-in" functions in C++

I'm beginner at C++ so if the answer is obvious it possibly is the one I'm looking for. I was reading the second response in this thread and got confused.
#include <algorithm>
#include <cassert>
int
main()
{
using std::swap;
int a(3), b(5);
swap(a, b);
assert(a == 5 && b == 3);
}
What I don't get is "This is just a defined function. What I meant was, why it is not directly built-in" but there was no need to include a new library so isn't it built-in? Does the std library automatically get imported (if yes, why doesn't the namespace automatically get set to std)?
"This is just a defined function. What I meant was, why it is not directly built-in" but there was no need to import a new library so isn't it built-in? Does the std library automatically get imported (if yes, why doesn't the namespace automatically get set to std)?
Well, by defined function it means, most likely that the function is already pre-written, and defined in the library, it isn't directly built-in probably because it was designed that way; only core essentials were included in the language, and everything else is in a library so the programmer can import what they want.
By built-in, usually it's a keyword, like for or while.
And no, std isn't automatically imported since it's designed so the programmer can choose what namespaces they want, like custom namespaces or std. An example of this being bad to automatically have std is this:
Say you automatically defined std, then you wanted to do using namespace foo;, now if foo also had function cout, you would run into a huge problem, like say you wanted to do this;
// code above
cout << "Hello, World" << endl;
// code below
how would the compiler which namespace function to use use? the default or your foo namespace cout? In order to prevent this, there is no default namespace set, leaving it up to the programmer.
Hope that helps!
This is just a defined function. What I meant was, why it is not directly built-in" but there was no need to include a new library so isn't it built-in?
The C++ library is a part of C++, by definition. However, it is not a part of the core language. C++ is a huge, huge language. Making it so the compiler knew every nook and cranny of the language right off the bat would make the compiler huge and slow to load. The philosophy instead is to keep the core somewhat small and give programmers the ability to extend the functionality by #including header files that. specify what they need.
Why doesn't the namespace automatically get set to std?
That would essentially make all kinds of very common words into keywords. The list of words you shouldn't use (keywords, global function in C, words reserved by POSIX or Microsoft, ...) is already huge. Putting the C++ standard library in namespace std is a feature. Putting all of those names in the global namespace would be a huge misfeature.
In your code, you have the line:
using std::swap;
So the call to swap doesn't need std::. For assert, it was defined as a macro, so it also would not need std::. If you did not use using, then other than macros, you would have to use std:: to refer to functions and objects provided by the standard C++ library.
The standard C++ library normally get linked into your program when you compile your program to create an executable. From this point of view, you might consider it "built-in". However, the term "built-in" would usually mean the compiler treats the word swap like a keyword, which is not the case here. swap is a template function defined in the algorithm header file, and assert is a macro defined in cassert.
A namespace is a convenience to allow parts of software to be easily partitioned by name. So if you wanted to define your own swap function, you could put it into your own namespace.
namespace mine {
template <typename T> void swap (T &a, T &b) { /*...*/ }
}
And it would not collide with the standard, or with some library that defined swap without a namespace.