This question already has answers here:
C++ preprocessor #define-ing a keyword. Is it standards conforming?
(3 answers)
Closed 8 years ago.
I want to define private and protected to public.
#define private public
#define protected public
Is this safe in C++?
No, this almost certainly results in undefined behaviour.
From n4296 17.6.4.3.1 [macro.names] /2: (via #james below)
A translation unit shall not #define or #undef names lexically identical to keywords, to the identifiers listed in Table 2, or to the attribute-tokens described in 7.6.
private and public are keywords. Simply doing a #define on one of them is undefined behavior if you use anything in the C++ standard library:
17.6.4.1/1 [constraints.overview]
This section describes restrictions on C ++ programs that use the facilities of the C++ standard library.
If you do not, the restriction in 17.6.4.3.1 does not seem to apply.
The other way it could lead to violation is if you use the same structure with two different definitions. While most implementations may not care that the two structures are identical other than public vs private, the standard does not make that guarantee.
Despite that, the most common kind of UB is 'it works', as few compilers care.
But that does not mean it is 'safe'. It may be safe in a particular compiler (examine the docs of said compiler: this would be a strange guarantee to give, however!). Few compilers (if any) will provide the layout guarantees (and mangling guarantees) required for the above to work explicitly if you access the same structure through two different definitions, for example, even if the other possibilities of error are more remote.
Many compilers will 'just work'. That does not make it safe: the next compiler version could make one of a myriad of changes and break your code in difficult (or easy) to detect ways.
Only do something like this if the payoff is large.
I cannot find evidence that #defineing a keyword is undefined behavior if you never include any standard library headers and do not use it to make a definition different in two compilation units. So in a highly restricted program, it may be legal. In practice, even if it legal, it still isn't 'safe', both because that legality is extremely fragile, and because compilers are unlikely to test against that kind of language abuse, or care if it leads to a bug.
The undefined behavior caused by #define private foo does not seem to be restricted to doing it before the #include of the std header, as an example of how fragile it is.
It is allowed only in a translation unit that doesn't in any way (even indirectly) include a standard header, but if you do there are restrictions such as this:
(17.6.4.3.1) A translation unit shall not #define or #undef names lexically identical to keywords [...]
Regardless of that, it's usually a bad idea - mucking around with access modifiers isn't "safe" by any common meaning of the word, even if it won't cause any immediate problems in itself.
If you're using library code, there are usually good reasons for things being protected.
If you want to make things public temporarily, e.g. for testing purposes, you could use a special conditional macro for that part of the class:
#if TESTING
#define PRIVATE_TESTABLE public
#else
#define PRIVATE_TESTABLE private
#endif
class Foo
{
public:
Foo();
void operation();
PRIVATE_TESTABLE:
int some_internal_operation();
private:
int some_internal_data;
};
There is nothing illegal about doing this, it's just crazy.
You have to use this definition for all the compilation units (otherwise you might get the linker failing because of the name mangling.
If you are asking out of curiosity then this is perfectly legal (if confusing) c++; if you are asking because you think this is a good idea so you don't have all those pesky access permissions then you are on a very bad path. There a specific semantic reasons for using protected and private which serve to reduce he complexity of code and explain the 'contract' that modules have with each other. Using this define makes you code nearly unreadable for practiced c++ programmers.
Syntax is correct but semantic is wrong.
Access modifiers are for humans only so that you by accident don't use a method or field etc. that you are not supposed to access/change.
You could as well always make everything public if you write new code but in old code it can definitely break something. It would work but it would of course also be more error prone. Imagine what IntelliSense would suggest if you had access to everything everytime. Access modifiers don't only protect code that could break something if you used it in a wrong way but it helps IntelliSense to show you only members that are relevant in a particular context.
Related
For example, i want to see the code of function toupper() to understand how it works, is there any way? I have searched and opened string.h library, but didn't find anything.
From a strict language point of view, you cannot "see the code" of a standard function, because the C++ language standard only defines functions' prototypes and behaviours, not how they are implemented.
In fact, from a strict language point of view, a standard function like toupper does not even have to have source code, because a standard header, like <string.h> does not even have to be a file!
Of course, in practice, you will probably never encounter a C++ implementation in which standard headers are not files, because files are just a natural and simple implementation of headers. This means that in practice, for the header <string.h>, there is actually a C++ source file called "string.h" somewhere on your computer. Just find it and open it.
I have searched and opened string.h library, but didn't find anything.
Then you have not looked close enough. Hint: This file most likely includes one or more other header files.
Note that if you actually looked for toupper, that function is not in <string.h> anyway. Look in <ctype.h> instead. cppreference.com is a good online reference to tell you which headers contain which functions.
http://en.cppreference.com/w/c/string/byte/toupper
Again, this does not mean that the corresponding header file of your compiler contains that function directly, but it may directly or indirectly include some other file which contains it.
In any case, beware of what you will see inside of your compiler's header files. It will usually be a lot more complicated than you may think, and, more importantly, it will often use constructs you are not allowed to use in your own code; after all, the code in those files is internal to the compiler implementation, and the compiler has a lot of privileges you don't have, for example using otherwise forbidden identifiers like _STD_BEGIN. Also expect a lot of completely non-standard #pragmas and other non-portable stuff.
Another important thing to keep in mind is that you are not supposed to dig through a function's implementation to find out what it does. In badly written software, i.e. software with confusing interfaces and no documentation (which exists everywhere in the real world), you unfortunately have to do this, provided you have access to the source code.
But C++ standard functions are perfectly documented and have, with some arguable exceptions, well-designed interfaces. It may be interesting, and educating, and sometimes even necessary for debugging, to look into their implementation on your system, but don't let this possibility keep you from learning two important software-engineering skills:
Reading documentation.
Programming to interfaces, not to implementations.
Yes, of course, you could (not all realizations, maybe). For example, the glibc implementation defines toupper function as:
#define __ctype_toupper \
((int32_t *) _NL_CURRENT (LC_CTYPE, _NL_CTYPE_TOUPPER) + 128)
int
toupper (int c)
{
return c >= -128 && c < 256 ? __ctype_toupper[c] : c;
}
On the std-proposals list, the following code was given:
#include <vector>
#include <algorithm>
void foo(const std::vector<int> &v) {
#ifndef _ALGORITHM
std::for_each(v.begin(), v.end(), [](int i){std::cout << i; }
#endif
}
Let's ignore, for the purposes of this question, why that code was given and why it was written that way (as there was a good reason but it's irrelevant here). It supposes that _ALGORITHM is a header guard inside the standard header <algorithm> as shipped with some known standard library implementation. There is no inherent intention of portability here.
Now, _ALGORITHM would of course be a reserved name, per:
[C++11: 2.11/3]: In addition, some identifiers are reserved for use by C++ implementations and standard libraries (17.6.4.3.2) and shall not be used otherwise; no diagnostic is required.
[C++11: 17.6.4.3.2/1]: Certain sets of names and function signatures are always reserved to the implementation:
Each name that contains a double underscore _ _ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.
Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
I was always under the impression that the intent of this passage was to prevent programmers from defining/mutating/undefining names that fall under the above criteria, so that the standard library implementors may use such names without any fear of conflicts with client code.
But, on the std-proposals list, it was claimed that this code is itself ill-formed for merely referring to such a reserved name. I can now see how the use of the phrase "shall not be used otherwise" from [C++11: 2.11/3]: may indeed suggest that.
One practical rationale given was that the macro _ALGORITHM could expand to some code that wipes your hard drive, for example. However, taking into account the likely intention of the rule, I'd say that such an eventuality has more to do with the obvious implementation-defined* nature of the _ALGORITHM name, and less to do with it being outright illegal to refer to it.
* "implementation-defined" in its English language sense, not the C++ standard sense of the phrase
I'd say that, as long as we're happy that we are going to have implementation-defined results and that we should investigate what that macro means on our implementation (if it exists at all!), it should not be inherently illegal to refer to such a macro provided we do not attempt to modify it.
For example, code such as the following is used all over the place to distinguish between code compiled as C and code compiled as C++:
#ifdef __cplusplus
extern "C" {
#endif
and I've never heard a complaint about that.
So, what do you think? Does "shall not be used otherwise" include simply writing such a name? Or is it probably not intended to be so strict (which may point to an opportunity to adjust the standard wording)?
Whether it's legal or not is implementation-specific (and identifier-specific).
When the Standard gives the implementation the sole right to use these names, that includes the right to make the names available in user code. If an implementation does so, great.
But if an implementation doesn't expressly give you the right, it is clear from "shall not be used otherwise" that the Standard does not, and you have undefined behavior.
The important part is "reserved to the implementation". It means that the compiler vendor may use those names and even document them. Your code may then use those names as documented. This is often used for extensions like __builtin_expect, where the compiler vendor avoids any clash with your identifiers (that are declared by your code) by using those reserved names. Even the standard uses them for things like __attribute__ to make sure it doesn't break existing (legal) code when adding new features.
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1882
Each identifier that contains a double understore __ or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.
any use. (similar text occurs both before and after that defect fix is applied)
__cplusplus is defined by the standard. _ALGORITHM is reserved by the standard to be used by implementations. These seem quite different? (The two sections of the standard do conflict, in that one states that __cplusplus is reserved for any use, and another uses it specifically, but I think that the winner of that conflict is clear).
The _ALGORITHM identifier could, under the standard, be used as part of a pre-processing step to say "replace this source code with hard drive deleting code". Its existence (prior to pre-processing, or after) could be sufficient to completely change your program behavior.
Now this is unlikely, but I do not think it results in an non-conforming implementation. It is a matter of quality of implementation only.
An implementation is free to document and define what _ALGORITHM means. For example, it could document that it is a header guard for <algorithm>, and indicates if that header file has been included. Treating your current <algorithm> implementation as documentation is probably going to far.
I'd guess using __cplusplus in C mode is technically "just as bad" as using _ALGORITHM, but this question is a c++ question, not a c question. I haven't delved into the c standard to look for quotes about it.
The names in [cpp.predefined] are different. Those have a specified meaning, so an implementation can't reserve them for any use, and using them in a program has a well-defined portable meaning. Using an implementation-specific identifier like the example of _ALGORITHM is ill-formed because it violates a shall-rule.
Yes, I'm fully aware of multiple examples where the library specification uses "shall" to mean "this is a requirement on user code, and violations are UB, not ill-formed".
Regarding whether it's UB or implementation-defined, running an ill-formed program results in UB. The standard wording clearly says the program is ill-formed, UB occurs if the implementation still chooses to accept the program and run it.
So, if a program uses the identifier _ALGORITHM, that program is ill-formed, and running such a program is UB, but that does not mean it doesn't work fine on an implementation that uses _ALGORITHM as an include guard, nor does it mean that it doesn't work fine on an implementation that doesn't.
If users are concerned about such ill-formedness and potential UB, and said users want to write portable C++, they shouldn't use reserved identifiers in portable C++ programs. If users accept that regardless of the standard prohibiting such a use, no practical implementation will wipe your hard drive, they can freely use such reserved identifiers, but by the letter of the standard, such uses are still ill-formed.
Historically, the purpose for making the use of such tokens "undefined behavior" is that compilers are free to attach any meaning they want to any such token that are not defined within the C standard. For example, on some embedded processors, using __xdata as a storage class for a variable will ask that it be stored in an area of RAM which is slower to access than the normal variable-storage area, but is much larger. On typical processors of that family, storage for "normal" variables would be limited to about 100 bytes, but storage for xdata variables may be much larger--up to 64K. The standard says basically nothing about what compilers are allowed to do with such directives, although typically (I'm not sure if the standard mandates this behavior, though I'm unaware of compilers violating it) such tokens are generally ignored within code that is disabled using a #if or similar directives.
Some libraries' header files will start their own internal identifiers with something that starts with two underscores but includes a pattern that's unlikely to be used by a compiler for any purpose (e.g. version 23 of the Foozle library might precede its identifiers with use __FZ23). It would be perfectly legitimate for a future compilers to use identifiers starting with __FZ23 for other purposes, and if that were to happen the Foozle library would need to be changed to use something else. If, however, it is likely that a major compiler upgrade would likely necessitate rewrites of the Foozle library for other reasons anyway, that risk may be acceptable compared to the risk of identifiers conflicting with outside code.
Note also that some project header files which are targeted toward a processor that requires __ directives may conditionally define macros with those names when compiled for other processors, for example:
#ifndef USE_XDATA
#define __XDATA
#endif
though a somewhat better pattern would generally be:
#ifdef USE_XDATA
#define XDATA __XDATA
#else
#define XDATA
#endif
When writing new code, the latter pattern is often better, but the former pattern may sometimes be useful when adapting existing code written on a platform that requires __XDATA so that it may be used both on platforms that use/require that directive and on platforms that do not.
Whether or not it is legal is a matter of local law. Whether it means anything, and if so, what, is a matter for the language definition. When you use a name that's reserved to the implementation the behavior of your program is undefined. That means that the language definition does not tell you what the program does. Nothing more, nothing less. If the compiler you're using documents what a particular reserved identifier does, then you can use that identifier with that compiler. If you hunt through headers and guess what various un-documented identifiers mean you might be able to use them, but don't be surprised if your code breaks when a subsequent update changes something.
Don't get hung up on __cplusplus. It's core language, and the stuff about double underscores, etc. is library. If that's not convincing, just consider it a glitch. You can use __cplusplus in C++ programs; its meaning is well defined.
The following compiles, links and runs just fine (on Xcode 5.1 / clang):
#include <iostream>
class C { int foo(); };
int main(int argc, const char * argv[])
{
C c;
cout << "Hello world!";
}
However, C::foo() is not defined anywhere, only declared. I don't get any compiler or linker warnings / errors, apparently because C::foo() is never referenced anywhere.
Is there any way I can emit a warning that in the whole program no definition for C::foo() exists even though it is declared? An error would actually be better.
Thanks!
There are good reasons why it is not easily feasible. A set of header files could declare many functions, some of which are provided by additional libraries. You may want to #include such headers without using all of these functions (for instance, if you only want to use some #define-d constant).
Alternatively, it is legitimate to have some header and to implement (in your library) only a subset of the API defined by the header files.
And a C++ or C header file could also define the interface of code defined by potential plugins, for programs which usually run without plugins. Many programs accepting plugins are declaring the plugin interface in their header file.
If you really wanted to have such a check, you might perhaps consider customizing GCC with MELT; however, such a check is non trivial to implement currently (and you'll need link time optimization too).
Perhaps try calling all functions in your implementation map and adding a try catch that spits out some warning if they segfault.
You don't. Or rather, this is not the job of the compiler
I guess I'm just repeating what other said in the comments, but:
It's actually a feature (see unimplemented private constructor)
It wouldn't really really help, as the "bug" here is that no proper code cleanup was done prior to committing the code, and an unimplemented and unused function is really you smallest problem then. What about all the other stuff that hasn't been cleaned up? The likelihood of having an implemented and unused function seems just as high to me, and it's the same mess more or less.
Rather than worry about this specific case, I would check if this was just a one time glitch, or if your development team could improve some procedures that would prevent such things in the future.
As far as languages such as C++ are concerned, detecting and reporting undefined functions will defeat one of the important feature provided by it. Virtual/pure-virtual functions are one of the important mechanism by which C++ implements run-time polymorphism.
If you are developing a library to be used by its clients, you might have declared-but-not-defined virtual functions. You may leave the definition for its clients. In such cases, if the compiler were to report such undefined functions, it won't be beneficial.
I've just been researching the use and benefits/pitfalls of the C++ keyword inline on the Microsoft Website and I understand all of that.
My question is this: if the compiler evaluates functions to see if inlining them will result in the code being more efficient and the inline keyword is only a SUGGESTION to the compiler, why bother with the keyword at all?
EDIT: A lot of people are moaning about my use of __inline instead of inline. I'd like to point out that __inline is the Microsoft specific one: so it's not wrong, it's just not necessarily what you're used to. (Also fixed the website link)
EDIT2: Re-formatted the question to indicate the inline keyword (used across all of C++) instead of the Microsoft-specific __inline keyword.
Firstly, it is not __inline, but inline.
Secondly, the effect inline has within the One Definition Rule is undeniably significant. It allows you to define the functions multiple times and have the compiler to handle it.
Thirdly, with regard to the actual inilining this is your way to express your opinion about that function not only to the compiler but also to those who might read your code later. In many cases it is a way to let off the steam, so to say. Basically, it is a way for you to tell the others and yourself: "I feel that this function is too small (or too specialized) to justify the calling overhead, so don't hold me responsible for this travesty. I did all I could. If not for your stupid company-wide coding standard, I would've made it a macro". In that regard it is a sort of formalized comment.
Fourthly, seeing that you used an implementation-specific spelling of the keyword, I'd note that some implementations offer you alternative keywords that give you the opportunity to be more... er... persuasive in your desire to have that function inlined. In MS compiler that would be __forceinline.
The keyword is inline and not __inline.
inline is not a mere suggestion it is the best standard compliant way to include a function definition in a header file without breaking the one definition rule.
Judging from the documentation in MSDN, I think it's a purely pragmatic thing, mostly for the benefit of C programs.
Since inline is a valid identifier name in C, another keyword is needed in C.
Personally, I would only consider using it if my source code could end up being shared between C and C++ programs (in Visual Studio, obviously).
What is the proper layout of a C++ .h file?
What I mean is header guard, includes, typedefs, enums, structs, function declarations, class definitions, classes, templates, etc, etc
I am porting an old code base that is over 10 years old and moving to a modern compiler from Codewarrior 8 is proving interesting as things seem all over the place. I get a lot of dont name a type errors, forbidding declaring without a type, etc, etc.
There is no silver bullet regarding how to organize your headers.
However one important rule is to keep it consistent across the project so that all persons involved in the project know what to expect.
Usually typedefs and defines are at the top of the file in my headers, but that can not be regarded as a rule, then come class/template definitions.
A rule that I follow for C++ is one header per class, which usually keeps the headers small enough to allow grasping the content and finding things without scrolling too much.
It depends on what you mean by proper. If you mean language-enforced, there really isn't one. In fact, you don't even have to name it ".h". I've seen ".c" files #include'd in working commercial code (name withheld to protect the guilty). #include is just a preprocessor hack to get some kind of rough modularity in the language by allowing files to textually include other files. Anything else you tend to see as standard practice is just useful idioms people have developed over time.
That doesn't help your current issue though.
I'd guess that what you are actually seeing is a lot of missing symbols due to platform differences. Nothing due to weirdly-formed .h files at all.
It is possible that the old code was written to work with an old K&R-style C compiler. They had oddities like implicit function declarations (any reference to an undeclared routine assumed it returned int and all its parameters were int). You could try seeing if your compiler has a K&R flag, but a lot of the flagged stuff may actually be latent errors in the old code.
It sounds like you're running into assumptions made based on the previous implementation (Codewarrior). For example:
#include <iostream>
int main() {
std::cout << "string literal\n";
return 0;
}
This relies on iostream including something it's not required to declare: the operator<<(ostream&, char const*) overload (it's a free function, not a method of ostream like the others). And to be completely unambiguous, #include <ostream> is also required above. In C++, library headers are allowed to include any other library header, in general, so this problem crops up whenever someone inadvertently depends on this.
(That the extra header is required in this particular circumstance is considered a flaw by many, including me, and almost all implementations do provide the declaration of this function in iostream. It is still the shortest, common example I know of to illustrate this.)
It's often more subtle and complicated than this simple example, but the core issue is the same. The solution is to check every header to make sure it includes any libraries it requires, starting with the ones giving you the errors. E.g. #include <vector> and make sure you use std::vector (to avoid relying on it being in the global namespace, which is done in some, mostly old and obsolete now, implementations) when you get "vector does not name a type".
You might also be running into dependent types, in which case you'd add typename.
I think best thing you can do is to check out layout of any library files.