Does using leading underscores actually cause trouble? - c++

The C/C++ standard reserves all identifiers that either lead with an underscore (plus an uppercase letter if not in the global namespace) or contain two or more adjacent underscores. Example:
int _myGlobal;
namespace _mine
{
void Im__outta__control() {}
int _LivingDangerously;
}
But what if I just don't care? What if I decide to live dangerously and use these "reserved" identifiers anyway? Just how dangerously would I be living?
Have you ever actually seen a compiler or linker problem resulting from the use of reserved identifiers by user code?
The answers below, so far, amount to, "Why break the rules when doing so might cause trouble?" But imagine that you already had a body of code that broke the rules. At what point would the cost of trouble from breaking the rules outweigh the cost of refactoring the code to comply? Or what if a programmer had developed a personal coding style that called for wild underscores (perhaps by coming from another language, for instance)? Assuming that changing their coding style was more or less painful to them, what would motivate them to overcome the pain?
Or I could ask the same question in reverse. What is it concretely that C/C++ libraries are doing with reserved words that a user is liable to fall afoul of? Are they declaring globals that might create name clashes? Functions? Classes? Each library is different, naturally, but how in general might this collision manifest?
I teach software students who come to me with these kinds of questions, and all I can tell them is, "It's against the rules." It's a superstitious, hand-waving answer. Moreover, in twenty years of C++ programming, I've never seen a compiler or linker error that resulted from breaking the reserved word rules.
A good skeptic, faced with any rule, asks, "Why should I care?" So: why should I care?

I now care because I just encountered a failure with underscores, large and old codebase, mostly aimed at Windows and compiled with VS2005 but some is also cross-compiled to Linux. While analyzing updates to a newer gcc, I rebuilt some under cygwin just for ease of syntax checking. I got totally unintelligible errors (to my tiny brain) out of a line like:
template< size_t _N = 0 > class CSomeForwardRef;
That produced an error like:
error: expected ‘>’ before numeric constant
Google on that error turned up https://svn.boost.org/trac/boost/ticket/2245 and https://svn.boost.org/trac/boost/ticket/7203 both of which hinted that a stray #define could get in the way. Sure enough, an examination of the preprocessed source via -E and a hunt thru the include path turned up some bits-related .h (forget which) that defined _N. Later in that same odyssey I encountered a similar problem with _L.
Edit: Not bits-related but char-related: /usr/include/ctype.h -- here are some samples together with how ctype.h uses them:
#define _L 02
#define _N 04
.
.
.
#define isalpha(__c) (__ctype_lookup(__c)&(_U|_L))
#define isupper(__c) ((__ctype_lookup(__c)&(_U|_L))==_U)
#define islower(__c) ((__ctype_lookup(__c)&(_U|_L))==_L)
#define isdigit(__c) (__ctype_lookup(__c)&_N)
#define isxdigit(__c) (__ctype_lookup(__c)&(_X|_N))
I'll be scanning the source for all underscored identifiers and weeding out via rename all those we created in error ...
Jon

The results may vary according to the specific complier you will use.
Regarding the "danger level" - every time you'll get a bug - you will have to wonder if it is originates from your implemented logic or from the fact you are not using the standard.
But that is not all... let's assume someone tells you: "it is perfectly safe!"
So, you can do that with no problem at all (only assuming..)
Will it redefine your thinking when you get to a bug or still you will be wondering if there is a slight chace he was wrong? :)
So, you see, no matter which answer you will get it can never be a good one.
(which makes me actually like your question)

Just how dangerously would I be living?
Dangerous enough to break your code in next compiler upgrade.
Think of the future, your code might not be portable and might break in future because future enhancement releases from your implementation might have exactly the same symbol name as you use.
Since the question has a pinch of: "It can be wrong yet how wrong can it be and when ever has it been wrong" flavor, I think Murphy's law answers this rather aptly:
"Anything that can go wrong will go wrong (When you are least expecting it)".[#]
[#] The (,) is my invention not Murphy's.

If you try to build your code somewhere where there's actually a conflict you will see strange build errors, or worse, no build error at all and incorrect runtime behavior.
I have seen someone use a reserved identifier which had to be changed when it caused build problems on a new platform.
It's not all that likely, but there's no reason to do it.

Related

Why don't C++ error messages describe the actual problem in the code?

Why do the Visual Studio error logs show the things caused by some error, rather than the error itself? I often find the error messages to be useless and meaningless.
When I make a mistake, like for example a circular dependency, it throws a bunch of errors like
syntax error: missing ';' instead of something like circular dependency detected.
When I forget to include some header and use it in my code, for example the std::map, it only says 'map' is not a member of 'std'
It never shows you what's actually wrong, it only shows the symptoms. I know that sometimes you can clearly see what's wrong based only on that, but I don't want to spend time figuring out what's wrong. I just want to fix it as soon as possible.
Why can't it be like Python with Pycharm IDE which actually shows you the actual error?
Even though this question might be considered out of topic, I'll attempt to give an answer to make it clear why one might find the experssed opinion (useless and meaningless) as "unfair".
Firstly this is not a Visual studio thing. If you've used other C++ compilers (gcc/clang/intel compiler) you'll notice the errors are pretty similar. Even though they might seem "useless", a little experience goes a long way. Not only are the errors precise, exact and reproducible, but they follow a standard specification and adhere to the formal description of the language.
Secondly we have to draw a line when comparing C++ to Python. C++ is a compiled language meaning that prior to creating an executable the entirety of your code must be written according to the language rules. This means that a runtime branch is checked even if it's not executed. Python on the other hand is an interpreted language. This means that it verifies code on the fly (having richer information on the execution context) so the following code will probably run fine:
def main():
a = []
a.append(1)
print(a)
# return 9'999 out of 10'000 times
a.shoot_foot(2) # list object has no attribute 'shoot_foot'
if __name__ == '__main__':
main()
This comes with the price that an error (not even calling this a bug since it violates language rules) may remain hidden and manifest in production.
Thirdly C++ is a strongly typed language. This means that types are statically checked, at the price of requiring you to be correct about them. Unlike the dynamically typed Python the type of an object cannot be modified at runtime which means increased type safety at the price of "being careful" when programming. It's a JS vs TS kind of thing, and people get accustomed to different styles; for me the lack of type safety is the thing I miss the most when working with Python. This power comes at the price of some of the most intricate error messages when doing type checking.
Lastly C++ is actively evolving towards improving error messages, making them more informative and shorter. This is not an easy task but features like concepts allow programmers to be explicit on the restrictions they impose to the type sytem. Additionally defensive programming techniques like static assertions can introduce a "fail switch" that stops compilation early resulting in less errors and keeping only the information about the offensive code.

When is it okay to create 'reserved' identifiers?

I figured one of the best ways I could learn and improve in programming is to look at various source codes. I was looking at Blender's source code and noticed something about the header files. Most of them used #ifndef include guards, where macros were surrounded by double underscores (e.g. __BMESH_CLASS_H__).
It got me thinking, the whole "just don't make anything that starts with underscores at all" advice is good for beginners, but I would think that in order to progress further in programming I should learn when creating my own reserved identifiers is and isn't appropriate.
I would think that in order to progress further in programming I should learn when creating my own reserved identifiers is and isn't appropriate.
Reserved identifiers are reserved for the implementation, which means roughly the compiler, its runtime library, and maybe parts of the operating system.
So, it's appropriate to create your own when your progression has led to you writing your own compiler or operating system. That's pretty much it.
There is only one case I know of where reserved identifiers are allowed to be self-made. All others are considered undefined behavior, and while they will still most likely work completely the same, it's against the standard and shouldn't be done.
That said, reserved identifiers can be made by you when interacting with certain components of your development environment. For example, some compilers may support something like __FILE_NAME__ and others may not, and it may even vary from compiler version to compiler version. If you are making it yourself for, say, backwards compatibility (i.e. adding a preprocessor definition to define said macro), then it is 100% okay, so long its implementation follows the requirements of what is expected of using said identifier exactly (e.g. it should substitute to the file name for __FILE_NAME__, not something else).

Can merely using (stable) third party library render my code not working

Say I have C++ project which has been working for years well.
Say also this project might (need to verify) contain undefined behaviour.
So maybe compiler was kind to us and doesn't make program misbehave even though there is UB.
Now imagine I want to add some features to the project. e.g. add Crypto ++ library to it.
But the actual code I add to it say from Crypto++ is legitimate.
Here I read:
Your code, if part of a larger project, could conditionally call some
3rd party code (say, a shell extension that previews an image type in
a file open dialog) that changes the state of some flags (floating
point precision, locale, integer overflow flags, division by zero
behavior, etc). Your code, which worked fine before, now exhibits
completely different behavior.
But I can't gauge exactly what author means. Does he say even by adding say Crypto ++ library to my project, despite the code from Crypto++ I add is legitimate, my project can suddenly start working incorrectly?
Is this realistic?
Any links which can confirm this?
It is hard for me to explain to people involved that just adding library might increase risks. Maybe someone can help me formulate how to explain this?
When source code invokes undefined behaviour, it means that the standard gives no guarantee on what could happen. It can work perfectly in one compilation run, but simply compiling it again with a newer version of the compiler or of a library could make it break. Or changing the optimisation level on the compiler can have same effect.
A common example for that is reading one element past end of an array. Suppose you expect it to be null and by chance next memory location contains a 0 on normal conditions (say it is an error flag). It will work without problem. But suppose now that on another compilation run after changing something totally unrelated, the memory organization is slightly changed and next memory location after the array is no longer that flag (that kept a constant value) but a variable taking other values. You program will break and will be hard to debug, because if that variable is used as a pointer, you could overwrite memory on random places.
TL/DR: If one version works but you suspect UB in it, the only correct way is to consistently remove all possible UB from the code before any change. Alternatively, you can keep the working version untouched, but beware, you could have to change it later...
Over the years, C has mutated into a weird hybrid of a low-level language and a high-level language, where code provides a low-level description of a way of performing a task, and modern compilers then try to convert that into a high-level description of what the task is and then implement efficient code to perform that task (possibly in a way very different from what was specified). In order to facilitate the translation from the low-level sequence of steps into the higher-level description of the operations being performed, the compiler needs to make certain assumptions about the conditions under which those low-level steps will be performed. If those assumptions do not hold, the compiler may generate code which malfunctions in very weird and bizarre ways.
Complicating the situation is the fact that there are many common programming constructs which might be legal if certain parts of the rules were a little better thought-out, but which as the rules are written would authorize compilers to do anything they want. Identifying all the places where code does things which arguably should be legal, and which have historically worked correctly 99.999% of the time, but might break for arbitrary reasons can be very difficult.
Thus, one may wish for the addition of a new library not to break anything, and most of the time one's wish might come true, but unfortunately it's very difficult to know whether any code may have lurking time bombs within it.

What's the deal with the CRT SECURE warnings/errors on visual studio?

I encountered this issue a dozen, if not a million times already: I compile a c++ program on visual studio and get a dozen, if not a million warnings and/or errors suggesting that I am doing something very dangerous and that there is no way my compiler will let me do that. the warnings/errors tell me that I am using a deprecated function and that I should consider using some other safer function that may or may not do the same thing as this one, but I have no idea what this one does in the first place since I did not write it.
After some research (I do it everytime, I am not a quick learner) I find out I am not the first one facing this particular problem, and I can coerce my compiler to work with this program with the proper macro definition (for the future readers who don't care about my question but want to compile their program, you have to define _CRT_SECURE_NO_DEPRECATE, don't you ever dare following visual studio's advice and using the allegedly safe function).
I have often read in the manual or on this very website, along with the answer, the fact that I should not do that if I don't know precisely what I am doing.
I must confess: I have no idea what I am doing, and I would be very grateful if someone would accept to explain it to me.
So here are my questions:
What are those functions that are unsafe? Why do they exist in the first place?
What is unsafe about them?
Why are they so often found in perfectly honourable libraries?
I have come to the understanding that there is no safe and portable alternative to those functions: why is it so? How about we have some people think about it and try to define a way to do it, and everyone would accept to do it that way, and we would call it standard maybe?
To tackle your questions in order:
They exist in the first place because the standard wrote them in such a way. Standards authors are human so don't think of everything and this left some security weaknesses in the C API. You can find a list of these deprecated functions at http://msdn.microsoft.com/en-us/library/ms235384.aspx.
Many of the functions are unsafe as they allow such things as buffer overruns to occur but other security vulnerabilities may be exposed depending on the function.
Honourable libraries generally try for some cross platform compatibility so I suspect will try to stick to stand C rather than using compiler specific functions and extensions.
The "perfect" standard will probably never exist as in my first point :) Some of the C API problems can be avoided using C++ but that's a big hammer to crack a small nut and brings security vulnerabilities of its own.

C++ Developer Tools: The Dark Areas [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
While C++ Standards Committee works hard to define its intricate but powerful features and maintain its backward compatibility with C, in my personal experience I've found many aspects of programming with C++ cumbersome due to lack of tools.
For example, I recently tried to refactor some C++ code, replacing many shared_ptr by T& to remove pointer usages where not needed within a large library. I had to perform almost the whole refactoring manually as none of the refactoring tools out there would help me do this safely.
Dealing with STL data structures using the debugger is like raking out the phone number of a stranger when she disagrees.
In your experience, what essential developer tools are lacking in C++?
My dream tool would be a compile-time template debugger. Something that'd let me interactively step through template instantiations and examine the types as they get instantiated, just like the regular debugger does at runtime.
In your experience, what essential developer tools are lacking in C++?
Code completion. Seriously. Refactoring is a nice-to-have feature but I think code completion is much more fundamental and more important for API discoverabilty and usabilty.
Basically, tools that require any undestanding of C++ code suck.
Code generation of class methods. When I type in the declaration you should be able to figure out the definition. And while I'm on the topic can we fix "goto declaration / goto definition" always going to the declaration?
Refactoring. Yes I know it's formally impossible because of the pre-processor - but the compiler could still do a better job of a search and replace on a variable name than I can maually. You could also syntax highlight local, members and paramaters while your at it.
Lint. So the variable I just defined shadows a higher one? C would have told me that in 1979, but c++ in 2009 apparently prefers me to find out on my own.
Some decent error messages. If I promise never to define a class with the same name inside the method of a class - do you promise to tell me about a missing "}". In fact can the compiler have some knowledge of history - so if I added an unbalanced "{" or "(" to a previously working file could we consider mentioning this in the message?
Can the STL error messages please (sorry to quote another comment) not look like you read "/dev/random", stuck "!/bin/perl" in front and then ran the tax code through the result?
How about some warnings for useful things? "Integer used as bool performance warning" is not useful, it doesn't make any performance difference, I don't have a choice - it's what the library does, and you have already told me 50 times.
But if I miss a ";" from the end of a class declaration or a "}" from the end of a method definition you don't warn me - you go out of your way to find the least likely (but theoretically) correct way to parse the result.
It's like the built in spell checker in this browser which happily accepts me misspelling wether (because that spelling is an archaic term for a castrated male goat! How many times do I write about soprano herbivores?)
How about spell checking? 40 years ago mainframe Fortran compilers had spell checking so if misspelled "WRITE" you didn't come back the next day to a pile of cards and a snotty error message. You got a warning that "WRIET" had been changed to WRITE in line X. Now the compiler happily continues and spends 10mins building some massive browse file and debugger output before telling you that you misspelled prinft 10,000 lines ago.
ps. Yes a lot of these only apply to Visual C++.
pps. Yes they are coming with my medication now.
If talking about MS Visual Studio C++, Visual Assist is a very handy tool for code completition, some refactorings - e.g. rename all/selected references, find/goto declaration, but I still miss the richness of Java IDEs like JBuilder or IntelliJ.
What I still miss, is a semantic diff tool - you know, one which does not compare the two files line-by-line, but statements/expressions. What I've found on the internet are only some abandoned tries - if you know one, please write in comment
The main problem with C++ is that it is hard to parse. That's why there are so very few tools out there that work on source code. (And that's also why we're stuck with some of the most horrific error messages in the history of compilers.) The result is, that, with very few exceptions (I only know doxygen and Visual Assist), it's down to the actual compiler to support everything needed to assist us writing and massaging the code. With compilers traditionally being rather streamlined command line tools, that's a very weak foundation to build rich editor support on.
For about ten years now, I'm working with VS. meanwhile, its code completion is almost usable. (Yes, I'm working on dual core machines. I wouldn't have said this otherwise, wouldn't I?) If you use Visual Assist, code completion is actually quite good. Both VS itself and VA come with some basic refactoring nowadays. That, too, is almost usable for the few things it aims for (even though it's still notably less so than code completion). Of course, >15 years of refactoring with search & replace being the only tool in the box, my demands are probably much too deteriorated compared to other languages, so this might not mean much.
However, what I am really lacking is still: Fully standard conforming compilers and standard library implementations on all platforms my code is ported to. And I'm saying this >10 years after the release of the last standard and about a year before the release of the next one! (Which just adds this: C++1x being widely adopted by 2011.)
Once these are solved, there's a few things that keep being mentioned now and then, but which vendors, still fighting with compliance to a >10 year old standard (or, as is actually the case with some features, having even given up on it), never got around to actually tackle:
usable, sensible, comprehensible compiler messages (como is actually pretty good, but that's only if you compare it to other C++ compilers); a linker that doesn't just throw up its hands and says "something's wrong, I can't continue" (if you have taught C++ as a first language, you'll know what I mean); concepts ('nuff said)
an IO stream implementation that doesn't throw away all the compile-time advantages which overloading operator<<() gives us by resorting to calling the run-time-parsing printf() under the hood (Dietmar Kühl once set out to do this, unfortunately his implementation died without the techniques becoming widespread)
STL implementations on all platforms that give rich debugging support (Dinkumware is already pretty good in that)
standard library implementations on all platforms that use every trick in the book to give us stricter checking at compile-time and run-time and more performance (wnhatever happened to yasli?)
the ability to debug template meta programs (yes, jalf already mentioned this, but it cannot be said too often)
a compiler that renders tools like lint useless (no need to fear, lint vendors, that's just wishful thinking)
If all these and a lot of others that I have forgotten to mention (feel free to add) are solved, it would be nice to get refactoring support that almost plays in the same league as, say, Java or C#. But only then.
A compiler which tries to optimize the compilation model.
Rather than naively include headers as needed, parsing them again in every compilation unit, why not parse the headers once first, build complete syntax trees for them (which would have to include preprocessor directives, since we don't yet know which macros are defined), and then simply run through that syntax tree whenever the header is included, applying the known #defines to prune it.
It could even be be used as a replacement for precompiled headers, so every header could be precompiled individually, just by dumping this syntax tree to the disk. We wouldn't need one single monolithic and error-prone precompiled header, and would get finer granularity on rebuilds, rebuilding as little as possible even if a header is modified.
Like my other suggestions, this would be a lot of work to implement, but I can't see any fundamental problems rendering it impossible.
It seems like it could dramatically speed up compile-times, pretty much rendering it linear in the number of header files, rather than in the number of #includes.
A fast and reliable indexer. Most of the fancy features come after this.
A common tool to enforce coding standards.
Take all the common standards and allow you to turn them on/off as appropriate for your project.
Currently just a bunch of perl scrips usullay has to supstitute.
I'm pretty happy with the state of C++ tools. The only thing I can think of is a default install of Boost in VS/gcc.
Refactoring, Refactoring, Refactoring. And compilation while typing. For refactorings I am missing at least half of what most modern Java IDEs can do. While Visual Assist X goes a long way, a lot of refactoring is missing. The task of writing C++ code is still pretty much that. Writing C++ code. The more the IDE supports high level refactoring the more it becomes construction, the more mallable the structure is the easier it will be to iterate over the structure and improve it. Pick up a demo version of Intellij and see what you are missing. These are just some that I remember from a couple of years ago.
Extract interface: taken a view classes with a common interface, move the common functions into an interface class (for C++ this would be an abstract base class) and derive the designated functions as abstract
Better extract method: mark a section of code and have the ide write a function that executes that code, constructing the correct parameters and return values
Know the type of each of the symbols that you are working with so that not only command completion can be correct for derived values e.g. symbol->... but also only offer functions that return the type that can be used in the current expression e.g. for
UiButton button = window->...
At the ... only insert functions that actually return a UiButton.
A tool all on it's own: Naming Conventions.
Intelligent Intellisense/Code Completion even for template-heavy code.
When you're inside a function template, of course the compiler can't say anything for sure about the template parameter (at least not without Concepts), but it should be able to make a lot of guesses and estimates. Depending on how the type is used in the function, it should be able to narrow the possible types down, in effect a kind of conservative ad-hoc Concepts. If one line in the function calls .Foo() on a template type, obviously a Foo member method must exist, and Intellisense should suggest it in the rest of the function as well.
It could even look at where the function is invoked from, and use that to determine at least one valid template parameter type, and simply offer Intellisense inside the function based on that.
If the function is called with a int as a template parameter, then obviously, use of int must be valid, and so the IDE could use that as a "sample type" inside the function and offer Intellisense suggestions based on that.
JavaScript just got Intellisense support in VS, which had to overcome a lot of similar problems, so it can be done. Of course, with C++'s level of complexity, it'd be a ridiculous amount of work. But it'd be a nice feature.