Should I really massively introduce the explicit keyword? - c++

When I used the (recently released) Cppcheck 1.69 on my code1, it showed a whole lot of messages where I expected none. Disabling noExplicitConstructor proved that all of them were of exactly this kind.
But I found that I'm not the only one with a lot of new Cppcheck messages, look at the results of the analysis of LibreOffice (which I'm allowed to show in public):
What would an experienced programmer do:
Suppress the check?
Massively introduce the explicit keyword?
1 This is of course not my code but code I have to work at work, it's legacy code: a mix of C and C++ in several (pre-)standard flavors (let's say C++98), and it's a pretty large code base.

I've been bitten in the past by performance hits introduced by implicit conversions as well as outright bugs. So I tend to always use explicit for all constructors that I do not want to participate in implicit conversions so that the compiler can help me catch my errors - and I then try to always also add a "// implicit intended" comment to the ctors where I explicitly intend for them to be used as converting ctors implicitly. I find that this helps me write more correct code with fewer surprises.
… So I'd say "yes, go add explicit" - in the long run you'll be glad you did - that's what I did when I first learned about it, and I'm glad I did.

Related

Why can't constexpr just be the default?

constexpr permits expressions which can be evaluated at compile time to be ... evaluated at compile time.
Why is this keyword even necessary? Why not permit or require that compilers evaluate all expressions at compile time if possible?
The standard library has an uneven application of constexpr which causes a lot of inconvenience. Making constexpr the "default" would address that and likely improve a huge amount of existing code.
It already is permitted to evaluate side-effect-free computations at compile time, under the as-if rule.
What constexpr does is provide guarantees on what data-flow analysis a compliant compiler is required to do to detect1 compile-time-computable expressions, and also allow the programmer to express that intent so that they get a diagnostic if they accidentally do something that cannot be precomputed.
Making constexpr the default would eliminate that very useful diagnostic ability.
1 In general, requiring "evaluate all expressions at compile time if possible" is a non-starter, because detecting the "if possible" requires solving the Halting Problem, and computer scientists know that this is not possible in the general case. So instead a relaxation is used where the outputs are { "Computable at compile-time", "Not computable at compile-time or couldn't decide" }. And the ability of different compilers to decide would depend on how smart their test was, which would make this feature non-portable. constexpr defines the exact test to use. A smarter compiler can still pre-compute even more expressions than the Standard test dictates, but if they fail the test, they can't be marked constexpr.
Note: despite the below, I admit to liking the idea of making constexpr the default. But you asked why it wasn't already done, so to answer that I will simply elaborate on mattnewport's last comment:
Consider the situation today. You're trying to use some function from the standard library in a context that requires a constant expression. It's not marked as constexpr, so you get a compiler error. This seems dumb, since "clearly" the ONLY thing that needs to change for this to work is to add the word constexpr to the definition.
Now consider life in the alternate universe where we adopt your proposal. Your code now compiles, yay! Next year you decide you to add Windows support to whatever project you're working on. How hard can it be? You'll compile using Visual Studio for your Windows users and keep using gcc for everyone else, right?
But the first time you try to compile on Windows, you get a bunch of compiler errors: this function can't be used in a constant expression context. You look at the code of the function in question, and compare it to the version that ships with gcc. It turns out that they are slightly different, and that the version that ships with gcc meets the technical requirements for constexpr by sheer accident, and likewise the one that ships with Visual Studio does not meet those requirements, again by sheer accident. Now what?
No problem you say, I'll submit a bug report to Microsoft: this function should be fixed. They close your bug report: the standard never says this function must be usable in a constant expression, so we can implement however we want. So you submit a bug report to the gcc maintainers: why didn't you warn me I was using non-portable code? And they close it too: how were we supposed to know it's not portable? We can't keep track of how everyone else implements the standard library.
Now what? No one did anything really wrong. Not you, not the gcc folks, nor the Visual Studio folks. Yet you still end up with un-portable code and are not a happy camper at this point. All else being equal, a good language standard will try to make this situation as unlikely as possible.
And even though I used an example of different compilers, it could just as well happen when you try to upgrade to a newer version of the same compiler, or even try to compile with different settings. For example: the function contains an assert statement to ensure it's being called with valid arguments. If you compile with assertions disabled, the assertion "disappears" and the function meets the rules for constexpr; if you enable assertions, then it doesn't meet them. (This is less likely these days now that the rules for constexpr are very generous, but was a bigger issue under the C++11 rules. But in principle the point remains even today.)
Lastly we get to the admittedly minor issue of error messages. In today's world, if I try to do something like stick in a cout statement in constexpr function, I get a nice simple error right away. In your world, we would have the same situation that we have with templates, deep stack-traces all the way to the very bottom of the implementation of output streams. Not fatal, but surely annoying.
This is a year and a half late, but I still hope it helps.
As Ben Voigt points out, compilers are already allowed to evaluate anything at compile time under the as-if rule.
What constexpr also does is lay out clear rules for expressions that can be used in places where a compile time constant is required. That means I can write code like this and know it will be portable:
constexpr int square(int x) { return x * x; }
...
int a[square(4)] = {};
...
Without the keyword and clear rules in the standard I'm not sure how you could specify this portably and provide useful diagnostics on things the programmer intended to be constexpr but don't meet the requirements.

C++ vs. D , Ada and Eiffel (horrible error messages with templates)

One of the problems of C++ are horrible error messages that we are getting from code which intensively uses templates and template metaprogramming. The concepts are designed to solve this problem, but unfortunately they will not be in the next standard.
I'm wondering, is this problem common for all languages, which are supporting generic programming? Or something is wrong with C++ templates?
Unfortunately I don't know any other language, that supports generic programming (Java and C# generics are too simplified and not as powerful as C++ templates).
So I'm asking you guys: are D,Ada,Eiffel templates (generics) producing such ugly error messages too? And Is it possible to have language with powerful generic programming paradigm, but without ugly error messages? And if yes, how these languages are solving this problem ?
Edit: for downvoters. I really love C++ and templates. I'm not saying that templates are bad. Actually I'm a big fan of generic programming and template metaprogramming. I'm just asking why I'm getting such ugly error messages from compilers.
In general I found Ada compiler error messages for generics really not significantly more difficult to read than any other Ada compiler error messages.
C++ template error messages, on the other hand, are notorious for being error novels. The main difference I think is the way C++ does template instantiation. The thing is, C++ templates are much more flexible than Ada generics. It is so flexible, it is almost like a macro preprocessor. The clever folks in Boost have used this to implement things like lambdas and even whole other languages.
Because of that flexibility, the entire template hierarchy basically has to be compiled anew every time its particular permutation of template parameters is first encountered. Thus issues that resolve down to incompatibilities several layers down a API end up being presented to the poor API client to decipher.
In Ada, Generics are actually strongly typed, and provide full information hiding to the client, just like normal packages and subroutines do. So if you do get an error message, it is typically just referencing the one generic you are trying to instatiate, not the entire hierarchy used to implement it.
So yes, C++ template error messages are way worse than Ada's.
Now debugging is a different story entirely...
The problem, at heart, is that error recovery is difficult, whatever the context.
And when you factor in C and C++ horrid grammars, you can only wonder that error messages are not worse than that! I am afraid that the C grammar has been designed by people who didn't have a clue about the essential properties of a grammar, one of them being that the less reliance on the context the better and the other being that you should strive to make it as unambiguous as possible.
Let us illustrate a common error: forgetting a semi-colon.
struct CType {
int a;
char b;
}
foo
bar() { /**/ }
Okay so this is wrong, where should the missing semi-colon go ? Well unfortunately it's ambiguous, it can go either before or after foo because:
C considers it normal to declare a variable in stride after defining a struct
C considers it normal not to specify a return type for a function (in which case it defaults to int)
If we reason about, we could see that:
if foo names a type, then it belongs to the function declaration
if not, it probably denotes a variable... unless of course we made a typo and it was meant to be written fool, which happens to be a type :/
As you can see, error recovery is downright difficult, because we need to infer what the writer meant, and the grammar is far from being receptive. It is not impossible though, and most errors can indeed be diagnosed more or less correctly, and even recovered from... it just takes considerable effort.
It seems that people working on gcc are more interested in producing fast code (and I mean fast, search for the latest benchmarks on gcc 4.6) and adding interesting features (gcc already implement most - if not all - of C++0x) than producing easy to read error messages. Can you blame them ? I can't.
Fortunately there are people who think that accurate error reporting and good error recovery are a very worthy goal, and some of those have been working on CLang for quite a bit, and they are continuing to do so.
Some nice features, off the top of my head:
Terse but complete error messages, which include the source ranges to expose exactly where the error emanated from
Fix-It notes when it's obvious what was meant
In which case the compiler parses the rest of the file as if the fix had been there already, instead of spewing lines upon lines of gibberish
(recent) avoid including the include stack for notes, to cut out on the cruft
(recent) trying only to expose the template parameter types that the developper actually wrote, and preserving typedefs (thus talking about std::vector<Name> instead of std::vector<std::basic_string<char, std::allocator<char>>, std::allocator<std::basic_string<char, std::allocator<char>> > which makes all the difference)
(recent) recovering correctly in case of a missing template in case it's missing in a call to a template method from within another template method
But each of those has required several hours to days of work.
They certainly didn't come for free.
Now, concepts should have (normally) made our lives easier. But they were mostly untested and so it was deemed preferable to remove them from the draft. I must say I am glad for this. Given C++ relative inertia, it's better not to include features that haven't been thoroughly revised, and the concept maps didn't really thrilled me. Neither did they thrilled Bjarne or Herb it seems, as they said that they would be rethinking Concepts from scratch for the next standard.
The article Generic Programming outlines many of the pros and cons of generics in several languages, including Ada in particular. Although lacking template specialization, all Ada generic instances are "equivalent to the instance declaration…immediately followed by the instance body". As a practical matter, error messages tend to occur at compile-time, and they typically represent familiar violations of type-safety.
D has two features to improve the quality of template error messages: Constraints and static assert.
// Use constraints to only allow a function to operate on random access
// ranges as defined in std.range. If something that doesn't satisfy this
// is passed, the compiler will error before even trying to instantiate
// fun().
void fun(R)(R range) if(isRandomAccessRange!(R)) {
// Do stuff.
}
// Use static assert to check a high level invariant. If
// the predicate is false, the error message will be
// printed and compilation will stop before a screen
// worth of more confusing errors are encountered.
// This function takes any number of ranges to merge sort
// and the same number of temporary buffers to merge into.
void mergeSort(R...)(R ranges) {
static assert(R.length % 2 == 0,
"Must have equal number of ranges to be sorted and temporary buffers.");
static assert(allSatisfy!(isRandomAccessRange, R),
"All arguments to mergeSort must be random access ranges.");
// Implementation
}
Eiffel has the best of all error messages because it is has the best of all template systems. It is fully integrated into the language and works well because it is the only language which is using covarianz in arguments.
Therefore it is much more then a simple compiler copy and paste. Unfortunately explaining the difference in a few lines is impossible. Just go and have a look at EiffelStudio.
There are some efforts to improve the error messages. Clang, for example, has put quite a lot of emphasis on generating more easily readable compiler error messages. I've only been using it for a short while, but my experience of it so far has been quite positive compared to GCC's equivalent errors.

Expressions with no side effects in C++

See, what I don't get is, why should programs like the following be legal?
int main()
{
static const int i = 0;
i < i > i;
}
I mean, surely, nobody actually has any current programs that have expressions with no side effects in them, since that would be very pointless, and it would make parsing & compiling the language much easier. So why not just disallow them? What benefit does the language actually gain from allowing this kind of syntax?
Another example being like this:
int main() {
static const int i = 0;
int x = (i);
}
What is the actual benefit of such statements?
And things like the most vexing parse. Does anybody, ever, declare functions in the middle of other functions? I mean, we got rid of things like implicit function declaration, and things like that. Why not just get rid of them for C++0x?
Probably because banning then would make the specification more complex, which would make compilers more complex.
it would make parsing & compiling the
language much easier
I don't see how. Why is it easier to parse and compile i < i > i if you're required to issue a diagnostic, than it is to parse it if you're allowed to do anything you damn well please provided that the emitted code has no side-effects?
The Java compiler forbids unreachable code (as opposed to code with no effect), which is a mixed blessing for the programmer, and requires a little bit of extra work from the compiler than what a C++ compiler is actually required to do (basic block dependency analysis). Should C++ forbid unreachable code? Probably not. Even though C++ compilers certainly do enough optimization to identify unreachable basic blocks, in some cases they may do too much. Should if (foo) { ...} be an illegal unreachable block if foo is a false compile-time constant? What if it's not a compile-time constant, but the optimizer has figured out how to calculate the value, should it be legal and the compiler has to realise that the reason it's removing it is implementation-specific, so as not to give an error? More special cases.
nobody actually has any current
programs that have expressions with no
side effects in them
Loads. For example, if NDEBUG is true, then assert expands to a void expression with no effect. So that's yet more special cases needed in the compiler to permit some useless expressions, but not others.
The rationale, I believe, is that if it expanded to nothing then (a) compilers would end up throwing warnings for things like if (foo) assert(bar);, and (b) code like this would be legal in release but not in debug, which is just confusing:
assert(foo) // oops, forgot the semi-colon
foo.bar();
things like the most vexing parse
That's why it's called "vexing". It's a backward-compatibility issue really. If C++ now changed the meaning of those vexing parses, the meaning of existing code would change. Not much existing code, as you point out, but the C++ committee takes a fairly strong line on backward compatibility. If you want a language that changes every five minutes, use Perl ;-)
Anyway, it's too late now. Even if we had some great insight that the C++0x committee had missed, why some feature should be removed or incompatibly changed, they aren't going to break anything in the FCD unless the FCD is definitively in error.
Note that for all of your suggestions, any compiler could issue a warning for them (actually, I don't understand what your problem is with the second example, but certainly for useless expressions and for vexing parses in function bodies). If you're right that nobody does it deliberately, the warnings would cause no harm. If you're wrong that nobody does it deliberately, your stated case for removing them is incorrect. Warnings in popular compilers could pave the way for removing a feature, especially since the standard is authored largely by compiler-writers. The fact that we don't always get warnings for these things suggests to me that there's more to it than you think.
It's convenient sometimes to put useless statements into a program and compile it just to make sure they're legal - e.g. that the types involve can be resolved/matched etc.
Especially in generated code (macros as well as more elaborate external mechanisms, templates where Policies or types may introduce meaningless expansions in some no-op cases), having less special uncompilable cases to avoid keeps things simpler
There may be some temporarily commented code that removes the meaningful usage of a variable, but it could be a pain to have to similarly identify and comment all the variables that aren't used elsewhere.
While in your examples you show the variables being "int" immediately above the pointless usage, in practice the types may be much more complicated (e.g. operator<()) and whether the operations have side effects may even be unknown to the compiler (e.g. out-of-line functions), so any benefit's limited to simpler cases.
C++ needs a good reason to break backwards (and retained C) compatibility.
Why should doing nothing be treated as a special case? Furthermore, whilst the above cases are easy to spot, one could imagine far more complicated programs where it's not so easy to identify that there are no side effects.
As an iteration of the C++ standard, C++0x have to be backward compatible. Nobody can assert that the statements you wrote does not exist in some piece of critical software written/owned by, say, NASA or DoD.
Anyway regarding your very first example, the parser cannot assert that i is a static constant expression, and that i < i > i is a useless expression -- e.g. if i is a templated type, i < i > i is an "invalid variable declaration", not a "useless computation", and still not a parse error.
Maybe the operator was overloaded to have side effects like cout<<i; This is the reason why they cannot be removed now. On the other hand C# forbids non-assignment or method calls expresions to be used as statements and I believe this is a good thing as it makes the code more clear and semantically correct. However C# had the opportunity to forbid this from the very beginning which C++ does not.
Expressions with no side effects can turn up more often than you think in templated and macro code. If you've ever declared std::vector<int>, you've instantiated template code with no side effects. std::vector must destruct all its elements when releasing itself, in case you stored a class for type T. This requires, at some point, a statement similar to ptr->~T(); to invoke the destructor. int has no destructor though, so the call has no side effects and will be removed entirely by the optimizer. It's also likely it will be inside a loop, then the entire loop has no side effects, so the entire loop is removed by the optimizer.
So if you disallowed expressions with no side effects, std::vector<int> wouldn't work, for one.
Another common case is assert(a == b). In release builds you want these asserts to disappear - but you can't re-define them as an empty macro, otherwise statements like if (x) assert(a == b); suddenly put the next statement in to the if statement - a disaster! In this case assert(x) can be redefined as ((void)0), which is a statement that has no side effects. Now the if statement works correctly in release builds too - it just does nothing.
These are just two common cases. There are many more you probably don't know about. So, while expressions with no side effects seem redundant, they're actually functionally important. An optimizer will remove them entirely so there's no performance impact, too.

Strange Pattern: all functions/methods return error-code using the same type in C++

In my last two projects I've seen the strange guideline, "All Methods/Functions should return error-code using some common ERROR_CODE type". In both projects ERROR_CODE is an int typedef.
Is there any good reason doing it in C++? Some MISRA requirement or something like that?
I can see only disadvantages:
If a function should return a value, it is done by argument reference. e.g.:
string s;
ERROR_CODE err = getString(s);
The importance of a function is not obvious. All looks the same. The list of errors conntains hundreds of errors from low level errors to some domain specific errors.
Have you experienced this programming style? Are there good arguments against it or for it?
I think it's a very bad style for several reasons.
Like you've said, it forces you to pass pointers/references to store the actual result of a function.
Like you've said, the unified error code is ugly because it's trying to unify all sorts of errors from all sorts of domains.
It creates an artificial dependency of all the program's modules on the error code system, making it awkward to reuse a single module or small subset of modules in other programs.
Further, since some of the error codes are domain-specific, it's actually introducing dependencies between unrelated object types/modules, since they're all dependent upon a component that's dependent upon the union of all of their possible error types.
My view is that any function/method which has more than a small manageable number of ways it can fail is either overly complex or poorly factored, probably both.
If you really want to return error codes, I would swap things around and pass the pointer to the error code as an argument to the function, and make the actual result the return value. Then I would choose one of these two approaches for implementing the error codes:
The simple way: throw away all abstraction of the error code and simply use int with a few universal error classes.
The heavy object oriented way: Provide a pointer to an internal "error object" where the base class is very abstract and can be shared between all components without introducing any dependency, and where each component defines its own component-specific error objects if needed.
A better approach if you're using C++ would probably be using exceptions...
I've seen it.
kernel programming is that way, except when only one error is possible.
It doesn't sound like a great idea, but neither all that bad of one.
It's not unusual for teams to agree on a common means of returning errors, since this helps in creating a common 'look and feel' to the project's code, just like any other team-wide coding convention. This could help new team members to understand the overall picture quicker, and make maintenance within the team of other peoples' code a little more intuitive.
It's surprising to me that a C++ project is unifying behind errors rather than exceptions, however. There's a discussion of the pros and cons of using exceptions vs error codes here.
I guess one argument in favour of error code handling is if you are using a C-style API that leads you into this approach (cough... Win32... cough).
This idiom is quite common, especially in the C world.
Even though I don't use it myself and I think it makes more harm than good (more on that in the other answers), I do find an advantage of it: a consistent way to report unexpected errors to the call site. Something like the errno variable, but easier to use.
For instance, consider a set of functions:
int a();
std::string b();
double c();
std::list<long> d();
Each of the above functions would indicate the failure in a different way: a() could return an -1, b() an empty string, c() a 0.0 and d() an empty list. That's inconsistent and not quite intuitive. Now imagine a function, whose range covers the entire possible range of the type it returns. That's even worse.
Some APIs also do:
int x(bool* ok);
But that also pollutes each function with an additional argument.
In C, there aren't many possibilities to do in a nice way, unfortunately, if you really need to design such an API that would indicate the different types of failure. In the C++ world, however, you can just use exceptions.
I've seen the argument that when linking to a C++ library compiled by another compiler than is used to compile your binaries, exceptions might not work. While this non-working may totally be true, in actuality, even the linking process need not work (although everyone may be sticking to the standards), so, theoretically, this argument is void. In practice however, it may be (I don't have experience here, sorry), that name mangling conflicts rarely arise, alignment conflicts rarely arise, and, well, all other implementation specific stuff is widely agreed upon, except for exceptions.
Second argument I've seen is run-time performance. While stack unwinding in case of an exception is expensive, I've not yet seen a fair benchmark that compared exceptions to a realistic amount of return code checking.
In my typical C++ I use a mix. I use the slower exceptions for stuff that I really don't expect to happen frequently or code paths that are measured to be rarely executed, but return codes for stuff that is more likely to break and probably called frequently.
Throwing exceptions in a tight loop because some funny condition holds true in every iteration is not cheap (assuming the loop body handles it).

Dangerous ways of removing compiler warnings?

I like to force a policy of no warnings when I check someone's code. Any warnings that appear have to be explicitly documented as sometimes it's not easy to remove some warnings or might require too many cycles or memory etc.
But there is a down-side to this policy and that is removing warnings in ways that are potentially dangerous, i.e. the method used actually hides the problem rather than fixes it.
The one I'm most acutely aware of is explicitly casts which might hide a bug.
What other potentially dangerous ways of removing compiler warnings in C(++) are there that I should look out for?
const correctness can cause a few problems for beginners:
// following should have been declared as f(const int & x)
void f( int & x ) {
...
}
later:
// n is only used to pass the parameter "4"
int n = 4;
// really wanted to say f(4)
f( n );
Edit1: In a somewhat similar vein, marking all member variables as mutable, because your code often changes them when const correctness says it really shouldn't.
Edit2: Another one I've come across (possibly from Java programmers ) is to tack throw() specifications onto functions, whether they could actually throw or not.
Well, there's the obvious way - disabling a specific warning for parts of the code:
#pragma warning( disable : 4507 34 )
EDIT: As has been pointed out in the comments, it is sometimes necessary to use in cases where you know that the warnings are OK (if it wasn't a useful feature, there would have been no reason to put it in in the first place). However, it is also a very easy way to "ignore" warnings in your code and still get it to compile silently, which is what the original question was about.
I think it's a delicate issue. My view is that warnings should be checked thoroughly to see if the code is correct / does what is intended. But often there is correct code that will produce warnings, and trying to eliminate them just convolutes the code or forces a rewrite in a less natural way.
I recall in a previous release I had correct and solid code that produced a couple of warnings, and coworkers started complaining about this. The code was much cleaner and did what it was intented. In the end the code went on production with the warnings.
Also, different compiler versions will produce different warnings, so it becomes more pointless to enforce a "no warnings" policy when the result depends on the mood of the compiler developers.
I want to emphasize how important is to check all warnings at least once.
Btw I develop in C and C++ for embedded systems.
Commenting out (or worse, deleting) the code that generates the warning. Sure, the warning goes away, but you are more than just a little likely ending up with code that doesn't do what you intend.
I also enforce a no-warnings rule, but you are right that you can't just remove the warning without careful consideration. And to be honest, at times I've left warnings in for a while because the code was right. Eventually I clean it up somehow, because once you have more than a dozen warnings in the build, people stop paying attention to them.
What you described is not a problem unique to warnings. I can't tell you how many times I see someones bug-fix for crash to be "I added a couple of NULL checks". You have to go to the root cause: Should that variable be NULL? If not, why was it?
This is why we have code reviews.
The biggest risk would be that someone would spend hours of development time to solve a minor warning that has no effect on the code. That would be a waste of time. Sometimes it's just easier to keep a warning and add a line of comment explaining why the warning occurs. (Until someone has time to resolve these trivial warnings.)
In my experience, resolving trivial warnings often adds two more days of work for developers. Those could make the difference between finishing before and after the deadline.