While I was on a short break, my workplace switched to using a static code analyzer.
They ran it over the project I am working on and one particular problem flagged by the analyzer goes like this (simplified example):
struct calcSomething
{
int result;
calcSomething() : result(0) {}
void operator()(const int v) { /*does something*/ }
operator int() const { return result; }
};
void foo()
{
std::vector<int> myvector(10);
// exercise for reader: stick some values in `myvector`
int result = std::for_each(myvector.begin(), myvector.end(), calcSomething());
}
The analyzer flags the following issues:
warning: CodeChecker: 'operator int' must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
operator int() const { return result; }
The suggested fix to the functor reads:
struct calcSomething
{
...
explicit operator int() const { return result; }
};
But if I fix my functor as suggested, the static analyzer quickly flags the following issue:
warning: CodeChecker: no viable conversion from '(anonymous namespace)::calcSomething' to 'int' [clang-diagnostic-error]
I now need to add the explicit cast:
void foo()
{
...
int total = static_cast<int>(std::for_each(myvector.begin(), myvector.end(), calcSomething()));
}
The above example is a mere simplification of the real problem which would otherwise just add filler and no substance.
I have seen plenty of examples of functors like the one I describe here in text books and programming reference web pages.
I have never considered these unsafe. I have never seen anyone flag these as unsafe.
So does the code analyzer have a point?
Or is it a little overzealous to make my functor's conversion operator explicit and as a result make me add the static cast?
Purely from an aesthetic point, I feel that a simple problem with an elegant solution now accrues a lot of ugly syntactic padding.
But perhaps that is the price we pay for writing safe(r) code.
side note: TIL that explicit applies not only to ctors
Edit
It seems some people are unable to read beyond the example code I provided (pretty textbook stuff) and still suggest other algorithms/idioms, completely failing to see that the actual question is about conversion operators on functors whose sole purpose is to calculate and return an algorithm's result.
If the question was about how to improve on an adding algorithm, then the title would have said so.
So I decided to hide any implementation details in this edit to make it easier for these people.
Sorry that some of the comments below now no longer make any sense, but the record got stuck so I had to move the needle a bit in order to move things forward (hopefully).
I would do away with any conversion operators altogether. What's wrong with:
int result = std::for_each(...).get();
where get() does the same as your current operator int.
You know that the result of for_each is not an integer, it's your function object. Why, why would you want to avoid making the conversion from a function to a value explicit? It is, by all means, a questionable idea. Sure, you can still do it, but you want clean warning-free code right? Well, clean, warning free code, in my book, should not auto-convert functions to integers. I agree static_cast is almost equally ugly, that's why I am suggesting a named function
Related
Say I have a class C that I want to be able to implicitly cast to bool to use in if statements.
class C {
public:
...
operator bool() { return data ? true : false; }
private:
void * data;
};
and
C c;
...
if (c) ...
But the cast operator has a conditional which is technically overhead (even if relatively insignificant). If data was public I could do if (c.data) instead which is entirely possible and does not involve any conditionals. I doubt that the compiler will do any implicit conversion involving a conditional in the latter scenario, since it will likely generate a "jump if zero" or "jump if not zero" which doesn't really need any Boolean value, which the CPU will most likely have no notion of anyway.
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Note that I did establish that if the typecast directly returns data it also works, probably using the same type of implicit (hypothetical and not really happening in practice) conversion that would be used in the case of if (c.data).
Edit: Just to clarify, the point of the matter is actually a bit hypothetical. The dilemma is that Boolean is itself a hypothetical construct (which didn't initially exist in C/C++), in reality it is just integers. As I mentioned, the typecast can directly return data or use != instead, but it is really not very readable, but even that is not the issue. I don't really know how to word it to make sense of it better, the C class has a void * that is an integer, the CPU has conditional jumps which use integers, the issue is that abiding to the hypothetical Boolean construct that sits in the middle mandates the extra conditional. Dunno if that "clarification" made things any more clear though...
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Only examining your compiler output - with the specific optimisation flags you'd like to use - can tell you for sure, and then it might change after some seemingly irrelevant change like adding an extra variable somewhere in the calling context, or perhaps with the next compiler release etc....
More generally, C++ wouldn't be renowned for speed if the optimisers didn't tend to handle this kind of situation perfectly, so your odds are very good.
Further, write working code then profile it and you'll learn a lot more about what performance problems are actually significant.
It depends on how smart your compiler's optimizer is. I think they should be smart enough to remove the useless ? true: false operation, because the typecast operation should be inlined.
Or you could just write this and not worry about it:
operator bool() { return data; }
Since there's a built-in implicit typecast from void* to bool, data gets typecast on the way out the function.
I don't remember if the conditional in if expects bool or void*; at one point, before C++ added bool, it was the latter. (operator! in the iostream classes returned void* back then.)
On modern compilers these two functions produce the same machine code:
bool toBool1(void* ptr) {
return ptr ? true : false;
}
bool toBool2(void* ptr) {
return ptr;
}
Demo
So it really doesn't matter.
I was watching Bjarne Stroustrup on YouTube and I was trying to figure out why this is considered bad as he said it is C++98 style bad code
void setInt(const unsigned int &i)
void takeaString(const std::string &str)
I mean you are passing a reference to a constant so you save yourself the copy operation and it isnt even using like passing the pointer so it doesnt have to dereference so why is it bad?
In pre-C++11, the general rule of thumb is if you don't modify the argument, pass a built-in type by value and an object of a class or struct by const&, because objects of classes or structs are typically so big that passing by const& pays in terms of performance.
Now that's a fairly arbitrary rule, and you'll also see exceptions, of course (e.g. iterators in the standard library) but it works well in practice and is an established idiom. When you see f(int const &i) or f(std::string s) in some other programmer's code, then you will want to know the reason, and if there's no apparent reason, people will be confused.
In C++11, the story may be different. A lot of people claim that due to new language features (move semantics and rvalue references), passing big objects by value is not a performance problem anymore and may even be faster. Look at this article: "Want Speed? Pass by Value." However, when you look at past Stack Overflow discussions, you will also find that there are experienced programmers opposed to this view.
Personally, I've not made up my mind on this. I consider C++11 too new for me to really judge what's good and bad.
Nevertheless, C++11 is often irrelevant if you have to use a pre-C++11 compiler for whatever reason, so it's important to know the pre-C++11 rules in any case.
Here is when it is good:
bool session_exists(heavy_key_t const& key)
{
// why would we ever want to copy the key if we don't need to
return sessions.find(key)) == sessions.end();
}
This is when passing argument by reference is possibly not so good:
struct session {
heavy_key_t key_;
session(heavy_key_t const& key):
key_(key) // <-- we are taking a copy anyway, why not letting compiler do it
{}
};
And another one that works just fine on values thanks to copy elision optimization and RVO:
template <class T, class Merger>
T merge(T state, T update, Merger const& merger) {
// merger is still by reference, we don't need the copy of it
return merger(std::move(state), std::move(update));
}
I'm thinking of replacing all the instances of safe bool idiom by explicit operator bool in code which already uses C++11 features (so the fact that older compilers don't recognized explicit conversion operators will not matter), so I'd like to know if it can cause some subtle problems.
Thus, what are all the possible incompatibilities (even the most minute ones) that can be caused by switching from old and dull safe bool idiom to new and shiny explicit operator bool?
EDIT: I know that switching is a good idea anyway, for the latter is a language feature, well-understood by the compiler, so it'll work no worse than what's in fact just a hack. I simply want to know the possible differences.
Probably the biggest difference, assuming your code is free of bugs (I know, not a safe assumption), will be that in some cases, you may want an implicit conversion to exactly bool. An explicit conversion function will not match.
struct S1
{
operator S1*() { return 0; } /* I know, not the best possible type */
} s1;
struct S2
{
explicit operator bool() { return false; }
} s2;
void f()
{
bool b1 = s1; /* okay */
bool b2 = s2; /* not okay */
}
If you've used safe-bool conversion incorrectly in your code, only then explicit operator bool be incompatible, as it wouldn't allow you to do things incorrectly that easily. Otherwise, it should be just fine without any problem. In fact, even if there is problem, you should still switch to explicit operator bool, because if you do so, then you could identify the problem in the usage of the safe-bool conversion.
According to this article, some compilers emit inefficient instructions for safe-bool implementation using member function pointer,
When people started using this idiom, it was discovered that there was an efficiency penalty on some compilers — the member function pointer caused a compiler headache resulting in slower execution when the address was fetched. Although the difference is marginal, the current practice is typically to use a member data pointer instead of a member function pointer.
How I can prevent the last line of this code from compiling?
#include <boost/optional.hpp>
int main()
{
typedef boost::optional<int> int_opt;
int_opt opt = 0;
bool x = opt; // <- I do not want this to compile
}
The last line doesn't examine opt's contained int value, but instead compiles as a type conversion to bool, and doesn't seem to be what the user intended.
The safe bool idiom seems to be relevant here?
The whole point of boost::optional is to enable code like this:
void func(boost::optional<int> optionalArg)
{
if (optionalArg) {
doSomething(*optionalArg);
}
}
So the implicit conversion to bool is a feature, and should not be prevented from compiling.
The problem that you describe used to be the case for the older versions of Boost. Ever since 1.56 release boost::optional has an explicit conversion to bool and the code that you show does not compile anymore (exactly the way you wanted). See here.
If you're using optional then you need to be able to determine if it's set before using it. The way this is implemented is with the (effectively bool) conversion.
It doesn't in my mind follow that the user didn't want what's actually written there: They should know that it's an optional and that they're checking it for validity.
Since the conversion is a built in part of boost::optional I'm not aware of any way to directly remove it.
You could of course implement a wrapper class for your particular int need that provides just the parts of the optional interface that you want, possibly with an explicit function that checks validity.
Alternately you could always use template<class T> inline T const* get_pointer ( optional<T> const& opt ) ; or its non-const version when working with optionals to make it explicit what's happening.
I've been looking for an example that shows how to implement constraints in C++ (or a boost library that lets me do this easily), but without much luck. The best I could come up with off the top of my head is:
#include <boost/function.hpp>
#include <boost/lambda/lambda.hpp>
template<typename T>
class constrained
{
public:
constrained(boost::function<bool (T)> constraint, T defaultValue, T value = defaultValue)
{
ASSERT(constraint(defaultValue));
ASSERT(constraint(value));
this->value = value;
this->defaultValue = defaultValue;
this->constraint = constraint;
}
void operator=(const T &assignedValue)
{
if(constraint(assignedValue))
value = assignedValue;
}
private:
T value;
T defaultValue;
boost::function<bool (T)> constraint;
};
int main(int argc, char* argv[])
{
constrained<int> foo(boost::lambda::_1 > 0 && boost::lambda::_1 < 100, 5, 10);
foo = 20; // works
foo = -20; // fails
return 0;
}
Of course there's probably some more functionality you'd want from a constraint class. This is just an idea for a starting point.
Anyway, the problem I see is that I have to overload all operators that T defines in order to make it really behave like a T, and there is no way for me to find out what those are. Now, I don't actually need constraints for that many different types, so I could just leave out the template and hard code them. Still, I'm wondering if there's a general (or at least more succint/elegant) solution or if there's anything seriously wrong with my approach.
Looks good as for tiny example. But be sure to implement all the operators and handle somehow wrong values.
foo = 100; // works
++foo; // should throw an exception or perform an assert
Use boost operators to help you with operators overload.
And probably it would be good to have an option as a template parameter: either exception or assertion.
I'd use such class. It is always better to have an index parameter that auto check vector range and do assertion.
void foo( VectorIndex i );
You don't need to overload all operators as others have suggested, though this is the approach that offers maximum control because expressions involving objects of type constrained<T> will remain of this type.
The alternative is to only overload the mutating operators (=, +=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=, pre and post ++, pre and post --) and provide a user-defined conversion to T:
template<typename T>
class constrained {
... // As before, plus overloads for all mutating operators
public:
operator T() const {
return value;
}
};
This way, any expression involving a constrained<T> object (e.g. x + y where x is int and y is constrained<int>) will be an rvalue of type T, which is usually more convenient and efficient. No safety is lost, because you don't need to control the value of any expression involving a constrained<T> object -- you only need to check the constraints at a time when a T becomes a constrained<T>, namely in constrained<T>'s constructor and in any of the mutating operators.
Boost.Constrained_Value may be of interest to you. It was reviewed last December, but it is not in the latest Boost release. IIRC, the review was mostly positive, but the decision is still pending.
I agree with Mykola Golubyev that boost operators would help.
You should define all the operators that you require for all the types you are using.
If any of the types you are using don't support the operator (for example the operator++()), then code that calls this method will not compile but all other usages will.
If you want to use different implementations for different types then use template specialisation.
I might just be confused, but if you are facing parameters that must not violate specific constraints, wouldn't it be easiest to create a class for them, checking for constraints in constructors and assignment operators?
Boost actually had such a library under discussion (I don't know what became of it). I've also written my own version of such a type, with slightly different behaviour (less flexible, but simpler). I've blogged an admittedly somewhat biased comparison here: Constrained vs. restricted value types
Edit: apparently Eric knows better what happened to boost's implementation.