Automatically determine if user-defined function is equivalent to the implicit one

Automatically determine if user-defined function is equivalent to the implicit one - c++

Sometimes, users implement functions with the equivalent functionality as their implicitly defined versions. For example, a copy constructor which simply calls the copy constructor of all its members.
struct A
{
int B;
A(const A& a) : B(a.B) { }
}
This is undesirable, because it causes additional maintenance, for example if the class members are renamed/reordered, etc., and reduces readability. Also, adding these functions also means that functions such as std::is_trivially_copy_constructable claim the type is cannot be trivially copy constructed (but in practice, it actually could be).
I have a code base where this seems to be a common occurrence, which I would like to rectify, by deleting these implementations. However, I am uneasy about removing functionality that seems to be identical to implicit implementation, in case it might not actually be equivalent. Is there a method for determining whether a function is equivalent to its implicit version? (Using any toolset/language variation/etc is acceptable).

My suggestion is to not try to programmatically determine if these functions are the same as the default implementation, because the difference might actually be a mistake (and they were supposed to have the normal default behavior).
Instead I would just suggest to write up a set of unit tests that take care of testing the expected behavior of the various functions, and then make sure they pass on the default implementations. Then not only do you have a test framework for future enhancements you can be confident the functions did what you wanted.

Related

Will using brace-init syntax change construction behavior when an initializer_list constructor is added later?

Suppose I have a class like this:
class Foo
{
public:
Foo(int something) {}
};
And I create it using this syntax:
Foo f{10};
Then later I add a new constructor:
class Foo
{
public:
Foo(int something) {}
Foo(std::initializer_list<int>) {}
};
What happens to the construction of f? My understanding is that it will no longer call the first constructor but instead now call the init list constructor. If so, this seems bad. Why are so many people recommending using the {} syntax over () for object construction when adding an initializer_list constructor later may break things silently?
I can imagine a case where I'm constructing an rvalue using {} syntax (to avoid most vexing parse) but then later someone adds an std::initializer_list constructor to that object. Now the code breaks and I can no longer construct it using an rvalue because I'd have to switch back to () syntax and that would cause most vexing parse. How would one handle this situation?

What happens to the construction of f? My understanding is that it will no longer call the first constructor but instead now call the init list constructor. If so, this seems bad. Why are so many people recommending using the {} syntax over () for object construction when adding an initializer_list constructor later may break things silently?
On one hand, it's unusual to have the initializer-list constructor and the other one both be viable. On the other hand, "universal initialization" got a bit too much hype around the C++11 standard release, and it shouldn't be used without question.
Braces work best for like aggregates and containers, so I prefer to use them when surrounding some things which will be owned/contained. On the other hand, parentheses are good for arguments which merely describe how something new will be generated.
I can imagine a case where I'm constructing an rvalue using {} syntax (to avoid most vexing parse) but then later someone adds an std::initializer_list constructor to that object. Now the code breaks and I can no longer construct it using an rvalue because I'd have to switch back to () syntax and that would cause most vexing parse. How would one handle this situation?
The MVP only happens with ambiguity between a declarator and an expression, and that only happens as long as all the constructors you're trying to call are default constructors. An empty list {} always calls the default constructor, not an initializer-list constructor with an empty list. (This means that it can be used at no risk. "Universal" value-initialization is a real thing.)
If there's any subexpression inside the braces/parens, the MVP problem is already solved.

Retrofitting classes with initializer lists in updated code is something that sounds like it will be a common thing to happen. So people start using {} syntax for existing constructors before the class is updated, and we want to automatically catch any old uses, especially those used in templates where they may be overlooked.
If I had a class like vector that took a size, then arguably using {} syntax is "wrong", but for the transition we want to catch that anyway. Constructing C c1 {val} means take some (one, in this case) values for the collection, and C c2 (arg) means use val as a descriptive piece of metadata for the class.
In order to support both uses, when the type of element happens to be compatible with the descriptive argument, code that used C c2 {arg} will change meaning. There seems to be no way around it in that case if we want to support both forms with different meanings.
So what would I do? If the compiler provides some way to issue a warning, I'd make the initializer list with one argument give a warning. That sounds tricky not to mention compiler specific, so I'd make a general template for that, if it's not already in Boost, and promote its use.
Other than containers, what other situations would have initializer list and single argument constructors with different meanings where the single argument isn't something of a very distinct type from what you'd be using with the list? For non-containers, it might suffice to notice that they won't be confused because the types are different or the list will always have multiple elements. But it's good to think about that and take additional steps if they could be confused in this manner.
For a non-container being enhanced with initializer_list features, it might be sufficient to specifically avoid designing a one-argument constructor that can be mistaken. So, the one-arg constructor would be removed in the updated class, or the initializer list would require other (possibly tag) arguments first. That is, don't do that, under penalty of pie-in-face at the code review.
Even for container-like classes, a class that's not a standard library class could impose that the one-arg constructor form is no longer available. E.g. C c3 (size); would have to be written as C c3 (size, C()); or designed to take an enumeration argument also, which is handy to specify initialized to one value vs. reserved size, so you can argue it's a feature and point out code that begins with a separate call to reserve. So again, don't do that if I can reasonably avoid it.

How does c++11 resolve constexpr into assembly?

The basic question:
Edit: v-The question-v
class foo {
public:
constexpr foo() { }
constexpr int operator()(const int& i) { return int(i); }
}
Performance is a non-trivial issue. How does the compiler actually compile the above? I know how I want it to be resolved, but how does the specification actually specify it will be resolved?
1) Seeing the type int has a constexpr constructor, create a int object and compile the string of bytes that make the type from memory into the code directly?
2) Replace any calls to the overload with a call to the 'int's constructor that for some unknown reason int doesn't have constexpr constructors? (Inlining the call.)
3) Create a function, call the function, and have that function call 'int's consctructor?
Why I want to know, and how I plan to use the knowledge
edit:v-Background only-v
The real library I'm working with uses template arguments to decide how a given type should be passed between functions. That is, by reference or by value because the exact size of the type is unknown. It will be a user's responsibility to work within the limits I give them, but I want these limits to be as light and user friendly as I can sanely make them.
I expect a simple single byte character to be passed around in which case it should be passed by value. I do not bar 300mega-byte behemoth that does several minuets of recalculation every time a copy constructor is invoked. In which case passing by reference makes more sense. I have only a list of requirements that a type must comply with, not set cap on what a type can or can not do.
Why I want to know the answer to my question is so I can in good faith make a function object that accepts this unknown template, and then makes a decision how, when, or even how much of a object should be copied. Via a virtual member function and a pointer allocated with new is so required. If the compiler resolves constexpr badly I need to know so I can abandon this line of thought and/or find a new one. Again, It will be a user's responsibility to work within the limits I give them, but I want these limits to be as light and user friendly as I can sanely make them.
Edit: Thank you for your answers. The only real question was the second sentence. It has now been answered. Everything else If more background is required, Allow me to restate the above:
I have a template with four argument. The goal of the template is a routing protocol. Be that TCP/IP -unlikely- or node to node within a game -possible. The first two are for data storage. They have no requirement beyond a list of operators for each. The last two define how the data is passed within the template. By default this is by reference. For performance and freedom of use, these can be changed define to pass information by value at a user's request.
Each is expect to be a single byte long. They could in the case of metric for a EIGRP or OSFP like protocol the second template argument could be the compound of a dozen or more different variable. Each taking a non-trival time to copy or recompute.
For ease of use I investigate the use a function object that accepts the third and fourth template to handle special cases and polymorphic classes that would fail to function or copy correctly. The goal to not force a user to rebuild their objects from scratch. This would require planning for virtual function to preform deep copies, or any number of other unknown oddites. The usefulness of the function object depends on how sanely a compiler can be depended on not generate a cascade of function calls.
More helpful I hope?

The C++11 standard doesn't say anything about how constexpr will be compiled down to machine instructions. The standard just says that expressions that are constexpr may be used in contexts where a compile time constant value is required. How any particular compiler chooses to translate that to executable code is an implementation issue.
Now in general, with optimizations turned on you can expect a reasonable compiler to not execute any code at runtime for many uses of constexpr but there aren't really any guarantees. I'm not really clear on what exactly you're asking about in your example so it's hard to give any specifics about your use case.

constexpr expressions are not special. For all intents and purposes, they're basically const unless the context they're used in is constexpr and all variables/functions are also constexpr. It is implementation defined how the compiler chooses to handle this. The Standard never deals with implementation details because it speaks in abstract terms.

Why can I not implement default constructors for structs in D?

Writing code like
struct S
{
this() // compile-time error
{
}
}
gives me an error message saying
default constructor for structs only allowed with #disable and no body.
Why??

This is one of cases much more tricky than one can initially expect.
One of important and useful features D has over C++ is that every single type (including all user types) has some initial non-garbage value that can be evaluated at compile-time. It is used as T.init and has two important use cases:
Template constraints can use T.init value to check if certain operations can be done on given type (quoting Kenji Hara's snippet):
template isSomething(T) {
enum isSomething = is(typeof({
//T t1; // not good if T is nested struct, or has #disable this()
//T t2 = void; auto x = t2; // not good if T is non-mutable type
T t = T.init; // avoid default construct check
...use t...
}));
}
Your variables are always initialized properly unless you explicitly use int i = void syntax. No garbage possible.
Given that, difficult question arises. Should we guarantee that T() and T.init are the same (as many programmers coming from C++ will expect) or allow default construction that may easily destroy that guarantee. As far as I know, decision was made that first approach is safer, despite being surprising.
However, discussions keep popping with various improvements proposed (for example, allowing CTFE-able default constructor). One such thread has appeared very recently.

It stems from the fact that all types in D must have a default value. There are quite a few places where a type's init value gets used, including stuff like default-initializing member variables and default-initializing every value in an array when it's allocated, and init needs to be known at compile for a number of those cases. Having init provides quite a few benefits, but it does get in the way of having a default constructor.
A true default constructor would need to be used in all of the places that init is used (or it wouldn't be the default), but allowing arbitrary code to run in a number of the cases that init is used would be problematic at best. At minimum, you'd probably be forced to make it CTFE-able and possibly pure. And as soon as you start putting restrictions like that on it, pretty soon, you might as well just directly initialize all of the member variables to what you want (which is what happens with init), as you wouldn't be gaining much (if anything) over that, which would make having default constructors pretty useless.
It might be possible to have both init and a default constructor, but then the question comes up as to when one is used over the other, and the default constructor wouldn't really be the default anymore. Not to mention, it could become very confusing to developers as to when the init value was used and when the default constructor was used.
Now, we do have the ability to #disable the init value of a struct (which causes its own set of problems), in which case, it would be illegal to use that struct in any situation that required init. So, it might be possible to then have a default constructor which could run arbitrary code at runtime, but what the exact consequences of that would be, I don't know. However, I'm sure that there are cases where people would want to have a default constructor that would require init and therefore wouldn't work, because it had been #disabled (things like declaring arrays of the type would probably be one of them).
So, as you can see, by doing what D has done with init, it's made the whole question of default constructors much more complicated and problematic than it is in other languages.
The normal way to get something akin to default construction is to use a static opCall. Something like
struct S
{
static S opCall()
{
//Create S with the values that you want and return it.
}
}
Then whenever you use S() - e.g.
auto s = S();
then the static opCall gets called, and you get a value that was created at runtime. However, S.init will still be used any place that it was before (including S s;), and the static opCall will only be used when S() is used explicitly. But if you couple that with #disable this() (which disables the init property), then you get something akin to what I described earlier where we might have default constructors with an #disabled init.
We may or may not end up with default constructors being added to the language eventually, but there are a number of technical problems with adding them due to how init and the language work, and Walter Bright doesn't think that they should be added. So, for default constructors to be added to the language, someone would have to come up with a really compelling design which appropriately resolves all of the issues (including convincing Walter), and I don't expect that to happen, but we'll see.

Macros to disallow class copy and assignment. Google -vs- Qt

To disallow copying or assigning a class it's common practice to make the copy constructor
and assignment operator private. Both Google and Qt have macros to make this easy and visible.
These macros are:
Google:
#define DISALLOW_COPY_AND_ASSIGN(TypeName) \
TypeName(const TypeName&); \
void operator=(const TypeName&)
Qt:
#define Q_DISABLE_COPY(Class) \
Class(const Class &); \
Class &operator=(const Class &);
Questions:
Why are the signatures of the two assignment operators different? It seems like the Qt version is correct.
What is the practical difference between the two?

It doesn't matter. The return type is not part of a function's signature, as it does not participate in overload resolution. So when you attempt to perform an assignment, both declarations will match, regardless of whether you use the return type.
And since the entire point in these macros is that the functions will never get called, it doesn't matter that one returns void.

I'd just like to mention that there is an alternative strategy for implementing an abstraction for disallowing copy and assignment of a class. The idea is to use inheritance instead of the preprocessor. I personally prefer this approach as I follow the rule of thumb that it is best to avoid using the preprocessor when at all possible.
boost::noncopyable is an example implementation. It is used as follows:
class A : noncopyable
{
...
};

See Boost.Utility, specifically boost::noncopyable. It's not a macro but a base class with private copy and assignment. It prevents the compiler from generating implicit copy and assignment in derived classes.
edit: Sorry, this was not an answer to the original question. By the way, boost::noncopyable uses a const reference as return type for the assignment operator. I was under the impression that the type of the return value doesn't matter since it's not supposed to be used. Still, making the operator private doesn't prevent usage inside the class or friends in which case a non-usual return type (like void, a const reference, etc) might lead to compilation errors and catch additional bugs.

There's no practical difference. The assignment operator signatures differ just as a matter of style. It's usual to have an assignment operator returning a reference to allow chaining:
a = b = c;
but a version returning void is also legal and will work just fine for cases when the only purpose is to just declare the operator private and therefore prohibited to use.

From the standard, 12.8, clause 9: "A user-declared copy assignment operator X::operator= is a non-static non-template member function of class X with exactly one parameter of type X, X&, const X&, volatile X&, or const volatile X&." It says nothing about the return type, so any return type is permissible.
Clause 10 says "If the class definition does not explicitly declare a copy assignment operator, one is declared implicitly."
Therefore, declaring any X::operator=(const X&) (or any other of the specified assignment types) is sufficient. Neither the body nor the return type is significant if the operator will never be used.
Therefore, it's a stylistic difference, with one macro doing what we'd likely expect and one saving a few characters and doing the job in a way that's likely to surprise some people. I think the Qt macro is better stylistically. Since we're talking macro, we're not talking about the programmer having to type anything extra, and failing to surprise people is a good thing in a language construct.

Others have already answered why it's legal to have different return values for operator=; IMHO jalf said it best.
However, you might wonder why Google uses a different return type, and I suspect it's this:
You don't have to repeat the type name when disabling the assignment operator like this. Usually the type name is the longest part of the declaration.
Of course, this reason is void given that a macro is used but still - old habits die hard. :-)

Both serve the same purpose
Once you write this one:
Class &operator=(const Class &);
you will get the benefits of chain assignments. But in this case you want the assignment operator to be private. so it doesn't matter.

Qt version is backward compatible, while google's is not.
If you develop your library and deprecate the use of assignment before you completely remove it, in Qt it will most likely retain the signature it originally had. In this case older application will continue to run with new version of library (however, they won't compile with the newer version).
Google's macro doesn't have such a property.

As several other answers have mentioned, the return type of the function doesn't participate in the function signature, so both declarations are equivalent as far as making the assignment operator unusable by clients of the class.
Personally I prefer the idiom of having a class privately inherit from an empty non-copyable base class (like boost::noncopyable, but I have my own so I can use it in projects that don't have boost available). The empty base class optimization takes care of making sure there's zero overhead, and it's simple, readable, and doesn't rely on the dreaded preprocessor macro functionality.
It also has the advantage that copy and assignment can't even be used within class implementation code - it'll fail at compile time while these macros will fail at link time (likely with a less informative error message).

Incidentally, if you have access to the Boost libraries (You don't? Why the heck not??), The Utility library has had the noncopyable class for a long time:
class YourNonCopyableClass : boost::noncopyable {
Clearer IMHO.

In practice I would say that both should not be used anymore if you have a C++11 compiler.
You should instead use the delete feature , see here
Meaning of = delete after function declaration
and here
http://www.stroustrup.com/C++11FAQ.html#default
Why : essentially because compiler message is much more clearer. When the compiler need one of the copy or copy assignment operator, it immediately points out to the line where the =delete was coded.
Better and complete explanations can also be found in Item 11: Prefer deleted functions to private undefined ones from Effective Modern C++ book by Scott Meyers

Functor class doing work in constructor

I'm using C++ templates to pass in Strategy functors to change my function's behavior. It works fine. The functor I pass is a stateless class with no storage and it just overloads the () operator in the classic functor way.
template <typename Operation> int foo(int a)
{
int b=Operation()(a);
/* use b here, etc */
}
I do this often, and it works well, and often I'm making templates with 6 or 7 templated functors passed in!
However I worry both about code elegance and also efficiency. The functor is stateless so I assume the Operation() constructor is free and the evaluation of the functor is just as efficient as an inlined function, but like all C++ programmers I always have some nagging doubt.
My second question is whether I could use an alternate functor approach.. one that does not override the () operator, but does everything in the constructor as a side effect!
Something like:
struct Operation {
Operation(int a, int &b) { b=a*a; }
};
template <typename Operation> int foo(int a)
{
int b;
Operation(a,b);
/* use b here, etc */
}
I've never seen anyone use a constructor as the "work" of a functor, but it seems like it should work. Is there any advantage? Any disadvantage? I do like the removal of the strange doubled parenthesis "Operator()(a)" , but that's likely just aesthetic.

Any disadvantage?
Ctors do not return any useful value -- cannot be used in chained calls (e.g. foo(bar()).
They can throw.
Design point of view -- ctors are object creation functions, not really meant to be workhorses.

Compilers actually inline the empty constructor of Operation (at least gcc in similar situations does, except when you turned off optimization)
The disadvantage of doing everything in the constructor is that you cannot create a functor with some internal state this way - eg. functor for counting the number of elements satisfying a predicate. Also, using a method of a real object as a functor allows you to store the instance of it for later execution, something you cannot do with your constructor approach.

From a performance pov the code demonstrated with get completely optimized with both VC and GCC. However, a better strategy often is to take the functor as a parameter, that way you get a lot more flexibility and identical performance characteristics.

I'd recommend defining functor that work with the STL-containers, i.e. they should implement operator(). (Following the API of the language you're using is always a good idea.)
That allow your algorithms to be very generic (pass in functions, functors, stl-bind, boost::function, boost::bind, boost::lambda, ...) which is what one usually wants.
This way, you don't need to specify the functor type as a template parameter, just construct an instance and pass it in:
my_algorithm(foo, bar, MyOperation())

There does not seem any point in implementing the constructor in another class.
All you are doing is breaking encapsulation and setting up your class for abuse.
The constructor is supposed to initialize the object into a good state as defined by the class. You are allowing another object to initialize your class. What guarantees do you have that this template class knows how to initialize your class correctly? A user of your class can provide any object that could mess with the internal state of your object in ways not intended.
The class should be self contained and initialize itself to a good state. What you seem to be doing is playing with templates just to see what they can do.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js