Optional function parameters: Use default arguments (NULL) or overload the function? - c++

I have a function that processes a given vector, but may also create such a vector itself if it is not given.
I see two design choices for such a case, where a function parameter is optional:
Make it a pointer and make it NULL by default:
void foo(int i, std::vector<int>* optional = NULL) {
if(optional == NULL){
optional = new std::vector<int>();
// fill vector with data
}
// process vector
}
Or have two functions with an overloaded name, one of which leaves out the argument:
void foo(int i) {
std::vector<int> vec;
// fill vec with data
foo(i, vec);
}
void foo(int i, const std::vector<int>& optional) {
// process vector
}
Are there reasons to prefer one solution over the other?
I slightly prefer the second one because I can make the vector a const reference, since it is, when provided, only read, not written. Also, the interface looks cleaner (isn't NULL just a hack?). And the performance difference resulting from the indirect function call is probably optimized away.
Yet, I often see the first solution in code. Are there compelling reasons to prefer it, apart from programmer laziness?

I would not use either approach.
In this context, the purpose of foo() seems to be to process a vector. That is, foo()'s job is to process the vector.
But in the second version of foo(), it is implicitly given a second job: to create the vector. The semantics between foo() version 1 and foo() version 2 are not the same.
Instead of doing this, I would consider having just one foo() function to process a vector, and another function which creates the vector, if you need such a thing.
For example:
void foo(int i, const std::vector<int>& optional) {
// process vector
}
std::vector<int>* makeVector() {
return new std::vector<int>;
}
Obviously these functions are trivial, and if all makeVector() needs to do to get it's job done is literally just call new, then there may be no point in having the makeVector() function. But I'm sure that in your actual situation these functions do much more than what is being shown here, and my code above illustrates a fundamental approach to semantic design: give one function one job to do.
The design I have above for the foo() function also illustrates another fundamental approach that I personally use in my code when it comes to designing interfaces -- which includes function signatures, classes, etc. That is this: I believe that a good interface is 1) easy and intuitive to use correctly, and 2) difficult or impossible to use incorrectly . In the case of the foo() function we are implictly saying that, with my design, the vector is required to already exist and be 'ready'. By designing foo() to take a reference instead of a pointer, it is both intuitive that the caller must already have a vector, and they are going to have a hard time passing in something that isn't a ready-to-go vector.

I would definitely favour the 2nd approach of overloaded methods.
The first approach (optional parameters) blurs the definition of the method as it no longer has a single well-defined purpose. This in turn increases the complexity of the code, making it more difficult for someone not familiar with it to understand it.
With the second approach (overloaded methods), each method has a clear purpose. Each method is well-structured and cohesive. Some additional notes:
If there's code which needs to be duplicated into both methods, this can be extracted out into a separate method and each overloaded method could call this external method.
I would go a step further and name each method differently to indicate the differences between the methods. This will make the code more self-documenting.

While I do understand the complaints of many people regarding default parameters and overloads, there seems to be a lack of understanding to the benefits that these features provide.
Default Parameter Values:
First I want to point out that upon initial design of a project, there should be little to no use for defaults if well designed. However, where defaults' greatest assets comes into play is with existing projects and well established APIs. I work on projects that consist of millions of existing lines of code and do not have the luxury to re-code them all. So when you wish to add a new feature which requires an extra parameter; a default is needed for the new parameter. Otherwise you will break everyone that uses your project. Which would be fine with me personally, but I doubt your company or users of your product/API would appreciate having to re-code their projects on every update. Simply, Defaults are great for backwards compatibility! This is usually the reason you will see defaults in big APIs or existing projects.
Function Overrides:
The benefit of function overrides is that they allow for the sharing of a functionality concept, but with with different options/parameters. However, many times I see function overrides lazily used to provide starkly different functionality, with just slightly different parameters. In this case they should each have separately named functions, pertaining to their specific functionality (As with the OP's example).
These, features of c/c++ are good and work well when used properly. Which can be said of most any programming feature. It is when they are abused/misused that they cause problems.
Disclaimer:
I know that this question is a few years old, but since these answers came up in my search results today (2012), I felt this needed further addressing for future readers.

I agree, I would use two functions. Basically, you have two different use cases, so it makes sense to have two different implementations.
I find that the more C++ code I write, the fewer parameter defaults I have - I wouldn't really shed any tears if the feature was deprecated, though I would have to re-write a shed load of old code!

A references can't be NULL in C++, a really good solution would be to use Nullable template.
This would let you do things is ref.isNull()
Here you can use this:
template<class T>
class Nullable {
public:
Nullable() {
m_set = false;
}
explicit
Nullable(T value) {
m_value = value;
m_set = true;
}
Nullable(const Nullable &src) {
m_set = src.m_set;
if(m_set)
m_value = src.m_value;
}
Nullable & operator =(const Nullable &RHS) {
m_set = RHS.m_set;
if(m_set)
m_value = RHS.m_value;
return *this;
}
bool operator ==(const Nullable &RHS) const {
if(!m_set && !RHS.m_set)
return true;
if(m_set != RHS.m_set)
return false;
return m_value == RHS.m_value;
}
bool operator !=(const Nullable &RHS) const {
return !operator==(RHS);
}
bool GetSet() const {
return m_set;
}
const T &GetValue() const {
return m_value;
}
T GetValueDefault(const T &defaultValue) const {
if(m_set)
return m_value;
return defaultValue;
}
void SetValue(const T &value) {
m_value = value;
m_set = true;
}
void Clear()
{
m_set = false;
}
private:
T m_value;
bool m_set;
};
Now you can have
void foo(int i, Nullable<AnyClass> &optional = Nullable<AnyClass>()) {
//you can do
if(optional.isNull()) {
}
}

I usually avoid the first case. Note that those two functions are different in what they do. One of them fills a vector with some data. The other doesn't (just accept the data from the caller). I tend to name differently functions that actually do different things. In fact, even as you write them, they are two functions:
foo_default (or just foo)
foo_with_values
At least I find this distinction cleaner in the long therm, and for the occasional library/functions user.

I, too, prefer the second one. While there are not much difference between the two, you are basically using the functionality of the primary method in the foo(int i) overload and the primary overload would work perfectly without caring about existence of lack of the other one, so there is more separation of concerns in the overload version.

In C++ you should avoid allowing valid NULL parameters whenever possible. The reason is that it substantially reduces callsite documentation. I know this sounds extreme but I work with APIs that take upwards of 10-20 parameters, half of which can validly be NULL. The resulting code is almost unreadable
SomeFunction(NULL, pName, NULL, pDestination);
If you were to switch it to force const references the code is simply forced to be more readable.
SomeFunction(
Location::Hidden(),
pName,
SomeOtherValue::Empty(),
pDestination);

I'm squarely in the "overload" camp. Others have added specifics about your actual code example but I wanted to add what I feel are the benefits of using overloads versus defaults for the general case.
Any parameter can be "defaulted"
No gotcha if an overriding function uses a different value for its default.
It's not necessary to add "hacky" constructors to existing types in order to allow them to have default.
Output parameters can be defaulted without needing to use pointers or hacky global objects.
To put some code examples on each:
Any parameter can be defaulted:
class A {}; class B {}; class C {};
void foo (A const &, B const &, C const &);
inline void foo (A const & a, C const & c)
{
foo (a, B (), c); // 'B' defaulted
}
No danger of overriding functions having different values for the default:
class A {
public:
virtual void foo (int i = 0);
};
class B : public A {
public:
virtual void foo (int i = 100);
};
void bar (A & a)
{
a.foo (); // Always uses '0', no matter of dynamic type of 'a'
}
It's not necessary to add "hacky" constructors to existing types in order to allow them to be defaulted:
struct POD {
int i;
int j;
};
void foo (POD p); // Adding default (other than {0, 0})
// would require constructor to be added
inline void foo ()
{
POD p = { 1, 2 };
foo (p);
}
Output parameters can be defaulted without needing to use pointers or hacky global objects:
void foo (int i, int & j); // Default requires global "dummy"
// or 'j' should be pointer.
inline void foo (int i)
{
int j;
foo (i, j);
}
The only exception to the rule re overloading versus defaults is for constructors where it's currently not possible for a constructor to forward to another. (I believe C++ 0x will solve that though).

I would favour a third option:
Separate into two functions, but do not overload.
Overloads, by nature, are less usable. They require the user to become aware of two options and figure out what the difference between them is, and if they're so inclined, to also check the documentation or the code to ensure which is which.
I would have one function that takes the parameter,
and one that is called "createVectorAndFoo" or something like that (obviously naming becomes easier with real problems).
While this violates the "two responsibilities for function" rule (and gives it a long name), I believe this is preferable when your function really does do two things (create vector and foo it).

Generally I agree with others' suggestion to use a two-function approach. However, if the vector created when the 1-parameter form is used is always the same, you could simplify things by instead making it static and using a default const& parameter instead:
// Either at global scope, or (better) inside a class
static vector<int> default_vector = populate_default_vector();
void foo(int i, std::vector<int> const& optional = default_vector) {
...
}

The first way is poorer because you cannot tell if you accidentally passed in NULL or if it was done on purpose... if it was an accident then you have likely caused a bug.
With the second one you can test (assert, whatever) for NULL and handle it appropriately.

Related

Best way to create a setter function in C++

I want to write a template function that receives parameter by move or by copy.
The most efficient way that I use is:
void setA(A a)
{
m_a = std::move(a);
}
Here, when we use is
A a;
setA(a); // <<---- one copy ctor & one move ctor
setA(std::move(a)); // <<---- two move ctors
I recently found out that defining it this way, with two functions:
void setA(A&& a)
{
m_a = std::move(a);
}
void setA(const A& a)
{
m_a = a; // of course we can so "m_a = std::move(a);" too, since it will do nothing
}
Will save a lot!
A a;
setA(a); // <<---- one copy ctor
setA(std::move(a)); // <<---- one move ctor
This is great! for one parameter... what is the best way to create a function with 10 parameters?!
void setAAndBAndCAndDAndEAndF...()
Any one has any ideas?
Thanks!
The two setter versions setA(A&& a) and setA(const A& a) can be combined into a single one using a forwarding reference (a.k.a. perfect forwarding):
template<typename A>
void setA(A&& a)
{
m_a = std::forward<A>(a);
}
The compiler will then synthesize either the rvalue- or lvalue-reference version as needed depending on the value category.
This also solves the issue of multi-value setters, as the right one will be synthesized depending on the value category of each parameter.
Having said that, keep in mind that setters are just regular functions; the object is technically already constructed by the time any setter can be called. In case of setA, if A has a non-trivial constructor, then an instance m_a would already have been (default-)constructed and setA would actually have to overwrite it.
That's why in modern C++, the focus is often not so much on move- vs. copy-, but on in-place construction vs. move/copy.
For example:
struct A {
A(int x) : m_x(x) {}
int m_x;
};
struct B {
template<typename T>
B(T&& a) : m_a(std::forward<T>(a)) {}
A m_a;
};
int main() {
B b{ 1 }; // zero copies/moves
}
The standard library also often offers "emplace"-style calls in addition to more traditional "push"/"add"-style calls. For example, vector::emplace takes the arguments needed to construct an element, and constructs one inside the vector, without having to copy or move anything.
The best would be to construct a in-place within the constructor. About setters, there is no single best. Taking by value and moving seems to work fine in most cases, but can sometimes be less efficient. Overloading as you showed is maximally efficient, but causes lots of code duplication. templates can avoid code duplication with the help of universal-references, but then you have to roll out your own type checking and it gets complicated. Unless you've detected this as a bottleneck with a profiler, I suggest you stick with take-by-value-then-move as it's the simplest, causes minimal code duplication and provides good exception-safety.
After a lot of research, I have found an answer!
I made an efficient wrapper class that allows you to hold both options and lets you decide in the inner function whether you want to copy or not!
#pragma pack(push, 1)
template<class T>
class CopyOrMove{
public:
CopyOrMove(T&&t):m_move(&t),m_isMove(true){}
CopyOrMove(const T&t):m_reference(&t),m_isMove(false){}
bool hasInstance()const{ return m_isMove; }
const T& getConstReference() const {
return *m_reference;
}
T extract() && {
if (hasInstance())
return std::move(*m_move);
else
return *m_reference;
}
void fastExtract(T* out) && {
if (hasInstance())
*out = std::move(*m_move);
else
*out = *m_reference;
}
private:
union
{
T* m_move;
const T* m_reference;
};
bool m_isMove;
};
#pragma pack(pop)
Now you can have the function:
void setAAndBAndCAndDAndEAndF(CopyOrMove<A> a, CopyOrMove<B> b, CopyOrMove<C> c, CopyOrMove<D> d, CopyOrMove<E> e, CopyOrMove<F> f)
With zero code duplication! And no redundant copy or move!
Short answer:
It's a compromise between verbosity and speed. Speed is not everything.
defining it this way, with two functions ... Will save a lot!
It will save a single move-assignment, which often isn't a lot.
Unless you need this specific piece of code to be as fast as possible (e.g. you're writing a custom container), I'd prefer passing by value because it's less verbose.
Other possible approaches are:
Using a forwarding reference, as suggested in the other answers. It'll give you the same amount of copies/moves as a pair of overloads (const T & + T &&), but it makes passing more than one parameter easier, because you only have to write a single function instead of 2N of them.
Making the setter behave like emplace(). This will give you no performance benefit (because you're assigning to an existing object instead of creating a new one), so it doesn't make much sense.

C++ Get/Set accessors - how do I avoid typing repetitive code?

I'm writing a pretty large library, and I find myself writing almost identical accessors all the time. I already have several dozen accessors such as the one below.
Question: How can I declare/implement accessors to save typing all this repetitive code? (No #defines please; I'm looking for C++ constructs.)
Update: Yes, I do need accessor functions, because I need to take pointers to these accessors for something called Property Descriptors, which enable huge savings in my GUI code (non-library).
.h file
private:
bool _visible;
public:
bool GetVisible() const { return _visible; }
void SetVisible (bool value);
// Repeat for Get/SetFlashing, Get/SetColor, Get/SetLineWidth, etc.
.cpp file
void Element::SetVisible (bool value)
{
_visible = value;
this->InvalidateSelf(); // Call method in base class
// ...
// A bit more code here, identical in 90% of my setters.
// ...
}
// Repeat for Get/SetFlashing, Get/SetColor, Get/SetLineWidth, etc.
I find myself writing almost identical accessors all the time. I already have several dozen accessors such as the one below.
This is a sure design smell that you are writing accessors "for the sake of it". Do you really need them all? Do you really need a low-level public "get" and "set" operation for each one? It's unlikely.
After all, if all you're doing is writing a getter and a setter for each private data member, and each one has the same logic, you may as well have just made the data members public.
Rather your class should have meaningful and semantic operations that, in the course of their duties, may or may not make use of private data members. You will find that each of these meaningful operations is quite different from the rest, and so your problem with repetitive code is vanquished.
As n.m. said:
Easy: avoid accessors. Program your classes to do something, rather than have something.
Even for those operations which have nothing more to them, like controlling visibility, you should have a bool isVisible() const, and a void show(), and a void hide(). You'll find that when you start coding like this it will promote a move away from boilerplate "for the sake of it" getters & setters.
Whilst I think Lightness Races in Orbit makes a very good point, there is also a few ways that can be used to implement "repeating code", which can be applied, assuming we do indeed have a class that have "many things that are similar that need to be controlled individually, so kind of continuing on this, say we have a couple of methods like this:
void Element::Show()
{
visible = true;
Invalidate();
// More code goes here.
}
void Element::Hide()
{
visible = false;
Invalidate();
// More code goes here.
}
Now, to my view, this breaks the DRY (Do not Repeat Yourself) principle, so we should probably do something like this:
void Element::UpdateProperty(bool &property, bool newValue)
{
property = value;
Invalidate();
// More code goes here.
}
Now, we can implement Show and Hide, Flash, Unflash, Shaded etc by doing this, avoiding repetition inside each function.
void Element::Show()
{
UpdateProperty(visible, true);
}
If the type isn't always bool, e.g. there is a position, we can do:
template<typename T>void Element::UpdateProperty(T &property, T newValue)
{
property = value;
Invalidate();
// More code goes here.
}
and the MoveTo becomes:
void Element::MoveTo(Point p)
{
UpdateProperty(position, p);
}
Edit based on previously undisclosed information added to question:
Obviously the above technique can equally be applied to any form of function that does this sort of work:
void Element::SetVisible(bool value)
{
UpdateProperty(visible, value);
}
will work just as well as for Show described above. It doesn't mean you can get away from declaring the functions, but it reduces the need for code inside the function.
I agree with Lightness. You should design your classes for the task at hand, and if you need so many getters and setters, you may be doing something wrong.
That said, most good IDEs allow you to generate simple getters and setters, and some might even allow you to customize them. You might save the repetitive code as a template and select the code fragment whenever needed.
You may also use a customizable editor like emacs and Vim (with Ultisnips) and create some custom helping functions to make your job easy. The task is ripe for automation.
The only time you should ever write a get/set set of functions in any language is if it does something other than just read or write to a simple variable; don't bother wrapping up access to data if all you're doing is make it harder for people to read. If that's what you're doing, stop doing anything.
If you ever do want a set of get/set functions, don't call them get and set -- use assignment and type casting (and do it cleverly). That way you can make your code more readable instead of less.
This is very inelegant:
class get_set {
int value;
public:
int get() { return value; }
void set(int v) { value = v; }
};
This is a bit better
class get_set_2 {
value_type value;
bool needs_updating;
public:
operator value_type const & () {
if (needs_updating) update(); // details to be found elsewhere
return value;
}
get_set_2& operator = (value_type t) {
update(t); // details to be found elsewhere
return *this;
}
};
If you're not doing the second pattern, don't do anything.
I'm a tad late again, but I wanted to answer because I don't totally agree with some other here, and think there's additional points to lay out.
It's difficult to say for sure if your access methods are code smells without seeing a larger codebase, or have more information about intent. Everyone here is right about one thing: access method are generally to be avoided unless they do some 'significant work', or they expose data for the purpose of generic-ism (particularly in libraries).
So, we can go ahead and call methods like the idiomatic data() from STL containers, 'trivial access method'.
Why not use trivial access methods?
First, as others have noted, this can lead to an over-exposure of implementation details. At it's best such exposure makes for tedious code, and at it's worse it can lead to obfuscation of ownership semantics, resource leaks, or fatal exceptions. Exposure is fundamentally opposite of object orientation, because each object ought to manage its own data, and operations.
Secondly, code tends to become long, hard to test, and hard to maintain, as you have noted.
When to use trivial access methods?
Usually when their intent is specific, and non-trivial. For example, the STL containers data() function exists to intentionally expose implementation details for the purposes of genericism for the standard library.
Procedural style-structs
Breaking away from directly object-oriented styles, as implementation sometimes does; you may want to consider a simple struct (or class if you prefer) which acts as a data carrier; that is, they have all, or mostly, public properties. I would advise using a struct only for simple holders. This is opposed to a class ought to be used to establish some invariant in the constructor. In addition to private methods, static methods are a good way to illustrate invariants in a class. For example, a validation method. The invariant establishment on public data is also very good for immutable data.
An example:
// just holds some fields
struct simple_point {
int x, y;
};
// holds from fields, but asserts invariant that coordinates
// must be in [0, 10].
class small_point {
public:
int x, y;
small_point() noexcept : x{}, y{} {}
small_point(int u, int v)
{
if (!small_point::valid(u) || !small_point::valid(u)) {
throw std::invalid_argument("small_point: Invalid coordinate.");
}
x = u;
y = v;
}
static valid(int v) noexcept { return 0 <= v && v <= 10; }
};

several classes implement parent class with varying api

I have a class Feature with a pure virtual method.
class Feature {
public:
virtual ~Feature() {}
virtual const float getValue(const vector<int>& v) const = 0;
};
This class is implemented by several classes, for example FeatureA and FeatureB.
A separate class Computer (simplified) uses the getValue method to do some computation.
class Computer {
public:
const float compute(const vector<Feature*>& features, const vector<int>& v) {
float res = 0;
for (int i = 0; i < features.size(); ++i) {
res += features[i]->getValue(v);
}
return res;
}
};
Now, I am would like to implement FeatureC but I realize that I need additional information in the getValue method. The method in FeatureC looks like
const float getValue(const vector<int>& v, const vector<int>& additionalInfo) const;
I can of course modify the signature of getValue in Feature, FeatureA, FeatureB to take additionalInfo as a parameter and also add additionalInfo as a parameter in the compute method. But then I may have to modify all those signatures again later if I want to implement FeatureD that needs even more additional info. I wonder if there is a more elegant solution to this or if there is a known design pattern that you can point me to for further reading.
You have at least two options:
Instead of passing the single vector to getValue(), pass a struct. In this struct you can put the vector today, and more data tomorrow. Of course, if some concrete runs of your program don't need the extra fields, the need to compute them might be wasteful. But it will impose no performance penalty if you always need to compute all the data anyway (i.e. if there will always be one FeatureC).
Pass to getValue() a reference to an object having methods to get the necessary data. This object could be the Computer itself, or some simpler proxy. Then the getValue() implementations can request exactly what they need, and it can be lazily computed. The laziness will eliminate wasted computations in some cases, but the overall structure of doing it this way will impose some small constant overhead due to having to call (possibly virtual) functions to get the various data.
Requiring the user of your Feature class hierarchy to call different methods based on class defeats polymorphism. Once you start doing dynamic_cast<>() you know you should be rethinking your design.
If a subclass requires information that it can only get from its caller, you should change the getValue() method to take an additionalInfo argument, and simply ignore that information in classes where it doesn't matter.
If FeatureC can get additionalInfo by calling another class or function, that's usually a better approach, as it limits the number of classes that need to know about it. Perhaps the data is available from an object which FeatureC is given access to via its constructor, or from a singleton object, or it can be calculated by calling a function. Finding the best approach requires a bit more knowledge about the case.
This problem is addressed in item 39 of C++ Coding Standards (Sutter, Alexandrescu), which is titled "Consider making virtual functions nonpublic, and public functions nonvirtual."
In particular, one of the motivations for following the Non-Virtual-Interface design pattern (this is what the item is all about) is stated as
Each interface can take its natural shape: When we separate the public interface
from the customization interface, each can easily take the form it naturally
wants to take instead of trying to find a compromise that forces them to look
identical. Often, the two interfaces want different numbers of functions and/or
different parameters; [...]
This is particularly useful
In base classes with a high cost of change
Another design pattern which is very useful in this case is the Visitor pattern. As for the NVI it applies when base classes (as well as the whole hierarchy) have a high cost of change. You can find plenty of discussion about this design pattern, I suggest you to read the related chapter in Modern C++ (Alexandrescu), which (on the side) gives you a great insight on how to use the (very easy to use) Visitor facilities in loki
I suggest for you to read all of this material and then edit the question so that we can give you a better answer. We can come up with all sort of solutions (e.g. use an additional method which gives the class the additional parameters, if needed) which might well not suit your case.
Try to address the following questions:
would a template-based solution fit the problem?
would it be feasible to add a new layer of indirection when calling the function?
would a "push argument"-"push argument"-...-"push argument"-"call function" method be of help? (this might seem very odd at first, but
think to something like "cout << arg << arg << arg << endl", where
"endl" is the "call function")
how do you intend to distinguish how to call the function in Computer::compute?
Now that we had some "theory", let's aim for the practice using the Visitor pattern:
#include <iostream>
using namespace std;
class FeatureA;
class FeatureB;
class Computer{
public:
int visitA(FeatureA& f);
int visitB(FeatureB& f);
};
class Feature {
public:
virtual ~Feature() {}
virtual int accept(Computer&) = 0;
};
class FeatureA{
public:
int accept(Computer& c){
return c.visitA(*this);
}
int compute(int a){
return a+1;
}
};
class FeatureB{
public:
int accept(Computer& c){
return c.visitB(*this);
}
int compute(int a, int b){
return a+b;
}
};
int Computer::visitA(FeatureA& f){
return f.compute(1);
}
int Computer::visitB(FeatureB& f){
return f.compute(1, 2);
}
int main()
{
FeatureA a;
FeatureB b;
Computer c;
cout << a.accept(c) << '\t' << b.accept(c) << endl;
}
You can try this code here.
This is a rough implementation of the Visitor pattern which, as you can see, solves your problem. I strongly advice you not to try to implement it this way, there are obvious dependency problems which can be solved by means of a refinement called the Acyclic Visitor. It is already implemented in Loki, so there is no need to worry about implementing it.
Apart from implementation, as you can see you are not relying on type switches (which, as somebody else pointed out, you should avoid whenever possible) and you are not requiring the classes to have any particular interface (e.g. one argument for the compute function). Moreover, if the visitor class is a hierarchy (make Computer a base class in the example), you won't need to add any new function to the hierarchy when you want to add functionalities of this sort.
If you don't like the visitA, visitB, ... "pattern", worry not: this is just a trivial implementation and you don't need that. Basically, in a real implementation you use template specialization of a visit function.
Hope this helped, I had put a lot of effort into it :)
Virtual functions, to work correctly, needs to have exactly the same "signature" (same parameters and same return type). Otherwise, you just get a "new member function", which isn't what you want.
The real question here is "how does the calling code know it needs the extra information".
You can solve this in a few different ways - the first one is to always pass in const vector <int>& additionalInfo, whether it's needed or not.
If that's not possible, because there isn't any additionalInfo except for in the case of FeatureC, you could have an "optional" parameter - which means use a pointer to vector (vector<int>* additionalInfo), which is NULL when the value is not available.
Of course if additionalInfo is a value that is something that can be stored in the FeatureC class, then that would also work.
Another option is to extend the base class Feature to have two more options:
class Feature {
public:
virtual ~Feature() {}
virtual const float getValue(const vector<int>& v) const = 0;
virtual const float getValue(const vector<int>& v, const vector<int>& additionalInfo) { return -1.0; };
virtual bool useAdditionalInfo() { return false; }
};
and then make your loop something like this:
for (int i = 0; i < features.size(); ++i) {
if (features[i]->useAdditionalInfo())
{
res += features[i]->getValue(v, additionalInfo);
}
else
{
res += features[i]->getValue(v);
}
}

Is this a design pattern - returning this from setters?

Is there a name for this:
class A
{
A* setA()
{
//set a
return this;
}
A* setB()
{
//set b
return this;
}
};
so you can do something like this:
A* a = new A;
a->setA()->setB();
Are there any drawbacks to using this? Advantages?
It's known as method chaining (FAQ link), and is more commonly done with references, not pointers.
Method chaining is strongly associated with the Named Parameter Idiom (FAQ link), as I now, after posting an initial version of this answer, see that Steve Jessop discusses in his answer. The NPI idiom is one simple way to provide a large number of defaulted arguments without forcing complexity into the constructor calls. For example, this is relevant for GUI programming.
One potential problem with the method chaining technique is when you want or need to apply the NPI idiom for classes in an inheritance hierarchy. Then you discover that C++ does not support covariant methods. What that is: when you let your eyes wander up or down the classes in a chain of class inheritance, then a covariant method is one whose definition involves some type that to your wandering eye varies in specificity in the same way as the class it’s defined in.
It is about the same problem as with defining a clone method, which has the same textual definition in all classes, but must be laboriously repeated in each class in order to get the types right.
Solving that problem is hard without language support; it appears to be an inherently complex problem, a kind of conflict with the C++ type system. My “How to do typed optional arguments in C++98” blog post links to relevant source code for automating the generation of covariant definitions, and to an article I wrote about it in Dr. Dobbs Journal. Maybe I'll revisit that for C++11, or sometime, because the complexity and possible brittleness may appear as a larger cost than it’s worth…
I've heard it called something like "method chaining" before, but I wouldn't call it a design pattern. (Some people also talk about implementing a "fluent interface" using this - I'd never seen it called that before though, but Martin Fowler seems to have written about it a while back)
You don't lose much by doing this - you can always ignore the return result quite happily if you don't want to use it like that.
As to is it worth doing I'm less sure. It can be quite cryptic in some circumstances. It is however basically required for things like operator<< for stream based IO though. I'd say it's a call to be made on how it fits in with the rest of the code - is it expected/obvious to people reading it?
(As Steve Jessop pointed out this is almost always done with references though, not pointers)
Another common-ish use is with "parameter objects". Without method chaining, they're quite inconvenient to set up, but with it they can be temporaries.
Instead of:
complicated_function(P1 param1 = default1, P2 param2 = default2, P3 param3 = default3);
Write:
struct ComplicatedParams {
P1 mparam1;
P2 mparam2;
P3 mparam3;
ComplicatedParams() : mparam1(default1), mparam2(default2), mparam3(default3) {}
ComplicatedParams &param1(P1 p) { mparam1 = p; return *this; }
ComplicatedParams &param2(P2 p) { mparam2 = p; return *this; }
ComplicatedParams &param3(P3 p) { mparam3 = p; return *this; }
};
complicated_function(const ComplicatedParams &params);
Now I can call it:
complicated_function(ComplicatedParams().param2(foo).param1(bar));
Which means the caller doesn't have to remember the order of parameters. Without the method chaining that would have to be:
ComplicatedParams params;
params.param1(foo);
params.param2(bar);
complicated_function(params);
I can also call it:
complicated_function(ComplicatedParams().param3(baz));
Which means that without having to define a tonne of overloads, I can specify just the last parameter and leave the rest at default.
The final obvious tweak is to make complicated_function a member of ComplicatedParams:
struct ComplicatedAction {
P1 mparam1;
P2 mparam2;
P3 mparam3;
ComplicatedAction() : mparam1(default1), mparam2(default2), mparam3(default3) {}
ComplicatedAction &param1(P1 p) { mparam1 = p; return *this; }
ComplicatedAction &param2(P2 p) { mparam2 = p; return *this; }
ComplicatedAction &param3(P3 p) { mparam3 = p; return *this; }
run(void);
};
ComplicatedAction().param3(baz).run();
One downside is that if you derive a class from A, say like this:
class Foo : public A
{
public:
Foo *setC()
{
// set C
return this;
}
};
then the order you call the setters is important. You'll need to call all the setters on Foo first: For example, this won't work:
Foo f=new Foo();
f->setA()->setC();
Whereas this will:
Foo f=new Foo();
f->setC()->setA();
It is commonly used in for example Boost, but most of the time the functions returns references instead:
A &setX()
{
// ...
return *this;
}

A recurring const-connundrum

I often find myself having to define two versions of a function in order to have one that is const and one which is non-const (often a getter, but not always). The two vary only by the fact that the input and output of one is const, while the input and output of the other is non-const. The guts of the function - the real work, is IDENTICAL.
Yet, for const-correctness, I need them both. As a simple practical example, take the following:
inline const ITEMIDLIST * GetNextItem(const ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<const ITEMIDLIST *>(reinterpret_cast<const BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
inline ITEMIDLIST * GetNextItem(ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<ITEMIDLIST *>(reinterpret_cast<BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
As you can see, they do the same thing. I can choose to define one in terms of the other using yet more casts, which is more appropriate if the guts - the actual work, is less trivial:
inline const ITEMIDLIST * GetNextItem(const ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<const ITEMIDLIST *>(reinterpret_cast<const BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
inline ITEMIDLIST * GetNextItem(ITEMIDLIST * pidl)
{
return const_cast<ITEMIDLIST *>(GetNextItem(const_cast<const ITEMIDLIST *>(pidl));
}
So, I find this terribly tedious and redundant. But if I wish to write const-correct code, then I either have to supply both of the above, or I have to litter my "consumer-code" with const-casts to get around the problems of having only defined one or the other.
Is there a better pattern for this? What is the "best" approach to this issue in your opinion:
providing two copies of a given function - the const and non-const versions
or just one version, and then requiring consumers of that code to do their casts as they will?
Or is there a better approach to the issue entirely?
Is there work being done on the language itself to mitigate or obviate this issue entirely?
And for bonus points:
do you find this to be an unfortunate by-product of the C++ const-system
or do you find this to be tantamount to touching the very heights of mount Olympus?
EDIT:
If I supply only the first - takes const returns const, then any consumer that needs to modify the returned item, or hand the returned item to another function that will modify it, must cast off the constness.
Similarly, if I supply only the second definition - takes non-const and returns non-const, then a consumer that has a const pidl must cast off the constness in order to use the above function, which honestly, doesn't modify the constness of the item itself.
Maybe more abstraction is desirable:
THING & Foo(THING & it);
const THING & Foo(const THING & it);
I would love to have a construct:
const_neutral THING & Foo(const_neutral THING & it);
I certainly could do something like:
THING & Foo(const THING & it);
But that's always rubbed me the wrong way. I am saying "I don't modify the contents of your THING, but I'm going to get rid of the constness that you entrusted me with silently for you in your code."
Now, a client, which has:
const THING & it = GetAConstThing();
...
ModifyAThing(Foo(it));
That's just wrong. GetAConstThing's contract with the caller is to give it a const reference. The caller is expected NOT TO MODIFY the thing - only use const-operations on it. Yes, the caller can be evil and wrong and cast away that constness of it, but that's just Evil(tm).
The crux of the matter, to me, is that Foo is const-neutral. It doesn't actually modify the thing its given, but its output needs to propagate the constness of its argument.
NOTE: edited a 2nd time for formatting.
IMO this is an unfortunate by-product of the const system, but it doesn't come up that often: only when functions or methods give out pointers/references to something (whether or not they modify something, a function can't hand out rights that it doesn't have or const-correctness would seriously break, so these overloads are unavoidable).
Normally, if these functions are just one short line, I'd just reduplicate them. If the implementation is more complicated, I've used templates to avoid code reduplication:
namespace
{
//here T is intended to be either [int] or [const int]
//basically you can also assert at compile-time
//whether the type is what it is supposed to be
template <class T>
T* do_foo(T* p)
{
return p; //suppose this is something more complicated than that
}
}
int* foo(int* p)
{
return do_foo(p);
}
const int* foo(const int* p)
{
return do_foo(p);
}
int main()
{
int* p = 0;
const int* q = foo(p); //non-const version
foo(q); //const version
}
The real problem here appears to be that you're providing the outside world with (relatively) direct access to the internals of your class. In a few cases (e.g., container classes) that can make sense, but in most it means you're providing low-level access to the internals as dumb data, where you should be looking at the higher-level operations that client code does with that data, and then provide those higher-level operations directly from your class.
Edit: While it's true that in this case, there's apparently no class involved, the basic idea remains the same. I don't think it's shirking the issue either -- I'm simply pointing out that while I agree that it is an issue, it's only that arises only rather infrequently.
I'm not sure low-level code justifies such things either. Most of my code is much lower level than most people ever have much reason to work with, and I still only encounter it rather infrequently.
Edit2: I should also mention that C++ 0x has a new definition of the auto keyword, along with a new keyword (decltype) that make a fair number of things like this considerably easier to handle. I haven't tried to implement this exact function with them, but this general kind of situation is the sort of thing for which they're intended (e.g., automatically figuring out a return type based on passed arguments). That said, they normally do just a bit more than you want, so they might be a bit clumsy (if useful at all) for this exact situation.
I don't believe it's the deficiency of const-correctness per se, but rather the lack of convenient ability to generalize a method over cv-qualifiers (in the same way we can generalize over types via templates). Hypothetically, imagine if you could write something like:
template<cvqual CV>
inline CV ITEMIDLIST* GetNextItem(CV ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<CV ITEMIDLIST *>(reinterpret_cast<CV BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
ITEMIDLIST o;
const ITEMIDLIST co;
ITEMIDLIST* po = GetNextItem(&o); // CV is deduced to be nothing
ITEMIDLIST* pco = GetNextItem(&co); // CV is deduced to be "const"
Now you can actually do this kind of thing with template metaprogramming, but this gets
messy real quick:
template<class T, class TProto>
struct make_same_cv_as {
typedef T result;
};
template<class T, class TProto>
struct make_same_cv_as<T, const TProto> {
typedef const T result;
};
template<class T, class TProto>
struct make_same_cv_as<T, volatile TProto> {
typedef volatile T result;
};
template<class T, class TProto>
struct make_same_cv_as<T, const volatile TProto> {
typedef const volatile T result;
};
template<class CV_ITEMIDLIST>
inline CV_ITEMIDLIST* GetNextItem(CV_ITEMIDLIST* pidl)
{
return pidl ? reinterpret_cast<CV_ITEMIDLIST*>(reinterpret_cast<typename make_same_cv_as<BYTE, CV_ITEMIDLIST>::result*>(pidl) + pidl->mkid.cb) : NULL;
}
The problem with the above is the usual problem with all templates - it'll let you pass object of any random type so long as it has the members with proper names, not just ITEMIDLIST. You can use various "static assert" implementations, of course, but that's also a hack in and of itself.
Alternatively, you can use the templated version to reuse the code inside your .cpp file, and then wrap it into a const/non-const pair and expose that in the header. That way, you pretty much only duplicate function signature.
Your functions are taking a pointer to a pidl which is either const or non-const. Either your function will be modifying the parameter or it won't - choose one and be done with it. If the function also modifies your object, make the function non-const. I don't see why you should need duplicate functions in your case.
You've got a few workarounds now...
Regarding best practices: Provide a const and a non-const versions. This is easiest to maintain and use (IMO). Provide them at the lowest levels so that it may propagate most easily. Don't make the clients cast, you're throwing implementation details, problems, and shortcomings on them. They should be able to use your classes without hacks.
I really don't know of an ideal solution... I think a keyword would ultimately be the easiest (I refuse to use a macro for it). If I need const and non-const versions (which is quite frequent), I just define it twice (as you do), and remember to keep them next to each other at all times.
I think it's hard to get around, if you look at something like vector in the STL, you have the same thing:
iterator begin() {
return (iterator(_Myfirst, this));
}
const_iterator begin() const {
return (iterator(_Myfirst, this));
}
/A.B.
During my work I developed a solution similar to what Pavel Minaev proposed. However I use it a bit differently and I think it makes the thing much simpler.
First of all you will need two meta-functions: an identity and const adding. Both can be taken from Boost if you use it (boost::mpl::identity from Boost.MPL and boost::add_const from Boost.TypeTraits). They are however (especially in this limited case) so trivial that they can be defined without referring to Boost.
EDIT: C++0x provides add_const (in type_traits header) meta-function so this solution just became a bit simpler. Visual C++ 2010 provides identity (in utility header) as well.
The definitions are following
template<typename T>
struct identity
{
typedef T type;
};
and
template<typename T>
struct add_const
{
typedef const T type;
};
Now having that generally you will provide a single implementation of a member function as a private (or protected if required somehow) static function which takes this as one of the parameters (in case of non-member function this is omitted).
That static function also has a template parameter being the meta-function for dealing with constness. Actual functions will the call this function specifying as the template argument either identity (non-const version) or add_const (const version).
Generally this will look like:
class MyClass
{
public:
Type1* fun(
Type2& arg)
{
return fun_impl<identity>(this, arg);
}
const Type1* fun(
const Type2& arg) const
{
return fun_impl<add_const>(this, arg);
}
private:
template<template<typename Type> class Constness>
static typename Constness<Type1>::type* fun_impl(
typename Constness<MyClass>::type* p_this,
typename Constness<Type2>::type& arg)
{
// Do the implementation using Constness each time constness
// of the type differs.
}
};
Note that this trick does not force you to have implementation in header file. Since fun_impl is private it should not be used outside of MyClass anyway. So you can move its definition to source file (leaving the declaration in the class to have access to class internals) and move fun definitions to source file as well.
This is only a bit more verbose however in case of longer non-trivial functions it pays off.
I think it is natural. After all you just said that you have to repeat the same algorithm (function implementation) for two different types (const one and non-const one). And that is what templates are for. For writing algorithms which work with any type satisfying some basic concepts.
I would posit that if you need to cast off the const of a variable to use it then your "consumer" code is not const correct. Can you provide a test case or two where you are running into this issue?
You don't need two versions in your case. A non-const thing will implicitly convert to a const thing, but not vice versa. From the name of you function, it looks like GetNextItem will have no reason to modify pidl, so you can rewrite it like this:
inline ITEMIDLIST * GetNextItem(const ITEMIDLIST * pidl);
Then clients can call it with a const or non-const ITEMIDLIST and it will just work:
ITEMIDLIST* item1;
const ITEMIDLIST* item2;
item1 = GetNextItem(item1);
item2 = GetNextItem(item2);
From your example, this sounds like a special case of having a pass-through function, where you want the return type to exactly match the parameter's type. One possibility would be to use a template. eg:
template<typename T> // T should be a (possibly const) ITEMIDLIST *
inline T GetNextItem(T pidl)
{
return pidl
? reinterpret_cast<T>(reinterpret_cast<const BYTE *>(pidl) + pidl->mkid.cb)
: NULL;
}
You could use templates.
template<typename T, typename U>
inline T* GetNextItem(T* pidl)
{
return pidl ? reinterpret_cast<T*>(reinterpret_cast<U*>(pidl) + pidl->mkid.cb) : NULL;
}
and use them like
ITEMDLIST* foo = GetNextItem<ITEMDLIST, BYTE>(bar);
const ITEMDLIST* constfoo = GetNextItem<const ITEMDLIST, const BYTE>(constbar);
or use some typedefs if you get fed up with typing.
If your function doesn't use a second type with the same changing constness, the compiler will deduce automatically which function to use and you can omit the template parameters.
But I think there may be a deeper problem hidden in the structure for ITEMDLIST. Is it possible to derive from ITEMDLIST? Almost forgot my win32 times... bad memories...
Edit: And you can, of course, always abuse the preprocessor. Thats what it's made for. Since you are already on win32, you can completly turn to the dark side, doesn't matter anymore ;-)