What is the advantage of a single-function-class over a function? - c++

I have often encountered classes with the following structure:
class FooAlgorithm {
public:
FooAlgorithm(A a, B b);
void run();
private:
A _a;
B _b;
};
So classes with only one public member function that are also only run once.
Is there any advantage in any case over a single free function foo(A a, B b)?
The latter option is easier to call, potentially has less header dependencies and also has much less boilerplate in the header.

An object has state that can be set up at object construction time. Then the single member function can be called at a later time, without knowing anything about how the object was set up, and the function can refer to the state that was set up earlier. A standalone function cannot do that. Any state must be passed to it as arguments, or be global/static (global/static data is best avoided for a variety of reasons).
An hands-on example is worth a thousand abstract explanations, so here is an exercise. Consider a simple object:
struct Obj {
std::array<std::string, 42> attributes;
};
How would you sort a vector of such objects, comparing only the attribute number K (K being a run-time parameter in the range 0..41)? Use std::sort and do not use any global or static data. Note how std::sort compares two objects: it calls a user-provided comparator, and passes it two objects to be compared, but it knows nothing about the parameter K and cannot pass it along to the comparator.

For bigger projects it may be easier to structure your programm using OOP.
For a simple program like yours it is easier to use a single function.

Classes introduce a new dynamic to your program -> Better readability, security, moudularity, re-usability....
functions are used to organize code into blocks.

Related

map to a function pointer with different number of arguments and varying datatypes

In my code, I have several class, each with different methods. For example:
class A {
public:
int sum(int a, int b);
bool isScalable(double d);
}
class B {
public:
std:string ownerName();
}
My aim is to create a map of all the function names as below
std::map<std::string, FnPtr> myMap;
// Add the newly implemented function to this map
myMap["sum"] = &A::sum;
myMap["isScalable"] = &A::isScalable;
myMap["myMap"] = &B::myMap;
The issue is I am not aware how I can define FnPtr. Can you please help me out.
As comments suggest, it is unlikely you really want to do this.
Assuming that you do, however - perhaps you should reconsider whether C++ is the right language for the task. Another language which has better runtime reflection facility might be more appropriate; perhaps an interpreted language like Python or Perl. Those are typically much more convenient when you need to look up class methods by name at runtime.
If it has to be C++, then perhaps you should relax the class structure somewhat. Use a single class for both A's and B's (lets call it MyCommonObj); and have the class hold a map of strings to function pointers. As for these functions' signatures - It's probably a good idea not to make the member functions, but freestanding ones. In that case, perhaps your function pointer type would be:
using generic_function = std::any (*)(std::vector<std::any>);
That's pretty generic - for storage and for invocation. If you have this map, you can easily look up your function name and pass the arguments. However, you might need to also keep additional information about what type your arguments should be, otherwise you'll always be passing strings. ... which is also an option, I suppose:
using generic_function = std::any (*)(std::vector<std::string>);
Now if the A and B members in your example are really non-static like you listed them, i.e. they use instance fields, then these generic functions must also always take a reference or pointer to an instance of MyCommonObj:
using generic_function = std::any (*)(MyCommonObj&, std::vector<std::string>);
Finally, note that code using this type, and run-time lookup of function names etc - will not be very performant.
If you're not using C++17 and don't have access to std::any, you can either:
Use boost::any from the Boost libraries.
Use an any-emulator library (which exist on GitHub)
Use a union of all the types you actually use, e.g. union {int i; double d;} - but then you'll need to protect yourself against passing values of the wrong type.

Vector of pointers to base type, find all instances of a given derived type stored in a base type

Suppose you have a base class inside of a library:
class A {};
and derived classes
class B: public A {};
class C: public A {};
Now Instances of B and C are stored in a std::vector of boost::shared_ptr<A>:
std::vector<boost::shared_ptr<A> > A_vec;
A_vec.push_back(boost::shared_ptr<B>(new B()));
A_vec.push_back(boost::shared_ptr<C>(new C()));
Adding instances of B and C is done by a user, and there is no way to determine in advance the order, in which they will be added.
However, inside of the library, there may be a need to perform specific actions on B and C, so the pointer to the base class needs to be casted to B and C.
I can of course do "trial and error" conversions, i.e. try to cast to Band C(and any other derivative of the base class), until I find a conversion that doesn't throw. However, this method seems very crude and error-prone, and I'm looking for a more elegant (and better performing) way.
I am looking for a solution that will also work with C++98, but may involve boost functionality.
Any ideas ?
EDIT:
O.k., thanks for all the answers so far!
I'd like to give some more details regarding the use-case. All of this happens in the context of parametric optimization.
Users define the optimization problem by:
Specifying the parameters, i.e. their types (e.g. "constrained double", "constrained integer", "unconstrained double", "boolean", etc.) and initial values
Specifying the evaluation function, which assigns one or more evaluations (double values) to a given parameter set
Different optimization algorithms then act on the problem definitions, including their parameters.
There is a number of predefined parameter objects for common cases, but users may also create their own parameter objects, by deriving from one of my base classes. So from a library perspective, apart from the fact that the parameter objects need to comply with a given (base-class) API, I cannot assume much about parameter objects.
The problem definition is a user-defined C++-class, derived from a base-class with a std::vector interface. The user adds his (predefined or home-grown) parameter objects and overloads a fitness-function.
Access to the parameter objects may happen
from within the optimization algorithms (usually o.k., even for home-grown parameter objects, as derived parameter objects need to provide access functions for their values).
from within the user-supplied fitness function (usually o.k., as the user knows where to find which parameter object in the collection and its value can be accessed easily)
This works fine.
There may however be special cases where
a user wants to access specifics of his home-grown parameter types
a third party has supplied the parameter structure (this is an Open Source library, others may add code for specific optimization problems)
the parameter structure (i.e. which parameters are where in the vector) may be modified as part of the optimization problem --> example: training of the architecture of a neural network
Under these circumstances it would be great to have an easy method to access all parameter objects of a given derived type inside of the collection of base types.
I already have a templated "conversion_iterator". It iterates over the vector of base objects and skips those that do not comply with the desired target type. However, this is based on "trial and error" conversion (i.e. I check whether the converted smart pointer is NULL), which I find very unelegant and error-prone.
I'd love to have a better solution.
NB: The optimization library is targetted at use-cases, where the evaluation step for a given parameter set may last arbitrarily long (usually seconds, possibly hours or longer). So speed of access to parameter types is not much of an issue. But stability and maintainability is ...
There’s no better general solution than trying to cast and seeing whether it succeeds. You can alternatively derive the dynamic typeid and compare it to all types in turn, but that is effectively the same amount of work.
More fundamentally, your need to do this hints at a design problem: the whole purpose of a base class is to be able to treat children as if they were parents. There are certain situations where this is necessary though, in which case you’d use a visitor to dispatch them.
If possible, add virtual methods to class A to do the "specific actions on B and C".
If that's not possible or not reasonable, use the pointer form of dynamic_cast, so there are no exceptions involved.
for (boost::shared_ptr<A> a : A_vec)
{
if (B* b = dynamic_cast<B*>(a.get()))
{
b->do_something();
}
else if (C* c = dynamic_cast<C*>(a.get()))
{
something_else(*c);
}
}
Adding instances of B and C is done by a user, and there is no way to determine in advance the order, in which they will be added.
Okay, so just put them in two different containers?
std::vector<boost::shared_ptr<A> > A_vec;
std::vector<boost::shared_ptr<B> > B_vec;
std::vector<boost::shared_ptr<C> > C_vec;
void add(B * p)
{
B_vec.push_back(boost::shared_ptr<B>(p));
A_vec.push_back(b.back());
}
void add(C * p)
{
C_vec.push_back(boost::shared_ptr<C>(p));
A_vec.push_back(c.back());
}
Then you can iterate over the Bs or Cs to your hearts content.
I would suggest to implement a method in the base class (e.g. TypeOf()), which will return the type of the particular object. Make sure you define that method as virtual and abstract so that you will be enforced to implement in the derived types. As for the type itself, you can define an enum for each type (e.g. class).
enum class ClassType { ClassA, ClassB, ClassC };
This answer might interest you: Generating an interface without virtual functions?
This shows you both approaches
variant w/visitor in a single collection
separate collections,
as have been suggested by others (Fred and Konrad, notably). The latter is more efficient for iteration, the former could well be more pure and maintainable. It could even be more efficient too, depending on the usage patterns.

Sharing function between classes

I have three classes which each store their own array of double values. To populate the arrays I use a fairly complex function, lets say foo(), which takes in several parameters and calculates the appropriate values for the array.
Each of my three classes uses the same function with only minor adjustments (i.e. the input parameters vary slightly). Each of the classes is actually quite similar although they each perform separate logic when retrieving the values of the array.
So I am wondering how should I 'share' the function so that all classes can use it, without having to duplicate the code?
I was thinking of creating a base class which contained the function foo() and a virtual get() method. My three classes could then inherit this base class. Alternatively, I was also thinking perhaps a global function was the way to go? maybe putting the function into a namespace?
If the classes have nothing in common besides this foo() function, it is silly to put it in a base class; make it a free function instead. C++ is not Java.
Declaring of a function in base class sounds the most appropriate solution. Not sure if you need virtual "get" though, instead just declare the array in the base class and provide access method(s) for descendants.
More complex part is "the input parameters vary slightly". If parameters differ by type only then you may write a template function. If difference is more significant than the only solution I see is splitting main function into several logic blocks and using these blocks in descendant classes to perform final result.
If your classes are quite similar, you could create a template class with three different implementations that has the function foo<T>()
Implement that function in base class. If these classes are similar as you say, they should be derived from one base class anyway! If there are several functions like foo(), it might be reasonable in some cases to combine them into another class which is utilized by/with your classes.
If the underlying data of the class is the same (Array of doubles), considering using a single class and overloading the constructor, or just use 3 different functions:
void PopulateFromString(const string&)
void PopulateFromXml(...)
void PopulateFromInteger(...)
If the data or the behavior is different in each class type, then your solution of base class is good.
You can also define a function in the same namespace as your classes as utility function, if it has nothing to do with specific class behavior (Polymorphism). Bjarne StroupStroup recommends this method by the way.
For the purpose of this answer, I am assuming the classes you have are not common in any other outwards way; they may load the same data, but they are providing different interfaces.
There are two possible situations here, and you haven't told us which one it is. It could be more like
void foo(double* arr, size_t size) {
// Some specific code (that probably just does some preparation)
// Lots of generic code
// ...
// Some more specific code (cleanup?)
}
or something similar to
void foo(double* arr, size_t size) {
// generic_code();
// ...
// specific_code();
// generic_code();
// ...
}
In the first case, the generic code may very well be easy to put into a separate function, and then making a base class doesn't make much sense: you'll probably be inheriting from it privately, and you should prefer composition over private inheritance unless you have a good reason to. You could put the new function in its own class if it benefits from it, but it's not strictly necessary. Whether you put it in a namespace or not depends on how you're organising your code.
The second case is trickier, and in that case I would advise polymorphism. However, you don't seem to need runtime polymorphism for this, and so you could just as well do it compile-time. Using the fact that this is C++, you can use CRTP:
template<typename IMPL>
class MyBase {
void foo(double* arr, size_t size) {
// generic code
// ...
double importantResult = IMPL::DoALittleWork(/* args */);
// more generic code
// ...
}
};
class Derived : MyBase<Derived> {
static double DoALittleWork(/* params */) {
// My specific stuff
return result;
}
};
This gives you the benefit of code organisation and saves you some virtual functions. On the other hand, it does make it slightly less clear what functions need to be implemented (although the error messages are not that bad).
I would only go with the second route if making a new function (possibly within a new class) would clearly be uglier. If you're parsing different formats as Andrey says, then having a parser object (that would be polymorphic) passed in would be even nicer as it would allow you to mock things with less trouble, but you haven't given enough details to say for sure.

C++ should all member variable use accessors and mutator

I have about 15~20 member variables which needs to be accessed, I was wondering
if it would be good just to let them be public instead of giving every one of them
get/set functions.
The code would be something like
class A { // a singleton class
public:
static A* get();
B x, y, z;
// ... a lot of other object that should only have one copy
// and doesn't change often
private:
A();
virtual ~A();
static A* a;
};
I have also thought about putting the variables into an array, but I don't
know the best way to do a lookup table, would it be better to put them in an array?
EDIT:
Is there a better way than Singleton class to put them in a collection
The C++ world isn't quite as hung up on "everything must be hidden behind accessors/mutators/whatever-they-decide-to-call-them-todays" as some OO-supporting languages.
With that said, it's a bit hard to say what the best approach is, given your limited description.
If your class is simply a 'bag of data' for some other process, than using a struct instead of a class (the only difference is that all members default to public) can be appropriate.
If the class actually does something, however, you might find it more appropriate to group your get/set routines together by function/aspect or interface.
As I mentioned, it's a bit hard to tell without more information.
EDIT: Singleton classes are not smelly code in and of themselves, but you do need to be a bit careful with them. If a singleton is taking care of preference data or something similar, it only makes sense to make individual accessors for each data element.
If, on the other hand, you're storing generic input data in a singleton, it might be time to rethink the design.
You could place them in a POD structure and provide access to an object of that type :
struct VariablesHolder
{
int a;
float b;
char c[20];
};
class A
{
public:
A() : vh()
{
}
VariablesHolder& Access()
{
return vh;
}
const VariablesHolder& Get() const
{
return vh;
}
private:
VariablesHolder vh;
};
No that wouldn't be good. Image you want to change the way they are accessed in the future. For example remove one member variable and let the get/set functions compute its value.
It really depends on why you want to give access to them, how likely they are to change, how much code uses them, how problematic having to rewrite or recompile that code is, how fast access needs to be, whether you need/want virtual access, what's more convenient and intuitive in the using code etc.. Wanting to give access to so many things may be a sign of poor design, or it may be 100% appropriate. Using get/set functions has much more potential benefit for volatile (unstable / possibly subject to frequent tweaks) low-level code that could be used by a large number of client apps.
Given your edit, an array makes sense if your client is likely to want to access the values in a loop, or a numeric index is inherently meaningful. For example, if they're chronologically ordered data samples, an index sounds good. Summarily, arrays make it easier to provide algorithms to work with any or all of the indices - you have to consider whether that's useful to your clients; if not, try to avoid it as it may make it easier to mistakenly access the wrong values, particularly if say two people branch some code, add an extra value at the end, then try to merge their changes. Sometimes it makes sense to provide arrays and named access, or an enum with meaningful names for indices.
This is a horrible design choice, as it allows any component to modify any of these variables. Furthermore, since access to these variables is done directly, you have no way to impose any invariant on the values, and if suddenly you decide to multithread your program, you won't have a single set of functions that need to be mutex-protected, but rather you will have to go off and find every single use of every single data member and individually lock those usages. In general, one should:
Not use singletons or global variables; they introduce subtle, implicit dependencies between components that allow seemingly independent components to interfere with each other.
Make variables const wherever possible and provide setters only where absolutely required.
Never make variables public (unless you are creating a POD struct, and even then, it is best to create POD structs only as an internal implementation detail and not expose them in the API).
Also, you mentioned that you need to use an array. You can use vector<B> or vector<B*> to create a dynamically-sized array of objects of type B or type B*. Rather than using A::getA() to access your singleton instance; it would be better to have functions that need type A to take a parameter of type const A&. This will make the dependency explicit, and it will also limit which functions can modify the members of that class (pass A* or A& to functions that need to mutate it).
As a convention, if you want a data structure to hold several public fields (plain old data), I would suggest using a struct (and use in tandem with other classes -- builder, flyweight, memento, and other design patterns).
Classes generally mean that you're defining an encapsulated data type, so the OOP rule is to hide data members.
In terms of efficiency, modern compilers optimize away calls to accessors/mutators, so the impact on performance would be non-existent.
In terms of extensibility, methods are definitely a win because derived classes would be able to override these (if virtual). Another benefit is that logic to check/observe/notify data can be added if data is accessed via member functions.
Public members in a base class is generally a difficult to keep track of.

Why bother with virtual functions in c++?

This is not a question about how they work and declared, this I think is pretty much clear to me. The question is about why to implement this?
I suppose the practical reason is to simplify bunch of other code to relate and declare their variables of base type, to handle objects and their specific methods from many other subclasses?
Could this be done by templating and typechecking, like I do it in Objective C? If so, what is more efficient? I find it confusing to declare object as one class and instantiate it as another, even if it is its child.
SOrry for stupid questions, but I havent done any real projects in C++ yet and since I am active Objective C developer (it is much smaller language thus relying heavily on SDK's functionalities, like OSX, iOS) I need to have clear view on any parallel ways of both cousins.
Yes, this can be done with templates, but then the caller must know what the actual type of the object is (the concrete class) and this increases coupling.
With virtual functions the caller doesn't need to know the actual class - it operates through a pointer to a base class, so you can compile the client once and the implementor can change the actual implementation as much as it wants and the client doesn't have to know about that as long as the interface is unchanged.
Virtual functions implement polymorphism. I don't know Obj-C, so I cannot compare both, but the motivating use case is that you can use derived objects in place of base objects and the code will work. If you have a compiled and working function foo that operates on a reference to base you need not modify it to have it work with an instance of derived.
You could do that (assuming that you had runtime type information) by obtaining the real type of the argument and then dispatching directly to the appropriate function with a switch of shorts, but that would require either manually modifying the switch for each new type (high maintenance cost) or having reflection (unavailable in C++) to obtain the method pointer. Even then, after obtaining a method pointer you would have to call it, which is as expensive as the virtual call.
As to the cost associated to a virtual call, basically (in all implementations with a virtual method table) a call to a virtual function foo applied on object o: o.foo() is translated to o.vptr[ 3 ](), where 3 is the position of foo in the virtual table, and that is a compile time constant. This basically is a double indirection:
From the object o obtain the pointer to the vtable, index that table to obtain the pointer to the function and then call. The extra cost compared with a direct non-polymorphic call is just the table lookup. (In fact there can be other hidden costs when using multiple inheritance, as the implicit this pointer might have to be shifted), but the cost of the virtual dispatch is very small.
I don't know the first thing about Objective-C, but here's why you want to "declare an object as one class and instantiate it as another": the Liskov Substitution Principle.
Since a PDF is a document, and an OpenOffice.org document is a document, and a Word Document is a document, it's quite natural to write
Document *d;
if (ends_with(filename, ".pdf"))
d = new PdfDocument(filename);
else if (ends_with(filename, ".doc"))
d = new WordDocument(filename);
else
// you get the point
d->print();
Now, for this to work, print would have to be virtual, or be implemented using virtual functions, or be implemented using a crude hack that reinvents the virtual wheel. The program need to know at runtime which of various print methods to apply.
Templating solves a different problem, where you determine at compile time which of the various containers you're going to use (for example) when you want to store a bunch of elements. If you operate on those containers with template functions, then you don't need to rewrite them when you switch containers, or add another container to your program.
A virtual function is important in inheritance. Think of an example where you have a CMonster class and then a CRaidBoss and CBoss class that inherit from CMonster.
Both need to be drawn. A CMonster has a Draw() function, but the way a CRaidBoss and a CBoss are drawn is different. Thus, the implementation is left to them by utilizing the virtual function Draw.
Well, the idea is simply to allow the compiler to perform checks for you.
It's like a lot of features : ways to hide what you don't want to have to do yourself. That's abstraction.
Inheritance, interfaces, etc. allow you to provide an interface to the compiler for the implementation code to match.
If you didn't have the virtual function mecanism, you would have to write :
class A
{
void do_something();
};
class B : public A
{
void do_something(); // this one "hide" the A::do_something(), it replace it.
};
void DoSomething( A* object )
{
// calling object->do_something will ALWAYS call A::do_something()
// that's not what you want if object is B...
// so we have to check manually:
B* b_object = dynamic_cast<B*>( object );
if( b_object != NULL ) // ok it's a b object, call B::do_something();
{
b_object->do_something()
}
else
{
object->do_something(); // that's a A, call A::do_something();
}
}
Here there are several problems :
you have to write this for each function redefined in a class hierarchy.
you have one additional if for each child class.
you have to touch this function again each time you add a definition to the whole hierarcy.
it's visible code, you can get it wrong easily, each time
So, marking functions virtual does this correctly in an implicit way, rerouting automatically, in a dynamic way, the function call to the correct implementation, depending on the final type of the object.
You dont' have to write any logic so you can't get errors in this code and have an additional thing to worry about.
It's the kind of thing you don't want to bother with as it can be done by the compiler/runtime.
The use of templates is also technically known as polymorphism from theorists. Yep, both are valid approach to the problem. The implementation technics employed will explain better or worse performance for them.
For example, Java implements templates, but through template erasure. This means that it is only apparently using templates, under the surface is plain old polymorphism.
C++ has very powerful templates. The use of templates makes code quicker, though each use of a template instantiates it for the given type. This means that, if you use an std::vector for ints, doubles and strings, you'll have three different vector classes: this means that the size of the executable will suffer.