Let's say I have two different implementations of mathematical vectors (and other mathematical structures such as matrices):
class SparseVector {
double dotproduct(SparseVector& other);
};
class DenseVector {
double dotproduct(DenseVector& other);
};
I would like to implement algorithm that are using either exclusively sparse or dense algebra. Of course I would like to only implement a generic version of the algorithm what can deal with either of the two.
The first idea was to create a virtual vector class (code below for the concept, it wouldn't actually work this way):
class Vector {
virtual double dotproduct(Vector& other);
};
class SparseVector : public Vector {
double dotproduct(SparseVector& other) override;
};
class DenseVector : public Vector {
double dotproduct(DenseVector& other) override;
};
However that doesn't work because each of the Vector implementations can only work with other vectors of the same type. (That is, the implementation should not allow a dotproduct between a sparse and a dense vector).
Is there a good implementation strategy or design pattern that prevents me having to implement algorithms twice?
The question is not the same as this one, as I do not want to support dotproducts between a sparse and a dense vector.
I was thinking about using templates:
template<class T>
class algorithm {
}
but I don't know how to restrict T to be one of SparseVector/DenseVector, so the compiler knows that there exists a suitable dotproduct() function.
The thing you're looking for is called Visitor Pattern or Double Dispatch. You should be able to easily locate further infor online. However, the gist is basically this: With regular virtual functions or OOP, the invoked code depends on the type of one object. With the Visitor Pattern, you get to pick the code depending on two objects (hence also the name Double Dispatch), which is basically what you need.
Related
I'm using two external libraries which define classes with identical contents (let's say Armadillo's Arma::vec and Eigen's Eigen::VectorXd). I would like to be able to convert between these classes as cleanly as possible.
If I had defined either class, it would be trivial to include a constructor or conversion operator in that class' definition, to allow me to write e.g.
Arma::vec foo(/*some constructor arguments*/);
Eigen::VectorXd bar = Eigen::VectorXd(foo);
but since both classes are from external libraries, I cannot do this. If I attemt to write a naive conversion function, e.g.
class A{
public:
int value_;
A(int value) : value_(value) {}
};
class B{
public:
int value_;
B(int value) : value_(value) {}
};
A A(const B& b){return A(b.value_);}
int main(void){
A a(1);
B b(2);
a = A(b);
}
then the function shadows the class definition, and suddenly I can't use the A class at all.
I understand that allowing A a=b to be defined would be a bad idea, but I don't see why allowing A a=A(b) would cause any problems.
My question:
Is it possible to write a function or operator to allow the syntax A a=A(b)? And if not, is there a canonical way of doing this kind of conversion?
I've seen A a=toA(b) in a few libraries, but this isn't used consistently, and I dislike the inconsistency with the usual type conversions.
Is it possible to write a function or operator to allow the syntax A a=A(b)?
No, it is not possible. The two classes involved define what conversions are possible and you can't change a class definition after it has been defined.
You will need to use a function as in your given example, although I would avoid repeating the type name and write
auto a = toA(b);
TL;DR
Best engineering practice is to use design pattern Factory by introducing function (or utility class) that consumes Eigen::VectorXd and returns Arma::vec.
Arma::vec createFrom(Eigen::VectorXd from) { ... }
Any other hacking is a waste of time and introduction of tight coupling that will strike back sooner or later. Loose coupling is essential in SW engineering.
Detailed
You might introduce descendant of the target class where you would define a constructor like you described:
class MyArma : Arma::vec {
public:
MyArma(Eigen::VectorXd from) : x(from.x), y(from.y), z(from.z) {
/* empty constructor as we are fine with initializers */
}
}
Then you'd just be able to create Arma vectors based on Eigen's vecotrs into E.g. Arma typed array
Arma::vec vecArray[] = { MyArma(eigenVect1), MyArma(eigenVect2) };
which comes from the principles of inheritance. Alternatively you could use a design pattern called Decorator where original vector (Eigen) is hidden behind the interface of the current vector (Armadillo). That involves overrding all the methods and there must be no public attribute and all the methods must have been delared as virtual... So lot of conditions.
However there are some engeneering flaws in above design. You are adding a performance overhead with Virtual Method Table, you are getting yourself in maintaining quite big and sensitive library for this purpose. And most important: You'd create technological dependency - so called spaghetti. One object shouldn't be avare about alternatives.
The documentation to armadillo gives nice hint that you should use design pattern called Factory. Factory is a standalone class or a function that combines knowledge of both implementations and contains algorihm to extract information from one and construct the other.
Based on http://arma.sourceforge.net/docs.html#imbue you'd best create a factory class that creates the target vector of the same size as the input vector and using method imbue(...) it would set the values of individual elements based on corresponding elements from the input vector.
class ArmaVecFacotry() {
Arma::vec createFrom(Eigen::VectorXd from) {
Arma::vec armaVec(from.size(), fill::none);
int currentElement = 0;
armaVec.imbue( [&]() { return from(currentElement++); } );
return armaVec;
}
}
and then simply create objects like
Eigen::VectorXd sourceVector;
Arma::vec tergetvector = std::move(ArmaVecFactory::createFrom(sourceVector));
Notes:
You can have currentElement counter outside of the lambda expression as it is captured by [&]
I am creating the vector on stack but std::move outside make sure that the memory is being used effectively without excessive copying.
I'm kinda new to OOP so this question feels a bit weird but I want to know what I should do in this case
Say I have a Tup4 class which just holds 4 doubles, and two classes Point4 and Vec4 that extend Tup4. Now, checking for equality in Tup4 is just comparing whether all 4 doubles in each tuple are (approximately) equal. This holds in both classes extending it. However, it makes no sense to define an equality function in Tup4, because then I would be able to check for equality between a Point and a Vector, which doesn't make much sense. So I can't define a virtual equals method in Tup4, so what can I do? The code is exactly the same in both cases, the only difference is the type of the function. So I want to know if I can avoid having two methods
bool equals(Point4 p);
bool equals(Vec4 v);
Where they both do the same but are defined in different classes
It looks like you already accepted an answer, but here's what I was going to
I propose without going down the template route:
Define an equality method in your Tup4 class, but leave it protected:
class Tup4
{
public:
double a, b, c, d;
protected:
bool EqualityCheck(const Tup4& other) const
{
return (a == other.a && b == other.b && c == other.c && d == other.d);
}
};
Then your Point4 and Vec4 classes can have overloaded equality operators that call the parent's method:
class Point4 : public Tup4
{
public:
bool operator==(const Point4& other) const
{
return EqualityCheck(other);
}
};
You can use templates for this.
It actually is not a good use OOP to shoehorn value-like types such as mathematical vectors and points into an object hierarchy. Object hierarchies mean using "reference semantics"; whereas, vectors, tuples, and points want to be values.
Look at, for example, how the C++ standard library implements complex numbers. It implements them as a class template parametrized on the number type you'd like to use e.g. float, double, etc. and then overloads the arithmetic operators to handle complex<T>.
How you would really implement a vector etc. class is similar.
Tup4 is a concept not a class. Vec4 snd Point4 satisfy that concept.
Most of Vec4 and Point4 are implemented as templates.
In the rare case you need to handle Tup4s in runtime polymophic way, don't use inheritance, use type erasure like std function. But you probably won't.
struct Tup4Data{
double v[4];
};
template<class D>
struct Tup4Impl:Tup4Data{
// common implementation details of Tup4
// D is derived class (Vec4 or Point4)
};
struct Vec4:Tup4Impl<Vec4>{
// extra stuff for Vec4
};
struct Point4:Tup4Impl<Point4>{
// extra stuff for Poimt4
};
Now, code that just wants to work on raw doubles and doesn't care can take Tup4Data. Tup4Impl uses the CRTP if you want to look it up; this provides static polymorphism.
Those that care if it is a vector or a point can take either one.
Those that wants to take both and behave differently can be template code, or type erase.
This last case -- type erase -- is harder, but in exchange you get massive improvements in every other case. And 99% of code bases don't even need to type erase.
I'm not even certain what kind of situation has code that wants to type erase here.
So just don't worry about it. (If you want to learn, look up example std function implementations).
You want the two types Point4 and Vector4 to be incompatible with each other, in the sense that they are different types. Now, as yourself what you need the Tuple4 for. Is it really important that Point4 is a Tuple4? In other words, is the Liskov Substitution Principle important there? My guess is that the answer is that Tuple4 is just a convenient baseclass for code reuse, not for OOP reasons.
If my assumption is correct, using a private baseclass would be a better choice. Since the base is private, it won't allow comparing Vector4 and Point4. For convenient code reuse, you can forward to the baseclass implementations:
class Point4: Tuple4 {
public:
bool operator==(Point4 const& rhs) const {
return static_cast<Tuple4 const&>(*this) == static_cast<Tuple4 const&>(rhs);
}
};
That said, consider using std::array as baseclass instead of writing your own.
The Problem
I want to implement a number of algorithms that work on a graph and return scores for node-pairs indicating whether those nodes are similar. The algorithms should work on a single node-pair and on all possible node-pairs. In the latter case a collection/matrix should be returned.
My Approach
The algorithms derive from
class SimilarityAlgorithm {
public:
Base(const Graph& G);
virtual double run(node u, node v) = 0; // indices for nodes in the graph
virtual ScoreCollection& runAll() = 0;
}
Now the algorithms differ in memory usage. Some algorithms might be symmetric and the scores for (u, v) and (v, u) are identical. This requires different ScoreCollection-types that should be returned. An example would be a sparse-matrix and a triangular matrix that both derive from ScoreCollection.
This would boil down to covariant return types:
class SpecificAlgorithm : SimilarityAlgorithm {
public:
double run(node u, node v);
// The specific algorithm is symmetric and thus uses a symmetric matrix to save memory
SymmetricScoreCollection& runAll();
}
Question
Is this design approach a good idea for this problem?
Should the fact that the collections are all implemented as matrices be exposed?
Your design seems appropriate for the problem you describe.
Problem:
However, there is a problem with your SpecificAlgorithm : runAll() doesn't return the same type as the virtual function of the base class. So it won't be called (or more probably, your code won't compile because of a missing virtual function).
Solution:
Use also a polymorphic approach for the ScoreCollection, by making SymmetricScoreCollection a derived class of ScoreCollection :
class SymetricScoreCollection: public ScoreCollection {
//define the member functions to access the values virtual
...
};
class SpecificAlgorithm : public SimilarityAlgorithm {
public:
double run(node u, node v);
// The specific algorithm is symmetric and thus uses a symmetric matrix to save memory
ScoreCollection& runAll();
};
In fact it's an application of the factory method pattern, with the following roles:
SimilarityAlgorithm is the factory,
SpecificAlgorithm is the concrete factory
ScoreCollection is the product
SymetricScoreCollection is the concrete product
Additional remark:
Returning a reference to ScoreCollection from runAll() brings in some risks. Suppose sa is a specific algorithm.
In the following statement :
ScoreCollection sc = sa.runAll();
sa.runAll() returns a reference to a SymetricScoreCollection, but it would copy the referred object to sc, making it a ScoreCollection. Slicing occurs, and polymorphism will fail to work.
The following statement would however succeed:
ScoreCollection& rsc = sa.runAll();
because rsc is a reference and it would still refer to the original SymetricScoreCollection object returned by sa.runAll(), and everything would work as designed.
You see that it's very easy to have unnoticed mistakes when returning references. I'd suggest to return a pointer instead of a reference.
I have a class:
class A
{
public:
virtual void func() {…}
virtual void func2() {…}
};
And some derived classes from this one, lets say B,C,D... In 95 % of the cases, i want to go through all objects and call func or func2(), so therefore i have them in a vector, like:
std::vector<std::shared_ptr<A> > myVec;
…
for (auto it = myVec.begin(); it != myVec.end(); ++it)
(*it).func();
However, in the rest 5 % of the cases i want to do something different to the classes depending on their subclass. And I mean totally different, like calling functions that takes other parameters or not calling functions at all for some subclasses. I have thought of some options to solve this, none of which I really like:
Use dynamic_cast to analyze subclass. Not good, too slow as I make calls very often and on limited hardware
Use a flag in each subclass, like an enum {IS_SUBCLASS_B, IS_SUBCLASS_C}. Not good as it doesnt feel OO.
Also put the classes in other vectors, each for their specific task. This doesnt feel really OO either, but maybe I'm wrong here. Like:
std::vector<std::shared_ptr<B> > vecForDoingSpecificOperation;
std::vector<std::shared_ptr<C> > vecForDoingAnotherSpecificOperation;
So, can someone suggest a style/pattern that achieves what I want?
Someone intelligent (unfortunately I forgot who) once said about OOP in C++: The only reason for switch-ing over types (which is what all your suggestions propose) is fear of virtual functions. (That's para-paraphrasing.) Add virtual functions to your base class which derived classes can override, and you're set.
Now, I know there are cases where this is hard or unwieldy. For that we have the visitor pattern.
There's cases where one is better, and cases where the other is. Usually, the rule of thumb goes like this:
If you have a rather fixed set of operations, but keep adding types, use virtual functions.
Operations are hard to add to/remove from a big inheritance hierarchy, but new types are easy to add by simply having them override the appropriate virtual functions.
If you have a rather fixed set of types, but keep adding operations, use the visitor pattern.
Adding new types to a large set of visitors is a serious pain in the neck, but adding a new visitor to a fixed set of types is easy.
(If both change, you're doomed either way.)
According to your comments, what you have stumbled upon is known (dubiously) as the Expression Problem, as expressed by Philip Wadler:
The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).
That is, extending both "vertically" (adding types to the hierarchy) and "horizontally" (adding functions to be overriden to the base class) is hard on the programmer.
There was a long (as always) discussion about it on Reddit in which I proposed a solution in C++.
It is a bridge between OO (great at adding new types) and generic programming (great at adding new functions). The idea is to have a hierachy of pure interfaces and a set of non-polymorphic types. Free-functions are defined on the concrete types as needed, and the bridge with the pure interfaces is brought by a single template class for each interface (supplemented by a template function for automatic deduction).
I have found a single limitation to date: if a function returns a Base interface, it may have been generated as-is, even though the actual type wrapped supports more operations, now. This is typical of a modular design (the new functions were not available at the call site). I think it illustrates a clean design, however I understand one could want to "recast" it to a more verbose interface. Go can, with language support (basically, runtime introspection of the available methods). I don't want to code this in C++.
As already explained myself on reddit... I'll just reproduce and tweak the code I already submitted there.
So, let's start with 2 types and a single operation.
struct Square { double side; };
double area(Square const s);
struct Circle { double radius; };
double area(Circle const c);
Now, let's make a Shape interface:
class Shape {
public:
virtual ~Shape();
virtual double area() const = 0;
protected:
Shape(Shape const&) {}
Shape& operator=(Shape const&) { return *this; }
};
typedef std::unique_ptr<Shape> ShapePtr;
template <typename T>
class ShapeT: public Shape {
public:
explicit ShapeT(T const t): _shape(t) {}
virtual double area() const { return area(_shape); }
private:
T _shape;
};
template <typename T>
ShapePtr newShape(T t) { return ShapePtr(new ShapeT<T>(t)); }
Okay, C++ is verbose. Let's check the use immediately:
double totalArea(std::vector<ShapePtr> const& shapes) {
double total = 0.0;
for (ShapePtr const& s: shapes) { total += s->area(); }
return total;
}
int main() {
std::vector<ShapePtr> shapes{ new_shape<Square>({5.0}), new_shape<Circle>({3.0}) };
std::cout << totalArea(shapes) << "\n";
}
So, first exercise, let's add a shape (yep, it's all):
struct Rectangle { double length, height; };
double area(Rectangle const r);
Okay, so far so good, let's add a new function. We have two options.
The first is to modify Shape if it is in our power. This is source compatible, but not binary compatible.
// 1. We need to extend Shape:
virtual double perimeter() const = 0
// 2. And its adapter: ShapeT
virtual double perimeter() const { return perimeter(_shape); }
// 3. And provide the method for each Shape (obviously)
double perimeter(Square const s);
double perimeter(Circle const c);
double perimeter(Rectangle const r);
It may seem that we fall into the Expression Problem here, but we don't. We needed to add the perimeter for each (already known) class because there is no way to automatically infer it; however it did not require editing each class either!
Therefore, the combination of External Interface and free functions let us neatly (well, it is C++...) sidestep the issue.
sodraz noticed in comments that the addition of a function touched the original interface which may need to be frozen (provided by a 3rd party, or for binary compatibility issues).
The second options therefore is not intrusive, at the cost of being slightly more verbose:
class ExtendedShape: public Shape {
public:
virtual double perimeter() const = 0;
protected:
ExtendedShape(ExtendedShape const&) {}
ExtendedShape& operator=(ExtendedShape const&) { return *this; }
};
typedef std::unique_ptr<ExtendedShape> ExtendedShapePtr;
template <typename T>
class ExtendedShapeT: public ExtendedShape {
public:
virtual double area() const { return area(_data); }
virtual double perimeter() const { return perimeter(_data); }
private:
T _data;
};
template <typename T>
ExtendedShapePtr newExtendedShape(T t) { return ExtendedShapePtr(new ExtendedShapeT<T>(t)); }
And then, define the perimeter function for all those Shape we would like to use with the ExtendedShape.
The old code, compiled to work against Shape, still works. It does not need the new function anyway.
The new code can make use of the new functionality, and still interface painlessly with the old code. (*)
There is only one slight issue, if the old code return a ShapePtr, we do not know whether the shape actually has a perimeter function (note: if the pointer is generated internally, it has not been generated with the newExtendedShape mechanism). This is the limitation of the design mentioned at the beginning. Oops :)
(*) Note: painlessly implies that you know who the owner is. A std::unique_ptr<Derived>& and a std::unique_ptr<Base>& are not compatible, however a std::unique_ptr<Base> can be build from a std::unique_ptr<Derived> and a Base* from a Derived* so make sure your functions are clean ownership-wise and you're golden.
I want to create a class that can use one of four algorithms (and the algorithm to use is only known at run-time). I was thinking that the Strategy design pattern sounds appropriate, but my problem is that each algorithm requires slightly different parameters. Would it be a bad design to use strategy, but pass in the relevant parameters into the constructor?.
Here is an example (for simplicity, let's say there are only two possible algorithms) ...
class Foo
{
private:
// At run-time the correct algorithm is used, e.g. a = new Algorithm1(1);
AlgorithmInterface* a;
};
class AlgorithmInterface
{
public:
virtual void DoSomething() = 0;
};
class Algorithm1 : public AlgorithmInterface
{
public:
Algorithm1( int i ) : value(i) {}
virtual void DoSomething(){ // Does something with int value };
int value;
};
class Algorithm2 : public AlgorithmInterface
{
public:
Algorithm2( bool b ) : value(b) {}
virtual void DoSomething(){ // Do something with bool value };
bool value;
};
It would be a valid design because the Strategy pattern asks for an interface to be defined and any class that implements it is a valid candidate to run the strategy code, regardless how it is constructed.
I think it's correct, if you have all the parameters you need when you create the new strategy and what you do is clear for everyone reading the code.
You are right on with this approach. Yes this is the essence of the strategy pattern..."Vary the algorithm independent of the implementation." You can just give yourself a generic constructor to pass in the parameters you need to initialize your class, such as an object array.
Enjoy!
Strategy pattern are useful when you want to decide on runtime which algorithm to be used.
You could also pass parameters in using a single interface of a memory block containing key-value pairs. That way the interface is common between any present and future algorithms. Each algorithm implementation would know how to decode the key-value pairs into its parameters.
IMHO, you are facing the challenge as you are confusing between the creational aspect of the concrete algorithm and the actual running of the algorithm. As long as the 'DoSomething' interface remains the same, Strategy Pattern can be used. It is only the creation of the different concrete algorithm that varies in your case, which can be handled through a Factory Method design pattern.