Get pointer to object given pointer to member, non-standard layout - c++

This is a follow up to Get pointer to object from pointer to some member with the caveat that my structs aren't standard layout.
Consider the following scenario:
struct Thing; // Some struct
struct Holder { // Bigger struct
Thing thing;
static void cb(Thing* thing) {
// How do I get a pointer to Holder here? Say, to access other fields in Holder?
// Can consider storing a void* inside thing, but that's avoiding the problem and not zero overhead either
}
};
// C function which takes a Thing and eventually calls the callback with the same Thing
void cfunc(Thing* thing, void(*cb)(Thing*) cb_ptr);
void run() {
Holder h;
cfunc(&h.thing, &Holder::cb);
}
Now, how do I get a pointer to Holder inside cb? Of course, I am prepared to do unsafe stuff (like reinterpret casting, etc) to tell the compiler my assumptions and accept undefined behaviour if my assumptions are violated.
The main issue seems to be that the info that the callback would always be called on the thing passed in seems to be missing to the compiler. Notwithstanding that, the fact that cb would only ever be called with the holder's own thing member (and by extension, only things inside holders) also seems to be missing. This is important if I have multiple Things and multiple (unique) callbacks associated with them.
Note that inheritance seems to make this pretty simple:
struct Thing; // Some struct
struct Holder : public Thing { // Bigger struct
static void cb(Thing* thing) {
Holder* holder = (Holder*)thing;
}
};
// C function which takes a Thing and eventually calls the callback with the same Thing
void cfunc(Thing* thing, void(*cb)(Thing*) cb_ptr);
void run() {
Holder h;
cfunc((Thing*)&h, &Holder::cb);
}
If I want multiple Things, I just inherit multiple times (probably with intermediate types since I don't know how to cast to the base class if I have multiple of the same type) and that's it.
Coming back to the linked answer, offsetof seems to be a decent solution till you run into the requirement of standard layout which is a no-go since I have both public and private data members.
Is there another way to do this without inheritance?
Bonus points if you can tell me why offsetof requires standard layout and why mixing public and private isn't standard layout. At least theoretically, it seems like the compiler should be able to figure this out anyway, especially if structs ALWAYS have a consistent layout (maybe this isn't true?) in the program.

What about having the Holder pointer inside Thing:
struct Holder;
struct Thing
{
Holder * parent;
};
struct Holder { // Bigger struct
Thing thing;
Holder()
{
thing.parent = this;
}
Holder(const Holder & h)
{
thing.parent = this;
}
Holder& operator= (const Holder & h)
{
//leave thing.parent as is
}
static void cb(Thing* thing) {
Holder* holder = thing->parent;
}
};
And here is a good place to learn about standard layout. Short answer is that that the standard does not guarantee the order outside the same access control level.

Related

Exposing fields from an opaque C struct

I am working with an existing C library (that I can't modify) where some structures have opaque fields that must be accessed through specific setters and getters, like in the following crude example (imagining x is private, even though it's written in C).
struct CObject {
int x;
};
void setCObjectX(CObject* o, int x) {
o->x = x;
}
int getCObjectX(CObject* o) {
return o->x;
}
I am writing classes that privately own these types of structures, kind of like wrappers, albeit more complex. I want to expose the relevant fields in a convenient way. At first, I was simply writing setters and getters wherever necessary. However, I thought of something else, and I wanted to know if there are any downsides to the method. It uses function pointers (std::function) to store the C setter-getter pairs and present them as if directly accessing a field instead of functions.
Here is the generic class I wrote to help define such "fake" fields:
template<typename T>
struct IndirectField {
void operator=(const T& value) {
setter(value);
}
auto operator()() const -> T {
return *this;
}
operator T() const {
return getter();
}
std::function<void(const T&)> setter;
std::function<T()> getter;
};
It is used by defining an instance in the C++ class and setting up setter and getter with the corresponding C functions:
IndirectField<int> x;
// ...
x.setter = [=](int x) {
setCObjectX(innerObject.get(), x);
};
x.getter = [=]() {
return getCObjectX(innerObject.get());
};
Here is a complete, working code for testing.
Are there any disadvantages to using this method? Could it lead to eventual dangerous behaviors or something?
The biggest problem I see with your solution is that std::function objects take space inside each instance of IndirectField inside CPPObject, even when CObject type is the same.
You can fix this problem by making function pointers into template parameters:
template<typename T,typename R,void setter(R*,T),T getter(R*)>
struct IndirectField {
IndirectField(R *obj) : obj(obj) {
}
void operator=(const T& value) {
setter(obj, value);
}
auto operator()() const -> T {
return *this;
}
operator T() const {
return getter(obj);
}
private:
R *obj;
};
Here is how to use this implementation:
class CPPObject {
std::unique_ptr<CObject,decltype(&freeCObject)> obj;
public:
CPPObject()
: obj(createCObject(), freeCObject)
, x(obj.get())
, y(obj.get()) {
}
IndirectField<int,CObject,setCObjectX,getCObjectX> x;
IndirectField<double,CObject,setCObjectY,getCObjectY> y;
};
This approach trades two std::function objects for one CObject* pointer per IndirectField. Unfortunately, storing this pointer is required, because you cannot get it from the context inside the template.
Your modified demo.
Are there any disadvantages to using this method?
There's a few things to highlight in your code:
Your getters & setters, being not part of the class, break encapsulation. (Do you really want to tie yourself permanently to this library?)
Your example shows a massive amount of copying being done; which will be slower than it needs to be. (auto operator()(), operator T() to name but 2).
It's taking up more memory than you need to and adds more compexity than just passing around a Cobject. If you don't want things to know that it's a CObject, then create an abstract class and pass that abstract class around (see below for example).
Could it lead to eventual dangerous behaviors or something?
The breaking of encapsulation will result in x changing from any number of routes; and force other things to know about how it's stored in the object. Which is bad.
The creation of IndirectField Means that every object will have to have getters and setters in this way; which is going to be a maintenance nightmare.
Really I think what you're looking for is something like:
struct xProvider {
virtual int getX() const = 0;
virtual void setX() = 0;
};
struct MyCObject : xProvider {
private:
CObject obj;
public:
int getX() const override {return obj.x;}
CObject& getRawObj() {return obj;}
// etc ...
}
And then you just pass a reference / pointer to an xProvider around.
This will remove the dependence on this external C library; allowing you to replace it with your own test struct or a whole new library if you see fit; without having to re-write all your code using it
in a struct by default (as you post) all the fields are public, so they are accessible by client software. I you want to make them accessible to derived classes (you don't need to reimplement anything if you know the field contract and want to access it in a well defined way) they are made protected. And if you want them to be accessed by nobody, then mark them as private.
If the author of such a software doesn't want the fields to be touched by you, he will mark them as private, and then you'll have nothing to do, but to adapt to this behaviour. Failing to do will give you bad consequences.
Suppose you make a field that is modified with a set_myField() method, that calls a list of listeners anytime you make a change. If you bypass the method accessing function, all the listeners (many of them of unknown origin) will be bypassed and won't be notified of the field change. This is quite common in object programming, so you must obey the rules the authors impose to you.

C++ : Access a sub-object's methods inside an object

I am starting to code bigger objects, having other objects inside them.
Sometimes, I need to be able to call methods of a sub-object from outside the class of the object containing it, from the main() function for example.
So far I was using getters and setters as I learned.
This would give something like the following code:
class Object {
public:
bool Object::SetSubMode(int mode);
int Object::GetSubMode();
private:
SubObject subObject;
};
class SubObject {
public:
bool SubObject::SetMode(int mode);
int SubObject::GetMode();
private:
int m_mode(0);
};
bool Object::SetSubMode(int mode) { return subObject.SetMode(mode); }
int Object::GetSubMode() { return subObject.GetMode(); }
bool SubObject::SetMode(int mode) { m_mode = mode; return true; }
int SubObject::GetMode() { return m_mode; }
This feels very sub-optimal, forces me to write (ugly) code for every method that needs to be accessible from outside. I would like to be able to do something as simple as Object->SubObject->Method(param);
I thought of a simple solution: putting the sub-object as public in my object.
This way I should be able to simply access its methods from outside.
The problem is that when I learned object oriented programming, I was told that putting anything in public besides methods was blasphemy and I do not want to start taking bad coding habits.
Another solution I came across during my research before posting here is to add a public pointer to the sub-object perhaps?
How can I access a sub-object's methods in a neat way?
Is it allowed / a good practice to put an object inside a class as public to access its methods? How to do without that otherwise?
Thank you very much for your help on this.
The problem with both a pointer and public member object is you've just removed the information hiding. Your code is now more brittle because it all "knows" that you've implemented object Car with 4 object Wheel members. Instead of calling a Car function that hides the details like this:
Car->SetRPM(200); // hiding
You want to directly start spinning the Wheels like this:
Car.wheel_1.SetRPM(200); // not hiding! and brittle!
Car.wheel_2.SetRPM(200);
And what if you change the internals of the class? The above might now be broken and need to be changed to:
Car.wheel[0].SetRPM(200); // not hiding!
Car.wheel[1].SetRPM(200);
Also, for your Car you can say SetRPM() and the class figures out whether it is front wheel drive, rear wheel drive, or all wheel drive. If you talk to the wheel members directly that implementation detail is no longer hidden.
Sometimes you do need direct access to a class's members, but one goal in creating the class was to encapsulate and hide implementation details from the caller.
Note that you can have Set and Get operations that update more than one bit of member data in the class, but ideally those operations make sense for the Car itself and not specific member objects.
I was told that putting anything in public besides methods was blasphemy
Blanket statements like this are dangerous; There are pros and cons to each style that you must take into consideration, but an outright ban on public members is a bad idea IMO.
The main problem with having public members is that it exposes implementation details that might be better hidden. For example, let's say you are writing some library:
struct A {
struct B {
void foo() {...}
};
B b;
};
A a;
a.b.foo();
Now a few years down you decide that you want to change the behavior of A depending on the context; maybe you want to make it run differently in a test environment, maybe you want to load from a different data source, etc.. Heck, maybe you just decide the name of the member b is not descriptive enough. But because b is public, you can't change the behavior of A without breaking client code.
struct A {
struct B {
void foo() {...}
};
struct C {
void foo() {...}
};
B b;
C c;
};
A a;
a.c.foo(); // Uh oh, everywhere that uses b needs to change!
Now if you were to let A wrap the implementation:
class A {
public:
foo() {
if (TESTING) {
b.foo();
} else {
c.foo();
}
private:
struct B {
void foo() {...}
};
struct C {
void foo() {...}
};
B b;
C c;
};
A a;
a.foo(); // I don't care how foo is implemented, it just works
(This is not a perfect example, but you get the idea.)
Of course, the disadvantage here is that it requires a lot of extra boilerplate, like you have already noticed. So basically, the question is "do you expect the implementation details to change in the future, and if so, will it cost more to add boilerplate now, or to refactor every call later?" And if you are writing a library used by external users, then "refactor every call" turns into "break all client code and force them to refactor", which will make a lot of people very upset.
Of course instead of writing forwarding functions for each function in SubObject, you could just add a getter for subObject:
const SubObject& getSubObject() { return subObject; }
// ...
object.getSubObject().setMode(0);
Which suffers from some of the same problems as above, although it is a bit easier to work around because the SubObject interface is not necessarily tied to the implementation.
All that said, I think there are certainly times where public members are the correct choice. For example, simple structs whose primary purpose is to act as the input for another function, or who just get a bundle of data from point A to point B. Sometimes all that boilerplate is really overkill.

Is there a way to simulate downcasting by reference

So, I have something along the lines of these structs:
struct Generic {}
struct Specific : Generic {}
At some point I have the the need to downcast, ie:
Specific s = (Specific) GetGenericData();
This is a problem because I get error messages stating that no user-defined cast was available.
I can change the code to be:
Specific s = (*(Specific *)&GetGenericData())
or using reinterpret_cast, it would be:
Specific s = *reinterpret_cast<Specific *>(&GetGenericData());
But, is there a way to make this cleaner? Perhaps using a macro or template?
I looked at this post C++ covariant templates, and I think it has some similarities, but not sure how to rewrite it for my case. I really don't want to define things as SmartPtr. I would rather keep things as the objects they are.
It looks like GetGenericData() from your usage returns a Generic by-value, in which case a cast to Specific will be unsafe due to object slicing.
To do what you want to do, you should make it return a pointer or reference:
Generic* GetGenericData();
Generic& GetGenericDataRef();
And then you can perform a cast:
// safe, returns nullptr if it's not actually a Specific*
auto safe = dynamic_cast<Specific*>(GetGenericData());
// for references, this will throw std::bad_cast
// if you try the wrong type
auto& safe_ref = dynamic_cast<Specific&>(GetGenericDataRef());
// unsafe, undefined behavior if it's the wrong type,
// but faster if it is
auto unsafe = static_cast<Specific*>(GetGenericData());
I assume here that your data is simple.
struct Generic {
int x=0;
int y=0;
};
struct Specific:Generic{
int z=0;
explicit Specific(Generic const&o):Generic(o){}
// boilerplate, some may not be needed, but good habit:
Specific()=default;
Specific(Specific const&)=default;
Specific(Specific &&)=default;
Specific& operator=(Specific const&)=default;
Specific& operator=(Specific &&)=default;
};
and bob is your uncle. It is somewhat important that int z hae a default initializer, so we don't have to repeat it in the from-parent ctor.
I made thr ctor explicit so it will be called only explicitly, instead of by accident.
This is a suitable solution for simple data.
So the first step is to realize you have a dynamic state problem. The nature of the state you store changes based off dynamic information.
struct GenericState { virtual ~GenericState() {} }; // data in here
struct Generic;
template<class D>
struct GenericBase {
D& self() { return *static_cast<D&>(*this); }
D const& self() const { return *static_cast<D&>(*this); }
// code to interact with GenericState here via self().pImpl
// if you have `virtual` behavior, have a non-virtual method forward to
// a `virtual` method in GenericState.
};
struct Generic:GenericBase<Generic> {
// ctors go here, creates a GenericState in the pImpl below, or whatever
~GenericState() {} // not virtual
private:
friend struct GenericBase<Generic>;
std::unique_ptr<GenericState> pImpl;
};
struct SpecificState : GenericState {
// specific stuff in here, including possible virtual method overrides
};
struct Specific : GenericBase<Specific> {
// different ctors, creates a SpecificState in a pImpl
// upcast operators:
operator Generic() && { /* move pImpl into return value */ }
operator Generic() const& { /* copy pImpl into return value */ }
private:
friend struct GenericBase<Specific>;
std::unique_ptr<SpecificState> pImpl;
};
If you want the ability to copy, implement a virtual GenericState* clone() const method in GenericState, and in SpecificState override it covariantly.
What I have done here is regularized the type (or semiregularized if we don't support move). The Specific and Generic types are unrelated, but their back end implementation details (GenericState and SpecificState) are related.
Interface duplication is avoided mostly via CRTP and GenericBase.
Downcasting now can either involve a dynamic check or not. You go through the pImpl and cast it over. If done in an rvalue context, it moves -- if in an lvalue context, it copies.
You could use shared pointers instead of unique pointers if you prefer. That would permit non-copy non-move based casting.
Ok, after some additional study, I am wondering if what is wrong with doing this:
struct Generic {}
struct Specific : Generic {
Specific( const Generic &obj ) : Generic(obj) {}
}
Correct me if I am wrong, but this works using the implicit copy constructors.
Assuming that is the case, I can avoid having to write one and does perform the casting automatically, and I can now write:
Specific s = GetGenericData();
Granted, for large objects, this is probably not a good idea, but for smaller ones, will this be a "correct" solution?

Vector of pointers to instances of a templated class

I am implementing a task runtime system that maintains buffers for user-provided objects of various types. In addition, all objects are wrapped before they are stored into the buffers. Since the runtime doesn't know the types of objects that the user will provide, the Wrapper and the Buffer classes are templated:
template <typename T>
class Wrapper {
private:
T mdata;
public:
Wrapper() = default;
Wrapper(T& user_data) : mdata(user_data) {}
T& GetData() { return mdata; }
...
};
template <typename T>
class Buffer {
private:
std::deque<Wrapper<T>> items;
public:
void Write(Wrapper<T> wd) {
items.push_back(wd);
}
Wrapper<T> Read() {
Wrapper<T> tmp = items.front();
items.pop_front();
return tmp;
}
...
};
Now, the runtime system handles the tasks, each of which operates on a subset of aforementioned buffers. Thus, each buffer is operated by one or more tasks. This means that a task must keep references to the buffers since the tasks may share buffers.
This is where my problem is:
1) each task needs to keep references to a number of buffers (this number is unknown in compile time)
2) the buffers are of different types (based on the templeted Buffer class).
3) the task needs to use these references to access buffers.
There is no point to have a base class to the Buffer class and then use base class pointers since the methods Write and Read from the Buffer class are templeted and thus cannot be virtual.
So I was thinking to keep references as void pointers, where the Task class would look something like:
class Task {
private:
vector<void *> buffers;
public:
template<typename T>
void AddBuffer(Buffet<T>* bptr) {
buffers.push_back((void *) bptr);
}
template<typename T>
Buffer<T>* GetBufferPtr(int index) {
return some_way_of_cast(buffers[index]);
}
...
};
The problem with this is that I don't know how to get the valid pointer from the void pointer in order to access the Buffer. Namely, I don't know how to retain the type of the object pointed by buffers[index].
Can you help me with this, or suggest some other solution?
EDIT: The buffers are only the implementation detail of the runtime system and the user is not aware of their existence.
In my experience, when the user types are kept in user code, run-time systems handling buffers do not need to worry about the actual type of these buffer. Users can invoke operations on typed buffers.
class Task {
private:
vector<void *> buffers;
public:
void AddBuffer(char* bptr) {
buffers.push_back((void *) bptr);
}
char *GetBufferPtr(int index) {
return some_way_of_cast(buffers[index]);
}
...
};
class RTTask: public Task {
/* ... */
void do_stuff() {
Buffer<UserType1> b1; b1Id = b1.id();
Buffer<UserType2> b2; b2Id = b2.id();
AddBuffer(cast(&b1));
AddBuffer(cast(&b2));
}
void do_stuff2() {
Buffer<UserType1> *b1 = cast(GetBufferPtr(b1Id));
b1->push(new UserType1());
}
};
In these cases casts are in the user code. But perhaps you have a different problem. Also the Wrapper class may not be necessary if you can switch to pointers.
What you need is something called type erasure. It's way to hide the type(s) in a template.
The basic technique is the following:
- Have an abstract class with the behavior you want in declared in a type independent maner.
- Derive your template class from that class, implement its virtual methods.
Good news, you probably don't need to write your own, there boost::any already. Since all you need is get a pointer and get the object back, that should be enough.
Now, working with void* is a bad idea. As perreal mentioned, the code dealing with the buffers should not care about the type though. The good thing to do is to work with char*. That is the type that is commonly used for buffers (e.g. socket apis). It is safer than too: there is a special rule in the standard that allows safer conversion to char* (see aliasing rules).
This isn't exactly an answer to your question, but I just wanted to point out that the way you wrote
Wrapper<T> Read() {
makes it a mutator member function which returns by value, and as such, is not good practice as it forces the user write exception unsafe code.
For the same reason the STL stack::pop() member function returns void, not the object that was popped off the stack.

How to implement the factory method pattern in C++ correctly

There's this one thing in C++ which has been making me feel uncomfortable for quite a long time, because I honestly don't know how to do it, even though it sounds simple:
How do I implement Factory Method in C++ correctly?
Goal: to make it possible to allow the client to instantiate some object using factory methods instead of the object's constructors, without unacceptable consequences and a performance hit.
By "Factory method pattern", I mean both static factory methods inside an object or methods defined in another class, or global functions. Just generally "the concept of redirecting the normal way of instantiation of class X to anywhere else than the constructor".
Let me skim through some possible answers which I have thought of.
0) Don't make factories, make constructors.
This sounds nice (and indeed often the best solution), but is not a general remedy. First of all, there are cases when object construction is a task complex enough to justify its extraction to another class. But even putting that fact aside, even for simple objects using just constructors often won't do.
The simplest example I know is a 2-D Vector class. So simple, yet tricky. I want to be able to construct it both from both Cartesian and polar coordinates. Obviously, I cannot do:
struct Vec2 {
Vec2(float x, float y);
Vec2(float angle, float magnitude); // not a valid overload!
// ...
};
My natural way of thinking is then:
struct Vec2 {
static Vec2 fromLinear(float x, float y);
static Vec2 fromPolar(float angle, float magnitude);
// ...
};
Which, instead of constructors, leads me to usage of static factory methods... which essentially means that I'm implementing the factory pattern, in some way ("the class becomes its own factory"). This looks nice (and would suit this particular case), but fails in some cases, which I'm going to describe in point 2. Do read on.
another case: trying to overload by two opaque typedefs of some API (such as GUIDs of unrelated domains, or a GUID and a bitfield), types semantically totally different (so - in theory - valid overloads) but which actually turn out to be the same thing - like unsigned ints or void pointers.
1) The Java Way
Java has it simple, as we only have dynamic-allocated objects. Making a factory is as trivial as:
class FooFactory {
public Foo createFooInSomeWay() {
// can be a static method as well,
// if we don't need the factory to provide its own object semantics
// and just serve as a group of methods
return new Foo(some, args);
}
}
In C++, this translates to:
class FooFactory {
public:
Foo* createFooInSomeWay() {
return new Foo(some, args);
}
};
Cool? Often, indeed. But then- this forces the user to only use dynamic allocation. Static allocation is what makes C++ complex, but is also what often makes it powerful. Also, I believe that there exist some targets (keyword: embedded) which don't allow for dynamic allocation. And that doesn't imply that the users of those platforms like to write clean OOP.
Anyway, philosophy aside: In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.
2) Return-by-value
OK, so we know that 1) is cool when we want dynamic allocation. Why won't we add static allocation on top of that?
class FooFactory {
public:
Foo* createFooInSomeWay() {
return new Foo(some, args);
}
Foo createFooInSomeWay() {
return Foo(some, args);
}
};
What? We can't overload by the return type? Oh, of course we can't. So let's change the method names to reflect that. And yes, I've written the invalid code example above just to stress how much I dislike the need to change the method name, for example because we cannot implement a language-agnostic factory design properly now, since we have to change names - and every user of this code will need to remember that difference of the implementation from the specification.
class FooFactory {
public:
Foo* createDynamicFooInSomeWay() {
return new Foo(some, args);
}
Foo createFooObjectInSomeWay() {
return Foo(some, args);
}
};
OK... there we have it. It's ugly, as we need to change the method name. It's imperfect, since we need to write the same code twice. But once done, it works. Right?
Well, usually. But sometimes it does not. When creating Foo, we actually depend on the compiler to do the return value optimisation for us, because the C++ standard is benevolent enough for the compiler vendors not to specify when will the object created in-place and when will it be copied when returning a temporary object by value in C++. So if Foo is expensive to copy, this approach is risky.
And what if Foo is not copiable at all? Well, doh. (Note that in C++17 with guaranteed copy elision, not-being-copiable is no problem anymore for the code above)
Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.
3) Two-phase construction
Another thing that someone would probably come up with is separating the issue of object allocation and its initialisation. This usually results in code like this:
class Foo {
public:
Foo() {
// empty or almost empty
}
// ...
};
class FooFactory {
public:
void createFooInSomeWay(Foo& foo, some, args);
};
void clientCode() {
Foo staticFoo;
auto_ptr<Foo> dynamicFoo = new Foo();
FooFactory factory;
factory.createFooInSomeWay(&staticFoo);
factory.createFooInSomeWay(&dynamicFoo.get());
// ...
}
One may think it works like a charm. The only price we pay for in our code...
Since I've written all of this and left this as the last, I must dislike it too. :) Why?
First of all... I sincerely dislike the concept of two-phase construction and I feel guilty when I use it. If I design my objects with the assertion that "if it exists, it is in valid state", I feel that my code is safer and less error-prone. I like it that way.
Having to drop that convention AND changing the design of my object just for the purpose of making factory of it is.. well, unwieldy.
I know that the above won't convince many people, so let's me give some more solid arguments. Using two-phase construction, you cannot:
initialise const or reference member variables,
pass arguments to base class constructors and member object constructors.
And probably there could be some more drawbacks which I can't think of right now, and I don't even feel particularly obliged to since the above bullet points convince me already.
So: not even close to a good general solution for implementing a factory.
Conclusions:
We want to have a way of object instantiation which would:
allow for uniform instantiation regardless of allocation,
give different, meaningful names to construction methods (thus not relying on by-argument overloading),
not introduce a significant performance hit and, preferably, a significant code bloat hit, especially at client side,
be general, as in: possible to be introduced for any class.
I believe I have proven that the ways I have mentioned don't fulfil those requirements.
Any hints? Please provide me with a solution, I don't want to think that this language won't allow me to properly implement such a trivial concept.
First of all, there are cases when
object construction is a task complex
enough to justify its extraction to
another class.
I believe this point is incorrect. The complexity doesn't really matter. The relevance is what does. If an object can be constructed in one step (not like in the builder pattern), the constructor is the right place to do it. If you really need another class to perform the job, then it should be a helper class that is used from the constructor anyway.
Vec2(float x, float y);
Vec2(float angle, float magnitude); // not a valid overload!
There is an easy workaround for this:
struct Cartesian {
inline Cartesian(float x, float y): x(x), y(y) {}
float x, y;
};
struct Polar {
inline Polar(float angle, float magnitude): angle(angle), magnitude(magnitude) {}
float angle, magnitude;
};
Vec2(const Cartesian &cartesian);
Vec2(const Polar &polar);
The only disadvantage is that it looks a bit verbose:
Vec2 v2(Vec2::Cartesian(3.0f, 4.0f));
But the good thing is that you can immediately see what coordinate type you're using, and at the same time you don't have to worry about copying. If you want copying, and it's expensive (as proven by profiling, of course), you may wish to use something like Qt's shared classes to avoid copying overhead.
As for the allocation type, the main reason to use the factory pattern is usually polymorphism. Constructors can't be virtual, and even if they could, it wouldn't make much sense. When using static or stack allocation, you can't create objects in a polymorphic way because the compiler needs to know the exact size. So it works only with pointers and references. And returning a reference from a factory doesn't work too, because while an object technically can be deleted by reference, it could be rather confusing and bug-prone, see Is the practice of returning a C++ reference variable, evil? for example. So pointers are the only thing that's left, and that includes smart pointers too. In other words, factories are most useful when used with dynamic allocation, so you can do things like this:
class Abstract {
public:
virtual void do() = 0;
};
class Factory {
public:
Abstract *create();
};
Factory f;
Abstract *a = f.create();
a->do();
In other cases, factories just help to solve minor problems like those with overloads you have mentioned. It would be nice if it was possible to use them in a uniform way, but it doesn't hurt much that it is probably impossible.
Simple Factory Example:
// Factory returns object and ownership
// Caller responsible for deletion.
#include <memory>
class FactoryReleaseOwnership{
public:
std::unique_ptr<Foo> createFooInSomeWay(){
return std::unique_ptr<Foo>(new Foo(some, args));
}
};
// Factory retains object ownership
// Thus returning a reference.
#include <boost/ptr_container/ptr_vector.hpp>
class FactoryRetainOwnership{
boost::ptr_vector<Foo> myFoo;
public:
Foo& createFooInSomeWay(){
// Must take care that factory last longer than all references.
// Could make myFoo static so it last as long as the application.
myFoo.push_back(new Foo(some, args));
return myFoo.back();
}
};
Have you thought about not using a factory at all, and instead making nice use of the type system? I can think of two different approaches which do this sort of thing:
Option 1:
struct linear {
linear(float x, float y) : x_(x), y_(y){}
float x_;
float y_;
};
struct polar {
polar(float angle, float magnitude) : angle_(angle), magnitude_(magnitude) {}
float angle_;
float magnitude_;
};
struct Vec2 {
explicit Vec2(const linear &l) { /* ... */ }
explicit Vec2(const polar &p) { /* ... */ }
};
Which lets you write things like:
Vec2 v(linear(1.0, 2.0));
Option 2:
you can use "tags" like the STL does with iterators and such. For example:
struct linear_coord_tag linear_coord {}; // declare type and a global
struct polar_coord_tag polar_coord {};
struct Vec2 {
Vec2(float x, float y, const linear_coord_tag &) { /* ... */ }
Vec2(float angle, float magnitude, const polar_coord_tag &) { /* ... */ }
};
This second approach lets you write code which looks like this:
Vec2 v(1.0, 2.0, linear_coord);
which is also nice and expressive while allowing you to have unique prototypes for each constructor.
You can read a very good solution in: http://www.codeproject.com/Articles/363338/Factory-Pattern-in-Cplusplus
The best solution is on the "comments and discussions", see the "No need for static Create methods".
From this idea, I've done a factory. Note that I'm using Qt, but you can change QMap and QString for std equivalents.
#ifndef FACTORY_H
#define FACTORY_H
#include <QMap>
#include <QString>
template <typename T>
class Factory
{
public:
template <typename TDerived>
void registerType(QString name)
{
static_assert(std::is_base_of<T, TDerived>::value, "Factory::registerType doesn't accept this type because doesn't derive from base class");
_createFuncs[name] = &createFunc<TDerived>;
}
T* create(QString name) {
typename QMap<QString,PCreateFunc>::const_iterator it = _createFuncs.find(name);
if (it != _createFuncs.end()) {
return it.value()();
}
return nullptr;
}
private:
template <typename TDerived>
static T* createFunc()
{
return new TDerived();
}
typedef T* (*PCreateFunc)();
QMap<QString,PCreateFunc> _createFuncs;
};
#endif // FACTORY_H
Sample usage:
Factory<BaseClass> f;
f.registerType<Descendant1>("Descendant1");
f.registerType<Descendant2>("Descendant2");
Descendant1* d1 = static_cast<Descendant1*>(f.create("Descendant1"));
Descendant2* d2 = static_cast<Descendant2*>(f.create("Descendant2"));
BaseClass *b1 = f.create("Descendant1");
BaseClass *b2 = f.create("Descendant2");
I mostly agree with the accepted answer, but there is a C++11 option that has not been covered in existing answers:
Return factory method results by value, and
Provide a cheap move constructor.
Example:
struct sandwich {
// Factory methods.
static sandwich ham();
static sandwich spam();
// Move constructor.
sandwich(sandwich &&);
// etc.
};
Then you can construct objects on the stack:
sandwich mine{sandwich::ham()};
As subobjects of other things:
auto lunch = std::make_pair(sandwich::spam(), apple{});
Or dynamically allocated:
auto ptr = std::make_shared<sandwich>(sandwich::ham());
When might I use this?
If, on a public constructor, it is not possible to give meaningful initialisers for all class members without some preliminary calculation, then I might convert that constructor to a static method. The static method performs the preliminary calculations, then returns a value result via a private constructor which just does a member-wise initialisation.
I say 'might' because it depends on which approach gives the clearest code without being unnecessarily inefficient.
Loki has both a Factory Method and an Abstract Factory. Both are documented (extensively) in Modern C++ Design, by Andei Alexandrescu. The factory method is probably closer to what you seem to be after, though it's still a bit different (at least if memory serves, it requires you to register a type before the factory can create objects of that type).
I don't try to answer all of my questions, as I believe it is too broad. Just a couple of notes:
there are cases when object construction is a task complex enough to justify its extraction to another class.
That class is in fact a Builder, rather than a Factory.
In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.
Then you could have your factory encapsulate it in a smart pointer. I believe this way you can have your cake and eat it too.
This also eliminates the issues related to return-by-value.
Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.
Indeed. All design patterns have their (language specific) constraints and drawbacks. It is recommended to use them only when they help you solve your problem, not for their own sake.
If you are after the "perfect" factory implementation, well, good luck.
This is my c++11 style solution. parameter 'base' is for base class of all sub-classes. creators, are std::function objects to create sub-class instances, might be a binding to your sub-class' static member function 'create(some args)'. This maybe not perfect but works for me. And it is kinda 'general' solution.
template <class base, class... params> class factory {
public:
factory() {}
factory(const factory &) = delete;
factory &operator=(const factory &) = delete;
auto create(const std::string name, params... args) {
auto key = your_hash_func(name.c_str(), name.size());
return std::move(create(key, args...));
}
auto create(key_t key, params... args) {
std::unique_ptr<base> obj{creators_[key](args...)};
return obj;
}
void register_creator(const std::string name,
std::function<base *(params...)> &&creator) {
auto key = your_hash_func(name.c_str(), name.size());
creators_[key] = std::move(creator);
}
protected:
std::unordered_map<key_t, std::function<base *(params...)>> creators_;
};
An example on usage.
class base {
public:
base(int val) : val_(val) {}
virtual ~base() { std::cout << "base destroyed\n"; }
protected:
int val_ = 0;
};
class foo : public base {
public:
foo(int val) : base(val) { std::cout << "foo " << val << " \n"; }
static foo *create(int val) { return new foo(val); }
virtual ~foo() { std::cout << "foo destroyed\n"; }
};
class bar : public base {
public:
bar(int val) : base(val) { std::cout << "bar " << val << "\n"; }
static bar *create(int val) { return new bar(val); }
virtual ~bar() { std::cout << "bar destroyed\n"; }
};
int main() {
common::factory<base, int> factory;
auto foo_creator = std::bind(&foo::create, std::placeholders::_1);
auto bar_creator = std::bind(&bar::create, std::placeholders::_1);
factory.register_creator("foo", foo_creator);
factory.register_creator("bar", bar_creator);
{
auto foo_obj = std::move(factory.create("foo", 80));
foo_obj.reset();
}
{
auto bar_obj = std::move(factory.create("bar", 90));
bar_obj.reset();
}
}
Factory Pattern
class Point
{
public:
static Point Cartesian(double x, double y);
private:
};
And if you compiler does not support Return Value Optimization, ditch it, it probably does not contain much optimization at all...
extern std::pair<std::string_view, Base*(*)()> const factories[2];
decltype(factories) factories{
{"blah", []() -> Base*{return new Blah;}},
{"foo", []() -> Base*{return new Foo;}}
};
I know this question has been answered 3 years ago, but this may be what your were looking for.
Google has released a couple of weeks ago a library allowing easy and flexible dynamic object allocations. Here it is: http://google-opensource.blogspot.fr/2014/01/introducing-infact-library.html