Task ownership in Factory-Processor model - c++

Factory supplies Tasks of different types to Processor asynchronously. Processor doesn't know details of Tasks and executes them via known Interface. Dynamic allocation is prohibited due to performance reasons. Factory should not own Tasks because otherwise Processor would need to inform Factory when he finishes execution of Task to do the cleanup. Processor should know only Interface, but not Tasks themselves. Processor may own Tasks as opaque objects while he processes them.
One possible solution is: store all kinds of Tasks inside the union of "Interface & padding buffer". Please, consider the following working example (C++11):
#include <iostream>
struct Interface
{
virtual void execute() {}
};
union X
{
X() {}
Interface i;
char padding[1024];
template <class T>
X& operator= (T &&y)
{
static_assert (sizeof(T) <= sizeof(padding), "X capacity is not enough!");
new (padding) T(y);
}
};
struct Task : public Interface
{
Task() : data(777) {}
virtual void execute() { std::cout << data << std::endl; }
int data;
};
int main()
{
Task t;
X x;
x = std::move(t);
Interface *i = &x.i;
i->execute();
};
The snippet works well (prints 777). But are there any dangers (like virtual inheritance) in such approach? Maybe any better solution is possible?

Your solution seems to involve both an unnecessary copy operation, and making assumptions about the layout of your objects in memory that are not guaranteed to be correct in all circumstances. It further invokes undefined behaviour by using memcpy to copy an object with virtual methods, which is explicitly disallowed by the c++ spec. It also has the potential to cause confusion over when object destructors run.
I would use an arrangement like this:
class Processor has an array of buffers, each of which is large enough to contain any defined subclass of your task interface. It has two methods used in submitting tasks:
one to return a pointer to a currently available buffer
one to submit a job
The job interface is extended with a requirement to track the pointer to the buffer that contains it (which will be supplied as a constructor parameter), and has a method to return that pointer.
Submitting a new task is now done like this:
void * buffer = processor.getBuffer();
Task * task = new (buffer) Task(buffer);
processor.submitJob(task);
(this could be simplified using a template method in Processor if required). Then, the processor simply executes jobs, and when it's done with them it asks them for their buffer, runs their destructor, and adds the buffer back into its free buffer list.

Updated answer.
See: std::aligned_union (en.cppreference.com). It is designed to be used together with placement new and explicit destructor call.
Below is the earlier answer, now retracted.
From the design perspective,
Avoiding dynamic allocation seems a drastic requirement. It requires some extraordinary justification.
In case one does not trust the standard allocator for any reason, one could still implement a custom allocator, in order to have full control of its behavior.
If there is a class or method that "owns" all instances of everything: all Factories, all Processors, and all Tasks (as is the case in your main() method), then it is not necessary to copy anything. Just pass references or pointers around, since this "class or method that owns everything" will take care of object lifetime.
My answer is only applicable to the question about "memcpy".
I do not try to cover the issue of memcpy-ing between "Task which has Interface as base class, and X which has Interface as member". This doesn't seem universally valid for all of the C++ compilers, but I don't know offhand which C++ compilers would fail this code.
Short answer, which is applicable to all C++ compilers:
To use memcpy on a type, the type needs to be trivially copyable.
Currently, trivially copyable lists "no virtual functions" as one of the necessary conditions, so the "according to the spec" answer is that your struct Task is not trivially copyable.
The longer, non-standard answer is whether your particular compiler will synthesize the struct and the machine code that would be effectively copyable (i.e. without ill-effects), despite the C++ specification saying no. Obviously this answer will be compiler-specific, and will depend on a lot of circumstances (such as optimization flags and minor code changes).
Remember that compiler optimization and code generation can change from version to version. There is no guarantee that the next version of the compiler will behave exactly the same.
To give an example of something that would be likely to be unsafe for memcpy-ing between two instances, consider:
struct Task : public Interface
{
Task(std::string&& s)
: data(std::move(s))
{}
virtual void execute() { std::cout << data << std::endl; }
std::string data;
};
The reason this is problematic is that, for sufficiently long strings, std::string will allocate dynamic memory to store its content. If there are two instances of Task, and memcpy is used to copy its bytes from one instance to another instance (which would have copied over the internal fields of the std::string class), their pointers will point to the same address, and therefore their destructors will both try to delete the same memory, leading to undefined behavior. In addition, if the instance that was being overwritten had an earlier string value, the memory will not be freed.
Since you have said that "dynamic allocation is prohibited", my guess is that you will not be using std::string or anything similar, instead opting to write C-like code exclusively. So this concern may not be relevant to you.
Speaking of "low level C-like code", here is my idea:
struct TaskBuffer
{
typedef void (*ExecuteFunc) (TaskBuffer*);
ExecuteFunc executeFunc;
char padding[1024];
};
void ProcessMethod(TaskBuffer* tb)
{
(tb->executeFunc)(tb);
}

Related

How to allow a std::unique_ptr to access a class's private destructor or implement a C++ factory class with a private destructor?

I'm quite far into the development of a game using SDL, OpenGL and C++ and am looking for ways to optimize the way the game switches between GLSL shaders for lots of different objects of different types. This is much more of a C++ question than an OpenGL question. However, I still want to provide as much context as I can, as I feel some justification is needed as to why the proposed Shader class I need, needs to be created / deleted the way that it is.
The first four sections are my justifications, journey & attempts leading up to this point, however my question can likely be answered by just the final section alone and I've intentionally written it as a bit of a tldr.
The necessity for a Shader class:
I've seen many implementations online of OpenGL shaders being created, compiled and deleted all in the same function when game objects are created during gameplay. This has proven to be inefficient and far too slow in particular sections of my game. Thus, I've required a system that creates and compiles shaders during a load-time and then intermittently uses/swaps between them during game-time before being deleted later.
This has lead to the creation of a class(Shader) that manages OpenGL shaders. Each instance of the class should manage one unique OpenGL shader each and contains some complex behavior around the shader type, where it's loaded in from, where it's used, the uniform variables it takes, etc.
With this said, the most important role of this class is to store the GLuint variable id that is returned from glCreateShader(), and manage all OpenGL calls that relate to the OpenGL shader with this id. I understand that this is effectively futile given the global nature of OpenGL(as anywhere in the program could technically call glDeleteShader() with the matching id and break the class), however for the purposes of intentionally encapsulating all OpenGL calls to very specific areas throughout the entire codebase this system will drastically reduce code-complexity.
Where the problems start...
The most "automatic" way to manage this GLuint id, would be to invoke glCreateShader() on the object's construction and glDeleteShader() on the object's destruction. This guarantees(within OpenGL limits) that the OpenGL shader will exist for the entire lifetime of the C++ Shader object and eliminates the need to call some void createShader() and deleteShader() functions.
This is all well and good, however problems soon arise when considering what happens if this object is copied. What if a copy of this object is destructed? That means that glDeleteShader() will be called and effectively break all copies of the shader object.
What about simple mistakes like accidentally invoking std::vector::push_back() in a vector of Shaders? Various std::vector methods can invoke the constructor / copy constructor / destructor of their type, which can result in the same problem as above.
Okay then... how about we do create some void createShader() and deleteShader() methods even if it's messy? Unfortunately this just defers the above problem, as once again any calls that modify the OpenGL shader will desynchronize / outright break all copies of a shader class with the same id. I've limited the OpenGL calls to glCreateShader() and glDeleteShader() in this example to keep things simple, however I should note that there are many other OpenGL calls in the class that would make creating various instance/static variables that keep track of instance copies far too complicated to justify doing it this way.
The last point I want to make before jumping into the class design below is that for a project as large as a raw C++, OpenGL and SDL Game, I'd prefer if any potential OpenGL mistakes I make generate compiler errors versus graphical issues that are harder to track down. This can be reflected in the class design below.
The first version of the Shader class:
It is for the above reasons that I have elected to:
Make the constructor private.
Provide a public static create function that returns a pointer to a new Shader object in place of a constructor.
Make the copy constructor private.
Make the operator= private (Although this might not be necessary).
Make the destructor private.
Put calls to glCreateShader() in the constructor and glDeleteShader() in the destructor, to have OpenGL shaders exist for the lifetime of this object.
As the create function invokes the new keyword(and returns the pointer to it), the place with the outside call to Shader::create() must then invoke delete manually (more on this in a second).
To my understanding, the first two bullet points utilize a factory pattern and will generate a compiler error should a non-pointer type of the class be attempted to be created. The third, fourth and fifth bullet points then prevent the object from being copied. The seventh bullet point then ensures that the OpenGL Shader will exist for the same lifetime of the C++ Shader object.
Smart Pointers and the main problem:
The only thing I'm not a huge fan of in the above, is the new/delete calls. They also make the glDeleteShader() calls in the destructor of the object feel inappropriate given the encapsulation that the class is trying to achieve. Given this, I opted to:
change the create function to return a std::unique_ptr of the Shader type instead of a Shader pointer.
The create function then looked like this:
std::unique_ptr<Shader> Shader::create() {
return std::make_unique<Shader>();
}
But then a new problem arose... std::make_unique unfortunately requires that the constructor is public, which interferes with the necessities described in the previous section. Fortunately, I found a solution by changing it to:
std::unique_ptr<Shader> Shader::create() {
return std::unique_ptr<Shader>(new Shader());
}
But... now std::unique_ptr requires that the destructor is public! This is... better but unfortunately, this means that the destructor can be manually called outside of the class, which in turn means the glDeleteShader() function can be called from outside the class.
Shader* p = Shader::create();
p->~Shader(); // Even though it would be hard to do this intentionally, I don't want to be able to do this.
delete p;
The final class:
For the sake of simplicity, I have removed the majority of instance variables, function/constructor arguments & other attributes but here's what the final proposed class (mostly)looks like:
class GLSLShader {
public:
~GLSLShader() { // OpenGL delete calls for id }; // want to make this private.
static std::unique_ptr<GLSLShader> create() { return std::unique_ptr<GLSLShader>(new GLSLShader()); };
private:
GLSLShader() { // OpenGL create calls for id };
GLSLShader(const GLSLShader& glslShader);
GLSLShader& operator=(const GLSLShader&);
GLuint id;
};
I'm happy with everything in this class, aside from the fact that the destructor is public. I've put this design to the test and the performance increase is very noticeable. Even though I can't imagine I'd ever accidentally manually call the destructor on a Shader object, I don't like that it is publicly exposed. I also feel that I might accidentally miss something, like the std::vector::push_back consideration in the second section.
I've found two potential solutions to this problem. I'd like some advice on these or other solutions.
Make std::unique_ptr or std::make_unique a friend of the Shader class. I've been reading threads such as this one, however this is to make the constructor accessible, rather than the destructor. I also don't quite understand the downsides / extra considerations needed with making std::unique_ptr or std::make_unique a friend (The top answer to that thread + comments)?
Not use smart pointers at all. Is there perhaps a way to have my static create() function return a raw pointer(using the new keyword), that is automatically deleted inside the class / when the Shader goes out of scope and the destructor is called?
Thank you very much for your time.
This is a context challenge.
You are solving the wrong problem.
GLuint id, would be to invoke glCreateShader() on the object's construction and glDeleteShader()
Fix the problem here.
The Rule of Zero is that you make your resource wrappers manage lifetimes, and you don't do it in business logic types. We can write a wrapper around a GLuint that knows how to clean itself up and is move-only, preventing double destruction, by hijacking std::unique_ptr to store an integer instead of a pointer.
Here we go:
// "pointers" in unique ptrs must be comparable to nullptr.
// So, let us make an integer qualify:
template<class Int>
struct nullable{
Int val=0;
nullable()=default;
nullable(Int v):val(v){}
friend bool operator==(std::nullptr_t, nullable const& self){return !static_cast<bool>(self);}
friend bool operator!=(std::nullptr_t, nullable const& self){return static_cast<bool>(self);}
friend bool operator==(nullable const& self, std::nullptr_t){return !static_cast<bool>(self);}
friend bool operator!=(nullable const& self, std::nullptr_t){return static_cast<bool>(self);}
operator Int()const{return val;}
};
// This both statelessly stores the deleter, and
// tells the unique ptr to use a nullable<Int> instead of an Int*:
template<class Int, void(*deleter)(Int)>
struct IntDeleter{
using pointer=nullable<Int>;
void operator()(pointer p)const{
deleter(p);
}
};
// Unique ptr's core functionality is cleanup on destruction
// You can change what it uses for a pointer.
template<class Int, void(*deleter)(Int)>
using IntResource=std::unique_ptr<Int, IntDeleter<Int,deleter>>;
// Here we statelessly remember how to destroy this particular
// kind of GLuint, and make it an RAII type with move support:
using GLShaderResource=IntResource<GLuint,glDeleteShader>;
now that type knows it is a shader and cleans itself up it non-null.
GLShaderResource id(glCreateShader());
SomeGLFunction(id.get());
apologies for any typos.
Stuff that in your class, and copy ctors are blocked, move ctors do the right thing, dtors clean up automatically, etc.
struct GLSLShader {
// public!
~GLSLShader() = default;
GLSLShader() { // OpenGL create calls for id };
private: // does this really need to be private?
GLShaderResource id;
};
so much simpler.
std::vector<GLSLShader> v;
and that just works. Our GLShaderResource is semi-regular (move only regular type, no sort support), and vector is happy with those. Rule of 0 means that GLSLShader, which owns it, is also semi-regular and supports RAII -- resource allocation is initialization -- which in turn means it cleans up after itself properly when stored in std containers.
A type being "Regular" means it "behaves like an int" -- like the prototypical value type. C++'s standard library, and much of C++, likes it when you are using regular or semi-regular types.
Note that this is basically zero overhead; sizeof(GLShaderResource) is the same as GLuint and nothing goes on the heap. We have a pile of compile-time type machinery wrapping a simple 32 bit integers; that compile-time type machinery generates code, but doesn't make the data more complex than 32 bits.
Live example.
The overhead includes:
Some calling conventions make passing a struct wrapping only an int be passed differently than an int.
On destruction, we check every one of these to see if it is 0 to decide if we want to call glDeleteShader; compilers can sometimes prove that something is guaranteed zero and skip that check. But it won't tell you if it did manage to pull that off. (OTOH, humans are notoriously bad at proving that they kept track of all resources, so a few runtime checks aren't the worst thing).
If you are doing a completely unoptimized build, there are going to be a few extra instructions when you call a OpenGL function. But after any non-zero level of inlineing by the compiler they will disappear.
The type isn't "trivial" (a term in the C++ standard) in a few ways (copyable, destroyable, constructible), which makes doing things like memset illegal under the C++ standard; you can't treat it like raw memory in a few low level ways.
A problem!
Many OpenGL implementations have pointers for glDeleteShader/glCreateShader etc, and the above relies on them being actual functions not pointers or macros or whatever.
There are two easy workarounds. The first is to add a & to the deleter arguments above (two spots). This has the problem that it only works when they are actually pointers now, and not when they are actual functions.
Making code that works in both cases is a bit tricky, but I think almost every GL implementation uses function pointers, so you should be good unless you want to make a "library quality" implementation. In that case, you can write some helper types that create constexpr function pointers that call the function pointer (or not) by name.
Finally, apparently some destructors require extra parameters. Here is a sketch.
using GLuint=std::uint32_t;
GLuint glCreateShaderImpl() { return 7; }
auto glCreateShader = glCreateShaderImpl;
void glDeleteShaderImpl(GLuint x) { std::cout << x << " deleted\n"; }
auto glDeleteShader = glDeleteShaderImpl;
std::pair<GLuint, GLuint> glCreateTextureWrapper() { return {7,1024}; }
void glDeleteTextureImpl(GLuint x, GLuint size) { std::cout << x << " deleted size [" << size << "]\n"; }
auto glDeleteTexture = glDeleteTextureImpl;
template<class Int>
struct nullable{
Int val=0;
nullable()=default;
nullable(Int v):val(v){}
nullable(std::nullptr_t){}
friend bool operator==(std::nullptr_t, nullable const& self){return !static_cast<bool>(self);}
friend bool operator!=(std::nullptr_t, nullable const& self){return static_cast<bool>(self);}
friend bool operator==(nullable const& self, std::nullptr_t){return !static_cast<bool>(self);}
friend bool operator!=(nullable const& self, std::nullptr_t){return static_cast<bool>(self);}
operator Int()const{return val;}
};
template<class Int, auto& deleter>
struct IntDeleter;
template<class Int, class...Args, void(*&deleter)(Int, Args...)>
struct IntDeleter<Int, deleter>:
std::tuple<std::decay_t<Args>...>
{
using base = std::tuple<std::decay_t<Args>...>;
using base::base;
using pointer=nullable<Int>;
void operator()(pointer p)const{
std::apply([&p](std::decay_t<Args> const&...args)->void{
deleter(p, args...);
}, static_cast<base const&>(*this));
}
};
template<class Int, void(*&deleter)(Int)>
using IntResource=std::unique_ptr<Int, IntDeleter<Int,deleter>>;
using GLShaderResource=IntResource<GLuint,glDeleteShader>;
using GLTextureResource=std::unique_ptr<GLuint,IntDeleter<GLuint, glDeleteTexture>>;
int main() {
auto res = GLShaderResource(glCreateShader());
std::cout << res.get() << "\n";
auto tex = std::make_from_tuple<GLTextureResource>(glCreateTextureWrapper());
std::cout << tex.get() << "\n";
}
Implement a deleter yourself, and let the deleter be a friend of your class.
Then edit your declaration like this:
static std::unique_ptr<GLSLShader, your_deleter> create();

Comparison between constant accessors of private members

The main portion of this question is in regards to the proper and most computationally efficient method of creating a public read-only accessor for a private data member inside of a class. Specifically, utilizing a const type & reference to access the variables such as:
class MyClassReference
{
private:
int myPrivateInteger;
public:
const int & myIntegerAccessor;
// Assign myPrivateInteger to the constant accessor.
MyClassReference() : myIntegerAccessor(myPrivateInteger) {}
};
However, the current established method for solving this problem is to utilize a constant "getter" function as seen below:
class MyClassGetter
{
private:
int myPrivateInteger;
public:
int getMyInteger() const { return myPrivateInteger; }
};
The necessity (or lack thereof) for "getters/setters" has already been hashed out time and again on questions such as: Conventions for accessor methods (getters and setters) in C++ That however is not the issue at hand.
Both of these methods offer the same functionality using the syntax:
MyClassGetter a;
MyClassReference b;
int SomeValue = 5;
int A_i = a.getMyInteger(); // Allowed.
a.getMyInteger() = SomeValue; // Not allowed.
int B_i = b.myIntegerAccessor; // Allowed.
b.myIntegerAccessor = SomeValue; // Not allowed.
After discovering this, and finding nothing on the internet concerning it, I asked several of my mentors and professors for which is appropriate and what are the relative advantages/disadvantages of each. However, all responses I received fell nicely into two categories:
I have never even thought of that, but use a "getter" method as it is "Established Practice".
They function the same (They both run with the same efficiency), but use a "getter" method as it is "Established Practice".
While both of these answers were reasonable, as they both failed to explain the "why" I was left unsatisfied and decided to investigate this issue further. While I conducted several tests such as average character usage (they are roughly the same), average typing time (again roughly the same), one test showed an extreme discrepancy between these two methods. This was a run-time test for calling the accessor, and assigning it to an integer. Without any -OX flags (In debug mode), the MyClassReference performed roughly 15% faster. However, once a -OX flag was added, in addition to performing much faster both methods ran with the same efficiency.
My question is thus has two parts.
How do these two methods differ, and what causes one to be faster/slower than the others only with certain optimization flags?
Why is it that established practice is to use a constant "getter" function, while using a constant reference is rarely known let alone utilized?
As comments pointed out, my benchmark testing was flawed, and irrelevant to the matter at hand. However, for context it can be located in the revision history.
The answer to question #2 is that sometimes, you might want to change class internals. If you made all your attributes public, they're part of the interface, so even if you come up with a better implementation that doesn't need them (say, it can recompute the value on the fly quickly and shave the size of each instance so programs that make 100 million of them now use 400-800 MB less memory), you can't remove it without breaking dependent code.
With optimization turned on, the getter function should be indistinguishable from direct member access when the code for the getter is just a direct member access anyway. But if you ever want to change how the value is derived to remove the member variable and compute the value on the fly, you can change the getter implementation without changing the public interface (a recompile would fix up existing code using the API without code changes on their end), because a function isn't limited in the way a variable is.
There are semantic/behavioral differences that are far more significant than your (broken) benchmarks.
Copy semantics are broken
A live example:
#include <iostream>
class Broken {
public:
Broken(int i): read_only(read_write), read_write(i) {}
int const& read_only;
void set(int i) { read_write = i; }
private:
int read_write;
};
int main() {
Broken original(5);
Broken copy(original);
std::cout << copy.read_only << "\n";
original.set(42);
std::cout << copy.read_only << "\n";
return 0;
}
Yields:
5
42
The problem is that when doing a copy, copy.read_only points to original.read_write. This may lead to dangling references (and crashes).
This can be fixed by writing your own copy constructor, but it is painful.
Assignment is broken
A reference cannot be reseated (you can alter the content of its referee but not switch it to another referee), leading to:
int main() {
Broken original(5);
Broken copy(4);
copy = original;
std::cout << copy.read_only << "\n";
original.set(42);
std::cout << copy.read_only << "\n";
return 0;
}
generating an error:
prog.cpp: In function 'int main()':
prog.cpp:18:7: error: use of deleted function 'Broken& Broken::operator=(const Broken&)'
copy = original;
^
prog.cpp:3:7: note: 'Broken& Broken::operator=(const Broken&)' is implicitly deleted because the default definition would be ill-formed:
class Broken {
^
prog.cpp:3:7: error: non-static reference member 'const int& Broken::read_only', can't use default assignment operator
This can be fixed by writing your own copy constructor, but it is painful.
Unless you fix it, Broken can only be used in very restricted ways; you may never manage to put it inside a std::vector for example.
Increased coupling
Giving away a reference to your internals increases coupling. You leak an implementation detail (the fact that you are using an int and not a short, long or long long).
With a getter returning a value, you can switch the internal representation to another type, or even elide the member and compute it on the fly.
This is only significant if the interface is exposed to clients expecting binary/source-level compatibility; if the class is only used internally and you can afford to change all users if it changes, then this is not an issue.
Now that semantics are out of the way, we can speak about performance differences.
Increased object size
While references can sometimes be elided, it is unlikely to ever happen here. This means that each reference member will increase the size of an object by at least sizeof(void*), plus potentially some padding for alignment.
The original class MyClassA has a size of 4 on x86 or x86-64 platforms with mainstream compilers.
The Broken class has a size of 8 on x86 and 16 on x86-64 platforms (the latter because of padding, as pointers are aligned on 8-bytes boundaries).
An increased size can bust up CPU caches, with a large number of items you may quickly experience slow downs due to it (well, not that it'll be easy to have vectors of Broken due to its broken assignment operator).
Better performance in debug
As long as the implementation of the getter is inline in the class definition, then the compiler will strip the getter whenever you compile with a sufficient level of optimizations (-O2 or -O3 generally, -O1 may not enable inlining to preserve stack traces).
Thus, the performance of access should only vary in debug code, where performance is least necessary (and otherwise so crippled by plenty of other factors that it matters little).
In the end, use a getter. It's established convention for a good number of reasons :)
When implementing constant reference (or constant pointer) your object also stores a pointer, which makes it bigger in size. Accessor methods, on the other hand, are instantiated only once in program and are most likely optimized out (inlined), unless they are virtual or part of exported interface.
By the way, getter method can also be virtual.
To answer question 2:
const_cast<int&>(mcb.myIntegerAccessor) = 4;
Is a pretty good reason to hide it behind a getter function. It is a clever way to do a getter-like operation, but it completely breaks abstraction in the class.

Simplest way to count instances of an object

I would like to know the exact number of instances of certain objects allocated at certain point of execution. Mostly for hunting possible memory leaks(I mostly use RAII, almost no new, but still I could forget .clear() on vector before adding new elements or something similar). Ofc I could have an
atomic<int> cntMyObject;
that I -- in destructor, ++ increase in constructor, cpy constructor(I hope I covered everything :)).
But that is hardcoding for every class. And it is not simple do disable it in "Release" mode.
So is there any simple elegant way that can be easily disabled to count object instances?
Have a "counted object" class that does the proper reference counting in its constructor(s) and destructor, then derive your objects that you want to track from it. You can then use the curiously recurring template pattern to get distinct counts for any object types you wish to track.
// warning: pseudo code
template <class Obj>
class CountedObj
{
public:
CountedObj() {++total_;}
CountedObj(const CountedObj& obj) {++total_;}
~CountedObj() {--total_;}
static size_t OustandingObjects() {return total_;}
private:
static size_t total_;
};
class MyClass : private CountedObj<MyClass>
{};
you can apply this approach
#ifdef DEBUG
class ObjectCount {
static int count;
protected:
ObjectCount() {
count++;
}
public:
void static showCount() {
cout << count;
}
};
int ObjectCount::count = 0;
class Employee : public ObjectCount {
#else
class Employee {
#endif
public:
Employee(){}
Employee(const Employee & emp) {
}
};
at DEBUG mode, invoking of ObjectCount::showCount() method will return count of object(s) created.
Better off to use memory profiling & leak detection tools like Valgrind or Rational Purify.
If you can't and want to implement your own mechanism then,
You should overload the new and delete operators for your class and then implement the memory diagnostic in them.
Have a look at this C++ FAQ answer to know how to do that and what precautions you should take.
This is a sort of working example of something similar: http://www.almostinfinite.com/memtrack.html (just copy the code at the end of the page and put it in Memtrack.h, and then run TrackListMemoryUsage() or one of the other functions to see diagnostics)
It overrides operator new and does some arcane macro stuff to make it 'stamp' each allocation with information that allow it to count how many instances of an object and how much memory they're usingusing. It's not perfect though, the macros they use break down under certain conditions. If you decide to try this out make sure to include it after any standard headers.
Without knowing your code and your requirements, I see 2 reasonable options:
a) Use boost::shared_ptr. It has the atomic reference counts you suggested built in and takes care of your memory management (so that you'd never actually care to look at the count). Its reference count is available through the use_count() member.
b) If the implications of a), like dealing with pointers and having shared_ptrs everywhere, or possible performance overhead, are not acceptable for you, I'd suggest to simply use available tools for memory leak detection (e.g. Valgrind, see above) that'll report your loose objects at program exit. And there's no need to use intrusive helper classes for (anyway debug-only) tracking object counts, that just mess up your code, IMHO.
We used to have the solution of a base class with internal counter and derive from it, but we changed it all into boost::shared_ptr, it keeps a reference counter and it cleans up memory for you. The boost smart pointer family is quite useful:
boost smart pointers
My approach, which outputs leakage count to Debug Output (via the DebugPrint function implemented in our code base, replace that call with your own...)
#include <typeinfo>
#include <string.h>
class CountedObjImpl
{
public:
CountedObjImpl(const char* className) : mClassName(className) {}
~CountedObjImpl()
{
DebugPrint(_T("**##** Leakage count for %hs: %Iu\n"), mClassName.c_str(), mInstanceCount);
}
size_t& GetCounter()
{
return mInstanceCount;
}
private:
size_t mInstanceCount = 0;
std::string mClassName;
};
template <class Obj>
class CountedObj
{
public:
CountedObj() { GetCounter()++; }
CountedObj(const CountedObj& obj) { GetCounter()++; }
~CountedObj() { GetCounter()--; }
static size_t OustandingObjects() { return GetCounter(); }
private:
size_t& GetCounter()
{
static CountedObjImpl mCountedObjImpl(typeid(Obj).name());
return mCountedObjImpl.GetCounter();
}
};
Example usage:
class PostLoadInfoPostLoadCB : public PostLoadCallback, private CountedObj<PostLoadInfoPostLoadCB>
Adding counters to individual classes was discussed in some of the answers. However, it requires to pick the classes to have counted and modify them in one way or the other. The assumption in the following is, you are adding such counters to find bugs where more objects of certain classes are kept alive than expected.
To shortly recap some things mentioned already: For real memory leaks, certainly there is valgrind:memcheck and the leak sanitizers. However, for other scenarios without real leaks they do not help (uncleared vectors, map entries with keys never accessed, cycles of shared_ptrs, ...).
But, since this was not mentioned: In the valgrind tool suite there is also massif, which can provide you with the information about all pieces of allocated memory and where they were allocated. However, let's assume that valgrind:massif is also not an option for you, and you truly want instance counts.
For the purpose of occasional bug hunting - if you are open for some hackish solution and if the above options don't work - you might consider the following: Nowadays, many objects on the heap are effectively held by smart pointers. This could be the smart pointer classes from the standard library, or the smart pointer classes of the respective helper libraries you use. The trick is then the following (picking the shared_ptr as an example): You can get instance counters for many classes at once by patching the shared_ptr implementation, namely by adding instance counts to the shared_ptr class. Then, for some class Foo, the counter belonging to shared_ptr<Foo> will give you an indication of the number of instances of class Foo.
Certainly, it is not quite as accurate as adding the counters to the respective classes directly (instances referenced only by raw pointers are not counted), but possibly it is accurate enough for your case. And, certainly, this is not about changing the smart pointer classes permanently - only during the bug hunting. At least, the smart pointer implementations are not too complex, so patching them is simple.
This approach is much simpler than the rest of the solutions here.
Make a variable for the count and make it static. Increase that variable by +1 inside the constructor and decrease it by -1 inside the destructor.
Make sure you initialize the variable (it cannot be initialized inside the header because its static).
.h
// Pseudo code warning
class MyObject
{
MyObject();
~MyObject();
static int totalObjects;
}
.cpp
int MyObject::totalObjects = 0;
MyObject::MyObject()
{
++totalObjects;
}
MyObject::~MyObject()
{
--totalObjects;
}
For every new instance you make, the constructor is called and totalObjects automatically grows by 1.

Lazy/multi-stage construction in C++

What's a good existing class/design pattern for multi-stage construction/initialization of an object in C++?
I have a class with some data members which should be initialized in different points in the program's flow, so their initialization has to be delayed. For example one argument can be read from a file and another from the network.
Currently I am using boost::optional for the delayed construction of the data members, but it's bothering me that optional is semantically different than delay-constructed.
What I need reminds features of boost::bind and lambda partial function application, and using these libraries I can probably design multi-stage construction - but I prefer using existing, tested classes. (Or maybe there's another multi-stage construction pattern which I am not familiar with).
The key issue is whether or not you should distinguish completely populated objects from incompletely populated objects at the type level. If you decide not to make a distinction, then just use boost::optional or similar as you are doing: this makes it easy to get coding quickly. OTOH you can't get the compiler to enforce the requirement that a particular function requires a completely populated object; you need to perform run-time checking of fields each time.
Parameter-group Types
If you do distinguish completely populated objects from incompletely populated objects at the type level, you can enforce the requirement that a function be passed a complete object. To do this I would suggest creating a corresponding type XParams for each relevant type X. XParams has boost::optional members and setter functions for each parameter that can be set after initial construction. Then you can force X to have only one (non-copy) constructor, that takes an XParams as its sole argument and checks that each necessary parameter has been set inside that XParams object. (Not sure if this pattern has a name -- anybody like to edit this to fill us in?)
"Partial Object" Types
This works wonderfully if you don't really have to do anything with the object before it is completely populated (perhaps other than trivial stuff like get the field values back). If you do have to sometimes treat an incompletely populated X like a "full" X, you can instead make X derive from a type XPartial, which contains all the logic, plus protected virtual methods for performing precondition tests that test whether all necessary fields are populated. Then if X ensures that it can only ever be constructed in a completely-populated state, it can override those protected methods with trivial checks that always return true:
class XPartial {
optional<string> name_;
public:
void setName(string x) { name_.reset(x); } // Can add getters and/or ctors
string makeGreeting(string title) {
if (checkMakeGreeting_()) { // Is it safe?
return string("Hello, ") + title + " " + *name_;
} else {
throw domain_error("ZOINKS"); // Or similar
}
}
bool isComplete() const { return checkMakeGreeting_(); } // All tests here
protected:
virtual bool checkMakeGreeting_() const { return name_; } // Populated?
};
class X : public XPartial {
X(); // Forbid default-construction; or, you could supply a "full" ctor
public:
explicit X(XPartial const& x) : XPartial(x) { // Avoid implicit conversion
if (!x.isComplete()) throw domain_error("ZOINKS");
}
X& operator=(XPartial const& x) {
if (!x.isComplete()) throw domain_error("ZOINKS");
return static_cast<X&>(XPartial::operator=(x));
}
protected:
virtual bool checkMakeGreeting_() { return true; } // No checking needed!
};
Although it might seem the inheritance here is "back to front", doing it this way means that an X can safely be supplied anywhere an XPartial& is asked for, so this approach obeys the Liskov Substitution Principle. This means that a function can use a parameter type of X& to indicate it needs a complete X object, or XPartial& to indicate it can handle partially populated objects -- in which case either an XPartial object or a full X can be passed.
Originally I had isComplete() as protected, but found this didn't work since X's copy ctor and assignment operator must call this function on their XPartial& argument, and they don't have sufficient access. On reflection, it makes more sense to publically expose this functionality.
I must be missing something here - I do this kind of thing all the time. It's very common to have objects that are big and/or not needed by a class in all circumstances. So create them dynamically!
struct Big {
char a[1000000];
};
class A {
public:
A() : big(0) {}
~A() { delete big; }
void f() {
makebig();
big->a[42] = 66;
}
private:
Big * big;
void makebig() {
if ( ! big ) {
big = new Big;
}
}
};
I don't see the need for anything fancier than that, except that makebig() should probably be const (and maybe inline), and the Big pointer should probably be mutable. And of course A must be able to construct Big, which may in other cases mean caching the contained class's constructor parameters. You will also need to decide on a copying/assignment policy - I'd probably forbid both for this kind of class.
I don't know of any patterns to deal with this specific issue. It's a tricky design question, and one somewhat unique to languages like C++. Another issue is that the answer to this question is closely tied to your individual (or corporate) coding style.
I would use pointers for these members, and when they need to be constructed, allocate them at the same time. You can use auto_ptr for these, and check against NULL to see if they are initialized. (I think of pointers are a built-in "optional" type in C/C++/Java, there are other languages where NULL is not a valid pointer).
One issue as a matter of style is that you may be relying on your constructors to do too much work. When I'm coding OO, I have the constructors do just enough work to get the object in a consistent state. For example, if I have an Image class and I want to read from a file, I could do this:
image = new Image("unicorn.jpeg"); /* I'm not fond of this style */
or, I could do this:
image = new Image(); /* I like this better */
image->read("unicorn.jpeg");
It can get difficult to reason about how a C++ program works if the constructors have a lot of code in them, especially if you ask the question, "what happens if a constructor fails?" This is the main benefit of moving code out of the constructors.
I would have more to say, but I don't know what you're trying to do with delayed construction.
Edit: I remembered that there is a (somewhat perverse) way to call a constructor on an object at any arbitrary time. Here is an example:
class Counter {
public:
Counter(int &cref) : c(cref) { }
void incr(int x) { c += x; }
private:
int &c;
};
void dontTryThisAtHome() {
int i = 0, j = 0;
Counter c(i); // Call constructor first time on c
c.incr(5); // now i = 5
new(&c) Counter(j); // Call the constructor AGAIN on c
c.incr(3); // now j = 3
}
Note that doing something as reckless as this might earn you the scorn of your fellow programmers, unless you've got solid reasons for using this technique. This also doesn't delay the constructor, just lets you call it again later.
Using boost.optional looks like a good solution for some use cases. I haven't played much with it so I can't comment much. One thing I keep in mind when dealing with such functionality is whether I can use overloaded constructors instead of default and copy constructors.
When I need such functionality I would just use a pointer to the type of the necessary field like this:
public:
MyClass() : field_(0) { } // constructor, additional initializers and code omitted
~MyClass() {
if (field_)
delete field_; // free the constructed object only if initialized
}
...
private:
...
field_type* field_;
next, instead of using the pointer I would access the field through the following method:
private:
...
field_type& field() {
if (!field_)
field_ = new field_type(...);
return field_;
}
I have omitted const-access semantics
The easiest way I know is similar to the technique suggested by Dietrich Epp, except it allows you to truly delay the construction of an object until a moment of your choosing.
Basically: reserve the object using malloc instead of new (thereby bypassing the constructor), then call the overloaded new operator when you truly want to construct the object via placement new.
Example:
Object *x = (Object *) malloc(sizeof(Object));
//Use the object member items here. Be careful: no constructors have been called!
//This means you can assign values to ints, structs, etc... but nested objects can wreak havoc!
//Now we want to call the constructor of the object
new(x) Object(params);
//However, you must remember to also manually call the destructor!
x.~Object();
free(x);
//Note: if you're the malloc and new calls in your development stack
//store in the same heap, you can just call delete(x) instead of the
//destructor followed by free, but the above is the correct way of
//doing it
Personally, the only time I've ever used this syntax was when I had to use a custom C-based allocator for C++ objects. As Dietrich suggests, you should question whether you really, truly must delay the constructor call. The base constructor should perform the bare minimum to get your object into a serviceable state, whilst other overloaded constructors may perform more work as needed.
I don't know if there's a formal pattern for this. In places where I've seen it, we called it "lazy", "demand" or "on demand".

What is the Performance, Safety, and Alignment of a Data member hidden in an embedded char array in a C++ Class?

I have seen a codebase recently that I fear is violating alignment constraints. I've scrubbed it to produce a minimal example, given below. Briefly, the players are:
Pool. This is a class which allocates memory efficiently, for some definition of 'efficient'. Pool is guaranteed to return a chunk of memory that is aligned for the requested size.
Obj_list. This class stores homogeneous collections of objects. Once the number of objects exceeds a certain threshold, it changes its internal representation from a list to a tree. The size of Obj_list is one pointer (8 bytes on a 64-bit platform). Its populated store will of course exceed that.
Aggregate. This class represents a very common object in the system. Its history goes back to the early 32-bit workstation era, and it was 'optimized' (in that same 32-bit era) to use as little space as possible as a result. Aggregates can be empty, or manage an arbitrary number of objects.
In this example, Aggregate items are always allocated from Pools, so they are always aligned. The only occurrences of Obj_list in this example are the 'hidden' members in Aggregate objects, and therefore they are always allocated using placement new. Here are the support classes:
class Pool
{
public:
Pool();
virtual ~Pool();
void *allocate(size_t size);
static Pool *default_pool(); // returns a global pool
};
class Obj_list
{
public:
inline void *operator new(size_t s, void * p) { return p; }
Obj_list(const Args *args);
// when constructed, Obj_list will allocate representation_p, which
// can take up much more space.
~Obj_list();
private:
Obj_list_store *representation_p;
};
And here is Aggregate. Note that member declaration member_list_store_d:
// Aggregate is derived from Lesser, which is twelve bytes in size
class Aggregate : public Lesser
{
public:
inline void *operator new(size_t s) {
return Pool::default_pool->allocate(s);
}
inline void *operator new(size_t s, Pool *h) {
return h->allocate(s);
}
public:
Aggregate(const Args *args = NULL);
virtual ~Aggregate() {};
inline const Obj_list *member_list_store_p() const;
protected:
char member_list_store_d[sizeof(Obj_list)];
};
It is that data member that I'm most concerned about. Here is the pseudocode for initialization and access:
Aggregate::Aggregate(const Args *args)
{
if (args) {
new (static_cast<void *>(member_list_store_d)) Obj_list(args);
}
else {
zero_out(member_list_store_d);
}
}
inline const Obj_list *Aggregate::member_list_store_p() const
{
return initialized(member_list_store_d) ? (Obj_list *) &member_list_store_d : 0;
}
You may be tempted to suggest that we replace the char array with a pointer to the Obj_list type, initialized to NULL or an instance of the class. This gives the proper semantics, but just shifts the memory cost around. If memory were still at a premium (and it might be, this is an EDA database representation), replacing the char array with a pointer to an Obj_list would cost one more pointer in the case when Aggregate objects do have members.
Besides that, I don't really want to get distracted from the main question here, which is alignment. I think the above construct is problematic, but can't really find more in the standard than some vague discussion of the alignment behavior of the 'system/library' new.
So, does the above construct do anything more than cause an occasional pipe stall?
Edit: I realize that there are ways to replace the approach using the embedded char array. So did the original architects. They discarded them because memory was at a premium. Now, if I have a reason to touch that code, I'll probably change it.
However, my question, about the alignment issues inherent in this approach, is what I hope people will address. Thanks!
Ok - had a chance to read it properly. You have an alignment problem, and invoke undefined behaviour when you access the char array as an Obj_list. Most likely your platform will do one of three things: let you get away with it, let you get away with it at a runtime penalty or occasionally crash with a bus error.
Your portable options to fix this are:
allocate the storage with malloc or
a global allocation function, but
you think this is too
expensive.
as Arkadiy says, make your buffer an Obj_list member:
Obj_list list;
but you now don't want to pay the cost of construction. You could mitigate this by providing an inline do-nothing constructor to be used only to create this instance - as posted the default constructor would do. If you follow this route, strongly consider invoking the dtor
list.~Obj_list();
before doing a placement new into this storage.
Otherwise, I think you are left with non portable options: either rely on your platform's tolerance of misaligned accesses, or else use any nonportable options your compiler gives you.
Disclaimer: It's entirely possible I'm missing a trick with unions or some such. It's an unusual problem.
The alignment will be picked by the compiler according to its defaults, this will probably end up as four-bytes under GCC / MSVC.
This should only be a problem if there is code (SIMD/DMA) that requires a specific alignment. In this case you should be able to use compiler directives to ensure that member_list_store_d is aligned, or increase the size by (alignment-1) and use an appropriate offset.
Can you simply have an instance of Obj_list inside Aggregate? IOW, something along the lines of
class Aggregate : public Lesser
{
...
protected:
Obj_list list;
};
I must be missing something, but I can't figure why this is bad.
As to your question - it's perfectly compiler-dependent. Most compilers, though, will align every member at word boundary by default, even if the member's type does not need to be aligned that way for correct access.
If you want to ensure alignment of your structures, just do a
// MSVC
#pragma pack(push,1)
// structure definitions
#pragma pack(pop)
// *nix
struct YourStruct
{
....
} __attribute__((packed));
To ensure 1 byte alignment of your char array in Aggregate
Allocate the char array member_list_store_d with malloc or global operator new[], either of which will give storage aligned for any type.
Edit: Just read the OP again - you don't want to pay for another pointer. Will read again in the morning.