Access std::map within operator new called by static constructor - c++

1) I have some static classes in my project that allocate variables within their constructors.
class StaticClass
{
public:
char *var;
StaticClass()
{
var=new char[100];
}
};
static StaticClass staticClass;
2) I have overridden the new and delete operators and made them keep track of all current allocations in a std::unordered_map
unordered_map<void*,size_t> allocations;
void* operator new[](size_t size)
{
void *p=malloc(size);
if (p==0) // did malloc succeed?
throw std::bad_alloc(); // ANSI/ISO compliant behavior
allocations[p]=size;
return p;
}
When my program starts, staticClass's constructor is called before allocations' constructor is, so operator new() tries to insert size into allocations before it has been initialized, which errors.
Previously, when I ran into problems with the order of static construction, I simply made the std::map into a NULL pointer, and then initialized it the first time it was used, ensuring it would be valid the first time I inserted it:
unsorted_map<void*,size_t> *allocations=NULL;
//in code called by static constructor:
if(allocations==NULL)
allocations=new unsortedmap()
//now safe to insert into allocations
However, this will no longer work since I would be calling new within operator new(), creating an infinite recursive loop.
I am aware that I could probably solve this by making another special version of operator new that takes some token argument to differentiate it, and just use that to initialize allocations, however in a more general (learning) sense, I would prefer to somehow either
a) force allocations to initialize before StaticClass does (best)
b) have some way to call the default operator new instead of my overridden one (which I don't think is possible, but...)
c) some other more general solution?

A simple way to avoid initialization order issues is to wrap your static object inside a function:
unordered_map<void*,size_t> &allocations()
{
static unordered_map<void*,size_t> static_map;
return static_map;
}
Then use it like this:
void* operator new[](size_t size)
{
void *p=malloc(size);
if (p==0) // did malloc succeed?
throw std::bad_alloc(); // ANSI/ISO compliant behavior
allocations()[p]=size;
return p;
}
However, you still run the risk of std::unordered_map using your operator new internally.

Related

How can overloaded 'operator new' cause infinite loops?

I was reading a book named
"Hands-On System Programming with C++". It says on page 320 that overloading the new operator can cause infinite loops, so it should be avoided.
These overloads affect all allocations, including those used by the C++ library, so care should be taken when leveraging these overloads as infinite cyclic recursions could occur if an allocation is performed inside these functions. For example, data structures such as std::vector and std::list, or debugging functions such as std::cout and std::cerr cannot be used as these facilities use the new() and delete() operators to allocate memory.
So, how can this piece of code cause an infinite loop, and why should I not use cout and vector with it? This was the piece of code in the book. I tried to use vector, cout (inside the new operator), push_back, but can't replicate the situation. So, when exactly can this happen?
void* operator new (size_t size){
if(size > 1000) page_counter++;
return malloc(size);
}
Simply telling a std::vector to allocate some memory in operator new should do it:
void *operator new(std::size_t size) {
// std::vector<int>::reserve calls std::allocator<int>::allocate calls (this) operator new calls ...
std::vector<int>().reserve(999);
return std::malloc(size);
}
int main() {
int *p = new int(42);
// OFC, this is undefined behavior, so we *could* reach this, but... we don't, which means it definitely is UB
std::cout << "Shouldn't reach this!\n";
}
Godbolt shows it crashing
Note that a) It's not enough to just construct a std::vector, because that might not allocate. std::vector usually only allocates when you somehow tell it to. It will expand when you try to add things to it, or you can say "be at least this big" with reserve. b) You have to call operator new from somewhere to trigger the loop (here it's within the new in main).

Bypassing constructor in C++

I am attempting to use C++ for AVR programming using gcc-avr. The main issue is that there is no libc++ available and the implementation does not define any of the new or delete operators. Similarly there is no header to include using placement new is not an option.
When attempting to allocate a new dynamic object I am tempted to just do this:
Class* p = reinterpret_cast<Class*>(malloc(sizeof(Class)));
p->Init();
where Init() manually initializes all internal variables. But is this safe or even possible?
I have read that object construction in C++ is somewhat complex but without new or delete how do I initialize a dynamically allocated object?
To expand on the above question.
Using standard g++ and placement new it is possible to subvert constructors in two ways, assuming that C++ uses the same straightforward ways of alligning memory as C (code example below).
Using placement new to initialize any allocated memory.
Initialize allocated memory directly using class methods.
Of course this only holds if the assumptions are true that:
Memory layout of an object is fixed at compile time.
Memory allocation is only concerned with class variables and observers normal C rules (allocated in order of declaration aligned to memory boundary).
If the above holds could I not just allocated memory using malloc and use a reinterpret_cast to convert to the correct class and initialize it manually? Of course this is both non-portable and hack-ish but the only other way I can see is to work around the problem and not use dynamically allocated memory at all.
Example:
Class A {
int i;
long l;
public:
A() : i(1), l(2) {}
int get_i() { return i; }
void set_i(int x) { i = x; }
long get_l() { return l; }
void set_l(long y) { l = y; }
};
Class B {
/* Identical to Class A, except constructor. */
public B() : i(3), l(4) {}
};
int main() {
A* a = (A*) ::operator new(sizeof(A));
B* b = (B*) ::operator new(sizeof(B));
/* Allocating A using B's constructor. */
a = (A*) new (a) B();
cout << a->get_i() << endl; // prints 3
cout << a->get_l() << endl; // prints 4
/* Setting b directly without constructing */
b->set_i(5);
b->set_l(6);
cout << b->get_i() << endl; // prints 5
cout << b->get_l() << endl; // prints 6
If your allegedly C++ compiler does not support operator new, you should be able to simply provide your own, either in the class or as a global definition. Here's a simple one from an article discussing operator new, slightly modified (and the same can be found in many other places, too):
void* operator new(size_t sz) {
void* mem = malloc(sz);
if (mem)
return mem;
else
throw std::bad_alloc();
}
void operator delete(void* ptr) {
free(ptr);
}
A longer discussion of operator new, in particular for class-specific definitions, can also be found here.
From the comments, it seems that given such a definition, your compiler then happily supports the standard object-on-heap creations like these:
auto a = std::make_shared<A>();
A *pa = new A{};
The problem with using Init methods as shown in the code snippet in your question is that it can be a pain to get that to work properly with inheritance, especially multiple or virtual inheritance, at least when something during object construction might throw. The C++ language has elaborate rules to make sure something useful and predictable happens in that situation with constructors; duplicating that with ordinary functions probably will get tricky rather fast.
Whether you can get away with your malloc()-reinterprete_cast<>-init() approach depends on whether you have virtual functions/inheritance. If there is nothing virtual in your class (it's a plain old datatype), your approach will work.
However, if there is anything virtual in it, your approach will fail miserably: In these cases, C++ adds a v-table to the data layout of your class which you cannot access directly without going very deep into undefined behavior. This v-table pointer is usually set when the constructor is run. Since you can't safely mimic the behavior of the constructor in this regard, you must actually call a constructor. That is, you must use placement-new at the very least.
Providing a classless operator new() as Christopher Creutzig suggests, is the easiest way to provide full C++ functionality. It is the function that is used internally by new expressions to provide the memory on which the constructors can be called to provide a fully initialized object.
One last point of assurance: as long as you do not use a variable length array at the end of a struct like this
typedef struct foo {
size_t arraySize;
int array[];
} foo;
the size of any class/struct is entirely a compile time constant.

Can the global new operator be overridden based on allocated object's type traits?

I'm experimenting with upgrading our pooled fixed-block memory allocator to take advantage of C++11 type traits.
Currently it is possible to force any allocation of any object anywhere to be dispatched to the correct pool by overriding the global new operator in the traditional way, eg
void* operator new (std::size_t size)
{ // if-cascade just for simplest possible example
if ( size <= 64 ) { return g_BlockPool64.Allocate(); }
else if ( size <= 256 ) { return g_BlockPool256.Allocate(); }
// etc .. else assume arguendo that we know the following will work properly
else return malloc(size);
}
In many cases we could improve performance further if objects could be dispatched to different pools depending on type traits such as is_trivially_destructible. Is it possible to make a templatized global new operator that is aware of the allocated type, not just a requested size? Something equivalent to
template<class T>
void *operator new( size_t size)
{
if ( size < 64 )
{ return std::is_trivially_destructible<T>::value ?
g_BlockPool64_A.Allocate() :
g_BlockPool64_B.Allocate(); } // etc
}
Overriding the member new operator in every class won't work here; we really need this to automatically work for any allocation anywhere. Placement new won't work either: requiring every alloc to look like
Foo *p = new (mempool<Foo>) Foo();
is too cumbersome and people will forget to use it.
The short answer is no. The allocation/deallocation functions have the following signatures:
void* operator new(std::size_t);
void* operator new[](std::size_t);
void operator delete(void*);
void operator delete[](void*);
Most deviations from these signatures will result in your function not being used at all. In a typical implementation you're basically replacing the default implementations at the linker level -- i.e., the existing function has some particular mangled name. If you provide a function with a name that mangles to an identical result, it'll get linked instead. If your function doesn't mangle to the same name, it won't get linked.
A template like you've suggested might get used in some cases, but if so, it would lead to undefined behavior. Depending on how you arranged headers (for example) you could end up with code mixing the use of your template with the default functions, at which point about the best you could hope for would be that it crash quickly and cleanly.

How do I properly overload new?

My code is below. However before main() is run something simple such as a static std::string globalvar; will call new. Before MyPool mypool is initialized.
MyPool mypool;
void* operator new(size_t s) { return mypool.donew(s); }
Is there anyway I can force mypool to be initialized first? I have no idea how overloading new is suppose to work if there is no way to initialize its values so I am sure there is a solution to this.
I am using both visual studios 2010 and gcc (cross platform)
Make mypool a static variable of your operator new function:
void* operator new(size_t s) {
static MyPool mypool;
return mypool.donew(s);
}
It will be initialized upon first call of the function (i.e. the operator new).
EDIT: As the commenters pointed out, declaring the variable as static in the operator new functions limits its scope and makes it inaccessible in the operator delete. To fix that, you should make an accessor function for your pool object:
MyPool& GetMyPool() {
static MyPool mypool;
return mypool;
}
and invoke it in both operator new and operator delete:
void* operator new(size_t s) {
return GetMyPool().donew(s);
}
// similarly for delete
As before, declaring it as static local variable guarantees initialization upon first invocation of GetMyPool function. Additionally, it will be the same pool object in both operators which likely what you want.
Properly? Best don't. Try Boost.Pool, and just use their allocation mechanics. Or if you insist on using your pool, make a new allocation function. I've seen horrible things done to the operator new, and I'm feeling sorry for it. :(
IMHO, the only time you should overload new is when implementing a memory manager for observation of the allocs / deallocs. Otherwise, just write your own functions and use them instead. Or for most containers, you can give them allocators.
Global initialization occurs in three steps, zero initialization, static
initialization and dynamic initialization. In that order. If your
operator new uses non-local variables, these variables must depend on
only zero or static initialization; as you said, you cannot guarantee
that your operator new won't be called before any particular variable
with dynamic initialization will have occured.
If you need objects with dynamic initialization (often the case), there
are two ways of handling this:
declare a pointer to the object, rather than the object itself, and
in operator new, check if the pointer is null, and initialize it
there, or
call a function which returns a reference to a local instance.
Neither of these solutions is thread safe, but that's likely not a
problem. They are thread safe once the first call returns, so if
there is any invocation of new before threading starts, you're OK.
(It's something to keep in mind, however. If unsure, you can always
allocate and delete an object manually before the first thread is
started—perhaps in the initialization of a static object.)

How do I prevent a class from being allocated via the 'new' operator? (I'd like to ensure my RAII class is always allocated on the stack.)

I'd like to ensure my RAII class is always allocated on the stack.
How do I prevent a class from being allocated via the 'new' operator?
All you need to do is declare the class' new operator private:
class X
{
private:
// Prevent heap allocation
void * operator new (size_t);
void * operator new[] (size_t);
void operator delete (void *);
void operator delete[] (void*);
// ...
// The rest of the implementation for X
// ...
};
Making 'operator new' private effectively prevents code outside the class from using 'new' to create an instance of X.
To complete things, you should hide 'operator delete' and the array versions of both operators.
Since C++11 you can also explicitly delete the functions:
class X
{
// public, protected, private ... does not matter
static void *operator new (size_t) = delete;
static void *operator new[] (size_t) = delete;
static void operator delete (void*) = delete;
static void operator delete[](void*) = delete;
};
Related Question: Is it possible to prevent stack allocation of an object and only allow it to be instiated with ‘new’?
I'm not convinced of your motivation.
There are good reasons to create RAII classes on the free store.
For example, I have an RAII lock class. I have a path through the code where the lock is only necessary if certain conditions hold (it's a video player, and I only need to hold the lock during my render loop if I've got a video loaded and playing; if nothing's loaded, I don't need it). The ability to create locks on the free store (with an unique_ptr) is therefore very useful; it allows me to use the same code path regardless of whether I have to take out the lock.
i.e. something like this:
unique_ptr<lock> l;
if(needs_lock)
{
l.reset(new lock(mtx));
}
render();
If I could only create locks on the stack, I couldn't do that....
#DrPizza:
That's an interesting point you have. Note though that there are some situations where the RAII idiom isn't necessarily optional.
Anyway, perhaps a better way to approach your dilemma is to add a parameter to your lock constructor that indicates whether the lock is needed. For example:
class optional_lock
{
mutex& m;
bool dolock;
public:
optional_lock(mutex& m_, bool dolock_)
: m(m_)
, dolock(dolock_)
{
if (dolock) m.lock();
}
~optional_lock()
{
if (dolock) m.unlock();
}
};
Then you could write:
optional_lock l(mtx, needs_lock);
render();
In my particular situation, if the lock isn't necessary the mutex doesn't even exist, so I think that approach would be rather harder to fit.
I guess the thing I'm really struggling to understand is the justification for prohibiting creation of these objects on the free store.