Compiler or Standard C++ Library - new and delete - c++

I am developing C++ coding for software (kernel) without any library. I am confused about the new operator and delete operator. I have implemented KMalloc() and KFree(). Now, I want to know if the following coding will work without any Standard C++ Library.
void *mem = KMalloc(sizeof(__type__));
Object *obj = new (mem) ();
If this will not work, then how will I setup the vtables or whatever object structure there is in a preallocated space without any Std Lib.

You first should define what C++ standard are you targeting. I guess it is at least C++11.
Then, if you code in C++ for some operating system kernel, beware and study carefully the relevant ABI specifications (the details depend even of the version of your C++ compiler, and gory details like even exception handling and stack unwinding matter a lot).
Notice that the Linux kernel ABI is not C++ friendly (it is not the same as Linux user-land ABI for x86-64). So coding in C++ for the Linux kernel is not reasonable.
You probably want
void *mem = KMalloc(sizeof(Object));
Object *obj = new (mem) Object();
The second statement uses the placement new feature of C++, which will run the constructor on the (more or less "unitialized") memory zone passed as placement.
(notice that bitwise copy of C++ objects -e.g. with memcpy- is undefined behavior in general (except for PODs); you need to use constructors and assignment operators)
There is no "placement delete", but you can explicitly run the destructor: obj->~Object() after which any use of the object pointed by the obj pointer is undefined behavior.
Now, I want to know if that code will work without any Standard C++ Library.
It might be much harder than what you believe. You need to understand all the details of the ABI targeted by your compiler, and that is hard.
Notice that running properly constructors (and destructors) -in a good enough order- is of paramount importance for C++; practically speaking, they are notably initializing the (implicit) vtable field[s], without which your object can crash (as soon as any virtual member function or destructor gets called).
Read also about the rule of five (for C++11).
Coding your own kernel in C++ practically requires understanding a lot of details about your C++ implementation (and ABI).
NB: practically speaking, bitwise copy with memcpy of smart pointers, of std::stream-s, of std::mutex-es, of std::thread-s - and perhaps even of standard containers and of std::string-s etc...- is very likely to make a disaster. If you dare doing such bad things, you really need to look into the details of your particular implementations...

In addition to what other answers have already said, you might want to overload operator new and operator delete in order for you not needing to do the KMalloc() plus placement new trick all the time.
// In the global namespace.
void* operator new
(
size_t size
)
{
/* You might also check for `KMalloc()`'s return value and throw
* an exception like the standard `operator new`. This, however,
* requires kernel-mode exception support, which is not that easy
* to get up and running.
*/
return KMalloc( size );
}
void* operator new[]
(
size_t size
)
{
return KMalloc( size );
}
void operator delete
(
void* what
)
{
KFree( what );
}
void operator delete[]
(
void* what
)
{
KFree( what );
}
Then, code like the following will work by calling your KMalloc() and KFree() routines when necessary, along with all necessary constructors like placement new would do.
template<typename Type>
class dumb_smart_pointer
{
public:
dumb_smart_pointer()
: pointer( nullptr )
{}
explicit dumb_smart_pointer
(
Type* pointer
)
: pointer( pointer )
{}
~dumb_smart_pointer()
{
if( this->pointer != nullptr )
{
delete this->pointer;
}
}
Type& operator*()
{
return *this->pointer;
}
Type* operator->()
{
return this->pointer;
}
private:
Type* pointer;
};
dumb_smart_pointer<int> my_pointer = new int( 123 );
*my_pointer += 42;
KConsoleOutput << *my_pointer << '\n';

Related

Bypassing constructor in C++

I am attempting to use C++ for AVR programming using gcc-avr. The main issue is that there is no libc++ available and the implementation does not define any of the new or delete operators. Similarly there is no header to include using placement new is not an option.
When attempting to allocate a new dynamic object I am tempted to just do this:
Class* p = reinterpret_cast<Class*>(malloc(sizeof(Class)));
p->Init();
where Init() manually initializes all internal variables. But is this safe or even possible?
I have read that object construction in C++ is somewhat complex but without new or delete how do I initialize a dynamically allocated object?
To expand on the above question.
Using standard g++ and placement new it is possible to subvert constructors in two ways, assuming that C++ uses the same straightforward ways of alligning memory as C (code example below).
Using placement new to initialize any allocated memory.
Initialize allocated memory directly using class methods.
Of course this only holds if the assumptions are true that:
Memory layout of an object is fixed at compile time.
Memory allocation is only concerned with class variables and observers normal C rules (allocated in order of declaration aligned to memory boundary).
If the above holds could I not just allocated memory using malloc and use a reinterpret_cast to convert to the correct class and initialize it manually? Of course this is both non-portable and hack-ish but the only other way I can see is to work around the problem and not use dynamically allocated memory at all.
Example:
Class A {
int i;
long l;
public:
A() : i(1), l(2) {}
int get_i() { return i; }
void set_i(int x) { i = x; }
long get_l() { return l; }
void set_l(long y) { l = y; }
};
Class B {
/* Identical to Class A, except constructor. */
public B() : i(3), l(4) {}
};
int main() {
A* a = (A*) ::operator new(sizeof(A));
B* b = (B*) ::operator new(sizeof(B));
/* Allocating A using B's constructor. */
a = (A*) new (a) B();
cout << a->get_i() << endl; // prints 3
cout << a->get_l() << endl; // prints 4
/* Setting b directly without constructing */
b->set_i(5);
b->set_l(6);
cout << b->get_i() << endl; // prints 5
cout << b->get_l() << endl; // prints 6
If your allegedly C++ compiler does not support operator new, you should be able to simply provide your own, either in the class or as a global definition. Here's a simple one from an article discussing operator new, slightly modified (and the same can be found in many other places, too):
void* operator new(size_t sz) {
void* mem = malloc(sz);
if (mem)
return mem;
else
throw std::bad_alloc();
}
void operator delete(void* ptr) {
free(ptr);
}
A longer discussion of operator new, in particular for class-specific definitions, can also be found here.
From the comments, it seems that given such a definition, your compiler then happily supports the standard object-on-heap creations like these:
auto a = std::make_shared<A>();
A *pa = new A{};
The problem with using Init methods as shown in the code snippet in your question is that it can be a pain to get that to work properly with inheritance, especially multiple or virtual inheritance, at least when something during object construction might throw. The C++ language has elaborate rules to make sure something useful and predictable happens in that situation with constructors; duplicating that with ordinary functions probably will get tricky rather fast.
Whether you can get away with your malloc()-reinterprete_cast<>-init() approach depends on whether you have virtual functions/inheritance. If there is nothing virtual in your class (it's a plain old datatype), your approach will work.
However, if there is anything virtual in it, your approach will fail miserably: In these cases, C++ adds a v-table to the data layout of your class which you cannot access directly without going very deep into undefined behavior. This v-table pointer is usually set when the constructor is run. Since you can't safely mimic the behavior of the constructor in this regard, you must actually call a constructor. That is, you must use placement-new at the very least.
Providing a classless operator new() as Christopher Creutzig suggests, is the easiest way to provide full C++ functionality. It is the function that is used internally by new expressions to provide the memory on which the constructors can be called to provide a fully initialized object.
One last point of assurance: as long as you do not use a variable length array at the end of a struct like this
typedef struct foo {
size_t arraySize;
int array[];
} foo;
the size of any class/struct is entirely a compile time constant.

How to write a C wrapper for delete that would be fast yet free any type given to it with out telling it what type

have a whole list of C wrappers for OpenCV C++ functions like the one below. And all of them return a "new". I can't change them because they are becoming part of OpenCV and it would make my library perfect to have a consistently updated skeleton to wrap around.
Mat* cv_create_Mat() {
return new Mat();
}
I can't rewrite the C wrapper for the C++ function so I wrote a delete wrapper like the one below,The memory I'm trying to free is a Mat*, Mat is an OpenCV c++ class...and the delete wrapper below works. There is absolutely no memory leakage at all.
I have a lot of other C wrappers for OpenCV C++ functions, though, that return a new pointer...there is at least 10 or 15 and my intention is to not have to write a separate delete wrapper for all of them. If you can show me how to write one delete wrapper that would free any pointer after having it not have to be told which type to free and fast too that would be awesome.
Those are my intentions and I know you great programmers can help me with that solution:)...in a nutshell...I have CvSVMParams*, Brisk*, RotatedRect*, CVANN_MLP* pointers there are a few others as well that all need to be free'd with one wrapper...one go to wrapper for C++'s delete that would free anything...Any help at this is greatly valued.
void delete_ptr(void* ptr) {
delete (Mat*)ptr;
}
Edit: I'd need one of the two of you who I sent the messages to, to tell me exactly how to run your posted code...The registry version doesn't work when I place in Emacs g++ above the main and run with Free(cv_create_Mat); a new Mat* creator and stub gets 5 error message running the same way. I need exact compile instructions. My intention is to be able to compile this to .so file You have really a lot of attention to this post though and I do appreciate it..Thank you
How about this, and then let the compiler deal with all the specializations:
template <typename T>
void delete_ptr(T *ptr) {
delete ptr;
}
The delete operator doesn't just free memory, it also calls destructors, and it has to be called on a typed pointer (not void*) so that it knows which class's destructor to call. You'll need a separate wrapper for each type.
For POD types that don't have destructors, you can allocate with malloc() instead of new, so that the caller can just use free().
I would advise against having a generic delete_ptr function.
Since creation and deletion come in pairs, I would create one for creation and for deletion of specific types.
Mat* cv_create_Mat();
void cv_delete_Mat(Mat*);
If you do this, there will be less ambiguity about the kind of object you are dealing with. Plus, the implementation of cv_delete_Mat(Mat*) will be less error prone and has to assume less.
void cv_delete_Mat(Mat* m)
{
delete m;
}
Generic operations like this can only be implemented in C by removing type information (void*) or by individually ensuring all of the wrapper functions exist.
C's ABI doesn't allow function overloading, and the C++ delete keyword is exactly the sort of generic wrapper you are asking for.
That said, there are things you can do, but none of them are any simpler than what you are already proposing. Any generic C++ code you write will be uncallable from C.
You could add members to your objects which know how to destroy themselves, e.g.
class CVersionOfFoo : public Foo {
...
static void deleter(CVersionOfFoo* p) { delete p; }
};
But that's not accessible from C.
Your last option is to set up some form of manual registry, where objects register their pointer along with a delete function. But that's going to be more work and harder to debug than just writing wrappers.
--- EDIT ---
Registry example; if you have C++11:
#include <functional>
#include <map>
/* not thread safe */
typedef std::map<void*, std::function<void(void*)>> ObjectMap;
static ObjectMap s_objectMap;
struct Foo {
int i;
};
struct Bar {
char x[30];
};
template<typename T>
T* AllocateAndRegister() {
T* t = new T();
s_objectMap[t] = [](void* ptr) { delete reinterpret_cast<T*>(ptr); };
return t;
}
Foo* AllocateFoo() {
return AllocateAndRegister<Foo>();
}
Bar* AllocateBar() {
return AllocateAndRegister<Bar>();
}
void Free(void* ptr) {
auto it = s_objectMap.find(ptr);
if (it != s_objectMap.end()) {
it->second(ptr); // call the lambda
s_objectMap.erase(it);
}
}
If you don't have C++11... You'll have to create a delete function.
As I said, it's more work than the wrappers you were creating.
It's not a case of C++ can't do this - C++ is perfectly capable of doing this, but you're trying to do this in C and C does not provide facilities for doing this automatically.
The core problem is that delete in C++ requires a type, and passing a pointer through a C interface loses that type. The question is how to recover that type in a generic way. Here are some choices.
Bear in mind that delete does two things: call the destructor and free the memory.
Separate functions for each type. Last resort, what you're trying to avoid.
For types that have a trivial destructor, you can cast your void pointer to anything you like because all it does it free the memory. That reduces the number of functions. [This is undefined behaviour, but it should work.]
Use run-time type information to recover the type_info of the pointer, and then dynamic cast it to the proper type to delete.
Modify your create functions to store the pointer in a dictionary with its type_info. On delete, retrieve the type and use it with dynamic cast to delete the pointer.
For all that I would probably use option 1 unless there were hundreds of the things. You could write a C++ template with explicit instantiation to reduce the amount of code needed, or a macro with token pasting to generate unique names. Here is an example (edited):
#define stub(T) T* cv_create_ ## T() { return new T(); } \
void cv_delete_ ## T(void *p) { delete (T*)p; }
stub(Mat);
stub(Brisk);
One nice thing about the dictionary approach is for debugging. You can track new and delete at run-time and make sure they match correctly. I would pick this option if the debugging was really important, but it takes more code to do.

Can the global new operator be overridden based on allocated object's type traits?

I'm experimenting with upgrading our pooled fixed-block memory allocator to take advantage of C++11 type traits.
Currently it is possible to force any allocation of any object anywhere to be dispatched to the correct pool by overriding the global new operator in the traditional way, eg
void* operator new (std::size_t size)
{ // if-cascade just for simplest possible example
if ( size <= 64 ) { return g_BlockPool64.Allocate(); }
else if ( size <= 256 ) { return g_BlockPool256.Allocate(); }
// etc .. else assume arguendo that we know the following will work properly
else return malloc(size);
}
In many cases we could improve performance further if objects could be dispatched to different pools depending on type traits such as is_trivially_destructible. Is it possible to make a templatized global new operator that is aware of the allocated type, not just a requested size? Something equivalent to
template<class T>
void *operator new( size_t size)
{
if ( size < 64 )
{ return std::is_trivially_destructible<T>::value ?
g_BlockPool64_A.Allocate() :
g_BlockPool64_B.Allocate(); } // etc
}
Overriding the member new operator in every class won't work here; we really need this to automatically work for any allocation anywhere. Placement new won't work either: requiring every alloc to look like
Foo *p = new (mempool<Foo>) Foo();
is too cumbersome and people will forget to use it.
The short answer is no. The allocation/deallocation functions have the following signatures:
void* operator new(std::size_t);
void* operator new[](std::size_t);
void operator delete(void*);
void operator delete[](void*);
Most deviations from these signatures will result in your function not being used at all. In a typical implementation you're basically replacing the default implementations at the linker level -- i.e., the existing function has some particular mangled name. If you provide a function with a name that mangles to an identical result, it'll get linked instead. If your function doesn't mangle to the same name, it won't get linked.
A template like you've suggested might get used in some cases, but if so, it would lead to undefined behavior. Depending on how you arranged headers (for example) you could end up with code mixing the use of your template with the default functions, at which point about the best you could hope for would be that it crash quickly and cleanly.

Why does this implementation of the C++ 'new' operator work?

I've found out that the C++ compiler for AVR uCs doesn't support the new and delete operators, but also that there is a quick fix:
void * operator new(size_t size)
{
return malloc(size);
}
void operator delete(void * ptr)
{
free(ptr);
}
I'm assuming that it would now be possible to call new ClassName(args);.
However, I am not really sure how this works. For example, what actually returns a size_t here? I thought that constructors don't return anything...
Could it be that new is now supposed to be used differently (in conjunction with sizeof())?
new T(args); is roughly equivalent to the following.
void* storage = operator new(sizeof(T)); // obtain raw storage
call_constructor<T>(storage, args); // make an object in it
(Here call_constructor is supposed to call the constructor† of T making storage be the this pointer within that constructor.)
The operator new part obtains the requested amount of raw storage, and the constructor call is the one that actually makes an object, by invoking the constructor.
The code in the question only replaces the operator new part, i.e. the retrieval of storage. Both the sizeof part and the constructor invocation are done automatically by the compiler when you use new T(args).
† The language has a way to express this direct constructor invocation called "placement new", but I omitted it for clarity.
From the compiler name (uC), I presume it's for embedded controller. This would make sense as you rarely require dynamic memory management with embedded devices, but might benefit from 'C with classes'. Hopefully it supports 'placement new' so you can actually use C++.
If your compiler doesn't support new & delete, it's not really much of a a C++ compiler is it!
I think the keyword 'new' effectively gets converted to:
Object* pointer = (Object *)new(sizeof Object);
pointer->Object_Constructor(args);

Issues with C++ 'new' operator?

I've recently come across this rant.
I don't quite understand a few of the points mentioned in the article:
The author mentions the small annoyance of delete vs delete[], but seems to argue that it is actually necessary (for the compiler), without ever offering a solution. Did I miss something?
In the section 'Specialized allocators', in function f(), it seems the problems can be solved with replacing the allocations with: (omitting alignment)
// if you're going to the trouble to implement an entire Arena for memory,
// making an arena_ptr won't be much work. basically the same as an auto_ptr,
// except that it knows which arena to deallocate from when destructed.
arena_ptr<char> string(a); string.allocate(80);
// or: arena_ptr<char> string; string.allocate(a, 80);
arena_ptr<int> intp(a); intp.allocate();
// or: arena_ptr<int> intp; intp.allocate(a);
arena_ptr<foo> fp(a); fp.allocate();
// or: arena_ptr<foo>; fp.allocate(a);
// use templates in 'arena.allocate(...)' to determine that foo has
// a constructor which needs to be called. do something similar
// for destructors in '~arena_ptr()'.
In 'Dangers of overloading ::operator new[]', the author tries to do a new(p) obj[10]. Why not this instead (far less ambiguous):
obj *p = (obj *)special_malloc(sizeof(obj[10]));
for(int i = 0; i < 10; ++i, ++p)
new(p) obj;
'Debugging memory allocation in C++'. Can't argue here.
The entire article seems to revolve around classes with significant constructors and destructors located in a custom memory management scheme. While that could be useful, and I can't argue with it, it's pretty limited in commonality.
Basically, we have placement new and per-class allocators -- what problems can't be solved with these approaches?
Also, in case I'm just thick-skulled and crazy, in your ideal C++, what would replace operator new? Invent syntax as necessary -- what would be ideal, simply to help me understand these problems better.
Well, the ideal would probably be to not need delete of any kind. Have a garbage-collected environment, let the programmer avoid the whole problem.
The complaints in the rant seem to come down to
"I liked the way malloc does it"
"I don't like being forced to explicitly create objects of a known type"
He's right about the annoying fact that you have to implement both new and new[], but you're forced into that by Stroustrups' desire to maintain the core of C's semantics. Since you can't tell a pointer from an array, you have to tell the compiler yourself. You could fix that, but doing so would mean changing the semantics of the C part of the language radically; you could no longer make use of the identity
*(a+i) == a[i]
which would break a very large subset of all C code.
So, you could have a language which
implements a more complicated notion of an array, and eliminates the wonders of pointer arithmetic, implementing arrays with dope vectors or something similar.
is garbage collected, so you don't need your own delete discipline.
Which is to say, you could download Java. You could then extend that by changing the language so it
isn't strongly typed, so type checking the void * upcast is eliminated,
...but that means that you can write code that transforms a Foo into a Bar without the compiler seeing it. This would also enable ducktyping, if you want it.
The thing is, once you've done those things, you've got Python or Ruby with a C-ish syntax.
I've been writing C++ since Stroustrup sent out tapes of cfront 1.0; a lot of the history involved in C++ as it is now comes out of the desire to have an OO language that could fit into the C world. There were plenty of other, more satisfying, languages that came out around the same time, like Eiffel. C++ seems to have won. I suspect that it won because it could fit into the C world.
The rant, IMHO, is very misleading and it seems to me that the author does understand the finer details, it's just that he appears to want to mislead. IMHO, the key point that shows the flaw in argument is the following:
void* operator new(std::size_t size, void* ptr) throw();
The standard defines that the above function has the following properties:
Returns: ptr.
Notes: Intentionally performs no other action.
To restate that - this function intentionally performs no other action. This is very important, as it is the key to what placement new does: It is used to call the constructor for the object, and that's all it does. Notice explicitly that the size parameter is not even mentioned.
For those without time, to summarise my point: everything that 'malloc' does in C can be done in C++ using "::operator new". The only difference is that if you have non aggregate types, ie. types that need to have their destructors and constructors called, then you need to call those constructor and destructors. Such types do not explicitly exist in C, and so using the argument that "malloc does it better" is not valid. If you have a struct in 'C' that has a special "initializeMe" function which must be called with a corresponding "destroyMe" then all points made by the author apply equally to that struct as they do to a non-aggregate C++ struct.
Taking some of his points explicitly:
To implement multiple inheritance, the compiler must actually change the values of pointers during some casts. It can't know which value you eventually want when converting to a void * ... Thus, no ordinary function can perform the role of malloc in C++--there is no suitable return type.
This is not correct, again ::operator new performs the role of malloc:
class A1 { };
class A2 { };
class B : public A1, public A2 { };
void foo () {
void * v = ::operator new (sizeof (B));
B * b = new (v) B(); // Placement new calls the constructor for B.
delete v;
v = ::operator new (sizeof(int));
int * i = reinterpret_cast <int*> (v);
delete v'
}
As I mention above, we need placement new to call the constructor for B. In the case of 'i' we can cast from void* to int* without a problem, although again using placement new would improve type checking.
Another point he makes is about alignment requirements:
Memory returned by new char[...] will not necessarily meet the alignment requirements of a struct intlist.
The standard under 3.7.3.1/2 says:
The pointer returned shall be suitably aligned so that it can be converted to a
pointer of any complete object type and then used to access the object or array in the storage allocated (until
the storage is explicitly deallocated by a call to a corresponding deallocation function).
That to me appears pretty clear.
Under specialized allocators the author describes potential problems that you might have, eg. you need to use the allocator as an argument to any types which allocate memory themselves and the constructed objects will need to have their destructors called explicitly. Again, how is this different to passing the allocator object through to an "initalizeMe" call for a C struct?
Regarding calling the destructor, in C++ you can easily create a special kind of smart pointer, let's call it "placement_pointer" which we can define to call the destructor explicitly when it goes out of scope. As a result we could have:
template <typename T>
class placement_pointer {
// ...
~placement_pointer() {
if (*count == 0) {
m_b->~T();
}
}
// ...
T * m_b;
};
void
f ()
{
arena a;
// ...
foo *fp = new (a) foo; // must be destroyed
// ...
fp->~foo ();
placement_pointer<foo> pfp = new (a) foo; // automatically !!destructed!!
// ...
}
The last point I want to comment on is the following:
g++ comes with a "placement" operator new[] defined as follows:
inline void *
operator new[](size_t, void *place)
{
return place;
}
As noted above, not just implemented this way - but it is required to be so by the standard.
Let obj be a class with a destructor. Suppose you have sizeof (obj[10]) bytes of memory somewhere and would like to construct 10 objects of type obj at that location. (C++ defines sizeof (obj[10]) to be 10 * sizeof (obj).) Can you do so with this placement operator new[]? For example, the following code would seem to do so:
obj *
f ()
{
void *p = special_malloc (sizeof (obj[10]));
return new (p) obj[10]; // Serious trouble...
}
Unfortunately, this code is incorrect. In general, there is no guarantee that the size_t argument passed to operator new[] really corresponds to the size of the array being allocated.
But as he highlights by supplying the definition, the size argument is not used in the allocation function. The allocation function does nothing - and so the only affect of the above placement expression is to call the constructor for the 10 array elements as you would expect.
There are other issues with this code, but not the one the author listed.