Is there a C++ allocator that respects an overridden new/delete? - c++

I'm implementing a resource-allocating cloning operation for an array of type T. The straightforward implementation uses new T[sz] followed by a std::copy call from the source into the new array. It walks memory twice.
I'd like to allocate raw memory and then use std::uninitialized_copy so I only walk memory once for performance reasons. I know how to accomplish this when a custom allocator is used (Allocator.allocate followed by std::uninitialized_copy), and I know how to accomplish this using std::allocator (which employs ::operator new following lib.allocator.members in section 20.4.1.1 of the specification). My concern is that a std::allocator-based approach seems wrong for types T where T::operator new has been defined. I know I can detect such a situation using Boost.TypeTraits' has_new_operator.
Is there a simple, standards-compliant way to allocate-and-then-initialize raw memory in a fashion that will respect an overridden new (and does so passing over memory only once)? If not, does using SFINAE to dispatch between an implementation employing std::allocator and one using the overridden operator new seem reasonable? FWIW, grepping through Boost does not show such a use of the has_new_operator trait.
Thanks,
Rhys

Seems it isn't possible. Only operator new[] knows how to store array size (if T has a destructor) in some implementation-specific way (operator delete[] then utilizes this info). Therefore, there is no portable way to store this information without new expression (and without calling elements constructors).

try placement new then.
typedef std::string T;
T src[5];
char* p = new char[sizeof(T)* 5];
T* dest = (T*)p;
for(int i = 0;i < 5; ++i)
{
new(dest + i) T(src[i]); //placement new
}

Related

C++ Buffer pains

I need a class (in C++11) which stores a couple of fields(including a buffer). I started with malloc() in a constructor and free() in a destructor (I didn't touch C/C++ for quite some time so that was all I remembered).
Next thing I remembered (because of crashes) that I need to implement a copy constructor and an assignment operator. Now, I have a full screen of code just for a class with 3 fields (one of which is the buffer).
A question.
What should I use? (I am dazzled by amount of choices - std::vector, std::array, std::shared_ptr, boost::scoped_ptr and may be something else).
Functionality for this buffer which I am looking for are:
as little as possible memory management
getting rid of these copy constructors and assignment operators
ability to use it as void* (I have to pass it to functions which accept "void*")
ability to access read/write it randomly (I may need to get a random range out of it and write a random range to it)
allocate it on the heap (buffer can be reasonably large)
preferably usage of some standard facility
You should just use std::vector<unsigned char> as your buffer. This already has all the necessary constructors, operators, and destructor, so there's nothing special you need to do except use it.
I'm not a C++ expert, so I'd welcome any criticisms to this solution, but a smart pointer to an array might work.
#include <iostream>
#include <memory>
void do_something(void *buffer)
{
char* char_buffer = (char*)buffer;
std::cout << char_buffer[0] << std::endl;
}
int main()
{
size_t size = 10;
// reference counted pointer to auto-delete the buffer
std::shared_ptr<char> buffer(new char[size], std::default_delete<char[]>());
// use the underlying pointer
// http://stackoverflow.com/questions/27819809/why-is-there-no-operator-for-stdshared-ptr
buffer.get()[0] = 'a';
do_something(buffer.get());
// buffer deallocated at the end of scope
}
as little as possible memory management
The smart pointer takes care of that for you
getting rid of these copy constructors and assignment operators
I think the smart pointer handles them
ability to use it as void* (I have to pass it to functions which accept "void*")
Use .get() and a cast.
ability to access read/write it randomly (I may need to get a random range out of it and write a random range to it)
It's still a pointer at the end of this, so I can't see why that wouldn't work (I haven't tried it).
allocate it on the heap (buffer can be reasonably large)
It's on the heap
preferably usage of some standard facility
<memory>
Alternatively, checkout <vector>'s .data() function, which returns a pointer to the vector's underlying array.
I think you need some kind of memory buffer. Check this simple c++ buffer class.
Not using c++11 is intentional because of my company development environment.
https://github.com/jeremyko/CumBuffer

C++ New vs Malloc for dynamic memory array of Objects

I have a class Bullet that takes several arguments for its construction. However, I am using a dynamic memory array to store them. I am using C++ so i want to conform to it's standard by using the new operator to allocate the memory. The problem is that the new operator is asking for the constructor arguments when I'm allocating the array, which I don't have at that time. I can accomplish this using malloc to get the right size then fill in form there, but that's not what i want to use :) any ideas?
pBulletArray = (Bullet*) malloc(iBulletArraySize * sizeof(Bullet)); // Works
pBulletArray = new Bullet[iBulletArraySize]; // Requires constructor arguments
Thanks.
You can't.
And if you truly want to conform to C++ standards, you should use std::vector.
FYI, it would probably be even more expensive than what you're trying to achieve. If you could do this, new would call a constructor. But since you'll modify the object later on anyway, the initial construction is useless.
1) std::vector
A std::vector really is the proper C++ way to do this.
std::vector<Bullet> bullets;
bullets.reserve(10); // allocate memory for bullets without constructing any
bullets.push_back(Bullet(10.2,"Bang")); // put a Bullet in the vector.
bullets.emplace_back(10.2,"Bang"); // (C++11 only) construct a Bullet in the vector without copying.
2) new [] operator
It is also possible to do this with new, but you really shouldn't. Manually managing resources with new/delete is an advanced task, similar to template meta-programming in that it's best left to library builders, who'll use these features to build efficient, high level libraries for you. In fact to do this correctly you'll basically be implementing the internals of std::vector.
When you use the new operator to allocate an array, every element in the array is default initialized. Your code could work if you added a default constructor to Bullet:
class Bullet {
public:
Bullet() {} // default constructor
Bullet(double,std::string const &) {}
};
std::unique_ptr<Bullet[]> b = new Bullet[10]; // default construct 10 bullets
Then, when you have the real data for a Bullet you can assign it to one of the elements of the array:
b[3] = Bullet(20.3,"Bang");
Note the use of unique_ptr to ensure that proper clean-up occurs, and that it's exception safe. Doing these things manually is difficult and error prone.
3) operator new
The new operator initializes its objects in addition to allocating space for them. If you want to simply allocate space, you can use operator new.
std::unique_ptr<Bullet,void(*)(Bullet*)> bullets(
static_cast<Bullet*>(::operator new(10 * sizeof(Bullet))),
[](Bullet *b){::operator delete(b);});
(Note that the unique_ptr ensures that the storage will be deallocated but no more. Specifically, if we construct any objects in this storage we have to manually destruct them and do so in an exception safe way.)
bullets now points to storage sufficient for an array of Bullets. You can construct an array in this storage:
new (bullets.get()) Bullet[10];
However the array construction again uses default initialization for each element, which we're trying to avoid.
AFAIK C++ doesn't specify any well defined method of constructing an array without constructing the elements. I imagine this is largely because doing so would be a no-op for most (all?) C++ implementations. So while the following is technically undefined, in practice it's pretty well defined.
bool constructed[10] = {}; // a place to mark which elements are constructed
// construct some elements of the array
for(int i=0;i<10;i+=2) {
try {
// pretend bullets points to the first element of a valid array. Otherwise 'bullets.get()+i' is undefined
new (bullets.get()+i) Bullet(10.2,"Bang");
constructed = true;
} catch(...) {}
}
That will construct elements of the array without using the default constructor. You don't have to construct every element, just the ones you want to use. However when destroying the elements you have to remember to destroy only the elements that were constructed.
// destruct the elements of the array that we constructed before
for(int i=0;i<10;++i) {
if(constructed[i]) {
bullets[i].~Bullet();
}
}
// unique_ptr destructor will take care of deallocating the storage
The above is a pretty simple case. Making non-trivial uses of this method exception safe without wrapping it all up in a class is more difficult. Wrapping it up in a class basically amounts to implementing std::vector.
4) std::vector
So just use std::vector.
It's possible to do what you want -- search for "operator new" if you really want to know how. But it's almost certainly a bad idea. Instead, use std::vector, which will take care of all the annoying details for you. You can use std::vector::reserve to allocate all the memory you'll use ahead of time.
Bullet** pBulletArray = new Bullet*[iBulletArraySize];
Then populate pBulletArray:
for(int i = 0; i < iBulletArraySize; i++)
{
pBulletArray[i] = new Bullet(arg0, arg1);
}
Just don't forget to free the memory using delete afterwards.
The way C++ new normally works is allocating the memory for the class instance and then calling the constructor for that instance. You basically have already allocated the memory for your instances.
You can call only the constructor for the first instance like this:
new((void*)pBulletArray) Bullet(int foo);
Calling the constructor of the second one would look like this (and so on)
new((void*)pBulletArray+1) Bullet(int bar);
if the Bullet constructor takes an int.
If what you're really after here is just fast allocation/deallocation, then you should look into "memory pools." I'd recommend using boost's implementation, rather than trying to roll your own. In particular, you would probably want to use an "object_pool".

C++ - overload operator new and provide additional arguments

I know you can overload the operator new. When you do, you method gets sent a size_t parameter by default. However, is it possible to send the size_t parameter - as well as additional user-provided parameters, to the overloaded new operator method?
For example
int a = 5;
Monkey* monk = new Monkey(a);
Because I want to override new operator like this
void* Monkey::operator new(size_t size, int a)
{
...
}
Thanks
EDIT: Here's what I a want to accomplish:
I have a chunk of virtual memory allocated at the start of the app (a memory pool). All objects that inherit my base class will inherit its overloaded new operator.
The reason I want to sometimes pass an argument in overloaded new is to tell my memory manager if I want to use the memory pool, or if I want to allocate it with malloc.
Invoke new with that additional operand, e.g.
Monkey *amonkey = new (1275) Monkey(a);
addenda
A practical example of passing argument[s] to your new operator is given by Boehm's garbage collector, which enables you to code
Monkey *acollectedmonkey = new(UseGc) Monkey(a);
and then you don't have to bother about delete-ing acollectedmonkey (assuming its destructor don't do weird things; see this answer). These are the rare situations where you want to pass an explicit Allocator argument to template collections like std::vector or std::map.
When using memory pools, you often want to have some MemoryPool class, and pass instances (or pointers to them) of that class to your new and your delete operations. For readability reasons, I won't recommend referencing memory pools by some obscure integer number.

How should I change this declaration?

I have been given a header with the following declaration:
//The index of 1 is used to make sure this is an array.
MyObject objs[1];
However, I need to make this array dynamically sized one the program is started. I would think I should just declare it as MyObject *objs;, but I figure if the original programmer declared it this way, there is some reason for it.
Is there anyway I can dynamically resize this? Or should I just change it to a pointer and then malloc() it?
Could I use some the new keyword somehow to do this?
Use an STL vector:
#include <vector>
std::vector<MyObject> objs(size);
A vector is a dynamic array and is a part of the Standard Template Library. It resizes automatically as you push back objects into the array and can be accessed like a normal C array with the [] operator. Also, &objs[0] is guaranteed to point to a contiguous sequence in memory -- unlike a list -- if the container is not empty.
You're correct. If you want to dynamically instantiate its size you need to use a pointer.
(Since you're using C++ why not use the new operator instead of malloc?)
MyObject* objs = new MyObject[size];
Or should I just change it to a
pointer and then malloc() it?
If you do that, how are constructors going to be called for the objects in on the malloc'd memory? I'll give you a hint - they won't be - you need to use a std::vector.
I have only seen an array used as a pointer inside a struct or union. This was ages ago and was used to treat the len and first char of a string as a hash to improve the speed of string comparisons for a scripting language.
The code was similar to this:
union small_string {
struct {
char len;
char buff[1];
};
short hash;
};
Then small_string was initialised using malloc, note the c cast is effectively a reinterpret_cast
small_string str = (small_string) malloc(len + 1);
strcpy(str.buff, val);
And to test for equality
int fast_str_equal(small_string str1, small_string str2)
{
if (str1.hash == str2.hash)
return strcmp(str1.buff, str2.buff) == 0;
return 0;
}
As you can see this is not a very portable or safe style of c++. But offered a great speed improvement for associative arrays indexed by short strings, which are the basis of most scripting languages.
I would probably avoid this style of c++ today.
Is this at the end of a struct somewhere?
One trick I've seen is to declare a struct
struct foo {
/* optional stuff here */
int arr[1];
}
and malloc more memory than sizeof (struct foo) so that arr becomes a variable-sized array.
This was fairly commonly used in C programs back when I was hacking C, since variable-sized arrays were not available, and doing an additional allocation was considered too error-prone.
The right thing to do, in almost all cases, is to change the array to an STL vector.
Using the STL is best if you want a dynamically sizing array, there are several options, one is std::vector. If you aren't bothered about inserting, you can also use std::list.
Its seems - yes, you can do this change.
But check your code on sizeof( objs );
MyObj *arr1 = new MyObj[1];
MyObj arr2[1];
sizeof(arr1) != sizeof(arr2)
Maybe this fact used somewhere in your code.
That comment is incredibly bad. A one-element array is an array even though the comment suggests otherwise.
I've never seen anybody try to enforce "is an array" this way. The array syntax is largely syntactic sugar (a[2] gives the same result as 2[a]: i.e., the third element in a (NOTE this is an interesting and valid syntax but usually a very bad form to use because you're going to confuse programmers for no reason)).
Because the array syntax is largely syntactic sugar, switching to a pointer makes sense as well. But if you're going to do that, then going with new[] makes more sense (because you get your constructors called for free), and going with std::vector makes even more sense (because you don't have to remember to call delete[] every place the array goes out of scope due to return, break, the end of statement, throwing an exception, etc.).

Issues with C++ 'new' operator?

I've recently come across this rant.
I don't quite understand a few of the points mentioned in the article:
The author mentions the small annoyance of delete vs delete[], but seems to argue that it is actually necessary (for the compiler), without ever offering a solution. Did I miss something?
In the section 'Specialized allocators', in function f(), it seems the problems can be solved with replacing the allocations with: (omitting alignment)
// if you're going to the trouble to implement an entire Arena for memory,
// making an arena_ptr won't be much work. basically the same as an auto_ptr,
// except that it knows which arena to deallocate from when destructed.
arena_ptr<char> string(a); string.allocate(80);
// or: arena_ptr<char> string; string.allocate(a, 80);
arena_ptr<int> intp(a); intp.allocate();
// or: arena_ptr<int> intp; intp.allocate(a);
arena_ptr<foo> fp(a); fp.allocate();
// or: arena_ptr<foo>; fp.allocate(a);
// use templates in 'arena.allocate(...)' to determine that foo has
// a constructor which needs to be called. do something similar
// for destructors in '~arena_ptr()'.
In 'Dangers of overloading ::operator new[]', the author tries to do a new(p) obj[10]. Why not this instead (far less ambiguous):
obj *p = (obj *)special_malloc(sizeof(obj[10]));
for(int i = 0; i < 10; ++i, ++p)
new(p) obj;
'Debugging memory allocation in C++'. Can't argue here.
The entire article seems to revolve around classes with significant constructors and destructors located in a custom memory management scheme. While that could be useful, and I can't argue with it, it's pretty limited in commonality.
Basically, we have placement new and per-class allocators -- what problems can't be solved with these approaches?
Also, in case I'm just thick-skulled and crazy, in your ideal C++, what would replace operator new? Invent syntax as necessary -- what would be ideal, simply to help me understand these problems better.
Well, the ideal would probably be to not need delete of any kind. Have a garbage-collected environment, let the programmer avoid the whole problem.
The complaints in the rant seem to come down to
"I liked the way malloc does it"
"I don't like being forced to explicitly create objects of a known type"
He's right about the annoying fact that you have to implement both new and new[], but you're forced into that by Stroustrups' desire to maintain the core of C's semantics. Since you can't tell a pointer from an array, you have to tell the compiler yourself. You could fix that, but doing so would mean changing the semantics of the C part of the language radically; you could no longer make use of the identity
*(a+i) == a[i]
which would break a very large subset of all C code.
So, you could have a language which
implements a more complicated notion of an array, and eliminates the wonders of pointer arithmetic, implementing arrays with dope vectors or something similar.
is garbage collected, so you don't need your own delete discipline.
Which is to say, you could download Java. You could then extend that by changing the language so it
isn't strongly typed, so type checking the void * upcast is eliminated,
...but that means that you can write code that transforms a Foo into a Bar without the compiler seeing it. This would also enable ducktyping, if you want it.
The thing is, once you've done those things, you've got Python or Ruby with a C-ish syntax.
I've been writing C++ since Stroustrup sent out tapes of cfront 1.0; a lot of the history involved in C++ as it is now comes out of the desire to have an OO language that could fit into the C world. There were plenty of other, more satisfying, languages that came out around the same time, like Eiffel. C++ seems to have won. I suspect that it won because it could fit into the C world.
The rant, IMHO, is very misleading and it seems to me that the author does understand the finer details, it's just that he appears to want to mislead. IMHO, the key point that shows the flaw in argument is the following:
void* operator new(std::size_t size, void* ptr) throw();
The standard defines that the above function has the following properties:
Returns: ptr.
Notes: Intentionally performs no other action.
To restate that - this function intentionally performs no other action. This is very important, as it is the key to what placement new does: It is used to call the constructor for the object, and that's all it does. Notice explicitly that the size parameter is not even mentioned.
For those without time, to summarise my point: everything that 'malloc' does in C can be done in C++ using "::operator new". The only difference is that if you have non aggregate types, ie. types that need to have their destructors and constructors called, then you need to call those constructor and destructors. Such types do not explicitly exist in C, and so using the argument that "malloc does it better" is not valid. If you have a struct in 'C' that has a special "initializeMe" function which must be called with a corresponding "destroyMe" then all points made by the author apply equally to that struct as they do to a non-aggregate C++ struct.
Taking some of his points explicitly:
To implement multiple inheritance, the compiler must actually change the values of pointers during some casts. It can't know which value you eventually want when converting to a void * ... Thus, no ordinary function can perform the role of malloc in C++--there is no suitable return type.
This is not correct, again ::operator new performs the role of malloc:
class A1 { };
class A2 { };
class B : public A1, public A2 { };
void foo () {
void * v = ::operator new (sizeof (B));
B * b = new (v) B(); // Placement new calls the constructor for B.
delete v;
v = ::operator new (sizeof(int));
int * i = reinterpret_cast <int*> (v);
delete v'
}
As I mention above, we need placement new to call the constructor for B. In the case of 'i' we can cast from void* to int* without a problem, although again using placement new would improve type checking.
Another point he makes is about alignment requirements:
Memory returned by new char[...] will not necessarily meet the alignment requirements of a struct intlist.
The standard under 3.7.3.1/2 says:
The pointer returned shall be suitably aligned so that it can be converted to a
pointer of any complete object type and then used to access the object or array in the storage allocated (until
the storage is explicitly deallocated by a call to a corresponding deallocation function).
That to me appears pretty clear.
Under specialized allocators the author describes potential problems that you might have, eg. you need to use the allocator as an argument to any types which allocate memory themselves and the constructed objects will need to have their destructors called explicitly. Again, how is this different to passing the allocator object through to an "initalizeMe" call for a C struct?
Regarding calling the destructor, in C++ you can easily create a special kind of smart pointer, let's call it "placement_pointer" which we can define to call the destructor explicitly when it goes out of scope. As a result we could have:
template <typename T>
class placement_pointer {
// ...
~placement_pointer() {
if (*count == 0) {
m_b->~T();
}
}
// ...
T * m_b;
};
void
f ()
{
arena a;
// ...
foo *fp = new (a) foo; // must be destroyed
// ...
fp->~foo ();
placement_pointer<foo> pfp = new (a) foo; // automatically !!destructed!!
// ...
}
The last point I want to comment on is the following:
g++ comes with a "placement" operator new[] defined as follows:
inline void *
operator new[](size_t, void *place)
{
return place;
}
As noted above, not just implemented this way - but it is required to be so by the standard.
Let obj be a class with a destructor. Suppose you have sizeof (obj[10]) bytes of memory somewhere and would like to construct 10 objects of type obj at that location. (C++ defines sizeof (obj[10]) to be 10 * sizeof (obj).) Can you do so with this placement operator new[]? For example, the following code would seem to do so:
obj *
f ()
{
void *p = special_malloc (sizeof (obj[10]));
return new (p) obj[10]; // Serious trouble...
}
Unfortunately, this code is incorrect. In general, there is no guarantee that the size_t argument passed to operator new[] really corresponds to the size of the array being allocated.
But as he highlights by supplying the definition, the size argument is not used in the allocation function. The allocation function does nothing - and so the only affect of the above placement expression is to call the constructor for the 10 array elements as you would expect.
There are other issues with this code, but not the one the author listed.