Related
How can I combine the variable length struct idiom
struct Data
{
std::size_t size;
char data[];
};
with the make_shared idiom which does essentially the same thing so that I can end up with a shared_ptr to one contiguous block of memory which contains the ref count structure header and structure data.
i.e. something like
// allocate an extra 30 bytes for the data storage
shared_ptr<Data> ptr = allocate_shared<Data>( vls_allocator(30) );
You can achieve this using boosts intrusive shared pointers, (Not sure if this is supported directly by C++11).
http://www.boost.org/doc/libs/1_55_0/libs/smart_ptr/intrusive_ptr.html
You'd need to create a larger struct with the reference count in it though and your pointers would point at the laregr structure rather than its contained Data member.
You also need to hook up your deallocation function into to these pointers.
NOTE: I suspect there are ways to get a shared pointer into the Data member, rather than the wrapper - but last time i did that with boost shared_ptr code it required some interesting hacks.
This is an important requirement in my opinion especially when writing code which must interact with or provide interfaces to low level system functionality (outside the world of modern C++ but still compliant with modern code).
There are two parts to the solution. First allocating the buffer as a smart pointer and secondly casting it to the right type in a safe way which passes the code analysis/guideline checkers.
The second part is generally available with "std::reinterpret_pointer_cast<T>" however this will not accept a cast from any array type without spewing out warnings=errors from the compiler or guideline checkers.
The first part which is the missing link until C++20 is to create a safe shared_ptr to a buffer of variable length, for example "shared_ptr(size_t length)" created by "std::make_shared<T[]>(size_t)" but sadly forgotten in the compiler/standard library versions before C++20 (even though they did implement "make_unique<T>(size_t)" and the request for change for the same in make_shared has been around for years).
I stumbled upon this macro which when combined with std::reinterpret_pointer_cast gives you a nice "polyfill" to achieve the same result in clean code until C++20 is generally available.
// C++20 polyfill for missing "make_shared<T[]>(size_t size)" overload.
template<typename T>
inline std::shared_ptr<T> make_shared_array(size_t bufferSize)
{
return std::shared_ptr<T>(new T[bufferSize], [](T* memory) { delete[] memory; });
}
// Creates a smart pointer to a type backed by a variable length buffer, e.g. system structures.
template<typename T>
inline std::shared_ptr<T> CreateSharedBuffer(size_t byteSize)
{
return std::reinterpret_pointer_cast<T>(make_shared_array<uint8_t>(byteSize));
}
Example:
size_t privilegesSize = sizeof(TOKEN_PRIVILEGES) + (sizeof(LUID_AND_ATTRIBUTES) * privilegesCount);
auto tokenPrivileges = CreateSharedBuffer<TOKEN_PRIVILEGES>(privilegesSize);
The only side effect is apparently a second memory access will occur during allocation but given C++20 is around the corner it's probably more efficient and robust for your solution to get running now with clean robust code than worrying about an extra processor cycle or two for what will be a short period of time.
All you have to do when C++20 is available is delete the "polyfill" template and rename or replace the call to it from "make_shared_array" to "make_shared" (with the new C++20 overload). Optionally ditch the whole CreateSharedBuffer macro if the C++20 solution below is acceptable:
auto data = std::reinterpret_cast<Data>(std::make_shared<uint8_t>(sizeof(Data) + (sizeof(char) * 30)));
I have a questions about recommended coding technique. I have a tool for model analysis and I sometimes need to pass a big amount of data (From a factory class to one that holds multiple heterogeneous chunks).
My question is whether there is some consensus about if I should rather use pointers or move the ownership (I need to avoid copying when possible as the size of a data-block may be as big as 1 GB).
The pointer version would look like this:
class FactoryClass {
...
public:
static Data * createData() {
Data * data = new Data;
...
return data;
}
};
class StorageClass {
unique_ptr<Data> data_ptr;
...
public:
void setData(Data * _data_ptr) {
data_ptr.reset(_data_ptr);
}
};
void pass() {
Data * data = FactoryClass::createData();
...
StorageClass storage;
storage.setData(data);
}
Whereas the move version is like this:
class FactoryClass {
...
public:
static Data createData() {
Data data;
...
return data;
}
};
class StorageClass {
Data data;
...
public:
void setData(Data _data) {
data = move(_data);
}
};
void pass() {
Data data = FactoryClass::createData();
...
StorageClass storage;
storage.setData(move(data));
}
I like the move version better - yes, I need to add move commands to the main code, but then I in the end have just the objects in the storage and I do not have to care about pointer semantics anymore.
However I am not quite relaxed when using the move semantics whom I do not understand in detail. (I do not care about the C++11 requirement though, as the code is already only Gcc4.7+ compilable).
Would someone have a reference that would support either version? Or is there some other, preferred version of how to pass data?
I was not able to Google anything as the keywords usually led to other topics.
Thanks.
EDIT NOTE:
The second example got refactored to incorporate suggestions from the comments, the semantics remained unchanged.
When you are passing an object to a function, what you pass depends in part on how that function is going to use it. A function can use an object in one of three general ways:
It can simply reference the object for the duration of the function call, with the calling function (or it's eventual parent up the call stack) maintaining ownership of the object. The reference in this case may be a constant reference or a modifiable reference. The function will not store this object long-term.
It can copy the object directly. It doesn't gain ownership of the original, but it does acquire a copy of the original, so as to store, modify, or do with the copy what it will. Note that the difference between #1 and this is that the copy is made explicit in the parameter list. For example, taking a std::string by value. But this could also be as simple as taking an int by value.
It can gain some form of ownership of the object. The function then has some responsibility over the object's destruction. This also allows the function to store the object long-term.
My general recommendation for the parameter types for these paradigms are as follows:
Take the object by an explicit language reference where possible. If that's not possible, try a std::reference_wrapper. If that can't work, and no other solutions seem reasonable, then use a pointer. A pointer would be for things like optional parameters (though C++14's std::optional will make that less useful. Pointers will still have uses though), language arrays (though again, we have objects that cover most of the uses of these), and so forth.
Take the object by value. That one's pretty non-negotiable.
Take the object either by value-move (ie: move it into a by-value parameter) or by a smart-pointer to the object (which will also be taken by value, since you're going to copy/move it anyway). The problem with your code is that you're transferring ownership via a pointer, but with a raw pointer. Raw pointers have no ownership semantics. The moment you allocate any pointer, you should immediately wrap it in some kind of smart pointer. So your factory function should have returned a unique_ptr.
Your case appears to be #3. Which you use between value-move and smart pointer is entirely up to you. If you have to heap allocate Data for some reason, then the choice is pretty much made for you. If Data can be stack allocated, then you have some options.
I would generally do this based on an estimation of Data's internal size. If internally, it's just a few pointers/integers (and by "few", I mean like 3-4), then putting it on the stack is fine.
Indeed, it can better because you'll have less chance of a double-cache-miss. If your Data functions often just access data from another pointer, if you store Data by pointer, then every function call on it will have to dereference your stored pointer to fetch the internal one, then dereference the internal one. That's two potential cache misses, since neither pointer has any locality with StorageClass.
If you store Data by value, it's much more likely that Data's internal pointer will already be in the cache. It has better locality with StorageClass's other members; if you accessed some of StorageClass before now, you already paid for a cache miss, so you are likely to already have Data in the cache.
But movement is not free. It's cheaper than a full copy, but it's not free. You're still copying the internal data (and possibly nulling out any pointers on the original). But then again, allocating memory on the heap isn't free either. Nor is deallocating it.
But then again, if you're not moving it around very often (you move it around to get it to its final location, but little more after that), even moving a larger object would be fine. If you're using it more than you're moving it, then the cache locality of the object's storage will probably win out over the cost of moving.
There ultimately aren't a lot of technical reasons to pick one or the other. I would say to default to movement where reasonable.
I am trying to write a simple game using C++ and SDL. My question is, what is the best practice to store class member variables.
MyObject obj;
MyObject* obj;
I read a lot about eliminating pointers as much as possible in similar questions, but I remember that few years back in some books I read they used it a lot (for all non trivial objects) . Another thing is that SDL returns pointers in many of its functions and therefor I would have to use "*" a lot when working with SDL objects.
Also am I right when I think the only way to initialize the first one using other than default constructor is through initializer list?
Generally, using value members is preferred over pointer members. However, there are some exceptions, e.g. (this list is probably incomplete and only contains reason I could come up with immediately):
When the members are huge (use sizeof(MyObject) to find out), the difference often doesn't matter for the access and stack size may be a concern.
When the objects come from another source, e.g., when there are factory function creating pointers, there is often no alternative to store the objects.
If the dynamic type of the object isn't known, using a pointer is generally the only alternative. However, this shouldn't be as common as it often is.
When there are more complicated relations than direct owner, e.g., if an object is shared between different objects, using a pointer is the most reasonable approach.
In all of these case you wouldn't use a pointer directly but rather a suitable smart pointer. For example, for 1. you might want to use a std::unique_ptr<MyObject> and for 4. a std::shared_ptr<MyObject> is the best alternative. For 2. you might need to use one of these smart pointer templates combined with a suitable deleter function to deal with the appropriate clean-up (e.g. for a FILE* obtained from fopen() you'd use fclose() as a deleter function; of course, this is a made up example as in C++ you would use I/O streams anyway).
In general, I normally initialize my objects entirely in the member initializer list, independent on how the members are represented exactly. However, yes, if you member objects require constructor arguments, these need to be passed from a member initializer list.
First I would like to say that I completely agree with Dietmar Kühl and Mats Petersson answer. However, you have also to take on account that SDL is a pure C library where the majority of the API functions expect C pointers of structs that can own big chunks of data. So you should not allocate them on stack (you shoud use new operator to allocate them on the heap). Furthermore, because C language does not contain smart pointers, you need to use std::unique_ptr::get() to recover the C pointer that std::unique_ptr owns before sending it to SDL API functions. This can be quite dangerous because you have to make sure that the std::unique_ptr does not get out of scope while SDL is using the C pointer (similar problem with std::share_ptr). Otherwise you will get seg fault because std::unique_ptr will delete the C pointer while SDL is using it.
Whenever you need to call pure C libraries inside a C++ program, I recommend the use of RAII. The main idea is that you create a small wrapper class that owns the C pointer and also calls the SDL API functions for you. Then you use the class destructor to delete all your C pointers.
Example:
class SDLAudioWrap {
public:
SDLAudioWrap() { // constructor
// allocate SDL_AudioSpec
}
~SDLAudioWrap() { // destructor
// free SDL_AudioSpec
}
// here you wrap all SDL API functions that involve
// SDL_AudioSpec and that you will use in your program
// It is quite simple
void SDL_do_some_stuff() {
SDL_do_some_stuff(ptr); // original C function
// SDL_do_some_stuff(SDL_AudioSpec* ptr)
}
private:
SDL_AudioSpec* ptr;
}
Now your program is exception safe and you don't have the possible issue of having smart pointers deleting your C pointer while SDL is using it.
UPDATE 1: I forget to mention that because SDL is a C library, you will need a custom deleter class in order to proper manage their C structs using smart pointers.
Concrete example: GSL GNU scientific library. Integration routine requires the allocation of a struct called "gsl_integration_workspace". In this case, you can use the following code to ensure that your code is exception safe
auto deleter= [](gsl_integration_workspace* ptr) {
gsl_integration_workspace_free(ptr);
};
std::unique_ptr<gsl_integration_workspace, decltype(deleter)> ptr4 (
gsl_integration_workspace_alloc (2000), deleter);
Another reason why I prefer wrapper classes
In case of initialization, it depends on what the options are, but yes, a common way is to use an initializer list.
The "don't use pointers unless you have to" is good advice in general. Of course, there are times when you have to - for example when an object is being returned by an API!
Also, using new will waste quite a bit of memory and CPU-time if MyObject is small. Each object created with new has an overhead of around 16-48 bytes in a typical modern OS, so if your object is only a couple of simple types, then you may well have more overhead than actual storage. In a largeer application, this can easily add up to a huge amount. And of course, a call to new or delete will most likely take some hundreds or thousands of cycles (above and beyond the time used in the constructor). So, you end up with code that runs slower and takes more memory - and of course, there's always some risk that you mess up and have memory leaks, causing your program to potentially crash due to out of memory, when it's not REALLY out of memory.
And as that famous "Murphy's law states", these things just have to happen at the worst possible and most annoying times - when you have just done some really good work, or when you've just succeeded at a level in a game, or something. So avoiding those risks whenever possible is definitely a good idea.
Well, creating the object is a lot better than using pointers because it's less error prone. Your code doesn't describe it well.
MyObj* foo;
foo = new MyObj;
foo->CanDoStuff(stuff);
//Later when foo is not needed
delete foo;
The other way is
MyObj foo;
foo.CanDoStuff(stuff);
less memory management but really it's up to you.
As the previous answers claimed the "don't use pointers unless you have to" is a good advise for general programming but then there are many issues that could finally make you select the pointers choice. Furthermore, in you initial question you are not considering the option of using references. So you can face three types of variable members in a class:
MyObject obj;
MyObject* obj;
MyObject& obj;
I use to always consider the reference option rather than the pointer one because you don't need to take care about if the pointer is NULL or not.
Also, as Dietmar Kühl pointed, a good reason for selecting pointers is:
If the dynamic type of the object isn't known, using a pointer is
generally the only alternative. However, this shouldn't be as common
as it often is.
I think this point is of particular importance when you are working on a big project. If you have many own classes, arranged in many source files and you use them in many parts of your code you will come up with long compilation times. If you use normal class instances (instead of pointers or references) a simple change in one of the header file of your classes will infer in the recompilation of all the classes that include this modified class. One possible solution for this issue is to use the concept of Forward declaration, which make use of pointers or references (you can find more info here).
I've been looking into this for the past few days, and so far I haven't really found anything convincing other than dogmatic arguments or appeals to tradition (i.e. "it's the C++ way!").
If I'm creating an array of objects, what is the compelling reason (other than ease) for using:
#define MY_ARRAY_SIZE 10
// ...
my_object * my_array=new my_object [MY_ARRAY_SIZE];
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i]=my_object(i);
over
#define MEMORY_ERROR -1
#define MY_ARRAY_SIZE 10
// ...
my_object * my_array=(my_object *)malloc(sizeof(my_object)*MY_ARRAY_SIZE);
if (my_object==NULL) throw MEMORY_ERROR;
for (int i=0;i<MY_ARRAY_SIZE;++i) new (my_array+i) my_object (i);
As far as I can tell the latter is much more efficient than the former (since you don't initialize memory to some non-random value/call default constructors unnecessarily), and the only difference really is the fact that one you clean up with:
delete [] my_array;
and the other you clean up with:
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i].~T();
free(my_array);
I'm out for a compelling reason. Appeals to the fact that it's C++ (not C) and therefore malloc and free shouldn't be used isn't -- as far as I can tell -- compelling as much as it is dogmatic. Is there something I'm missing that makes new [] superior to malloc?
I mean, as best I can tell, you can't even use new [] -- at all -- to make an array of things that don't have a default, parameterless constructor, whereas the malloc method can thusly be used.
I'm out for a compelling reason.
It depends on how you define "compelling". Many of the arguments you have thus far rejected are certainly compelling to most C++ programmers, as your suggestion is not the standard way to allocate naked arrays in C++.
The simple fact is this: yes, you absolutely can do things the way you describe. There is no reason that what you are describing will not function.
But then again, you can have virtual functions in C. You can implement classes and inheritance in plain C, if you put the time and effort into it. Those are entirely functional as well.
Therefore, what matters is not whether something can work. But more on what the costs are. It's much more error prone to implement inheritance and virtual functions in C than C++. There are multiple ways to implement it in C, which leads to incompatible implementations. Whereas, because they're first-class language features of C++, it's highly unlikely that someone would manually implement what the language offers. Thus, everyone's inheritance and virtual functions can cooperate with the rules of C++.
The same goes for this. So what are the gains and the losses from manual malloc/free array management?
I can't say that any of what I'm about to say constitutes a "compelling reason" for you. I rather doubt it will, since you seem to have made up your mind. But for the record:
Performance
You claim the following:
As far as I can tell the latter is much more efficient than the former (since you don't initialize memory to some non-random value/call default constructors unnecessarily), and the only difference really is the fact that one you clean up with:
This statement suggests that the efficiency gain is primarily in the construction of the objects in question. That is, which constructors are called. The statement presupposes that you don't want to call the default constructor; that you use a default constructor just to create the array, then use the real initialization function to put the actual data into the object.
Well... what if that's not what you want to do? What if what you want to do is create an empty array, one that is default constructed? In this case, this advantage disappears entirely.
Fragility
Let's assume that each object in the array needs to have a specialized constructor or something called on it, such that initializing the array requires this sort of thing. But consider your destruction code:
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i].~T();
For a simple case, this is fine. You have a macro or const variable that says how many objects you have. And you loop over each element to destroy the data. That's great for a simple example.
Now consider a real application, not an example. How many different places will you be creating an array in? Dozens? Hundreds? Each and every one will need to have its own for loop for initializing the array. Each and every one will need to have its own for loop for destroying the array.
Mis-type this even once, and you can corrupt memory. Or not delete something. Or any number of other horrible things.
And here's an important question: for a given array, where do you keep the size? Do you know how many items you allocated for every array that you create? Each array will probably have its own way of knowing how many items it stores. So each destructor loop will need to fetch this data properly. If it gets it wrong... boom.
And then we have exception safety, which is a whole new can of worms. If one of the constructors throws an exception, the previously constructed objects need to be destructed. Your code doesn't do that; it's not exception-safe.
Now, consider the alternative:
delete[] my_array;
This can't fail. It will always destroy every element. It tracks the size of the array, and it's exception-safe. So it is guaranteed to work. It can't not work (as long as you allocated it with new[]).
Of course, you could say that you could wrap the array in an object. That makes sense. You might even template the object on the type elements of the array. That way, all the desturctor code is the same. The size is contained in the object. And maybe, just maybe, you realize that the user should have some control over the particular way the memory is allocated, so that it's not just malloc/free.
Congratulations: you just re-invented std::vector.
Which is why many C++ programmers don't even type new[] anymore.
Flexibility
Your code uses malloc/free. But let's say I'm doing some profiling. And I realize that malloc/free for certain frequently created types is just too expensive. I create a special memory manager for them. But how to hook all of the array allocations to them?
Well, I have to search the codebase for any location where you create/destroy arrays of these types. And then I have to change their memory allocators accordingly. And then I have to continuously watch the codebase so that someone else doesn't change those allocators back or introduce new array code that uses different allocators.
If I were instead using new[]/delete[], I could use operator overloading. I simply provide an overload for operators new[] and delete[] for those types. No code has to change. It's much more difficult for someone to circumvent these overloads; they have to actively try to. And so forth.
So I get greater flexibility and reasonable assurance that my allocators will be used where they should be used.
Readability
Consider this:
my_object *my_array = new my_object[10];
for (int i=0; i<MY_ARRAY_SIZE; ++i)
my_array[i]=my_object(i);
//... Do stuff with the array
delete [] my_array;
Compare it to this:
my_object *my_array = (my_object *)malloc(sizeof(my_object) * MY_ARRAY_SIZE);
if(my_object==NULL)
throw MEMORY_ERROR;
int i;
try
{
for(i=0; i<MY_ARRAY_SIZE; ++i)
new(my_array+i) my_object(i);
}
catch(...) //Exception safety.
{
for(i; i>0; --i) //The i-th object was not successfully constructed
my_array[i-1].~T();
throw;
}
//... Do stuff with the array
for(int i=MY_ARRAY_SIZE; i>=0; --i)
my_array[i].~T();
free(my_array);
Objectively speaking, which one of these is easier to read and understand what's going on?
Just look at this statement: (my_object *)malloc(sizeof(my_object) * MY_ARRAY_SIZE). This is a very low level thing. You're not allocating an array of anything; you're allocating a hunk of memory. You have to manually compute the size of the hunk of memory to match the size of the object * the number of objects you want. It even features a cast.
By contrast, new my_object[10] tells the story. new is the C++ keyword for "create instances of types". my_object[10] is a 10 element array of my_object type. It's simple, obvious, and intuitive. There's no casting, no computing of byte sizes, nothing.
The malloc method requires learning how to use malloc idiomatically. The new method requires just understanding how new works. It's much less verbose and much more obvious what's going on.
Furthermore, after the malloc statement, you do not in fact have an array of objects. malloc simply returns a block of memory that you have told the C++ compiler to pretend is a pointer to an object (with a cast). It isn't an array of objects, because objects in C++ have lifetimes. And an object's lifetime does not begin until it is constructed. Nothing in that memory has had a constructor called on it yet, and therefore there are no living objects in it.
my_array at that point is not an array; it's just a block of memory. It doesn't become an array of my_objects until you construct them in the next step. This is incredibly unintuitive to a new programmer; it takes a seasoned C++ hand (one who probably learned from C) to know that those aren't live objects and should be treated with care. The pointer does not yet behave like a proper my_object*, because it doesn't point to any my_objects yet.
By contrast, you do have living objects in the new[] case. The objects have been constructed; they are live and fully-formed. You can use this pointer just like any other my_object*.
Fin
None of the above says that this mechanism isn't potentially useful in the right circumstances. But it's one thing to acknowledge the utility of something in certain circumstances. It's quite another to say that it should be the default way of doing things.
If you do not want to get your memory initialized by implicit constructor calls, and just need an assured memory allocation for placement new then it is perfectly fine to use malloc and free instead of new[] and delete[].
The compelling reasons of using new over malloc is that new provides implicit initialization through constructor calls, saving you additional memset or related function calls post an malloc And that for new you do not need to check for NULL after every allocation, just enclosing exception handlers will do the job saving you redundant error checking unlike malloc.
These both compelling reasons do not apply to your usage.
which one is performance efficient can only be determined by profiling, there is nothing wrong in the approach you have now. On a side note I don't see a compelling reason as to why use malloc over new[] either.
I would say neither.
The best way to do it would be:
std::vector<my_object> my_array;
my_array.reserve(MY_ARRAY_SIZE);
for (int i=0;i<MY_ARRAY_SIZE;++i)
{ my_array.push_back(my_object(i));
}
This is because internally vector is probably doing the placement new for you. It also managing all the other problems associated with memory management that you are not taking into account.
You've reimplemented new[]/delete[] here, and what you have written is pretty common in developing specialized allocators.
The overhead of calling simple constructors will take little time compared the allocation. It's not necessarily 'much more efficient' -- it depends on the complexity of the default constructor, and of operator=.
One nice thing that has not been mentioned yet is that the array's size is known by new[]/delete[]. delete[] just does the right and destructs all elements when asked. Dragging an additional variable (or three) around so you exactly how to destroy the array is a pain. A dedicated collection type would be a fine alternative, however.
new[]/delete[] are preferable for convenience. They introduce little overhead, and could save you from a lot of silly errors. Are you compelled enough to take away this functionality and use a collection/container everywhere to support your custom construction? I've implemented this allocator -- the real mess is creating functors for all the construction variations you need in practice. At any rate, you often have a more exact execution at the expense of a program which is often more difficult to maintain than the idioms everybody knows.
IMHO there both ugly, it's better to use vectors. Just make sure to allocate the space in advance for performance.
Either:
std::vector<my_object> my_array(MY_ARRAY_SIZE);
If you want to initialize with a default value for all entries.
my_object basic;
std::vector<my_object> my_array(MY_ARRAY_SIZE, basic);
Or if you don't want to construct the objects but do want to reserve the space:
std::vector<my_object> my_array;
my_array.reserve(MY_ARRAY_SIZE);
Then if you need to access it as a C-Style pointer array just (just make sure you don't add stuff while keeping the old pointer but you couldn't do that with regular c-style arrays anyway.)
my_object* carray = &my_array[0];
my_object* carray = &my_array.front(); // Or the C++ way
Access individual elements:
my_object value = my_array[i]; // The non-safe c-like faster way
my_object value = my_array.at(i); // With bounds checking, throws range exception
Typedef for pretty:
typedef std::vector<my_object> object_vect;
Pass them around functions with references:
void some_function(const object_vect& my_array);
EDIT:
IN C++11 there is also std::array. The problem with it though is it's size is done via a template so you can't make different sized ones at runtime and you cant pass it into functions unless they are expecting that exact same size (or are template functions themselves). But it can be useful for things like buffers.
std::array<int, 1024> my_array;
EDIT2:
Also in C++11 there is a new emplace_back as an alternative to push_back. This basically allows you to 'move' your object (or construct your object directly in the vector) and saves you a copy.
std::vector<SomeClass> v;
SomeClass bob {"Bob", "Ross", 10.34f};
v.emplace_back(bob);
v.emplace_back("Another", "One", 111.0f); // <- Note this doesn't work with initialization lists ☹
Oh well, I was thinking that given the number of answers there would be no reason to step in... but I guess I am drawn in as the others. Let's go
Why your solution is broken
C++11 new facilities for handling raw memory
Simpler way to get this done
Advices
1. Why your solution is broken
First, the two snippets you presented are not equivalent. new[] just works, yours fails horribly in the presence of Exceptions.
What new[] does under the cover is that it keeps track of the number of objects that were constructed, so that if an exception occurs during say the 3rd constructor call it properly calls the destructor for the 2 already constructed objects.
Your solution however fails horribly:
either you don't handle exceptions at all (and leak horribly)
or you just try to call the destructors on the whole array even though it's half built (likely crashing, but who knows with undefined behavior)
So the two are clearly not equivalent. Yours is broken
2. C++11 new facilities for handling raw memory
In C++11, the comittee members have realized how much we liked fiddling with raw memory and they have introduced facilities to help us doing so more efficiently, and more safely.
Check cppreference's <memory> brief. This example shows off the new goodies (*):
#include <iostream>
#include <string>
#include <memory>
#include <algorithm>
int main()
{
const std::string s[] = {"This", "is", "a", "test", "."};
std::string* p = std::get_temporary_buffer<std::string>(5).first;
std::copy(std::begin(s), std::end(s),
std::raw_storage_iterator<std::string*, std::string>(p));
for(std::string* i = p; i!=p+5; ++i) {
std::cout << *i << '\n';
i->~basic_string<char>();
}
std::return_temporary_buffer(p);
}
Note that get_temporary_buffer is no-throw, it returns the number of elements for which memory has actually been allocated as a second member of the pair (thus the .first to get the pointer).
(*) Or perhaps not so new as MooingDuck remarked.
3. Simpler way to get this done
As far as I am concered, what you really seem to be asking for is a kind of typed memory pool, where some emplacements could not have been initialized.
Do you know about boost::optional ?
It is basically an area of raw memory that can fit one item of a given type (template parameter) but defaults with having nothing in instead. It has a similar interface to a pointer and let you query whether or not the memory is actually occupied. Finally, using the In-Place Factories you can safely use it without copying objects if it is a concern.
Well, your use case really looks like a std::vector< boost::optional<T> > to me (or perhaps a deque?)
4. Advices
Finally, in case you really want to do it on your own, whether for learning or because no STL container really suits you, I do suggest you wrap this up in an object to avoid the code sprawling all over the place.
Don't forget: Don't Repeat Yourself!
With an object (templated) you can capture the essence of your design in one single place, and then reuse it everywhere.
And of course, why not take advantage of the new C++11 facilities while doing so :) ?
You should use vectors.
Dogmatic or not, that is exactly what ALL the STL container do to allocate and initialize.
They use an allocator then allocates uninitialized space and initialize it by means of the container constructors.
If this (like many people use to say) "is not c++" how can be the standard library just be implemented like that?
If you just don't want to use malloc / free, you can allocate "bytes" with just new char[]
myobjet* pvext = reinterpret_cast<myobject*>(new char[sizeof(myobject)*vectsize]);
for(int i=0; i<vectsize; ++i) new(myobject+i)myobject(params);
...
for(int i=vectsize-1; i!=0u-1; --i) (myobject+i)->~myobject();
delete[] reinterpret_cast<char*>(myobject);
This lets you take advantage of the separation between initialization and allocation, still taking adwantage of the new allocation exception mechanism.
Note that, putting my first and last line into an myallocator<myobject> class and the second ands second-last into a myvector<myobject> class, we have ... just reimplemented std::vector<myobject, std::allocator<myobject> >
What you have shown here is actually the way to go when using a memory allocator different than the system general allocator - in that case you would allocate your memory using the allocator (alloc->malloc(sizeof(my_object))) and then use the placement new operator to initialize it. This has many advantages in efficient memory management and quite common in the standard template library.
If you are writing a class that mimics functionality of std::vector or needs control over memory allocation/object creation (insertion in array / deletion etc.) - that's the way to go. In this case, it's not a question of "not calling default constructor". It becomes a question of being able to "allocate raw memory, memmove old objects there and then create new objects at the olds' addresses", question of being able to use some form of realloc and so on. Unquestionably, custom allocation + placement new are way more flexible... I know, I'm a bit drunk, but std::vector is for sissies... About efficiency - one can write their own version of std::vector that will be AT LEAST as fast ( and most likely smaller, in terms of sizeof() ) with most used 80% of std::vector functionality in, probably, less than 3 hours.
my_object * my_array=new my_object [10];
This will be an array with objects.
my_object * my_array=(my_object *)malloc(sizeof(my_object)*MY_ARRAY_SIZE);
This will be an array the size of your objects, but they may be "broken". If your class has virtual funcitons for instance, then you won't be able to call those. Note that it's not just your member data that may be inconsistent, but the entire object is actully "broken" (in lack of a better word)
I'm not saying it's wrong to do the second one, just as long as you know this.
I've got a lightweight templated class that contains a couple of member objects that are very rarely used, and so I'd like to avoid calling their constructors and destructors except in the rare cases when I actually use them.
To do that, I "declare" them in my class like this:
template <class K, class V> class MyClass
{
public:
MyClass() : wereConstructorsCalled(false) {/* empty */}
~MyClass() {if (wereConstructorsCalled) MyCallPlacementDestructorsFunc();}
[...]
private:
bool wereConstructorsCalled;
mutable char keyBuf[sizeof(K)];
mutable char valBuf[sizeof(V)];
};
... and then I use placement new and placement delete to set up and tear down the objects only when I actually need to do so.
Reading the C++ FAQ it said that when using placement new, I need to be careful that the placement is properly aligned, or I would run into trouble.
My question is, will the keyBuf and valBuf arrays be properly aligned in all cases, or is there some extra step I need to take to make sure they will be aligned properly? (if so, a non-platform-dependent step would be preferable)
There's no guarantee that you'll get the appropriate alignment. Arrays are in general only guaranteed to be aligned for the member type. A char array is aligned for storage of char.
The one exception is that char and unsigned char arrays allocated with new are given maximum alignment, so that you can store arbitrary types into them. But this guarantee doesn't apply in your case as you're avoiding heap allocation.
TR1 and C++0x add some very helpful types though:
std::alignment_of and std::aligned_storage together give you a portable (and functioning) answer.
std::alignment_of<T>::value gives you the alignment required for a type T. std::aligned_storage<A, S>::type gives you a POD type with alignment A and size S. That means that you can safely write your object into a variable of type std::aligned_storage<A, S>::type.
(In TR1, the namespace is std::tr1, rather than just std)
May I ask why you want to place them into a char buffer? Why not just create pointer objects of K and V then instantiate it when you need it.
Maybe I didn't understand your question, but can't you just do char *keyBuf[..size..];, set it initially to NULL (not allocated) and allocate it the first time you need it?
What you're trying to do with placement new seems risky business and bad coding style.
Anyway, code alignment is implementation dependent.
If you want to change code alignment use pragma pack
#pragma pack(push,x)
// class code here
#pragma pack(pop) // to restore original pack value
if x is 1, there will be no padding between your elements.
Heres a link to read
http://www.cplusplus.com/forum/general/14659/
I found this answer posted by SiCrane at http://www.gamedev.net/community/forums/topic.asp?topic_id=455233 :
However, for static allocations, it's less wasteful to declare the memory block in a union with other types. Then the memory block will be guaranteed to be aligned to the alignment of the most restrictive type in the union. It's still pretty ugly either way.
Sounds like a union might do the trick!
I recommend that you look at the boost::optional template. It does what you need, even if you can't use it you should probably look at its implementation.
It uses alignment_of and type_with_alignment for its alignment calculations and guarantees.
To make a very very long story very very short this isn't going to help your performance any and will cause lots of headaches and it won't be long before you get sucked into writing your own memory managemer.
Placement new is fine for a POD (but won't save you anything) but if you have a constructor at all then it's not going to work at all.
You also can't depend on the value of your boolean variable if you use placement new.
Placement new has uses but not really for this.