Deleting a element from a vector of pointers in C++ - c++

I remember hearing that the following code is not C++ compliant and was hoping someone with much more C++ legalese than me would be able to confirm or deny it.
std::vector<int*> intList;
intList.push_back(new int(2));
intList.push_back(new int(10));
intList.push_back(new int(17));
for(std::vector<int*>::iterator i = intList.begin(); i != intList.end(); ++i) {
delete *i;
}
intList.clear()
The rationale was that it is illegal for a vector to contain pointers to invalid memory. Now obviously my example will compile and it will even work on all compilers I know of, but is it standard compliant C++ or am I supposed to do the following, which I was told is in fact the standard compliant approach:
while(!intList.empty()) {
int* element = intList.back();
intList.pop_back();
delete element;
}

You code is valid, but the better solution will be to use smart pointers.
The thing is that all requirements to std::vector are located in 23.2.4 section of C++ Standard. There're no limitations about invalid pointers. std::vector works with int* as with any other type (we doesn't consider the case of vector<bool>), it doesn't care where they are point to.

Your code is fine. If you're worried for some reason about the elements being invalid momentarily, then change the body of the loop to
int* tmp = 0;
swap (tmp, *i);
delete tmp;

The C++ philosophy is to allow the programmer as much latitude as possible, and to only ban things that are actually going to cause harm. Invalid pointers do no harm in themselves, and therefore you can have them around freely. What will cause harm is using the pointer in any way, and that therefore invokes undefined behavior.

Ultimately, this is a question of personal taste more than anything. It's not "standards non-compliant" to have a vector that contains invalid pointers, but it is dangerous, just like it's dangerous to have any pointer that points to invalid memory. Your latter example will ensure that your vector never contains a bad pointer, yes, so it's the safest choice.
But if you knew that the vector would never be used during your former example's loop (if the vector is locally scoped, for example), it's perfectly fine.

Where did you hear that? Consider this:
std::vector<int *> intList(5);
I just created a vector filled with 5 invalid pointers.

In storing raw pointers in a container (I wouldn't recommend this) then having to do a 2 phase delete, I would choose your first option over the second.
I believe container::clear() will delete the contents of the map more efficiently than popping a single item at a time.
You could probably turn the for loop into a nice (psuedo) forall(begin(),end(),delete) and make it more generic so it didn't even matter if you changed from vector to some other container.

I don't believe this is an issue of standards compliance. The C++ standards define the syntax of the language and implementation requirements. You are using the STL which is a powerful library, but like all libraries it is not part of C++ itself...although I guess it could be argued that when used aggressively, libraries like STL and Qt extend the language into a different superset language.
Invalid pointers are perfectly compliant with the C++ standards, the computer just won't like it when you dereference them.
What you are asking is more of a best practices question. If your code is multi-threaded and intList is potentially shared, then your first approach may be more dangerous, but as Greg suggested if you know that intList can't be accessed then the first approach may be more efficient. That said, I believe safety should usually win in a trade-off until you know there is a performance problem.
As suggested by the Design by Contract concept, all code defines a contract whether implicit or explicit. The real issue with code like this is what are you promising the user: preconditions, postconditions, invariants, etc. The libraries make a certain contract and each function you write defines its own contract. You just need to pick the appropriate balance for you code, and as long as you make it clear to the user (or yourself six months from now) what is safe and what isn't, it will be okay.
If there are best practices documented with with an API, then use them whenever possible. They probably are best practices for a reason. But remember, a best practice may be in the eye of the beholder...that is they may not be a best practice in all situations.

it is illegal for a vector to contain
pointers to invalid memory
This is what the Standard has to say about the contents of a container:
(23.3) : The type of objects stored in these components must meet the requirements of CopyConstructible types (20.1.3), and the additional requirements of Assignable types.
(20.1.3.1, CopyConstructible) : In the following Table 30, T is a type to be supplied by a C + + program instantiating a template, t is a value of type T, and u is a value of type const T.
expression return type requirement
xxxxxxxxxx xxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
T(t) t is equivelant to T(t)
T(u) u is equivelant to T(u)
t.~T()
&t T* denotes the address of t
&u const T* denotes the address of u
(23.1.4, Assignable) : 64, T is the type used to instantiate the container, t is a value of T, and u is a value of (possibly
const) T.
expression return type requirement
xxxxxxxxxx xxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
t = u T& t is equivilant to u
That's all that is says about the contents of an STL collection. It says nothing about pointers and it is particularly silent about the pointers pointing to valid memory.
Therefore, deleteing pointers in a vector, while most likely a very bad architectural decision and an invitation to pain and suffering with the debugger at 3:00 AM on a Saturday night, is perfectly legal.
EDIT:
Regarding Kranar's comment that "assigning a pointer to an invalid pointer value results in undefined behavior." No, this is incorrect. This code is perfectly valid:
Foo* foo = new Foo();
delete foo;
Foo* foo_2 = foo; // This is legal
What is illegal is trying to do something with that pointer (or foo, for that matter):
delete foo_2; // UB
foo_2->do_something(); // UB
Foo& foo_ref = *foo_2; // UB
Simply creating a wild pointer is legal according to the Standard. Probably not a good idea, but legal nonetheless.
EDIT2:
More from the Standard regarding pointer types.
So sayeth the Standard (3.9.2.3) :
... A valid value of an object pointer
type represents either the address of
a byte in memory (1.7) or a null
pointer (4.10)...
...and regarding "a byte in memory," (1.7.1) :
The fundamental storage unit in the C
+ + memory model is the byte. A byte is at least large enough to contain
any member of the basic execution
character set and is composed of a
contiguous sequence of bits, the
number of which is
implementation-defined. The least
significant bit is called the
low-order bit; the most significant
bit is called the high-order bit. The
memory available to a C + + program
consists of one or more sequences of
contiguous bytes. Every byte has a
unique address.
There is nothing here about that byte being part of a living Foo, about you having access to it, or anything of the sort. Its just a byte in memory.

Related

Is FlatBuffers C++ reinterpret_cast access actually undefined behavior? Is it practically OK to do that?

Recently I try to use FlatBuffers in C++. I found FlatBuffers seems to use a lot of type punning with things like reinterpret_cast in C++. This make me a little uncomfortable because I've learned it's undefined behavior in many cases.
e.g. Rect in fbs file:
struct Rect {
left:int;
top:int;
right:int;
bottom:int;
}
turns into this C++ code for reading it from a table:
const xxxxx::Rect *position() const {
return GetStruct<const xxxxx::Rect *>(VT_POSITION);
}
and the definition of GetStruct simply uses reinterpret_cast.
My questions are:
Is this really undefined behavior in C++?
In practice, will this kind of usage actually be problematic?
Update:
The buffer can just came from network or disk. I don't know if it's different if the buffer actually came from same memory written by writer of the same C++ program.
But the writer's auto-generated method is:
void add_position(const xxxxx::Rect *position) {
fbb_.AddStruct(Char::VT_POSITION, position);
}
which will use this method and this method and so use reinterpret_cast also.
I didn't analyze the whole FlatBuffers' source code, but I didn't see where these objects are created: I see no new expression, which would create P objects here:
template<typename P> P GetStruct(voffset_t field) const {
auto field_offset = GetOptionalFieldOffset(field);
auto p = const_cast<uint8_t *>(data_ + field_offset);
return field_offset ? reinterpret_cast<P>(p) : nullptr;
}
So, it seems that this code does have undefined behavior.
However, this is only true for C++17 (or pre). In C++20, there will be implicit-lifetime objects (for example, scalar types, aggregates are implicit-lifetime types). If P has implicit lifetime, then this code can be well defined. Provided that the same memory area are always accessed by a type, which doesn't violate type-punning rules (for example, it always accessed by the same type).
I think both your questions are answered by the Flatbuffers: Use in C++ page:
Direct memory access
As you can see from the above examples, all elements in a buffer are accessed through generated accessors. This is because everything is stored in little endian format on all platforms (the accessor performs a swap operation on big endian machines), and also because the layout of things is generally not known to the user.
For structs, layout is deterministic and guaranteed to be the same across platforms (scalars are aligned to their own size, and structs themselves to their largest member), and you are allowed to access this memory directly by using sizeof() and memcpy on the pointer to a struct, or even an array of structs.
These paragraphs guarantee that – given a valid flatbuffer – all memory accesses are valid, as the memory at that specific location will match the expected layout.
If you are processing untrusted flatbuffers, you first need to use the verifier functions to ensure the flatbuffer is valid:
This verifier will check all offsets, all sizes of fields, and null termination of strings to ensure that when a buffer is accessed, all reads will end up inside the buffer.

Convert an array of doubles to an array of structs with only double members without copying data

I am using a third-party C++ library to do some heavy lifting in Julia. On the Julia side, data is stored in an object of type Array{Float64, 2} (this is roughly similar to a 2D array of doubles). I can pass this to C++ using a pointer to a double. On the C++ side, however, data is stored in a struct called vector3:
typedef struct _vector3
{
double x, y, z;
} vector3;
My quick and dirty approach is a five-step process:
Dynamically allocate an array of structs on the C++ side
Copy input data from double* to vector3*
Do heavy lifting
Copy output data from vector3* to double*
Delete dynamically allocated arrays
Copying large amounts of data is very inefficient. Is there some arcane trickery I can use to avoid copying data from double to struct and back? I want to somehow interpret a 1D array of double (with a size that is a multiple of 3) as a 1D array of a struct with 3 double members.
Unfortunately no you cannot. And this is because of the aliasing rule that C++ has. In short if you have an object T you cannot legally access it from a pointer of incompatible type U. In this sense you cannot access an object of type double or double* via a pointer of type struct _vector3 or vice-versa.
If you dig deep enough you will find reinterpret_cast and maybe think "Oh this is exactly what I need" but it is not. No matter what trickery you do (reinterpret_cast or otherwise) to bypass the language restrictions (also known as just make it compile) the fact remains that you can legally access the object of type double only via pointers of type double.
One trick that is often used to type-pune is to use union. Legal in C it is however illegal in C++ but some compilers allow it. In your case however I don't think there is a way to use union.
The ideal situation would be to do the heavy lifting on the double* data directly. If that is feasible on your workflow.
Strictly speaking, you cannot. I asked a similar question some times ago (Aliasing struct and array the C++ way), and answers explained why direct aliasing would invoke Undefined Behaviour, and gave some hints on possible workarounds.
That being said you are already in a corner case, because the original data come from a different language. That means that the processing of that data is not covered by the C++ standard and is only defined by the implementation that you are using (gcc/version or clang/version or...)
For your implementation, it might be legal to alias an external array to a C++ struct or an external struct to a C++ array. You shuld carefully the documentation of mixed language programming for your precise implementation.
The other answers mention real issues (the conversion may not work properly). I will add a small runtime check which verifies that aliasing works, and provide a placeholder where you would call/use your copy-heavy code.
int aliasing_supported_internal() {
double testvec[6];
_vector3* testptr = (_vector3*)(void*)testvec;
// check that the pointer wasn't changed
if (testvec != (void*)testptr) return 0;
// check for structure padding
if (testvec+3 != (void*)(testptr+1)) return 0;
// TODO other checks?
return 1;
}
int aliasing_supported() {
static int cached_result = aliasing_supported_internal();
return cached_result;
}
This code converts a small array of doubles to an array of structures aliasing (not copying), then checks if it is valid. If the conversion works (the function returns 1), you are likely to be able to use the same kind of aliasing (converting via void pointer) yourself.
BEWARE THAT THE CODE CAN STILL BE BROKEN IN UNEXPECTED WAYS. The strict aliasing rules state that even the above check is undefined behavior. This may work or may fail horribly. Only conversions that are allowed to work properly are to void* and back to the original pointer type. Also, the above check may be completely wrong in multiple inheritance hierarchies or with virtual base classes (both are in a sense unsafe to convert to void* because the actual pointer value may be shifted, that is, may change due to constraints other than alignment and by quite a few bytes)

How to use Minimal GC in VC++ 2013? [duplicate]

This question already has an answer here:
Garbage Collection in C++11
(1 answer)
Closed 9 years ago.
According to here, VC++ 2013 supports Minimal GC.
Could you guys give me some examples to illustrate its usage?
In other words, with VC++ 2013, how to use GC?
The code example I want might look like this:
auto p = gcnew int;
Are there any?
You may be disappointed about what Minimal GC in C++11: It doesn't do garbage collection! The minimal garbage collection support in C++11 consists of two parts:
There is a mandated to not "hide" pointers for everybody. When you have a pointer you are not allowed to obfuscate this pointer to the system, e.g., by writing it to a file to be read later or by using the xor-trick to create a doubly linked list while storing just one pointer. The standard speaks about safely derived pointers (the relevant clause is 3.7.4.3 [basic.stc.dynamic.safety]).
The standard C++ library provides a set of interfaces which can be used to identify pointers which can't be tracked as being reachable or, once they are no longer reachable to say so. That is, you can define a set of root objects which are considered to be usable and shouldn't be considered released by any garbage collection system.
There is, however, nothing standardized which actually makes use of these facilities. Just because there is no standard, it doesn't mean that the promises as interfaces are not used, of course.
The relevant functions for the API outlined above are defined in 20.6.4 [util.dynamic.safety] and the header to include is <memory>. The functions are, briefly:
void std::declare_reachable(void* p) stating that if p is non-null pointer that p is a reachable object even if a garbage collector has decided that it isn't. The function may allocate memory and, thus, throw.
template <typename T> T* std::undeclare_reachable(T* p) stating that if p is a non-null pointer that p is no longer reachable. The number of calls to undeclare_reachable(p) shall not exceed the number of calls to declare_reachable(p) with the same pointer.
void std::declare_no_pointers(char* p, size_t n) declares that the range of n bytes starting at p does not contain any pointers even if a garbage collectors has decided that there would be pointers insides.
void std::undeclare_no_pointers(char* p, size_t n) undoes the declaration that there are no pointers in the n bytes starting at p.
std::pointer_safety std::get_pointer_safety() noexcept returns if the implementation has strict pointer safety.
I think that all of these functions can basically implemented to do nothing and return a default value or an argument where a return type is specified. The pointer of these function is that there is a portable system to inform garbage collectors about pointers to consider reachable and memory areas not to trace.
In the future some level of garbage collection or, more likely, litter collection may be added but I'm not sure if there is a concrete proposal on the table. If something it is added it is probably something dubbed litter collection because it actually doesn't clean up all garbage: litter collection would just reclaim memory of unreachable object but not try to destroy the objects! That is, the system would give a view of an indefinitely living object although it may reuse the memory where it was located.

malloc & placement new vs. new

I've been looking into this for the past few days, and so far I haven't really found anything convincing other than dogmatic arguments or appeals to tradition (i.e. "it's the C++ way!").
If I'm creating an array of objects, what is the compelling reason (other than ease) for using:
#define MY_ARRAY_SIZE 10
// ...
my_object * my_array=new my_object [MY_ARRAY_SIZE];
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i]=my_object(i);
over
#define MEMORY_ERROR -1
#define MY_ARRAY_SIZE 10
// ...
my_object * my_array=(my_object *)malloc(sizeof(my_object)*MY_ARRAY_SIZE);
if (my_object==NULL) throw MEMORY_ERROR;
for (int i=0;i<MY_ARRAY_SIZE;++i) new (my_array+i) my_object (i);
As far as I can tell the latter is much more efficient than the former (since you don't initialize memory to some non-random value/call default constructors unnecessarily), and the only difference really is the fact that one you clean up with:
delete [] my_array;
and the other you clean up with:
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i].~T();
free(my_array);
I'm out for a compelling reason. Appeals to the fact that it's C++ (not C) and therefore malloc and free shouldn't be used isn't -- as far as I can tell -- compelling as much as it is dogmatic. Is there something I'm missing that makes new [] superior to malloc?
I mean, as best I can tell, you can't even use new [] -- at all -- to make an array of things that don't have a default, parameterless constructor, whereas the malloc method can thusly be used.
I'm out for a compelling reason.
It depends on how you define "compelling". Many of the arguments you have thus far rejected are certainly compelling to most C++ programmers, as your suggestion is not the standard way to allocate naked arrays in C++.
The simple fact is this: yes, you absolutely can do things the way you describe. There is no reason that what you are describing will not function.
But then again, you can have virtual functions in C. You can implement classes and inheritance in plain C, if you put the time and effort into it. Those are entirely functional as well.
Therefore, what matters is not whether something can work. But more on what the costs are. It's much more error prone to implement inheritance and virtual functions in C than C++. There are multiple ways to implement it in C, which leads to incompatible implementations. Whereas, because they're first-class language features of C++, it's highly unlikely that someone would manually implement what the language offers. Thus, everyone's inheritance and virtual functions can cooperate with the rules of C++.
The same goes for this. So what are the gains and the losses from manual malloc/free array management?
I can't say that any of what I'm about to say constitutes a "compelling reason" for you. I rather doubt it will, since you seem to have made up your mind. But for the record:
Performance
You claim the following:
As far as I can tell the latter is much more efficient than the former (since you don't initialize memory to some non-random value/call default constructors unnecessarily), and the only difference really is the fact that one you clean up with:
This statement suggests that the efficiency gain is primarily in the construction of the objects in question. That is, which constructors are called. The statement presupposes that you don't want to call the default constructor; that you use a default constructor just to create the array, then use the real initialization function to put the actual data into the object.
Well... what if that's not what you want to do? What if what you want to do is create an empty array, one that is default constructed? In this case, this advantage disappears entirely.
Fragility
Let's assume that each object in the array needs to have a specialized constructor or something called on it, such that initializing the array requires this sort of thing. But consider your destruction code:
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i].~T();
For a simple case, this is fine. You have a macro or const variable that says how many objects you have. And you loop over each element to destroy the data. That's great for a simple example.
Now consider a real application, not an example. How many different places will you be creating an array in? Dozens? Hundreds? Each and every one will need to have its own for loop for initializing the array. Each and every one will need to have its own for loop for destroying the array.
Mis-type this even once, and you can corrupt memory. Or not delete something. Or any number of other horrible things.
And here's an important question: for a given array, where do you keep the size? Do you know how many items you allocated for every array that you create? Each array will probably have its own way of knowing how many items it stores. So each destructor loop will need to fetch this data properly. If it gets it wrong... boom.
And then we have exception safety, which is a whole new can of worms. If one of the constructors throws an exception, the previously constructed objects need to be destructed. Your code doesn't do that; it's not exception-safe.
Now, consider the alternative:
delete[] my_array;
This can't fail. It will always destroy every element. It tracks the size of the array, and it's exception-safe. So it is guaranteed to work. It can't not work (as long as you allocated it with new[]).
Of course, you could say that you could wrap the array in an object. That makes sense. You might even template the object on the type elements of the array. That way, all the desturctor code is the same. The size is contained in the object. And maybe, just maybe, you realize that the user should have some control over the particular way the memory is allocated, so that it's not just malloc/free.
Congratulations: you just re-invented std::vector.
Which is why many C++ programmers don't even type new[] anymore.
Flexibility
Your code uses malloc/free. But let's say I'm doing some profiling. And I realize that malloc/free for certain frequently created types is just too expensive. I create a special memory manager for them. But how to hook all of the array allocations to them?
Well, I have to search the codebase for any location where you create/destroy arrays of these types. And then I have to change their memory allocators accordingly. And then I have to continuously watch the codebase so that someone else doesn't change those allocators back or introduce new array code that uses different allocators.
If I were instead using new[]/delete[], I could use operator overloading. I simply provide an overload for operators new[] and delete[] for those types. No code has to change. It's much more difficult for someone to circumvent these overloads; they have to actively try to. And so forth.
So I get greater flexibility and reasonable assurance that my allocators will be used where they should be used.
Readability
Consider this:
my_object *my_array = new my_object[10];
for (int i=0; i<MY_ARRAY_SIZE; ++i)
my_array[i]=my_object(i);
//... Do stuff with the array
delete [] my_array;
Compare it to this:
my_object *my_array = (my_object *)malloc(sizeof(my_object) * MY_ARRAY_SIZE);
if(my_object==NULL)
throw MEMORY_ERROR;
int i;
try
{
for(i=0; i<MY_ARRAY_SIZE; ++i)
new(my_array+i) my_object(i);
}
catch(...) //Exception safety.
{
for(i; i>0; --i) //The i-th object was not successfully constructed
my_array[i-1].~T();
throw;
}
//... Do stuff with the array
for(int i=MY_ARRAY_SIZE; i>=0; --i)
my_array[i].~T();
free(my_array);
Objectively speaking, which one of these is easier to read and understand what's going on?
Just look at this statement: (my_object *)malloc(sizeof(my_object) * MY_ARRAY_SIZE). This is a very low level thing. You're not allocating an array of anything; you're allocating a hunk of memory. You have to manually compute the size of the hunk of memory to match the size of the object * the number of objects you want. It even features a cast.
By contrast, new my_object[10] tells the story. new is the C++ keyword for "create instances of types". my_object[10] is a 10 element array of my_object type. It's simple, obvious, and intuitive. There's no casting, no computing of byte sizes, nothing.
The malloc method requires learning how to use malloc idiomatically. The new method requires just understanding how new works. It's much less verbose and much more obvious what's going on.
Furthermore, after the malloc statement, you do not in fact have an array of objects. malloc simply returns a block of memory that you have told the C++ compiler to pretend is a pointer to an object (with a cast). It isn't an array of objects, because objects in C++ have lifetimes. And an object's lifetime does not begin until it is constructed. Nothing in that memory has had a constructor called on it yet, and therefore there are no living objects in it.
my_array at that point is not an array; it's just a block of memory. It doesn't become an array of my_objects until you construct them in the next step. This is incredibly unintuitive to a new programmer; it takes a seasoned C++ hand (one who probably learned from C) to know that those aren't live objects and should be treated with care. The pointer does not yet behave like a proper my_object*, because it doesn't point to any my_objects yet.
By contrast, you do have living objects in the new[] case. The objects have been constructed; they are live and fully-formed. You can use this pointer just like any other my_object*.
Fin
None of the above says that this mechanism isn't potentially useful in the right circumstances. But it's one thing to acknowledge the utility of something in certain circumstances. It's quite another to say that it should be the default way of doing things.
If you do not want to get your memory initialized by implicit constructor calls, and just need an assured memory allocation for placement new then it is perfectly fine to use malloc and free instead of new[] and delete[].
The compelling reasons of using new over malloc is that new provides implicit initialization through constructor calls, saving you additional memset or related function calls post an malloc And that for new you do not need to check for NULL after every allocation, just enclosing exception handlers will do the job saving you redundant error checking unlike malloc.
These both compelling reasons do not apply to your usage.
which one is performance efficient can only be determined by profiling, there is nothing wrong in the approach you have now. On a side note I don't see a compelling reason as to why use malloc over new[] either.
I would say neither.
The best way to do it would be:
std::vector<my_object> my_array;
my_array.reserve(MY_ARRAY_SIZE);
for (int i=0;i<MY_ARRAY_SIZE;++i)
{ my_array.push_back(my_object(i));
}
This is because internally vector is probably doing the placement new for you. It also managing all the other problems associated with memory management that you are not taking into account.
You've reimplemented new[]/delete[] here, and what you have written is pretty common in developing specialized allocators.
The overhead of calling simple constructors will take little time compared the allocation. It's not necessarily 'much more efficient' -- it depends on the complexity of the default constructor, and of operator=.
One nice thing that has not been mentioned yet is that the array's size is known by new[]/delete[]. delete[] just does the right and destructs all elements when asked. Dragging an additional variable (or three) around so you exactly how to destroy the array is a pain. A dedicated collection type would be a fine alternative, however.
new[]/delete[] are preferable for convenience. They introduce little overhead, and could save you from a lot of silly errors. Are you compelled enough to take away this functionality and use a collection/container everywhere to support your custom construction? I've implemented this allocator -- the real mess is creating functors for all the construction variations you need in practice. At any rate, you often have a more exact execution at the expense of a program which is often more difficult to maintain than the idioms everybody knows.
IMHO there both ugly, it's better to use vectors. Just make sure to allocate the space in advance for performance.
Either:
std::vector<my_object> my_array(MY_ARRAY_SIZE);
If you want to initialize with a default value for all entries.
my_object basic;
std::vector<my_object> my_array(MY_ARRAY_SIZE, basic);
Or if you don't want to construct the objects but do want to reserve the space:
std::vector<my_object> my_array;
my_array.reserve(MY_ARRAY_SIZE);
Then if you need to access it as a C-Style pointer array just (just make sure you don't add stuff while keeping the old pointer but you couldn't do that with regular c-style arrays anyway.)
my_object* carray = &my_array[0];
my_object* carray = &my_array.front(); // Or the C++ way
Access individual elements:
my_object value = my_array[i]; // The non-safe c-like faster way
my_object value = my_array.at(i); // With bounds checking, throws range exception
Typedef for pretty:
typedef std::vector<my_object> object_vect;
Pass them around functions with references:
void some_function(const object_vect& my_array);
EDIT:
IN C++11 there is also std::array. The problem with it though is it's size is done via a template so you can't make different sized ones at runtime and you cant pass it into functions unless they are expecting that exact same size (or are template functions themselves). But it can be useful for things like buffers.
std::array<int, 1024> my_array;
EDIT2:
Also in C++11 there is a new emplace_back as an alternative to push_back. This basically allows you to 'move' your object (or construct your object directly in the vector) and saves you a copy.
std::vector<SomeClass> v;
SomeClass bob {"Bob", "Ross", 10.34f};
v.emplace_back(bob);
v.emplace_back("Another", "One", 111.0f); // <- Note this doesn't work with initialization lists ☹
Oh well, I was thinking that given the number of answers there would be no reason to step in... but I guess I am drawn in as the others. Let's go
Why your solution is broken
C++11 new facilities for handling raw memory
Simpler way to get this done
Advices
1. Why your solution is broken
First, the two snippets you presented are not equivalent. new[] just works, yours fails horribly in the presence of Exceptions.
What new[] does under the cover is that it keeps track of the number of objects that were constructed, so that if an exception occurs during say the 3rd constructor call it properly calls the destructor for the 2 already constructed objects.
Your solution however fails horribly:
either you don't handle exceptions at all (and leak horribly)
or you just try to call the destructors on the whole array even though it's half built (likely crashing, but who knows with undefined behavior)
So the two are clearly not equivalent. Yours is broken
2. C++11 new facilities for handling raw memory
In C++11, the comittee members have realized how much we liked fiddling with raw memory and they have introduced facilities to help us doing so more efficiently, and more safely.
Check cppreference's <memory> brief. This example shows off the new goodies (*):
#include <iostream>
#include <string>
#include <memory>
#include <algorithm>
int main()
{
const std::string s[] = {"This", "is", "a", "test", "."};
std::string* p = std::get_temporary_buffer<std::string>(5).first;
std::copy(std::begin(s), std::end(s),
std::raw_storage_iterator<std::string*, std::string>(p));
for(std::string* i = p; i!=p+5; ++i) {
std::cout << *i << '\n';
i->~basic_string<char>();
}
std::return_temporary_buffer(p);
}
Note that get_temporary_buffer is no-throw, it returns the number of elements for which memory has actually been allocated as a second member of the pair (thus the .first to get the pointer).
(*) Or perhaps not so new as MooingDuck remarked.
3. Simpler way to get this done
As far as I am concered, what you really seem to be asking for is a kind of typed memory pool, where some emplacements could not have been initialized.
Do you know about boost::optional ?
It is basically an area of raw memory that can fit one item of a given type (template parameter) but defaults with having nothing in instead. It has a similar interface to a pointer and let you query whether or not the memory is actually occupied. Finally, using the In-Place Factories you can safely use it without copying objects if it is a concern.
Well, your use case really looks like a std::vector< boost::optional<T> > to me (or perhaps a deque?)
4. Advices
Finally, in case you really want to do it on your own, whether for learning or because no STL container really suits you, I do suggest you wrap this up in an object to avoid the code sprawling all over the place.
Don't forget: Don't Repeat Yourself!
With an object (templated) you can capture the essence of your design in one single place, and then reuse it everywhere.
And of course, why not take advantage of the new C++11 facilities while doing so :) ?
You should use vectors.
Dogmatic or not, that is exactly what ALL the STL container do to allocate and initialize.
They use an allocator then allocates uninitialized space and initialize it by means of the container constructors.
If this (like many people use to say) "is not c++" how can be the standard library just be implemented like that?
If you just don't want to use malloc / free, you can allocate "bytes" with just new char[]
myobjet* pvext = reinterpret_cast<myobject*>(new char[sizeof(myobject)*vectsize]);
for(int i=0; i<vectsize; ++i) new(myobject+i)myobject(params);
...
for(int i=vectsize-1; i!=0u-1; --i) (myobject+i)->~myobject();
delete[] reinterpret_cast<char*>(myobject);
This lets you take advantage of the separation between initialization and allocation, still taking adwantage of the new allocation exception mechanism.
Note that, putting my first and last line into an myallocator<myobject> class and the second ands second-last into a myvector<myobject> class, we have ... just reimplemented std::vector<myobject, std::allocator<myobject> >
What you have shown here is actually the way to go when using a memory allocator different than the system general allocator - in that case you would allocate your memory using the allocator (alloc->malloc(sizeof(my_object))) and then use the placement new operator to initialize it. This has many advantages in efficient memory management and quite common in the standard template library.
If you are writing a class that mimics functionality of std::vector or needs control over memory allocation/object creation (insertion in array / deletion etc.) - that's the way to go. In this case, it's not a question of "not calling default constructor". It becomes a question of being able to "allocate raw memory, memmove old objects there and then create new objects at the olds' addresses", question of being able to use some form of realloc and so on. Unquestionably, custom allocation + placement new are way more flexible... I know, I'm a bit drunk, but std::vector is for sissies... About efficiency - one can write their own version of std::vector that will be AT LEAST as fast ( and most likely smaller, in terms of sizeof() ) with most used 80% of std::vector functionality in, probably, less than 3 hours.
my_object * my_array=new my_object [10];
This will be an array with objects.
my_object * my_array=(my_object *)malloc(sizeof(my_object)*MY_ARRAY_SIZE);
This will be an array the size of your objects, but they may be "broken". If your class has virtual funcitons for instance, then you won't be able to call those. Note that it's not just your member data that may be inconsistent, but the entire object is actully "broken" (in lack of a better word)
I'm not saying it's wrong to do the second one, just as long as you know this.

C++: What are scenarios where using pointers is a "Good Idea"(TM)? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Common Uses For Pointers?
I am still learning the basics of C++ but I already know enough to do useful little programs.
I understand the concept of pointers and the examples I see in tutorials make sense to me. However, on the practical level, and being a (former) PHP developer, I am not yet confident to actually use them in my programs.
In fact, so far I have not felt the need to use any pointer. I have my classes and functions and I seem to be doing perfectly fine without using any pointer (let alone pointers to pointers). And I can't help feeling a bit proud of my little programs.
Still, I am aware that I am missing on one of C++'s most important feature, a double edged one: pointers and memory management can create havoc, seemingly random crashes, hard to find bugs and security holes... but at the same time, properly used, they must allow for clever and efficient programming.
So: do tell me what I am missing by not using pointers.
What are good scenarios where using pointers is a must?
What do they allow you to do that you couldn't do otherwise?
In which way to they make your programs more efficient?
And what about pointers to pointers???
[Edit: All the various answers are useful. One problem at SO is that we cannot "accept" more than one answer. I often wish I could. Actually, it's all the answers combined that help to understand better the whole picture. Thanks.]
I use pointers when I want to give a class access to an object, without giving it ownership of that object. Even then, I can use a reference, unless I need to be able to change which object I am accessing and/or I need the option of no object, in which case the pointer would be NULL.
This question has been asked on SO before. My answer from there:
I use pointers about once every six lines in the C++ code that I write. Off the top of my head, these are the most common uses:
When I need to dynamically create an object whose lifetime exceeds the scope in which it was created.
When I need to allocate an object whose size is unknown at compile time.
When I need to transfer ownership of an object from one thing to another without actually copying it (like in a linked list/heap/whatever of really big, expensive structs)
When I need to refer to the same object from two different places.
When I need to slice an array without copying it.
When I need to use compiler intrinsics to generate CPU-specific instructions, or work around situations where the compiler emits suboptimal or naive code.
When I need to write directly to a specific region of memory (because it has memory-mapped IO).
Pointers are commonly used in C++. Becoming comfortable with them, will help you understand a broader range of code. That said if you can avoid them that is great, however, in time as your programs become more complex, you will likely need them even if only to interface with other libraries.
Primarily pointers are used to refer to dynamically allocated memory (returned by new).
They allow functions to take arguments that cannot be copied onto the stack either because they are too big or cannot be copied, such as an object returned by a system call. (I think also stack alignment, can be an issue, but too hazy to be confident.)
In embedded programing they are used to refer to things like hardware registers, which require that the code write to a very specific address in memory.
Pointers are also used to access objects through their base class interfaces. That is if I have a class B that is derived from class A class B : public A {}. That is an instance of the object B could be accessed as if it where class A by providing its address to a pointer to class A, ie: A *a = &b_obj;
It is a C idiom to use pointers as iterators on arrays. This may still be common in older C++ code, but is probably considered a poor cousin to the STL iterator objects.
If you need to interface with C code, you will invariable need to handle pointers which are used to refer to dynamically allocated objects, as there are no references. C strings are just pointers to an array of characters terminated by the nul '\0' character.
Once you feel comfortable with pointers, pointers to pointers won't seem so awful. The most obvious example is the argument list to main(). This is typically declared as char *argv[], but I have seen it declared (legally I believe) as char **argv.
The declaration is C style, but it says that I have array of pointers to pointers to char. Which is interpreted as a arbitrary sized array (the size is carried by argc) of C style strings (character arrays terminated by the nul '\0' character).
If you haven't felt a need for pointers, I wouldn't spend a lot of time worrying about them until a need arises.
That said, one of the primary ways pointers can contribute to more efficient programming is by avoiding copies of actual data. For example, let's assume you were writing a network stack. You receive an Ethernet packet to be processed. You successively pass that data up the stack from the "raw" Ethernet driver to the IP driver to the TCP driver to, say, the HTTP driver to something that processes the HTML it contains.
If you're making a new copy of the contents for each of those, you end up making at least four copies of the data before you actually get around to rendering it at all.
Using pointers can avoid a lot of that -- instead of copying the data itself, you just pass around a pointer to the data. Each successive layer of the network stack looks at its own header, and passes a pointer to what it considers the "payload" up to the next higher layer in the stack. That next layer looks at its own header, modifies the pointer to show what it considers the payload, and passes it on up the stack. Instead of four copies of the data, all four layers work with one copy of the real data.
A big use for pointers is dynamic sizing of arrays. When you don't know the size of the array at compile time, you will need to allocate it at run-time.
int *array = new int[dynamicSize];
If your solution to this problem is to use std::vector from the STL, they use dynamic memory allocation behind the scenes.
There are several scenarios where pointers are required:
If you are using Abstract Base Classes with virtual methods. You can hold a std::vector and loop through all these objects and call a virtual method. This REQUIRES pointers.
You can pass a pointer to a buffer to a method reading from a file etc.
You need a lot of memory allocated on the heap.
It's a good thing to care about memory problems right from the start. So if you start using pointers, you might as well take a look at smart pointers, like boost's shared_ptr for example.
What are good scenarios where using pointers is a must?
Interviews. Implement strcpy.
What do they allow you to do that you couldn't do otherwise?
Use of inheritance hierarchy. Data structures like Binary trees.
In which way to they make your programs more efficient?
They give more control to the programmer, for creating and deleting resources at run time.
And what about pointers to pointers???
A frequently asked interview question. How will you create two dimensional array on heap.
A pointer has a special value, NULL, that reference's won't. I use pointers wherever NULL is a valid and useful value.
I just want to say that i rarely use pointers. I use references and stl objects (deque, list, map, etc).
A good idea is when you need to return an object where the calling function should free or when you dont want to return by value.
List<char*>* fileToList(char*filename) { //dont want to pass list by value
ClassName* DataToMyClass(DbConnectionOrSomeType& data) {
//alternatively you can do the below which doesnt require pointers
void DataToMyClass(DbConnectionOrSomeType& data, ClassName& myClass) {
Thats pretty much the only situation i use but i am not thinking that hard. Also if i want a function to modify a variable and cant use the return value (say i need more then one)
bool SetToFiveIfPositive(int**v) {
You can use them for linked lists, trees, etc.
They're very important data structures.
In general, pointers are useful as they can hold the address of a chunk of memory. They are especially useful in some low level drivers where they are efficiently used to operate on a piece of memory byte by byte. They are most powerful invention that C++ inherits from C.
As to pointer to pointer, here is a "hello-world" example showing you how to use it.
#include <iostream>
void main()
{
int i = 1;
int j = 2;
int *pInt = &i; // "pInt" points to "i"
std::cout<<*pInt<<std::endl; // prints: 1
*pInt = 6; // modify i, i = 6
std::cout<<i<<std::endl; // prints: 6
int **ppInt = &pInt; // "ppInt" points to "pInt"
std::cout<<**ppInt<<std::endl; // prints: 6
**ppInt = 8; // modify i, i = 8
std::cout<<i<<std::endl; // prints: 8
*ppInt = &j; // now pInt points to j
*pInt = 10; // modify j, j = 10
std::cout<<j<<std::endl; // prints: 10
}
As we see, "pInt" is a pointer to integer which points to "i" at the beginning. With it, you can modify "i". "ppInt" is a pointer to pointer which points to "pInt". With it, you can modify "pInt" which happens to be an address. As a result, "*ppInt = &j" makes "pInt" points to "j" now. So we have all the results above.