<vector> push_back implementation internal - c++

In std::vectors push_back implemetation, when size()==capacity() , the part, that it allocates two times more place and copy the elements of old , I want to ask, the most effective way of that copying?

This is not really specified in the standard, but left for the implementation. Thus, different C++ library versions will do this differently. However, the allocator of the vector must be used for any allocation and de-allocations
(hence no usage of realloc()).
What the standard does specify, though is that if objects can be moved (instead of copied) then they should be moved (since C++11). Typically, a new block of raw memory will be allocated, typically twice the size of the previous, and then the elements moved (or copied and destroyed) from the previous to the old. For the generic case, this moving/copying must be done one by one (since otherwise the integrity of the moves/copied cannot be guaranteed without knowledege of the internals of the respective move/copy constructors). Of course, for pod (=plain old data) types optimisations via std::memcpy() are possible (and most likely implemented).
You can try to look at your C++ library implementation, but be warned that the code can be quite opaque to those unfamiliar with meta template programming: the code is complicated because of the various optimisations (for pod, movable types, etc.).

Sounds a bit as if you wanted to implement std::vector<> yourself.
The most efficient way is to avoid reallocations in the first place, which often but not always is possible.
If you have prior knowledge about how many items you will push_back() next, you can use std::vector<>::reserve() and avoid reallocations altogether.
If you really were to implement yourself, you might want to use realloc(), in spite of the fact that this is a horrible function which can do - depending on parameter values anything from malloc() to free() to realloc().
The reason, why this function can be helpful is, that there is a good chance that there is still room behind the data block on the heap. realloc() is the only function which can avoid a 100% of the time moving of the data by expanding this heap blocks size, not touching the data at all, if there is enough room to do that.
The drawback of using realloc() is that your "special performance" implementation would forego the concept of allocators as well as fail to handle element types which require copy constructor calls etc. So you would best not name your vector vector but something like "SimpleDataVector" and possibly build in some mechanism to detect misuse.

Related

Is it better to save object pointer in STL container rather than object itself?

Suppose I have a class A, and I need a vector of objects with class A.
Is it better to use std::vector<A*> or std::vector<A>?
I came across some lectures mentioned that the former doesn't require the definition of copy constructor and copy assignment operator; while the latter requires definitions of both. Is this correct?
The lecture notes are not fully correct: using a vector of A uses a copy constructor / a copy assignment operator, but if the default implementation provided by the compiler works for you, you are not required to provide your own definition.
The decision to define a copy constructor, an assignment operator, and a destructor is made independently of the decision to place your objects in a container. You define these three when your class allocates its resources manually. Otherwise, default implementations should work.
Back to the main question, the decision to store a pointer vs. an object depends mostly on the semantic of your collection, and on the need to store objects with polymorphic behavior.
If polymorphic behavior is not needed, and creating copies of your objects is relatively inexpensive, using vector<A> is a better choice, because it lets the container manage resources for you.
If copying is expensive, and you need polymorphic behavior, you need to use pointers. They are not necessarily need to be raw pointers, though: C++ Standard Library provides smart pointers that will deal with cleanup for you, so you wouldn't have to destroy your objects manually.
Unlike many other modern languages, C++ thrives on value types.
If you store pointers, you have to manage the resulting objects lifetime. Unless you use a unique_ptr or similar smart pointers, the language won't help you with that. Losing track of objects is leaking : keeping track of them after you have disposed of them is a dangling reference/pointer. Both are very common bugs.
If you store values (or value-like types), and you teach your data how to move itself efficiently, a vector will store it contiguously in memory.
On modern computers, CPUs are fast, and memory is amazingly slow. A typical computer will have 3 levels of cache in order to try to make memory faster, but if your data is scattered throughout memory (because you used the free store to store the objects), the CPU has little chance to figure out where you are going to access next.
If your data is in a contiguous buffer, not only will one fetch from memory get more than one object, but the CPU will be able to guess that you are going to want the next chunk of memory in the buffer, and pre-fetch it for you.
So the short version is, if your objects are modest in size, use a vector of actual copies of the object. If they are modestly larger, stick the frequently accessed stuff in the object, and the big less frequently accessed part in a vector within the object, and write an efficient move semantics. Then store the object itself in a vector.
There are a few exceptions to this.
First, polymorphism in value types is hard, so you end up using the free store a lot.
Second, some objects end up having their location part of their identity. Vector moves objects around, and the cost of "rehoming" an object can be not worth the bother.
Third, often performance doesn't matter much. So you do what is easy. At the same time, value types are not that hard, and while premature optimization is a bad idea, so is premature deoptimization. Learning how to work with value types and contiguous vectors is important.
Finally, learn the rule of zero. The rule of zero is that objects which manage resources should have their copy/move/assignment/move assignment/destructors carefully written to follow value semantics rules (or blocked). Then objects that use those resource management objects typically do not need to have their move/copy/assignment/move assign/destructors actually written -- they can be left empty, =defaulted, or similar.
And code you don't write tends to have fewer bugs than code you write.
This is correct and it depends.
Storing pointers in your container gives you additional flexibility, since you don't need these operators or you can have these operators with side effects and / or high cost. The container itself will, at worse, perform copies of the pointers, the cost of which is quite low. In addition, you can store objects of different sizes (instances of different classes in the same inheritance hierarchy comes to mind, especially if they have virtual methods).
On the other hand, storing the objects themselves lower the access overhead, as you won't need to dereference the pointers everytime you access an element. Moreover, it'll improve data locality (thus lower cache misses, page misses…) and reduce memory consumption and fragmentation.
There's no general rule of thumb, there. Small objects with no side effects in their constructors and copy operators are probably better directly in a container that's read often and rarely modified, while large objects with constructors and copy operators that have expensive side effects fit probably better outside of containers that are often modified, resized…
You have to consider your use case and weight the pros and cons of each approach.
Do you need to check for identity? if yes
then use unique_ptr to store them because then you don't need to take care of deleting them later.
Yes you need copy operations because if you store A into the vector, the vector copys the object.
With A* the vector only copys the adress ( the pointer )

Why use new and delete at all?

I'm new to C++ and I'm wondering why I should even bother using new and delete? It can cause problems (memory leaks) and I don't get why I shouldn't just initialize a variable without the new operator. Can someone explain it to me? It's hard to google that specific question.
For historical and efficiency reasons, C++ (and C) memory management is explicit and manual.
Sometimes, you might allocate on the call stack (e.g. by using VLAs or alloca(3)). However, that is not always possible, because
stack size is limited (depending on the platform, to a few kilobytes or a few megabytes).
memory need is not always FIFO or LIFO. It does happen that you need to allocate memory, which would be freed (or becomes useless) much later during execution, in particular because it might be the result of some function (and the caller - or its caller - would release that memory).
You definitely should read about garbage collection and dynamic memory allocation. In some languages (Java, Ocaml, Haskell, Lisp, ....) or systems, a GC is provided, and is in charge of releasing memory of useless (more precisely unreachable) data. Read also about weak references. Notice that most GCs need to scan the call stack for local pointers.
Notice that it is possible, but difficult, to have quite efficient garbage collectors (but usually not in C++). For some programs, Ocaml -with a generational copying GC- is faster than the equivalent C++ code -with explicit memory management.
Managing memory explicitly has the advantage (important in C++) that you don't pay for something you don't need. It has the inconvenience of putting more burden on the programmer.
In C or C++ you might sometimes consider using the Boehm's conservative garbage collector. With C++ you might sometimes need to use your own allocator, instead of the default std::allocator. Read also about smart pointers, reference counting, std::shared_ptr, std::unique_ptr, std::weak_ptr, and the RAII idiom, and the rule of three (in C++, becoming the rule of 5). The recent wisdom is to avoid explicit new and delete (e.g. by using standard containers and smart pointers).
Be aware that the most difficult situation in managing memory are arbitrary, perhaps circular, graphs (of reference).
On Linux and some other systems, valgrind is a useful tool to hunt memory leaks.
The alternative, allocating on the stack, will cause you trouble as stack sizes are often limited to Mb magnitudes and you'll get lots of value copies. You'll also have problems sharing stack-allocated data between function calls.
There are alternatives: using std::shared_ptr (C++11 onwards) will do the delete for you once the shared pointer is no longer being used. A technique referred to by the hideous acronym RAII is exploited by the shared pointer implementation. I mention it explicitly since most resource cleanup idioms are RAII-based. You can also make use of the comprehensive data structures available in the C++ Standard Template Library which eliminate the need to get your hands too dirty with explicit memory management.
But formally, every new must be balanced with a delete. Similarly for new[] and delete[].
Indeed in many cases new and delete are not needed, you can just use standard containers instead and leaving to them the allocation/deallocation management.
One of the reasons for which you may need to use allocation explicitly is for objects where the identity is important (i.e. they are not just values that can be copied around).
For example if you have a gui "window" object then making copies probably doesn't make sense and thus you're more or less ruling out all standard containers (they're designed for objects that can be copied and assigned). In this case if the object needs to survive the function that creates it probably the simplest solution is to just allocate explicitly it on the heap, possibly using a smart pointer to avoid leaks or use-after-delete.
In other cases it may be important to avoid copies not because they're illegal, but just not very efficient (big objects) and explicitly handling the instance lifetime may be a better (faster) solution.
Another case where explicit allocation/deallocation may be the best option are complex data structures that cannot be represented by the standard library (for example a tree in which each node is also part of a doubly-linked list).
Modern C++ styles often frown on explicit calls to new and delete outside of specialized resource management code.
This is not because the stack/automatic storage is sufficient, but rather because RAII smart resource owners (be they containers, shared pointers, or something else) make almost all direct memory wrangling unnessecary. And as the problem of memory management is often error prone, this makes your code more robust, easier to read, and sometimes faster (as the fancy resource owners can use techniques you might not bother with everywhere).
This is exemplified by the rule of zero: write no destructor, copy/move assign, copy/move constructor. Store state in smart storage, and have it handle it for you.
None of the above applies when you yourself are writing smart memory owning classes. This is a rare thing to need to do, however. It also requires C++14 (for make_unique) to get rid of the penultimate excuse to call new.
Now, the free store is still used, just not directly, under the above style. The free store (aka heap) is needed because automatic storage (aka the stack) only supports really simple object lifetime rules (scope based, compile time deterministic size and count, FILO order). As runtime sized and counted data is common, and object lifetime is often not that simple, the free store is used by most programs. Sometimes copying an object around on the stack is enough to make the simple lifetime less of a problem, but at other times identity is important.
The final reason is stack overflow. On some C++ implementations the stack/automatic storage is seriously constrained in size. What more is that there is rarely if ever a reliable failure mode when you put to much stuff in it. By storing large data on the free store, we can reduce the chance the stack will overflow.
First, if you don't need dynamic allocation, don't use it.
The most frequent reason for needing dynamic allocation is that
the object will have a lifetime which is determined by the
program logic rather than lexical scope. The new and
delete operators are designed to support explicitly managed
lifetimes.
Another common reason is that the size or structure of the
"object" is determined at runtime. For simple cases (arrays,
etc.) there are standard classes (std::vector) which will
handle this for you, but for more complicated structures (e.g.
graphs and trees), you'll have to do this yourself. (The usual
technique here is to create a class representing the graph or
tree, and have it manage the memory.)
And there is the case where the object must be polymorphic, and
the actual type won't be known until runtime. (There are some
tricky ways of handling this without dynamic allocation in the
simplest cases, but in general, you'll need dynamic allocation.)
In this case, std::unique_ptr might be appropriate to handle
the delete, or if the object must be shared, std::shared_ptr
(although usually, objects which must be shared fall into the
first category, above, and so smart pointers aren't
appropriate).
There are probably other reasons as well, but these are the
three that I've encountered the most often.
Only on simple programs you can know beforehand how much memory you'd use. In general you can not foresee how much memory you'd use.
However with modern C++11 you generally rely on standard libraries like vector and map for memory allocation, and the use of smart pointers helps you avoid memory leaks, so you don't really need to use new and delete explicitly by hand.
When you are using New then your object stores in Heap, and it remains there until you don't manually delete it. but in the case without using new your object goes in Stack and it destroys automatically when it goes out of scope.
Stack is set to a fix size, so if there is no any block for assign a new object then Stack Overflow occurs. This often happens when a lot of nested functions are being called, or if there is an infinite recursive call. If the current size of the heap is too small to accommodate new memory, then more memory can be added to the heap by the operating system.
Another reason may be if you are explicitly calling an external library or API with a C-style interface. Setting up a callback in such cases often means context data must be supplied and returned in the callback, and such an interface usually provides only a 'simple' void* or int*. Allocating an object or struct with new is appropriate for such actions, (you can delete it later in the callback, should you need to).

std::list of objects efficiency

Say you have a std::list of some class. There are two ways you can make this list:
1)
std::list<MyClass> myClassList;
MyClass myClass;
myClassList.push_front(myClass);
Using this method, the copy constructor will be called when you pass the object to the list. If the class has many member variables and you are making this call many times it could become costly.
2)
std::list<MyClass*> myClassList;
MyClass* myClass = new MyClass();
myClassList.push_front(myClass);
This method will not call the copy constructor for the class. I'm not exactly positive what happens in this case, but I think the list will create a new MyClass* and assign the address of the parameter. In fact if you make myClass on the stack instead of the heap and let it go out of scope then myClassList.front() is invalid so that must be the case.
If I am wrong about this please contradict me, but I believe the 2nd method is much more efficient for certain classes.
The important point to consider here is much more subtle than the performance issue.
Standard library containers work on Copy Semantics, they create a copy of the element you add to the container.
In general it is always better to stay away from dynamic memory allocations in C++ unless you absolutely need it. First option is better because you do not have to bother about the memory allocations and deallocations, The container will take the ownership of the object you add to it, And do the management for you.
In Second case the container does not take the ownership of element you add, You have to manage it yourself. And if you must then you should a Smart pointer as container element rather than a raw pointer.
With respect to performance, You will nedd to profile the code samples on your system to see if the performance difference is notable enough to select one approach over the other.
This is always a though question.
First of all, it really depends whether your compiler supports C++11 move semantics or not, as this dramatically change the aspects of the problem.
For those stuck in C++03
There are multiple choices:
std::list<MyClass> list;
list.push_front(MyClass());
Even though semantically there is a copy, the optimizer might remove most of the redundant/dead stores. Most optimizers will require that the definition of the default constructor and copy constuctor be available.
boost::ptr_deque<MyClass> deque;
std::auto_ptr<MyClass> p(new MyClass());
deque.push_front(p);
ptr_vector could be used should you replace push_front with push_back, otherwise it's a bit wasteful. This avoids most of the memory overhead of a std::list<MyClass*> and has the added bonus of automatically handling memory.
boost::stable_vector<MyClass> svec;
svec.push_back(MyClass());
// ~~~~
There is one copy (as with list) but a guarantee that no further copy should be made within the container (as with list). It also allows a few more operations than list (for example, random access), at the cost of being slower for insertion in the middle for large containers.
For those enjoying C++11
std::list<MyClass> list;
list.push_front(MyClass());
does not generate any copy, instead a move operation occurs.
It is also possible to use the new operations provided to construct objects in place:
std::list<MyClass> list;
list.emplace_front();
will create a new MyClass directly within the node, no copy, no move.
And finally, you may wish for a more compact representation or other operations on the container, in which case:
std::vector<std::unique_ptr<MyClass>> vec;
vec.emplace_back(new MyClass());
Offers you random access and a lower memory overhead.
If you are really concerned about performance but still need to use linked lists, consider using boost::intrusive::list. The main problem with using std::list is that you'll need to allocate new memory from the heap, and that's probably more costly then even the copy construction for most cases. Since boost::intrusive::list leaves allocation to you, you could keep your objects in a std::vector and allocate them in batches. This way, you'd also have better cache locality, another concern in performance. Alternatively, you could use a custom allocator with the std::list to do the same. Since using the custom allocator for std::list is probably around as messy as using a boost intrusive list, I'd go with boost, because you get many other useful features with that (such as keeping the same object in multiple lists, etc.).
BTW, don't be concerned about the copy construction, the compiler will probably optimize out any unnecessary copying (unnecessary given the way you use it).
The problem with the first approach -- low performance when MyClass is large and inability to have the same object in two data structures (in two lists, each with a different semantics; in a list and a tree; etc). If these downsides do not bother you, go with the first approach.
The second approach is more efficient, but may be harder to manage. For example, you need to correctly destroy MyClass objects if they will not be accessible anymore. This may be non-trivial in the presence of exceptions (read about C++ exception safety). I would recommend you looking at Boost Smart Pointers, which intend to ease C++ pointers management. C++11 has these built-in, so you don't need Boost if you use a modern compiler. Read Wikipedia for a short introduction.

Compacting garbage collector implementation in C++0x

I'm implementing a compacting garbage collector for my own personal use in C++0x, and I've got a question. Obviously the mechanics of the collector depend upon moving objects, and I've been wondering how to implement this in terms of the smart pointer types that point to it. I've been thinking about either pointer-to-pointer in the pointer type itself, or, the collector maintains a list of pointers that point to each object so that they can be modified, removing the need for a double de-ref when accessing the pointer but adding some extra overhead during collection and additional memory overhead. What's the best way to go here?
Edit: My primary concern is for speedy allocation and access. I'm not concerned with particularly efficient collections or other maintenance, because that's not really what the GC is intended for.
There's nothing straight forward about grafting on extra GC to C++, let alone a compacting algorithm. It isn't clear exactly what you're trying to do and how it will interact with the rest of the C++ code.
I have actually written a gc in C++ which works with existing C++ code, and it had a compactor at one stage (though I dropped it because it was too slow). But there are many nasty semantic problems. I mentioned to Bjarne only a few weeks ago that C++ lacks the operator required to do it properly and the situation is that it is unlikely to ever exist because it has limited utility..
What you actually need is a "re-addres-me" operator. What happens is that you do not actually move objects around. You just use mmap to change the object address. This is much faster, and, in effect, it is using the VM features to provide handles.
Without this facility you have to have a way to perform an overlapping move of an object, which you cannot do in C++ efficiently: you'd have to move to a temporary first. In C, it is much easier, you can use memmove. At some stage all the pointers to or into the moved objects have to be adjusted.
Using handles does not solve this problem, it just reduces the problem from arbitrary sized objects to constant sized ones: these are easier to manage in an array, but the same problem exists: you have to manage the storage. If you remove lots of handle from the array randomly .. you still have a problem with fragmentation.
So don't bother with handles, they don't work.
This is what I did in Felix: you call new(shape, collector) T(args). Here the shape is a descriptor of the type, including a list of offsets which contain (GC) pointers, and the address of a routine to finalise the object (by default, it calls the destructor).
It also contains a flag saying if the object can be moved with memmove. If the object is big or immobile, it is allocated by malloc. If the object is small and mobile, it is allocated in an arena, provided there is space in the arena.
The arena is compacted by moving all the objects in it, and using the shape information to globally adjust all the pointers to or into these objects. Compaction can be done incrementally.
The downside for a C++ programmer is the need to construct a correct shape object to pass. This doesn't bother me because I'm implementing a language which can generate the shape information automatically.
Now: the key point is: to do compaction, you must use a precise collector. Compaction cannot work with a conservative collector. This is very important. It is fine to allow some leakage if you see an value that looks like a pointer but happens to be an integer: some object won't be collected, but this is usually no big deal. But for compaction you have to adjust the pointers but you'd better not change that integer: so you have to know for sure when something is a pointer, so your collector has to be precise: the shape must be known.
In Ocaml this is relatively simple: everything is either a pointer or integer and the low bit is used at run time to tell. Objects pointed at have a code telling the type, and there are only a few types: either a scalar (don't scan it) or an aggregate (scan it, it only contains integers or pointers).
This is a pretty straight-forward question so here's a straight-forward answer:
Mark-and-sweep (and occasionally mark-and-compact to avoid heap fragmentation) is the fastest when it comes to allocation and access (avoiding double de-refs). It's also very easy to implement. Since you're not worried about collection performance impact (mark-and-sweep tends to freeze up the process in a nondeterministically), this should be the way to go.
Implementation details found at:
http://www.brpreiss.com/books/opus5/html/page424.html#secgarbagemarksweep
http://www.brpreiss.com/books/opus5/html/page428.html
A nursery generation will give you the best possible allocation performance because it is just a pointer bump.
You could implement pointer updates without using double indirection by using techniques like a shadow stack but this will be slow and very error prone if you're writing this C++ code by hand.

Why is there no reallocation functionality in C++ allocators?

In C the standard memory handling functions are malloc(), realloc() and free(). However, C++ stdlib allocators only parallel two of them: there is no reallocation function. Of course, it would not be possible to do exactly the same as realloc(), because simply copying memory is not appropriate for non-aggregate types. But would there be a problem with, say, this function:
bool reallocate (pointer ptr, size_type num_now, size_type num_requested);
where
ptr is previously allocated with the same allocator for num_now objects;
num_requested >= num_now;
and semantics as follows:
if allocator can expand given memory block at ptr from size for num_now objects to num_requested objects, it does so (leaving additional memory uninitialized) and returns true;
else it does nothing and returns false.
Granted, this is not very simple, but allocators, as I understand, are mostly meant for containers and containers' code is usually complicated already.
Given such a function, std::vector, say, could grow as follows (pseudocode):
if (allocator.reallocate (buffer, capacity, new_capacity))
capacity = new_capacity; // That's all we need to do
else
... // Do the standard reallocation by using a different buffer,
// copying data and freeing the current one
Allocators that are incapable of changing memory size altogether could just implement such a function by unconditional return false;.
Are there so few reallocation-capable allocator implementation that it wouldn't worth it to bother? Or are there some problems I overlooked?
From:
http://www.sgi.com/tech/stl/alloc.html
This is probably the most questionable
design decision. It would have
probably been a bit more useful to
provide a version of reallocate that
either changed the size of the
existing object without copying or
returned NULL. This would have made it
directly useful for objects with copy
constructors. It would also have
avoided unnecessary copying in cases
in which the original object had not
been completely filled in.
Unfortunately, this would have
prohibited use of realloc from the C
library. This in turn would have added
complexity to many allocator
implementations, and would have made
interaction with memory-debugging
tools more difficult. Thus we decided
against this alternative.
This is actually a design flaw that Alexandrescu points out with the standard allocators (not operator new[]/delete[] but what were originally the stl allocators used to implement std::vector, e.g.).
A realloc can occur significantly faster than a malloc, memcpy, and free. However, while the actual memory block can be resized, it can also move memory to a new location. In the latter case, if the memory block consists of non-PODs, all objects will need to be destroyed and copy-constructed after the realloc.
The main thing the standard library needs to accommodate this as a possibility is a reallocate function as part of the standard allocator's public interface. A class like std::vector could certainly use it even if the default implementation is to malloc the newly sized block and free the old one. It would need to be a function that is capable of destroying and copy-constructing the objects in memory though, it cannot treat the memory in an opaque fashion if it did this. There's a little complexity involved there and would require some more template work which may be why it was omitted from the standard library.
std::vector<...>::reserve is not sufficient: it addresses a different case where the size of the container can be anticipated. For truly variable-sized lists, a realloc solution could make contiguous containers like std::vector a lot faster, especially if it can deal with realloc cases where the memory block was successfully resized without being moved, in which case it can omit calling copy constructors and destructors for the objects in memory.
What you're asking for is essentially what vector::reserve does. Without move semantics for objects, there's no way to reallocate memory and move the objects around without doing a copy and destroy.
I guess this is one of the things where god went wrong, but I was just too lazy to write to the standards committee.
There should have been a realloc for array allocations:
p = renew(p) [128];
or something like that.
Because of the object oriented nature of C++, and the inclusion of the various standard container types, I think it's simply that less focus was placed on direction memory management than in C. I agree that there are cases that a realloc() would be useful, but the pressure to remedy this is minimal, as almost all of the resulting functionality can be gained by using containers instead.