std::string and its automatic memory resizing - c++

I'm pretty new to C++, but I know you can't just use memory willy nilly like the std::string class seems to let you do. For instance:
std::string f = "asdf";
f += "fdsa";
How does the string class handle getting larger and smaller? I assume it allocates a default amount of memory and if it needs more, it news a larger block of memory and copies itself over to that. But wouldn't that be pretty inefficient to have to copy the whole string every time it needed to resize? I can't really think of another way it could be done (but obviously somebody did).
And for that matter, how do all the stdlib classes like vector, queue, stack, etc handle growing and shrinking so transparently?

Your analysis is correct — it is inefficient to copy the string every time it needs to resize. That's why common advice discourages that use pattern. Use the string's reserve function to ask it to allocate enough memory for what you intend to store in it. Then further operations will fill that memory. (But if your hint turns out to be too small, the string will still grow automatically, too.)
Containers will also usually try to mitigate the effects of frequent re-allocation by allocating more memory than they need. A common algorithm is that when a string finds that it's out of space, it doubles its buffer size instead of just allocating the minimum required to hold the new value. If the string is being grown one character at a time, this doubling algorithm reduces the time complexity to amortized linear time (instead of quadratic time). It also reduces the program's susceptibility to memory fragmentation.

Usually, there's a doubling algorithm. In other words, when it fills the current buffer, it allocates a new buffer that's twice as big, and then copies the current data over. This results in fewer allocate/copy operations than the alternative of growing by a single allocation block.

Although I do not know the exact implementation of std::string, most data structures that need to handle dynamic memory growth do so by doing exactly what you say - allocate a default amount of memory, and if more is needed then create a bigger block and copy yourself over.
The way you get around the obvious inefficiency problem is to allocate more memory than you need. The ratio of used memory:total memory of a vector/string/list/etc is often called the load factor (also used for hash tables in a slightly different meaning). Usually it's a 1:2 ratio - that is, you assign twice the memory you need. When you run out of space, you assign a new amount of memory twice your current amount and use that. This means that over time, if you continue to add things to the vector/string/etc, you need to copy over the item less and less (as the memory creation is exponential, and your inserting of new items is of course linear), and so the time taken for this method of memory handling is not as large as you might think. By the principles of Amortized Analysis, you can then see that inserting m items into a vector/string/list using this method is only Big-Theta of m, not m2.

Related

Can you predict where in memory a vector might move when growing?

I'm learning about C++ and have a conceptual question. Let's say I have a vector. I know that my vector is stored in contiguous memory, but let's say my vector keeps growing and runs out of room to keep the memory contiguous. How can I predict where in memory the vector will go? I'm excluding the option of using functions that tell the vector where it should be in memory.
If it "runs out of room to keep the memory contiguous", then it simply won't grow. Attempting to add items past the currently allocated size will (typically) result in its throwing an exception (though technically, it's up to the allocator object to decide what to do--it's responsible for memory allocation, and responding when that's not possible.
Note, however, that this could result from running out of address space (especially on a 32-bit machine) rather than running out of actual memory. A typical virtual memory manager can reallocate physical pages (e.g., 4 KB or 8 KB chunks) and write data to the paging file if necessary to free physical memory if needed--but when/if there's not enough contiguous address space, there's not much that can be done.
The answer depends highly on your allocation strategy, but in general, the answer is no. Most allocators do not provide you with information where the next allocation will occur. If you were writing a custom allocator, then you could potentially make this information accessible, but doing so is not necessarily a good idea unless your use case specifically requires this knowledge.
The realloc function is the only C function which will attempt to grow your memory in place, and it makes no guarantees that it will do so.
Neither new nor malloc provide any information for where the "next" allocation will take place. You could potentially guess, if you knew the exact implementation details for your specific compiler, but this would be very unwise to rely on in a real program. Regarding specifically the std::allocator used for std::vector, it also does not provide details about where future allocations will take place.
Even if you could predict it in a particular situation, it would be extremely fragile - all it takes is one function you call to change to make another call to new or malloc [unless you are using a very specific allocation method - which is different from the "usual" method] to "break" where the next allocation is made.
If you KNOW that you need a certain size, you can use std::vector::resize() to set the size of the vector [or std::vector<int> vec(10000); to create a pre-sized to 10000, for example] - which of course is not guaranteed to work, but it guarantees that you never need "enough space to hold 3x the current content", which is what happens with std::vector when you grow it using push_back [and if you are REALLY unlucky, that means that your vector will use 2*n-1 elements, leaving n-1 unused, because your size is n-1 and you add ONE more element, which doubles the size, so now 2*n, and you only actually require one more element...
The internal workings of STL containers are kept private for good reasons. You should never be accessing any container elements through any mechanism other than the appropriate iterators; and it is not possible to acquire one of those on an element that does not yet exist.
You could however, supply an allocator and use that to deterministically place future allocations.
Can you predict where in memory a vector might move when growing?
As others like EJP, Jerry and Mats have said, you cannot determine the location of a "grown" vector until after it grows. There are some corner cases, like the allocator providing a block of memory that's larger than required so that the vector does not actually move after a grow. But its not something you should depend on.
In general, stacks grow down and heaps grow up. This is an artifact from the old memory days. Your code segment was sandwiched between them, and it ensured your program would overwrite its own code segment and eventually cause an illegal instruction. So you might be able to guess the new vector is going to be higher in memory than the old vector because the vector is probably using heap memory. But its not really useful information.
If you are devising a strategy for locating elements after a grow, then use an index and not an iterator. Iterators are invalidated after inserts and deletes (including the grow).
For example, suppose you are parsing the vector and you are looking for the data that follows -----BEGIN CERTIFICATE-----. Once you know the offset of the data (byte 27 in the vector), then you can always relocate it in constant time with v.begin() + 26. If you only have part of the certificate and later add the tail of the data and the -----END CERTIFICATE----- (and the vector grows), then the data is still located at v.begin() + 26.
No, in practical terms you can't predict where it will go if it has to move due to resizing. However, it isn't so random that you could use it as a random number generator (;

c++ unordered_map is there a way to pre-allocate memory for elements if max size known in advance

Looks like reserve/rehash functions only pre-allocate the number of buckets, not memory for the elements(key,vlaue) pairs to be inserted.
Is there a way we can pre-allocate memory for elements also, so low-latency apps dont need to waste time on dynamic memory allocation.
One possibility would be to write your own allocator. This can be especially effective if you have at least a fair idea of how many items are likely to go in the table (so you can pre-allocate space for all of them) and don't care about re-using space for items if they're removed from the table (so your bookkeeping is simple).
In such a case, you can basically pre-allocate space for N objects, and simply keep track of the position of the next item to be allocated. Allocating the object consists of simply returning the address and incrementing the pointer, as in return *next++;
Of course, this doesn't truly eliminate the dynamic allocation--it just makes it cheap enough that you probably don't care about it any more (and since it's supplied as a template parameter, there's a good chance of its being expanded inline, so you don't even get the overhead of a function call in the process.
Even if you can't put up with quite that restrictive of an allocator, a general-purpose allocator for fixed-size objects will still usually be (at least somewhat) faster than one for variable sizes of objects. It still won't eliminate the dynamic allocation, but it may give enough improvement in speed to work quite a bit better for your purpose.

Should I use boost fast pool allocator for following?

I have a server that throughout the course of 24 hours keeps adding new items to a set. Elements are not deleted over the 24 period, just new elements keep getting inserted.
Then at end of period the set is cleared, and new elements start getting added again for another 24 hours.
Do you think a fast pool allocator would be useful here as to reuse the memory and possibly help with fragmentation?
The set grows to around 1 million elements. Each element is about 1k.
It's highly unlikely …but you are of course free to test it in your program.
For a collection of that size and allocation pattern (more! more! more! + grow! grow! grow!), you should use an array of vectors. Just keep it in contiguous blocks and reserve() when they are created and you never need to reallocate/resize or waste space and bandwidth traversing lists. vector is going to be best for your memory layout with a collection that large. Not one big vector (which would take a long time to resize), but several vectors, each which represent chunks (ideal chunk size can vary by platform -- I'd start with 5MB each and measure from there). If you follow, you see there is no need to resize or reuse memory; just create an allocation every few minutes for the next N objects -- there is no need for high frequency/speed object allocation and recreation.
The thing about a pool allocator would suggest you want a lot of objects which have discontiguous allocations, lots of inserts and deletes like a list of big allocations -- this is bad for a few reasons. If you want to create an implementation which optimizes for contiguous allocation at this size, just aim for the blocks with vectors approach. Allocation and lookup will both be close to minimal. At that point, allocation times should be tiny (relative to the other work you do). Then you will also have nothing unusual or surprising about your allocation patterns. However, the fast pool allocator suggests you treat this collection as a list, which will have terrible performance for this problem.
Once you implement that block+vector approach, a better performance comparison (at that point) would be to compare boost's pool_allocator vs std::allocator. Of course, you could test all three, but memory fragmentation is likely going to be reduced far more by that block of vectors approach, if you implement it correctly. Reference:
If you are seriously concerned about performance, use fast_pool_allocator when dealing with containers such as std::list, and use pool_allocator when dealing with containers such as std::vector.

Linked list vs dynamic array for implementing a stack using vector class

I was reading up on the two different ways of implementing a stack: linked list and dynamic arrays. The main advantage of a linked list over a dynamic array was that the linked list did not have to be resized while a dynamic array had to be resized if too many elements were inserted hence wasting alot of time and memory.
That got me wondering if this is true for C++ (as there is a vector class which automatically resizes whenever new elements are inserted)?
It's difficult to compare the two, because the patterns of their memory usage are quite different.
Vector resizing
A vector resizes itself dynamically as needed. It does that by allocating a new chunk of memory, moving (or copying) data from the old chunk to the new chunk, the releasing the old one. In a typical case, the new chunk is 1.5x the size of the old (contrary to popular belief, 2x seems to be quite unusual in practice). That means for a short time while reallocating, it needs memory equal to roughly 2.5x as much as the data you're actually storing. The rest of the time, the "chunk" that's in use is a minimum of 2/3rds full, and a maximum of completely full. If all sizes are equally likely, we can expect it to average about 5/6ths full. Looking at it from the other direction, we can expect about 1/6th, or about 17% of the space to be "wasted" at any given time.
When we do resize by a constant factor like that (rather than, for example, always adding a specific size of chunk, such as growing in 4Kb increments) we get what's called amortized constant time addition. In other words, as the array grows, resizing happens exponentially less often. The average number of times items in the array have been copied tends to a constant (usually around 3, but depends on the growth factor you use).
linked list allocations
Using a linked list, the situation is rather different. We never see resizing, so we don't see extra time or memory usage for some insertions. At the same time, we do see extra time and memory used essentially all the time. In particular, each node in the linked list needs to contain a pointer to the next node. Depending on the size of the data in the node compared to the size of a pointer, this can lead to significant overhead. For example, let's assume you need a stack of ints. In a typical case where an int is the same size as a pointer, that's going to mean 50% overhead -- all the time. It's increasingly common for a pointer to be larger than an int; twice the size is fairly common (64-bit pointer, 32-bit int). In such a case, you have ~67% overhead -- i.e., obviously enough, each node devoting twice as much space to the pointer as the data being stored.
Unfortunately, that's often just the tip of the iceberg. In a typical linked list, each node is dynamically allocated individually. At least if you're storing small data items (such as int) the memory allocated for a node may be (usually will be) even larger than the amount you actually request. So -- you ask for 12 bytes of memory to hold an int and a pointer -- but the chunk of memory you get is likely to be rounded up to 16 or 32 bytes instead. Now you're looking at overhead of at least 75% and quite possibly ~88%.
As far as speed goes, the situation is rather similar: allocating and freeing memory dynamically is often quite slow. The heap manager typically has blocks of free memory, and has to spend time searching through them to find the block that's most suited to the size you're asking for. Then it (typically) has to split that block into two pieces, one to satisfy your allocation, and another of the remaining memory it can use to satisfy other allocations. Likewise, when you free memory, it typically goes back to that same list of free blocks and checks whether there's an adjoining block of memory already free, so it can join the two back together.
Allocating and managing lots of blocks of memory is expensive.
cache usage
Finally, with recent processors we run into another important factor: cache usage. In the case of a vector, we have all the data right next to each other. Then, after the end of the part of the vector that's in use, we have some empty memory. This leads to excellent cache usage -- the data we're using gets cached; the data we're not using has little or no effect on the cache at all.
With a linked list, the pointers (and probable overhead in each node) are distributed throughout our list. I.e., each piece of data we care about has, right next to it, the overhead of the pointer, and the empty space allocated to the node that we're not using. In short, the effective size of the cache is reduced by about the same factor as the overall overhead of each node in the list -- i.e., we might easily see only 1/8th of the cache storing the date we care about, and 7/8ths devoted to storing pointers and/or pure garbage.
Summary
A linked list can work well when you have a relatively small number of nodes, each of which is individually quite large. If (as is more typical for a stack) you're dealing with a relatively large number of items, each of which is individually quite small, you're much less likely to see a savings in time or memory usage. Quite the contrary, for such cases, a linked list is much more likely to basically waste a great deal of both time and memory.
Yes, what you say is true for C++. For this reason, the default container inside std::stack, which is the standard stack class in C++, is neither a vector nor a linked list, but a double ended queue (a deque). This has nearly all the advantages of a vector, but it resizes much better.
Basically, an std::deque is a linked list of arrays of sorts internally. This way, when it needs to resize, it just adds another array.
First, the performance trade-offs between linked-lists and dynamic arrays are a lot more subtle than that.
The vector class in C++ is, by requirement, implemented as a "dynamic array", meaning that it must have an amortized-constant cost for inserting elements into it. How this is done is usually by increasing the "capacity" of the array in a geometric manner, that is, you double the capacity whenever you run out (or come close to running out). In the end, this means that a reallocation operation (allocating a new chunk of memory and copying the current content into it) is only going to happen on a few occasions. In practice, this means that the overhead for the reallocations only shows up on performance graphs as little spikes at logarithmic intervals. This is what it means to have "amortized-constant" cost, because once you neglect those little spikes, the cost of the insert operations is essentially constant (and trivial, in this case).
In a linked-list implementation, you don't have the overhead of reallocations, however, you do have the overhead of allocating each new element on freestore (dynamic memory). So, the overhead is a bit more regular (not spiked, which can be needed sometimes), but could be more significant than using a dynamic array, especially if the elements are rather inexpensive to copy (small in size, and simple object). In my opinion, linked-lists are only recommended for objects that are really expensive to copy (or move). But at the end of the day, this is something you need to test in any given situation.
Finally, it is important to point out that locality of reference is often the determining factor for any application that makes extensive use and traversal of the elements. When using a dynamic array, the elements are packed together in memory one after the other and doing an in-order traversal is very efficient as the CPU can preemptively cache the memory ahead of the reading / writing operations. In a vanilla linked-list implementation, the jumps from one element to the next generally involves a rather erratic jumps between wildly different memory locations, which effectively disables this "pre-fetching" behavior. So, unless the individual elements of the list are very big and operations on them are typically very long to execute, this lack of pre-fetching when using a linked-list will be the dominant performance problem.
As you can guess, I rarely use a linked-list (std::list), as the number of advantageous applications are few and far between. Very often, for large and expensive-to-copy objects, it is often preferable to simply use a vector of pointers (you get basically the same performance advantages (and disadvantages) as a linked list, but with less memory usage (for linking pointers) and you get random-access capabilities if you need it).
The main case that I can think of, where a linked-list wins over a dynamic array (or a segmented dynamic array like std::deque) is when you need to frequently insert elements in the middle (not at either ends). However, such situations usually arise when you are keeping a sorted (or ordered, in some way) set of elements, in which case, you would use a tree structure to store the elements (e.g., a binary search tree (BST)), not a linked-list. And often, such trees store their nodes (elements) using a semi-contiguous memory layout (e.g., a breadth-first layout) within a dynamic array or segmented dynamic array (e.g., a cache-oblivious dynamic array).
Yes, it's true for C++ or any other language. Dynamic array is a concept. The fact that C++ has vector doesn't change the theory. The vector in C++ actually does the resizing internally so this task isn't the developers' responsibility. The actual cost doesn't magically disappear when using vector, it's simply offloaded to the standard library implementation.
std::vector is implemented using a dynamic array, whereas std::list is implemented as a linked list. There are trade-offs for using both data structures. Pick the one that best suits your needs.
As you indicated, a dynamic array can take a larger amount of time adding an item if it gets full, as it has to expand itself. However, it is faster to access since all of its members are grouped together in memory. This tight grouping also usually makes it more cache-friendly.
Linked lists don't need to resize ever, but traversing them takes longer as the CPU must jump around in memory.
That got me wondering if this is true for c++ as there is a vector class which automatically resizes whenever new elements are inserted.
Yes, it still holds, because a vector resize is a potentially expensive operation. Internally, if the pre-allocated size for the vector is reached and you attempt to add new elements, a new allocation takes place and the old data is moved to the new memory location.
From the C++ documentation:
vector::push_back - Add element at the end
Adds a new element at the end of the vector, after its current last element. The content of val is copied (or moved) to the new element.
This effectively increases the container size by one, which causes an automatic reallocation of the allocated storage space if -and only if- the new vector size surpasses the current vector capacity.
http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Keynote-Bjarne-Stroustrup-Cpp11-Style
Skip to 44:40. You should prefer std::vector whenever possible over a std::list, as explained in the video, by Bjarne himself. Since std::vector stores all of it's elements next to each other, in memory, and because of that it will have the advantage of being cached in memory. And this is true for adding and removing elements from std::vector and also searching. He states that std::list is 50-100x slower than a std::vector.
If you really want a stack, you should really use std::stack instead of making your own.

Uses for the capacity value of a string

In the C++ Standard Library, std::string has a public member function capacity() which returns the size of the internal allocated storage, a value greater than or equal to the number of characters in the string (according to here). What can this value be used for? Does it have something to do with custom allocators?
You are more likely to use the reserve() member function, which sets the capacity to at least the supplied value.
The capacity() member function itself might be used to avoid allocating memory. For instance, you could recycle used strings through a pool, and put each one in a different size bucket based on its capacity. A client of the pool could then ask for a string that already has some minimum capacity.
The string::capacity() function return the number of total characters the std::string may contain before it has to reallocate memory, which is quite an expensive operation.
A std::vector works in the same way so I suggest you look up std::vector here for an in detail explanation of the difference between allocated and initialized memory.
Update
Hmm I can see I misread the question, actually I think I've never used capacity() myself on either std::string or std::vector, seems to seldomely be of any reason as you have to call reserve anyway.
It gives you the number of characters the string could contain without having to re-allocate. I suppose this might be important in a situation where allocation was expensive, and you wanted to avoid it, but I must say this is one string member function I've never used in real code.
It could be used for some performance tuning if you are about to add a lot of characters to the string. Before starting the string manipulation, you can check for the capacity and if it is too small, reserve the desired length in a single step (instead of letting it reallocate successively bigger chunks of memory several times, which would be a performance hog).
Strings have a capacity and a size. The capacity indicates how many characters that the string can hold before it will have to allocate more memory. The size indicates how many characters that it currently holds. reserve() can be used to set the minimum capacity of the string (it will allocate memory for at least that number of characters but could allocate more).
This is primarily of importance if you're increasing the size of the string. When you concatenate onto the string with += or append(), the characters from the given string will be added to the end of the current one. If increasing the string to that size does not exceed the capacity, then it's just going to use the capacity that it has. However, if the new size would exceed the current capacity, then the string will have to reallocate memory internally and copy its internals into the new memory. If you're going to be doing that a lot, it can get expensive (though it is done in amortized constant time), so in such a case, you could use reserve() to preallocate enough memory to reduce how often reallocations have to take place.
vector functions in basically the same way with the same functions.
Personally, while I've dealt with capacity() and reserve() with vector from time to time, I've never seen much need to do so with string - probably because I don't generally do enough string concatenations in my code for it to be worth it. In most cases, a particular string might get a few concatenations but not enough to worry about its capacity. Worrying about capacity is generally something you do when trying to optimize your code.
There's hardly any relevant use. It is similar to std::vector::capacity. However, one of the most common uses of strings is assignment. When assigning to a std::string, its .capacity may change. This means that an implementation has the right to ignore the old capacity and allocate precisely enough memory.
It genuinely isn't very useful, and is probably there only for symmetry with vector (under the assumption that both will operate internally in the same way).
The capacity of a vector is guaranteed to affect the behaviour of a resize. Resizing a vector to a value less than or equal to the capacity will not induce a reallocation, and will not therefore invalidate iterators or pointers referring to elements in the vector. This means you can pre-allocate some storage by calling reserve on a vector, then (with care) add elements to it by resizing or pushing back (etc.), safe in the knowledge that the underlying buffer won't move.
There is no such guarantee for string, though. It seems that the capacity is for informational purposes only -- though even that's a stretch, as it doesn't look like there's any useful information to be taken from it anyway. (Worse yet, contiguity of string chars isn't guaranteed either, so the only way you can get at the string as a linear buffer is c_str() -- which may induce a reallocation.)
At a guess, string was presumably originally intended to be implemented as some kind of a special case of vector, but over time the two grew apart...