Memory allocation of C++ vector<bool> - c++

The vector<bool> class in the C++ STL is optimized for memory to allocate one bit per bool stored, rather than one byte. Every time I output sizeof(x) for vector<bool> x, the result is 40 bytes creating the vector structure. sizeof(x.at(0)) always returns 16 bytes, which must be the allocated memory for many bool values, not just the one at position zero. How many elements do the 16 bytes cover? 128 exactly? What if my vector has more or less elements?
I would like to measure the size of the vector and all of its contents. How would I do that accurately? Is there a C++ library available for viewing allocated memory per variable?

I don't think there's any standard way to do this. The only information a vector<bool> implementation gives you about how it works is the reference member type, but there's no reason to assume that this has any congruence with how the data are actually stored internally; it's just that you get a reference back when you dereference an iterator into the container.
So you've got the size of the container itself, and that's fine, but to get the amount of memory taken up by the data, you're going to have to inspect your implementation's standard library source code and derive a solution from that. Though, honestly, this seems like a strange thing to want in the first place.
Actually, using vector<bool> is kind of a strange thing to want in the first place. All of the above is essentially why its use is frowned upon nowadays: it's almost entirely incompatible with conventions set by other standard containers… or even those set by other vector specialisations.

Related

Which C++ stl container should I use?

Imagine the following requirements:
measurement data should be logged and the user should be able to iterate through the data.
uint32_t timestamp;
uint16_t place;
struct SomeData someData;
have a timestamp (uint32_t), a place (uint16_t) and some data in a struct
have a constant number of datasets. If a new one arrives, the oldest is thrown away.
the number of "place" is dynamic, the user can insert new ones during runtime
it should be possible to iterate through the data to the next newer or older dataset but only if the place is the same
need to insert at the end only
memory should be allocated once at program start
insertion need not to be fast but should not block other threads for a long time which might be iterating through the container
memory requirement should be low
EDIT: - The container should all the memory which is not used otherwise, therefore it can be large.
I am not sure which container I should use. It is an embedded system and should not use boost etc.
I see the following possibilities:
std::vector - drawbacks: The insertion at the end requires that all objects are copied and during this time another thread cannot access the vector. Edit: This can be avoided by implementing it as a circular buffer - see comments below. When iterating throught the vector, I have to test the place ID. Maybe it might also be a problem to allocate much memory as one block - because the memory could be segmented?
std::deque - compared to std::vector insertion (and pop_back) is faster but memory requirement? Iterators do not become invalid if the insertion is at the end. But I still have to iterate and test the second ID ("place"). I think it does not need to allocate all the memory in one big block as it is the case with vector or array. If an element is added in front and another one is removed at the end (or removed first and added after), I guess there does no memory allocation take place?
std::queue - instead of deque, I should rather use a queue? Is it true that in many implementations a queue ist implented just as a deque?
std::map - Like deque any iterators to existing elements will not become invalid. If I make the key a combination of place and timestamp, then iteration through the map is maybe faster because it is already sorted? Memory requirements of a map?
std::multimap - as the number of places is not constant I cannot make a multimap with "place" as the index.
std::list - has no advantage over deque here?
Some suggested the use of a circular buffer. If I do not want that the memory is allocated as one big block I still have to use a container and most questions above stay valid.
Update:
I will use a ring buffer as suggested here but using a deque as the underlying container. In order to being able to scroll fast through the datasets with the preselected "place" I will eventually introduce two additional indices into the data struct which will point to the previous and the next index with the same place.
How much memory will be used? In my special case the size of the struct is 56 bytes. The gnu lib uses 512 bytes as minimum block size, the IAR compiler 16 bytes. Hence the used block size will be 512 or 56 bytes respectively. Besides two iterators (using 4 pointers each) and the size there will be a pointer stored for each block. Therefore in the implementation of the iar compiler (block size 56 bytes) there will be 7 % overhead (on a 32 bit system) compared to the use of a std::vector or array. In the gcc implementation there will fit 9 objects in the block (504 bytes) while 512 + 4 bytes are needed per block which is 2 % more.
The block size is not large but the continuous memory size needed for the pointer array is already relatively large, especially for the implementation where one block is one struct.
A std::list would need 2 pointers per struct which is 14 % overhead in my case on 32 bit systems.
std::vector
... the memory could be segmented?
No, std::vector allocates contiguous memory, as is documented in that link. Arrays are also contiguous, but you might just as well use vector for this.
std::deque is segmented, which you said you didn't want. Or do you want to avoid a single large allocated block? It's not clear.
Anyway, it has no benefit over vector if you really want a circular buffer (because you'll never be adding/removing elements from the front/back anyway), and you can't control the block size.
std::queue
... Is it true that in many implementations a queue is implented just as a deque?
Yes, that's the default in all implementations. See the linked documentation or any decent book.
It doesn't sound like you want a FIFO queue, so I don't know why you're considering this one - the interface doesn't match your stated requirement.
'std::map`
... iteration through the map is maybe faster because it is already sorted?
On most modern server/desktop architectures, map will be slower because advancing an iterator involves a pointer chase (which impairs pipelining) and a likely cache miss. Your anonymous embedded architecture may be less sensitive to these effects, so map may be faster for you.
... Memory requirements of a map?
Higher. You have the node size (at least a couple of pointers) added to each element.

Why does std::vector<bool> have no .data()?

The specialisation of std::vector<bool>, as specified in C++11 23.3.7/1, doesn't declare a data() member (e.g. mentioned here and here).
The question is: Why does a std::vector<bool> have no .data()? This is the very same question as why is a vector of bools not stored contiguously in memory. What are the benefits in not doing so?
Why can a pointer to an array of bools not be returned?
Why does a std::vector have no .data()?
Because std::vector<bool> stores multiple values in 1 byte.
Think about it like a compressed storage system, where every boolean value needs 1 bit. So, instead of having one element per memory block (one element per array cell), the memory layout may look like this:
Assuming that you want to index a block to get a value, how would you use operator []? It can't return bool& (since it will return one byte, which stores more than one bools), thus you couldn't assign a bool* to it. In other words bool *bool_ptr =&v[0]; is not valid code, and would result in a compilation error.
Moreover, a correct implementation might not have that specialization and don't do the memory optimization (compression). So data() would have to copy to the expected return type depending of implementation (or standard should force optimization instead of just allowing it).
Why can a pointer to an array of bools not be returned?
Because std::vector<bool> is not stored as an array of bools, thus no pointer can be returned in a straightforward way. It could do that by copying the data to an array and return that array, but it's a design choice not to do that (if they did, I would think that this works as the data() for all containers, which would be misleading).
What are the benefits in not doing so?
Memory optimization.
Usually 8 times less memory usage, since it stores multiple bits in a single byte. To be exact, CHAR_BIT times less.

Difference in size between std::vector and std::deque

Right after I declare a vector<int> and a deque<int> if I print out sizeof on both of them, the std::vector has 12 bytes ( I guess begin, end and size ) while deque has 40 bytes. Where do those extra bytes come from?
I'm using Code::Blocks IDE 13.12 and I did select the C++11 standard to be used.
The size is implementation dependent; but it's not surprising that the control structure for a more complicated structure like deque might be larger than a simple one like vector (which, as you say, can be easily managed by three pointers, or a pointer and two sizes).
Looking at the GNU implementation, it stores precalculated "start" and "end" iterators, presumably because they're difficult to calculate on demand, and easier to update when the container changes. Each of those is quite complicated, containing four pointers: the current position, the start and end of the current block, and a pointer to a map structure needed to move between blocks. The deque also has a (possibly redundant?) pointer to that map, and the size (again, presumably difficult to calculate on demand), giving a total of ten pointers/sizes, as you observed.
The sizeof operator takes the size of the object and not the size of its contents.
So if an object have three data member variables, the sizeof operator will get the size of those member variables, but if one of them is a pointer to more data then you only get the size of the pointer and not the length of the data it points to.
sizeof does not give you the number of elements in a container; rather the compile time evaluated size of the structure.
So the actual value is down to the implementation of the standard library and, as such, can vary from platform to platform.

Vector of double to a vector bytes

I have a std::vector<double> and need to work with a library the takes a const vector<uint8_t>. I specify what type the data is to the library with an enum.
Is there anyway that I can avoid copying data completely and have the byte vector internally refer to the same data as the double vector? Since the byte vector is const and the double vector won't change during the lifetime of the byte vector, this appears like it would be pretty safe. There is a lot of data so copying it really isn't an option.
If your "byte vector" were actually a vector of bytes then you would have a chance, because you can legally examine pretty much anything as a char array. However, uint32_ts are not bytes and they are certainly not chars. So, no, you basically can't do this without horrible hacky magic whose safety will be entirely implementation dependent.
Either way, you can't do it with the vector types: you'd have to cast and pass the result of std::vector::data(), i.e. a pointer.
Sorry but I have to recommend revisiting your design. If the library you're using really takes a vector of integers that is actually supposed to be a vector of doubles, then its developers have put you in an awkward position.

Using read() directly into a C++ std:vector

I'm wrapping up user space linux socket functionality in some C++ for an embedded system (yes, this is probably reinventing the wheel again).
I want to offer a read and write implementation using a vector.
Doing the write is pretty easy, I can just pass &myvec[0] and avoid unnecessary copying. I'd like to do the same and read directly into a vector, rather than reading into a char buffer then copying all that into a newly created vector.
Now, I know how much data I want to read, and I can allocate appropriately (vec.reserve()). I can also read into &myvec[0], though this is probably a VERY BAD IDEA. Obviously doing this doesn't allow myvec.size to return anything sensible. Is there any way of doing this that:
Doesn't completely feel yucky from a safety/C++ perspective
Doesn't involve two copies of the data block - once from kernel to user space and once from a C char * style buffer into a C++ vector.
Use resize() instead of reserve(). This will set the vector's size correctly -- and after that, &myvec[0] is, as usual, guaranteed to point to a continguous block of memory.
Edit: Using &myvec[0] as a pointer to the underlying array for both reading and writing is safe and guaranteed to work by the C++ standard. Here's what Herb Sutter has to say:
So why do people continually ask whether the elements of a std::vector (or std::array) are stored contiguously? The most likely reason is that they want to know if they can cough up pointers to the internals to share the data, either to read or to write, with other code that deals in C arrays. That’s a valid use, and one important enough to guarantee in the standard.
I'll just add a short clarification, because the answer was already given. resize() with argument greater than current size will add elements to the collection and default - initialize them. If You create
std::vector<unsigned char> v;
and then resize
v.resize(someSize);
All unsigned chars will get initialized to 0. Btw You can do the same with a constructor
std::vector<unsigned char> v(someSize);
So theoretically it may be a little bit slower than a raw array, but if the alternative is to copy the array anyway, it's better.
Reserve only prepares the memory, so that there is no reallocation needed, if new elements are added to the collection, but You can't access that memory.
You have to get an information about the number of element written to Your vector. The vector won't know anything about it.
Assuming it's a POD struct, call resize rather than reserve. You can define an empty default constructor if you really don't want the data zeroed out before you fill the vector.
It's somewhat low level, but the semantics of construction of POD structs is purposely murky. If memmove is allowed to copy-construct them, I don't see why a socket-read shouldn't.
EDIT: ah, bytes, not a struct. Well, you can use the same trick, and define a struct with just a char and a default constructor which neglects to initialize it… if I'm guessing correctly that you care, and that's why you wanted to call reserve instead of resize in the first place.
If you want the vector to reflect the amount of data read, call resize() twice. Once before the read, to give yourself space to read into. Once again after the read, to set the size of the vector to the number of bytes actually read. reserve() is no good, since calling reserve doesn't give you permission to access the memory allocated for the capacity.
The first resize() will zero the elements of the vector, but this is unlikely to create much of a performance overhead. If it does then you could try Potatoswatter's suggestion, or you could give up on the size of the vector reflecting the size of the data read, and instead just resize() it once, then re-use it exactly as you would an allocated buffer in C.
Performance-wise, if you're reading from a socket in user mode, most likely you can easily handle data as fast as it comes in. Maybe not if you're connecting to another machine on a gigabit LAN, or if your machine is frequently running 100% CPU or 100% memory bandwidth. A bit of extra copying or memsetting is no big deal if you are eventually going to block on a read call anyway.
Like you, I'd want to avoid the extra copy in user-space, but not for performance reasons, just because if I don't do it, I don't have to write the code for it...