Compacting bools feature of std::vector in C++

Compacting bools feature of std::vector in C++ - c++

Does the std::vector in C++ compact bools? I mean I have read that std::vector can combine 8 booleans into 1 byte. However, when I tried this code in visual studio,
#include <vector>
#include <iostream>
using namespace std;
int main()
{
vector<bool> array {true, false, false, true, true,false,false,true};
cout << sizeof(array) << endl;
cout << sizeof(array[0]) << endl;
getchar();
return 0;
}
it printed:
24
16
while in another IDE, such as codeblocks, it printed 20 and 8.
I don't quite get what it does with booleans here.

Does the std::vector in C++ compact bools?
Yes, it is allowed to do so, and typically does.
I don't quite get what it does with booleans here.
You actually don't get what array[0] evaluates to.
It does not evaluate to a bit. It evaluates to a proxy object that correctly handles both conversion to bool and assignment from bool.
the sizeof this proxy does not have much significance. It is not the size of a bit or a bool. It's the size of an object programmed to act on a specific bit.

std::vector usually uses dynamic allocation internally by default. If you define your own allocator that tracks actual allocation size, you'll see that the number of bytes allocated for vector<bool> implies values are stored as bits:
#include <vector>
#include <iostream>
template<typename T>
class my_allocator : public std::allocator<T> {
public:
T * allocate(const size_t count) {
std::cout << "Allocated " << count << " * " << typeid(T).name() << std::endl;
std::cout << "Total size: " << count * sizeof(T) << std::endl;
return std::allocator<T>::allocate(count);
}
T * allocate(const size_t count, const void *) {
return allocate(count);
}
template<typename U>
struct rebind {
typedef my_allocator<U> other;
};
my_allocator() noexcept {};
my_allocator(const my_allocator<T>&) noexcept = default;
template<typename Other>
my_allocator(const my_allocator<Other>&) noexcept {}
};
int main() {
std::vector<int, my_allocator<int>> v1 { 0 };
std::vector<bool, my_allocator<bool>> v2 { 0 };
v1.reserve(100);
v2.reserve(100);
return 0;
}
Relevant output:
Allocated 100 * int
Total size: 400
Allocated 4 * unsigned int
Total size: 16
Demo: https://wandbox.org/permlink/WHTD0k3sMvd3E4ag

Related

C++ constexpr constructor initializes garbage values

I wanted a simple class which would encapsulate a pointer and a size, like C++20's std::span will be, I think. I am using C++ (g++ 11.2.1 to be precise)
I want it so I can have some constant, module-level data arrays without having to calculate the size of each one.
However my implementation 'works' only sometimes, dependent on the optimization flags and compiler (I tried on godbolt). Hence, I've made a mistake. How do I do this correctly?
Here is the implementation plus a test program which prints out the number of elements (which is always correct) and the elements (which is usually wrong)
#include <iostream>
#include <algorithm>
using std::cout;
using std::endl;
class CArray {
public:
template <size_t S>
constexpr CArray(const int (&e)[S]) : els(&e[0]), sz(S) {}
const int* begin() const { return els; }
const int* end() const { return els + sz; }
private:
const int* els;
size_t sz;
};
const CArray gbl_arr{{3,2,1}};
int main() {
CArray arr{{1,2,3}};
cout << "Global size: " << std::distance(gbl_arr.begin(), gbl_arr.end()) << endl;
for (auto i : gbl_arr) {
cout << i << endl;
}
cout << "Local size: " << std::distance(arr.begin(), arr.end()) << endl;
for (auto i : arr) {
cout << i << endl;
}
return 0;
}
Sample output:
Global size: 3
32765
0
0
Local size: 3
1
2
3
In this case the 'local' variable is correct, but the 'global' is not, should be 3,2,1.

I think the issue is your initialization is creating a temporary and then you're storing a pointer to that array after it has been destroyed.
const CArray gbl_arr{{3,2,1}};
When invoking the above constructor, the argument passed in is created just for the call itself, but gbl_arr refers to it after its life has ended. Change to this:
int gbl_arrdata[]{3,2,1};
const CArray gbl_arr{gbl_arrdaya};
And it should work, because the array it refers to now has the same lifetime scope as the object that refers to it.

C++11 compatible Linear Allocator Implementation

I have implemented a C++11 compatible linear or arena allocator. The code follows.
linear_allocator.hpp:
#pragma once
#include <cstddef>
#include <cassert>
#include <new>
#include "aligned_mallocations.hpp"
template <typename T>
class LinearAllocator
{
public:
using value_type = T;
using pointer = T*;
using const_pointer = const T*;
using reference = T&;
using const_reference = const T&;
//using propagate_on_container_copy_assignment = std::true_type;
//using propagate_on_container_move_assignment = std::true_type;
//using propagate_on_container_swap = std::true_type;
LinearAllocator(std::size_t count = 64)
: m_memUsed(0),
m_memStartAddress(nullptr)
{
allocate(count);
}
~LinearAllocator()
{
clear();
}
template <class U>
LinearAllocator(const LinearAllocator<U>&) noexcept
{}
/// \brief allocates memory equal to # count objects of type T
pointer allocate(std::size_t count)
{
if (count > std::size_t(-1) / sizeof(T))
{
throw std::bad_alloc{};
}
if (m_memStartAddress != nullptr)
{
alignedFree(m_memStartAddress);
}
m_memUsed = count * sizeof(T);
m_memStartAddress = static_cast<pointer>(alignedMalloc(m_memUsed, alignof(T)));
return m_memStartAddress;
}
/// \brief deallocates previously allocated memory
/// \brief Linear/arena allocators do not support free() operations. Use clear() instead.
void deallocate([[maybe_unused]] pointer p, [[maybe_unused]] std::size_t count) noexcept
{
//assert(false);
clear();
}
/// \brief simply resets memory
void clear()
{
if (m_memStartAddress != nullptr)
{
alignedFree(m_memStartAddress);
m_memStartAddress = nullptr;
}
this->m_memUsed = 0;
}
/// \brief GETTERS
pointer getStartAddress() const
{
return this->m_memStartAddress;
}
std::size_t getUsedMemory() const
{
return this->m_memUsed;
}
private:
std::size_t m_memUsed;
pointer m_memStartAddress;
};
template <class T, class U>
bool operator==(const LinearAllocator<T> &, const LinearAllocator<U> &)
{
return true;
}
template <class T, class U>
bool operator!=(const LinearAllocator<T> &, const LinearAllocator<U> &)
{
return false;
}
Don't worry about alignedMalloc and alignedFree. They are correct.
This is my test program (linear_allocator.cpp):
#include "linear_allocator.hpp"
#include <vector>
#include <deque>
#include <iostream>
#include <string>
#include <typeinfo>
int main()
{
[[maybe_unused]]
LinearAllocator<int> a{1024};
std::cout << a.getStartAddress() << '\n';
std::cout << a.getUsedMemory() << '\n';
std::vector<std::string, LinearAllocator<std::string>> v;
v.reserve(100);
std::cout << "Vector capacity = " << v.capacity() << '\n';
//std::cout << v.get_allocator().getStartAddress() << '\n';
//std::cout << v.get_allocator().getUsedMemory() << '\n';
v.push_back("Hello");
v.push_back("w/e");
v.push_back("whatever");
v.push_back("there is ist sofi j");
v.push_back("wisdom");
v.push_back("fear");
v.push_back("there's more than meets the eye");
for (const auto &s : v)
{
std::cout << s << '\n';
}
std::cout << typeid(v.get_allocator()).name() << '\n';
std::deque<int, LinearAllocator<int>> dq;
dq.push_back(23);
dq.push_back(90);
dq.push_back(38794);
dq.push_back(7);
dq.push_back(0);
dq.push_back(2);
dq.push_back(13);
dq.push_back(24323);
dq.push_back(0);
dq.push_back(1234);
for (const auto &i : dq)
{
std::cout << i << '\n';
}
std::cout << typeid(dq.get_allocator()).name() << '\n';
}
Compiling with g++ -std=c++17 -O2 -march=native -Wall linear_allocator.cpp -o linear_allocator.gpp.exe and running linear_allocator.gpp.exe gives output:
0x4328b8
4096
Vector capacity = 100
Hello
w/e
whatever
there is ist sofi j
wisdom
fear
there's more than meets the eye
15LinearAllocatorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEE
As you can see deque's output isn't there at all. If I uncomment these 2 lines:
//std::cout << v.get_allocator().getStartAddress() << '\n';
//std::cout << v.get_allocator().getUsedMemory() << '\n';
vector's output will also not be displayed.
Compilation with MSVS cl gives the following output:
000000B47A1CAF88
4096
which is even worse.
There must be something I am missing as there appears to be UB, but I am unable to pinpoint where that is. My allocator design was based on C++11 + guidelines. I wonder what am I doing wrong.

While an allocator takes care of providing and releasing memory for storing container's data, it still does so only on container's request. That is, the actual management of the provided storage (in particular, its lifetime) is still on the container's side. Imagine what happens when a vector performs a relocation of its elements:
A new chunk of memory, greater by a given factor than the current (old) one, is requested.
The elements stored in the "old" chunk are copied/moved to the new chunk.
Only then, the "old" memory chunk can be released.
In your implementation, there can be only one memory chunk active at a time -- the old memory chunk is released before a new one is allocated (specifically, this happens when a container only requests a new chunk of memory, where the elements could be relocated to). You invoke UB already when the vector tries to relocate elements from the previous storage, because the memory where they lived has already been invalidated.
Additionally, by not providing a copy-constructor for your allocator type, the compiler-provided implementation performs a shallow copy (i.e., it copies the pointer, not the data stored under that address), which is then released in the destructor. That is, the call:
v.get_allocator()
will make a shallow copy of the allocator, creating a prvalue of your allocator type, and release the stored pointer as soon as the temporary object ends its lifetime (i.e., at the end of full statement including the cout call), leading to a double call to alignedFree on the same pointer.

Prevent vector items from being moved

this is a learning question for me and hopefully others as well. My problem breaks down to having a pointer pointing to content of a vector. The issue occurs when I erase the first element of the vector. I'm not quite sure what I was expecting, I somehow assumed that, when removing items, the vector would not start moving objects in memory.
The question I have is: is there a way to keep the objects in place in memory? For example changing the underlying container of vector? With my particular example, I will remove the pointer access and just use and id for the object since the class needs a ID anyway.
here is a simplified example:
#include <iostream>
#include <vector>
class A
{
public:
A(unsigned int id) : id(id) {};
unsigned int id;
};
int main()
{
std::vector<A> aList;
aList.push_back(A(1));
aList.push_back(A(2));
A * ptr1 = &aList[0];
A * ptr2 = &aList[1];
aList.erase(aList.begin());
std::cout << "Pointer 1 points to \t" << ptr1 << " with content " << ptr1->id << std::endl;
std::cout << "Pointer 2 points to \t" << ptr2 << " with content " << ptr2->id << std::endl;
std::cout << "Element 1 is stored at \t" << &aList[0] << " with content " << aList[0].id << std::endl;
}
What I get is:
Pointer 1 points to 0xf69320 with content 2
Pointer 2 points to 0xf69324 with content 2
Element 1 is stored at 0xf69320 with content 2

While you can't achieve what you want exactly, there are two easy alternatives. The first is to use std::vector<std::unique_ptr<T>> instead of std::vector<T>. The actual instance of each object will not be moved when the vector resizes. This implies changing any use of &aList[i] to aList[i].get() and aList[i].id to aList[i]->id.
#include <iostream>
#include <memory>
#include <vector>
class A
{
public:
A(unsigned int id) : id(id) {};
unsigned int id;
};
int main()
{
std::vector<std::unique_ptr<A>> aList;
aList.push_back(std::make_unique<A>(1));
aList.push_back(std::make_unique<A>(2));
A * ptr1 = aList[0].get();
A * ptr2 = aList[1].get();
aList.erase(aList.begin());
// This output is undefined behavior, ptr1 points to a deleted object
//std::cout << "Pointer 1 points to \t" << ptr1 << " with content " << ptr1->id << std::endl;
std::cout << "Pointer 2 points to \t" << ptr2 << " with content " << ptr2->id << std::endl;
std::cout << "Element 1 is stored at \t" << aList[0].get() << " with content " << aList[0]->id << std::endl;
}
Note that ptr1 will point to a deleted object, as such it's still undefined behavior to deference it.
Another solution might be to use a different container that does not invalidate references and pointers. std::list never invalidates a node unless it's specifically erased. However, random access is not supported, so your example can't be directly modified to use std::list. You would have to iterate through the list to obtain your pointers.

Not sure if this is what you want, but how about this:
(only basic layout, you need to fill in the details, also: haven't tested the design, might have some flaws)
template <class T>
class MagicVector {
class MagicPointer {
friend class MagicVector;
private:
MagicVector* parent;
unsigned int position;
bool valid;
MagicPointer(MagicVector* par, const unsigned int pos); //yes, private!
public:
~MagicPointer();
T& content();
void handle_erase(const unsigned int erase_position);
}
friend class MagicPointer;
private:
vector<T> data;
vector<std::shared_ptr<MagicPointer> > associated_pointers;
public:
(all the methods you need from vector)
void erase(const unsigned int position);
std::shared_ptr<MagicPointer> create_pointer(const unsigned int position);
}
template <class T>
void MagicVector<T>::erase(const unsigned int position){
data.erase(position);
for(unsigned int i=0; i<associated_pointers.size(); i++){
associated_pointers[i].handle_erase(position);
}
}
template <class T>
std::shared_ptr<MagicPointer> MagicVector<T>::create_pointer(const unsigned int position){
associated_pointers.push_back(std::shared_ptr<MagicPointer>(new MagicPointer(this, position)));
return std::shared_ptr<MagicPointer>(associated_pointers.back());
}
template <class T>
MagicVector<T>::MagicPointer(MagicVector* par, const unsigned int pos){
parent = par;
position = pos;
if (position < parent->data.size()){
valid = true;
}else{
valid = false;
}
}
template <class T>
T& MagicVector<T>::MagicPointer::content(){
if(not valid){
(handle this somehow)
}
return parent->data[position];
}
template <class T>
void MagicVector<T>::MagicPointer::handle_erase(const unsigned int erase_position){
if (erase_position < position){
position--;
}
else if (erase_position == position){
valid = false;
}
}
template <class T>
MagicVector<T>::MagicPointer::~MagicPointer(){
for(unsigned int i=0; i<parent->associated_pointers.size(); i++){
if(parent->associated_pointers[i] == this){
parent->associated_pointers.erase(i);
i=parent->associated_pointers.size();
}
}
}
Basic idea: You have your own classes for vectors and pointers, with pointer storing a position in the vector. The vector knows it's pointers and handles them accordingly whenever something is erased.
I'm not completely satisfied myself, that shared_ptr over MagicPointer looks ugly, but not sure how to simplify this. Maybe we need to work with three classes, MagicVector, MagicPointerCore which stores the parent and position and MagicPointer : public shared_ptr < MagicPointerCore>, with MagicVector having vector < MagicPointerCore> associated_pointers.
Note that the destructor of MagicVector has to set all of it's associated pointers to invalid, since a MagicPointer can outlive the scope of it's parent.

I was expecting, I somehow assumed that, when removing items, the vector would not start moving objects in memory.
How so? What else did you expect?? A std::vector guarantees a contiguous series of it's contained elements in memory. So if something is removed, the other elements need to be replaced in that contiguous memory.

sizeof std::aligned_storage and std::aligned_union

Given the following code:
#include <iostream>
#include <type_traits>
int main() {
std::aligned_storage<sizeof(double), alignof(double)> storage;
std::aligned_union<sizeof(double), double> union_storage;
std::cout << sizeof(storage) << '\n';
std::cout << sizeof(union_storage) << '\n';
std::cout << sizeof(double) << '\n';
}
I expect sizeof(storage) and sizeof(union_storage) to be greater or equal to sizeof(double) since they have to be able to hold a double. However, I get the output
1
1
8
clang-3.8 and gcc-5.3 both produce this output.
Why does sizeof return an incorrect size?
If I use placement new to put a double into storage or union_storage would that be undefined behavior?

std::aligned_storage and std::aligned_union are type traits which provide a member type which is the actual type of the storage.
Thus, placing a double in the memory of the actual trait type would indeed be UB because they're empty types with just a typedef member.
#include <iostream>
#include <type_traits>
int main()
{
using storage_type =
std::aligned_storage<sizeof(double), alignof(double)>::type;
using union_storage_type =
std::aligned_union<sizeof(double), double>::type;
storage_type storage;
union_storage_type union_storage;
std::cout << sizeof(storage_type) << '\n';
std::cout << sizeof(union_storage_type) << '\n';
std::cout << sizeof(storage) << '\n';
std::cout << sizeof(union_storage) << '\n';
std::cout << sizeof(double) << '\n';
return 0;
}
This gives:
8
8
8
8
8
Note: As #T.C. correctly noted: C++14 provides alias templates ending on _t for the std type traits (i.e. std::aligned_storage<L, A>::type === std::aligned_storage_t<L,A>). The benefit is
No typename in template dependant context.
Less typing. ;)

How can a template function 'know' the size of the array given as template argument?

In the C++ code below, the templated Check function gives an output that is not what I would like: it's 1 instead of 3. I suspect that K is mapped to int*, not to int[3] (is that a type?). I would like it to give me the same output than the second (non templated) function, to which I explicitly give the size of the array...
Short of using macros, is there a way to write a Check function that accepts a single argument but still knows the size of the array?
#include <iostream>
using namespace std;
int data[] = {1,2,3};
template <class K>
void Check(K data) {
cout << "Deduced size: " << sizeof(data)/sizeof(int) << endl;
}
void Check(int*, int sizeofData) {
cout << "Correct size: " << sizeofData/sizeof(int) << endl;
}
int main() {
Check(data);
Check(data, sizeof(data));
}
Thanks.
PS: In the real code, the array is an array of structs that must be iterated upon for unit tests.

template<class T, size_t S>
void Check(T (&)[S]) {
cout << "Deduced size: " << S << endl;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js