posix_memalign for std::vector - c++

Is there a way to posix_memalign a std::vector without creating a local instance of the vector first?
The problem I'm encountering is that I need to tell posix_memalign how much space to allocate and I don't know how to say
sizeof(std::vector<type>(n))
without actually creating a new vector.
Thanks

Well, there are two sizes here. The vector itself is typically no more than a pointer or two to some allocated memory, and unsigned integers keeping track of size and capacity. There is also the allocated memory itself, which is what I think you want.
What you want to do is make a custom allocator that the vector will use. When it comes time, it will use your allocator and you can have your own special functionality. I won't go over the full details of an allocator, but the specifics:
template <typename T>
struct aligned_allocator
{
// ...
pointer allocate(size_type pCount, const_pointer = 0)
{
pointer mem = 0;
if (posix_memalign(&mem, YourAlignment, sizeof(T) * pCount) != 0)
{
throw std::bad_alloc(); // or something
}
return mem;
}
void deallocate(pointer pPtr, size_type)
{
free(pPtr);
}
// ...
};
And then you'd use it like:
typedef std::vector<T, aligned_allocator<T> > aligned_T_vector;
aligned_T_vector vec;
vec.push_back( /* ... */ ); // array is aligned
But to reiterate the first point, the size of a vector is the same regardless of how many elements it's holding, as it only points to a buffer. Only the size of that buffer changes.

Related

Problems that may arise when initializing arrays on stack inside a function scope with an N size_t parameter?

Say for example I have a function that takes some argument and a size_t length to initialize an array on stack inside a function.
Considering the following:
Strictly the length can only be on the range of 1 to 30 (using a fixed max buffer length of 30 is not allowed).
The array only stays inside the function and is only used to compute a result.
int foo(/*some argument, ..., ... */ size_t length) {
uint64_t array[length];
int some_result = 0;
// some code that uses the array to compute something ...
return some_result;
}
In normal cases I would use an std::vector, new or *alloc functions for this but... I'm trying to optimize since this said function is being repeatedly called through out the life time of the program, making the heap allocations a large overhead.
Initially using an array on stack with fixed size is the solution that I have come up with, but I cannot do this, for some reasons that I cannot tell since it would be rude.
Anyway I wonder If I can get away with this approach without encountering any problem in the future?
In the rare cases where I've done some image processing with large fixed sized temp buffers or just wanted to avoid the runtime for redundant alloc/free calls, I've made my own heap.
It doesn't make a lot of sense for small allocations, where you could just use the stack, but you indicated your instructor said not to do this. So you could try something like this:
template<typename T>
struct ArrayHeap {
unordered_map<size_t, list<shared_ptr<T[]>>> available;
unordered_map<uint64_t*, pair<size_t, shared_ptr<T[]>>> inuse;
T* Allocate(size_t length) {
auto &l = available[length];
shared_ptr<T[]> ptr;
if (l.size() == 0) {
ptr.reset(new T[length]);
} else {
ptr = l.front();
l.pop_front();
}
inuse[ptr.get()] = {length, ptr};
return ptr.get();
}
void Deallocate(T* allocation) {
auto itor = inuse.find(allocation);
if (itor == inuse.end()) {
// assert
} else {
auto &p = itor->second;
size_t length = p.first;
shared_ptr<T[]> ptr = p.second;
inuse.erase(allocation);
// optional - you can choose not to push the pointer back onto the available list
// if you have some criteria by which you want to reduce memory usage
available[length].push_back(ptr);
}
}
};
In the above code, you can Allocate a buffer of a specific length. The first time invoked for a given length value, it will incur the overhead of allocating "new". But when the buffer is returned to the heap, the second allocation for the buffer of the same length, it will be fast.
Then your function can be implemented like this:
ArrayHeap<uint64_t> global_heap;
int foo(/*some argument, ..., ... */ size_t length) {
uint64_t* array = global_heap.Allocate(length);
int some_result = 0;
// some code that uses the array to compute something ...
global_heap.Deallocate(array);
return some_result;
}
Personally I would use a fixed size array on the stack, but if there are reasons to prohibit that then check if there are any against the alloca() method.
man 3 alloca

Pointer wrapper for insertion

Can someone tell me if there is a datatype in C++/STL that allows me to solve the following problem comfortably:
I have a preallocated contiguous area of memory representing an array of objects of type T.
I have a raw pointer ptrEnd to this area which points right after the last object of the area.
I have a pointer ptrCurrent that points to some position inside this area.
Now what I want is some kind of wrapper class that helps me insert new elements into this area. It should have some kind of "append" function which basically does the following things
Assign *ptrCurrent the value of object to insert
Increment ptrCurrent by one.
Omit the aforementioned steps if ptrCurrent >= ptrEnd. Return an error instead (or a false to indicate failure).
I could write something like this myself, but I wanted to ask first if there is a class in C++ STL that allows me to solve this problem more elegantly.
Thanks for your help.
There is a convenient feature for exactly this in C++17, polymorphic allocators. More specifically, this is what you want:
std::pmr::monotonic_buffer_resource buffer(sizeof(T) * 256);
// Buffer that can hold 256 objects of type `T`.
std::pmr::vector<T> vec(&buffer);
// The vector will use `buffer` as the backing storage.
live godbolt.org example
You'd need to write an Allocator that hands out Ts from your array, and then std::vector can use it.
template <typename T>
class ArrayAllocator
{
T* current;
T* end;
public:
using value_type = T;
ArrayAllocator(T* start, T* end) : current(start), end(end) {}
T* allocate(size_t n)
{
if (current + n >= end) throw std::bad_alloc();
T * result = current;
current += n;
return result;
}
void deallocate(T* what, size_t n)
{
if (what + n != current) throw std::runtime_error("bad deallocate");
current = what;
}
size_t max_size() { return end - current; }
};
You'd have to immediately reserve the whole amount, because when vector reallocates it needs to copy the old values into the new space, which will result in a "bad deallocate".
I ended up writing an AppendHelper class that takes the start and end pointer and otherwise reproduces the std::vector interface. I realized that using std::vector with a custom allocator meant not having full control over when allocation and deallocation is performed, so the result could behave differently from my original intention.

C++: Array with custom size in class

I want to do this:
class Graphic
{
int *array;
Graphic( int size )
{
int temp_array[size];
array = temp_array;
glGenTextures( size, array );
}
}
Will this work? And even if it will, is there a better way to do this?
Thanks.
Using new means you have to remember to delete [] it; using compiler-dependent variable-size arrays means you lose portability.
It's much better to use a vector.
#include <vector>
class Graphic
{
std::vector<int> array;
Graphic( int size )
{
array.resize(size);
glGenTextures( size, &array[0] );
}
}
The language guarantees that vector elements will be contiguous in memory so it's safe to do &array[0] here.
No, the memory for temp_array is allocated on the stack. When the function ends then that memory is deallocated and all you'll be left with is a dangling pointer. If you want to keep the array valid beyond the point that the constructor returns then allocate it dynamically using new. Example:
array = new int[size]
And then remember to delete it. Typically this is done in the destructor like this:
delete[] array

How is vector implemented in C++

I am thinking of how I can implement std::vector from the ground up.
How does it resize the vector?
realloc only seems to work for plain old stucts, or am I wrong?
it is a simple templated class which wraps a native array. It does not use malloc/realloc. Instead, it uses the passed allocator (which by default is std::allocator).
Resizing is done by allocating a new array and copy constructing each element in the new array from the old one (this way it is safe for non-POD objects). To avoid frequent allocations, often they follow a non-linear growth pattern.
UPDATE: in C++11, the elements will be moved instead of copy constructed if it is possible for the stored type.
In addition to this, it will need to store the current "size" and "capacity". Size is how many elements are actually in the vector. Capacity is how many could be in the vector.
So as a starting point a vector will need to look somewhat like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_;
typename A::size_type capacity_;
typename A::size_type size_;
A allocator_;
};
The other common implementation is to store pointers to the different parts of the array. This cheapens the cost of end() (which no longer needs an addition) ever so slightly at the expense of a marginally more expensive size() call (which now needs a subtraction). In which case it could look like this:
template <class T, class A = std::allocator<T> >
class vector {
public:
// public member functions
private:
T* data_; // points to first element
T* end_capacity_; // points to one past internal storage
T* end_; // points to one past last element
A allocator_;
};
I believe gcc's libstdc++ uses the latter approach, but both approaches are equally valid and conforming.
NOTE: This is ignoring a common optimization where the empty base class optimization is used for the allocator. I think that is a quality of implementation detail, and not a matter of correctness.
Resizing the vector requires allocating a new chunk of space, and copying the existing data to the new space (thus, the requirement that items placed into a vector can be copied).
Note that it does not use new [] either -- it uses the allocator that's passed, but that's required to allocate raw memory, not an array of objects like new [] does. You then need to use placement new to construct objects in place. [Edit: well, you could technically use new char[size], and use that as raw memory, but I can't quite imagine anybody writing an allocator like that.]
When the current allocation is exhausted and a new block of memory needs to be allocated, the size must be increased by a constant factor compared to the old size to meet the requirement for amortized constant complexity for push_back. Though many web sites (and such) call this doubling the size, a factor around 1.5 to 1.6 usually works better. In particular, this generally improves chances of re-using freed blocks for future allocations.
From Wikipedia, as good an answer as any.
A typical vector implementation consists, internally, of a pointer to
a dynamically allocated array,[2] and possibly data members holding
the capacity and size of the vector. The size of the vector refers to
the actual number of elements, while the capacity refers to the size
of the internal array. When new elements are inserted, if the new size
of the vector becomes larger than its capacity, reallocation
occurs.[2][4] This typically causes the vector to allocate a new
region of storage, move the previously held elements to the new region
of storage, and free the old region. Because the addresses of the
elements change during this process, any references or iterators to
elements in the vector become invalidated.[5] Using an invalidated
reference causes undefined behaviour
Like this:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_vector.h
(official gcc mirror on github)
///Implement Vector class
class MyVector {
int *int_arr;
int capacity;
int current;
public:
MyVector() {
int_arr = new int[1];
capacity = 1;
current = 0;
}
void Push(int nData);
void PushData(int nData, int index);
void PopData();
int GetData(int index);
int GetSize();
void Print();
};
void MyVector::Push(int data)
{
if (current == capacity){
int *temp = new int[2 * capacity];
for (int i = 0; i < capacity; i++)
{
temp[i] = int_arr[i];
}
delete[] int_arr;
capacity *= 2;
int_arr = temp;
}
int_arr[current] = data;
current++;
}
void MyVector::PushData(int data, int index)
{
if (index == capacity){
Push(index);
}
else
int_arr[index] = data;
}
void MyVector::PopData(){
current--;
}
int MyVector::GetData(int index)
{
if (index < current){
return int_arr[index];
}
}
int MyVector::GetSize()
{
return current;
}
void MyVector::Print()
{
for (int i = 0; i < current; i++) {
cout << int_arr[i] << " ";
}
cout << endl;
}
int main()
{
MyVector vect;
vect.Push(10);
vect.Push(20);
vect.Push(30);
vect.Push(40);
vect.Print();
std::cout << "\nTop item is "
<< vect.GetData(3) << std::endl;
vect.PopData();
vect.Print();
cout << "\nTop item is "
<< vect.GetData(1) << endl;
return 0;
}
It allocates a new array and copies everything over. So, expanding it is quite inefficient if you have to do it often. Use reserve() if you have to use push_back().
You'd need to define what you mean by "plain old structs."
realloc by itself only creates a block of uninitialized memory. It does no object allocation. For C structs, this suffices, but for C++ it does not.
That's not to say you couldn't use realloc. But if you were to use it (note you wouldn't be reimplementing std::vector exactly in this case!), you'd need to:
Make sure you're consistently using malloc/realloc/free throughout your class.
Use "placement new" to initialize objects in your memory chunk.
Explicitly call destructors to clean up objects before freeing your memory chunk.
This is actually pretty close to what vector does in my implementation (GCC/glib), except it uses the C++ low-level routines ::operator new and ::operator delete to do the raw memory management instead of malloc and free, rewrites the realloc routine using these primitives, and delegates all of this behavior to an allocator object that can be replaced with a custom implementation.
Since vector is a template, you actually should have its source to look at if you want a reference – if you can get past the preponderance of underscores, it shouldn't be too hard to read. If you're on a Unix box using GCC, try looking for /usr/include/c++/version/vector or thereabouts.
You can implement them with resizing array implementation.
When the array becomes full, create an array with twice as much the size and copy all the content to the new array. Do not forget to delete the old array.
As for deleting the elements from vector, do resizing when your array becomes a quarter full. This strategy makes prevents any performance glitches when one might try repeated insertion and deletion at half the array size.
It can be mathematically proved that the amortized time (Average time) for insertions is still linear for n insertions which is asymptotically the same as you will get with a normal static array.
realloc only works on heap memory. In C++ you usually want to use the free store.

Improvements for this C++ stack allocator?

Any suggestions for my stack based allocator?
(Except for suggestions to use a class with private/public members)
struct Heap
{
void* heap_start;
void* heap_end;
size_t max_end;
Heap(size_t size)
{
heap_start = malloc(size);
heap_end = heap_start;
max_end = size + (size_t) heap_start;
}
~Heap()
{
::free(heap_start);
}
void* allocate(size_t bytes)
{
size_t new_end = ((size_t) heap_end) + bytes;
if( new_end > max_end )
throw std::bad_alloc();
void* output = heap_end;
heap_end = (void*) new_end;
return output;
}
}
You've implemented a stack based allocator. You can't free up without leaving gaps. Usually a pool refers to a block of contiguous memory with fixed sized slots, which are doubly linked to allow constant time add and delete.
Here's one you can use as a guide. It's along the same lines as yours but includes basic iterators over allocated nodes, and uses templates to be type aware.
size_t new_end = ((size_t) heap_end) + bytes;
Not good, never do things like that, you assume that sizeof(size_t)==sizeof(void*), also what happens if bytes==(size_t)(-1) this would not work
Additionally, you need make sure that pointers that you are return are aligned.
Otherwise you would have problems. So you need to make sure that bytes are multiple of 4 or 8 according to your platform.
class {...
char *max_end,*head_end,*heap_start;
};
...
max_end=heap_start+size;
...
bytes=align_to_platform_specific_value(bytes);
if(max_end-heap_end >= bytes) {
void* output = (void*)heap_end;
heap_end+=bytes;
return output;
}
throw std::bad_alloc();
Suggestion? Do not reinvent the wheel. There are many and good pool libraries.
Two obvious problems:
1/ You don't have a deallocate().
2/ A deallocate() will be very hard to write with your current strategy unless you're always going to deallocate in the exact reverse order of allocating. You'll need to cater for the case where a client wants to deallocate memory in the middle of your used section.
Of course, if you do deallocate in reverse order, (2) is not a problem. And if you never free memory at all, (1) is also not a problem.
It depends on what you want it to do.
Your heap doesn't allow deallocation. How will you use it for objects allocated with new in C++?