Wrapping dynamic array into STL/Boost container? - c++

I need to wrap a dynamically allocated array(from a = new double[100] for example) into std::vector(preferably) without copying the array.
This restriction is imposed by that the array I want to wrap is mmaped from a file, so just doing vector(a, a+size) will double the memory usage.
Is any tricks to do that?

One of the best solutions for this is something like STLSoft's array_proxy<> template. Unfortunately, the doc page generated from the source code by doxygen isn't a whole lot of help understanding the template. The source code might actually be a bit better:
http://www.stlsoft.org/doc-1.9/array__proxy_8hpp-source.html
The array_proxy<> template is described nicely in Matthew Wilson's book, Imperfect C++. The version I've used is a cut-down version of what's on the STLSoft site so I didn't have to pull in the whole library. My version's not as portable, but that makes it much simpler than what's on STLSoft (which jumps through a whole lot of portability hoops).
If you set up a variable like so:
int myArray[100];
array_proxy<int> myArrayProx( myArray);
The variable myArrayProx has many of the STL interfaces - begin(), end(), size(), iterators, etc.
So in many ways, the array_proxy<> object behaves just like a vector (though push_back() isn't there since the array_proxy<> can't grow - it doesn't manage the array's memory, it just wraps it in something a little closer to a vector).
One really nice thing with array_proxy<> is that if you use them as function parameter types, the function can determine the size of the array passed in, which isn't true of native arrays. And the size of the wrapped array isn't part of the template's type, so it's quite flexible to use.

A boost::iterator_range provides a container-like interface:
// Memory map an array of doubles:
size_t number_of_doubles_to_map = 100;
double* from_mmap = mmap_n_doubles(number_of_doubles_to_map);
// Wrap that in an iterator_range
typedef boost::iterator_range<double*> MappedDoubles;
MappedDoubles mapped(from_mmap, from_mmap + number_of_doubles_to_map);
// Use the range
MappedDoubles::iterator b = mapped.begin();
MappedDoubles::iterator e = mapped.end();
mapped[0] = 1.1;
double first = mapped(0);
if (mapped.empty()){
std::cout << "empty";
}
else{
std::cout << "We have " << mapped.size() << "elements. Here they are:\n"
<< mapped;
}

I was once determined to accomplish the exact same thing. After a few days of thinking and trying I decided it wasn't worth it. I ended up creating my own custom vector that behaved like std::vector's but only had the functionality I actually needed like bound checking, iterators etc.
If you still desire to use std::vector, the only way I could think of back then was to create a custom allocator. I've never written one but seeing as this is the only way to control STL's memory management maybe there is something that can be done there.

No, that is not possible using a std::vector.
But if possible you can create the vector with this size, and possible map the file to that instead.
std::vector<double> v(100);
mmapfile_double(&v[0], 100);

What about vector of pointers that point to your mapped area elements (reduced memory consumption as sizeof(double*) < sizeof(double))? Is this OK for you?
There is some drawbacks (primary is you need special predicates for sort) but some benefits too as you can, for example, delete elements without changing actual mapped content (or have even number of such arrays with different order of elements without any change to actual values).
There is common problem of all the solutions with std::vector on mapped file: to 'nail' vector content to mapped area. This can't be tracked, you can only watch after yourself to not use something which could lead to vector content re-allocation. So be careful in any case.

You could go with array_proxy<>, or take a look at Boost.Array . It gives you size(), front(), back(), at(), operator[], etc. Personally, I'd prefer Boost.Array since Boost is more prevalent anyway.

well, the vector template allows to provide your own memory allocator. I never did it myself but I guess it is not that difficult to get it to point to your array, maybe with placement new operator... just a guess, I write more if I try and succeed.

Here's the solution to your question. I had been attempting this off and on for quite some time before I came up with a workable solution. The caveat is that you have got to zero out the pointers after use in order to avoid double-freeing the memory.
#include <vector>
#include <iostream>
template <class T>
void wrapArrayInVector( T *sourceArray, size_t arraySize, std::vector<T, std::allocator<T> > &targetVector ) {
typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *vectorPtr =
(typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *)((void *) &targetVector);
vectorPtr->_M_start = sourceArray;
vectorPtr->_M_finish = vectorPtr->_M_end_of_storage = vectorPtr->_M_start + arraySize;
}
template <class T>
void releaseVectorWrapper( std::vector<T, std::allocator<T> > &targetVector ) {
typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *vectorPtr =
(typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *)((void *) &targetVector);
vectorPtr->_M_start = vectorPtr->_M_finish = vectorPtr->_M_end_of_storage = NULL;
}
int main() {
int tests[6] = { 1, 2, 3, 6, 5, 4 };
std::vector<int> targetVector;
wrapArrayInVector( tests, 6, targetVector);
std::cout << std::hex << &tests[0] << ": " << std::dec
<< tests[1] << " " << tests[3] << " " << tests[5] << std::endl;
std::cout << std::hex << &targetVector[0] << ": " << std::dec
<< targetVector[1] << " " << targetVector[3] << " " << targetVector[5] << std::endl;
releaseVectorWrapper( targetVector );
}
Alternatively you could just make a class that inherits from vector and nulls out the pointers upon destruction:
template <class T>
class vectorWrapper : public std::vector<T>
{
public:
vectorWrapper() {
this->_M_impl _M_start = this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = NULL;
}
vectorWrapper(T* sourceArray, int arraySize)
{
this->_M_impl _M_start = sourceArray;
this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = sourceArray + arraySize;
}
~vectorWrapper() {
this->_M_impl _M_start = this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = NULL;
}
void wrapArray(T* sourceArray, int arraySize)
{
this->_M_impl _M_start = sourceArray;
this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = sourceArray + arraySize;
}
};

Related

How do I create an array of objects?

I only know Java, and I am learning how to do c++ right now. I currently have an object called "node". I want to make an array of those elements in a different class, and I have to perform many operations on this array. Because of this, I am trying to declare a global array variable that gets initialized in my constructor. In Java, this would've been done by
ObjectName[] variableName = new ObjectName[size];
but I am not sure how to do it in c++. I've tried declaring it similar to how I declared the other global arrays, with
Node* nodes;
and then in my constructor:
nodes = new Node[size]
but I got a bunch of compiler errors. How am I supposed to do this? This is only my second week of coding in c++, so try to keep answers basic.
In C++ you use vector more often than array. You also distinguish between creating objects on the stack and on the heap (you already mentioned that concept; in C++ you are more actively involved in thinking about that).
You also may want to pay attention which C++ Standard you are using. Some concepts are not available in older standards. I tried to mention some in the example code below.
When dealing with arrays in C/C++ you should understand the notion of pointers, which I believe is the probable cause of your confusion. new creates an object on the heap and returns a pointer. When creating an array, then the returned pointer points to the first element.
Avoid new if you can. In newer C++ standards there are better concepts of smart pointers (e.g. std::unique_ptr<...>); I will not dive into that since you are just beginning. Be patient with learning C++, I am sure you will succeed, it takes time really.
#include <iostream>
#include <array>
#include <vector>
struct Node {
std::string name = "node";
};
int main() {
const size_t size = 10;
// you can create it on the stack
// will be deleted when leaving the block/scope
Node nodes1[size];
nodes1[0].name = "first node1";
std::cout << nodes1[0].name << std::endl;
// you can create it on the heap
// you have to delete the objects yourself then
Node *nodes2 = new Node[size];
nodes2[0].name = "first node2";
std::cout << nodes2[0].name << std::endl;
// in C++ 11 and later you can use std::array<...>
// you have to include the header <array> for that
std::array<Node, size> nodes3;
nodes3[0].name = "first node3";
std::cout << nodes3[0].name << std::endl;
// in C++ you use array "seldom"
// instead you use the containers quite a lot as far as I have learned
// e.g. you can include <vector>; can be used like an array
std::vector<Node> nodes4(size);
nodes4[0].name = "first node4";
std::cout << nodes4[0].name << std::endl;
// you can iterate over a vector like you know it from an array
for (size_t i = 0; i < nodes4.size(); ++i) {
if (i == 0) {
std::cout << nodes4[i].name << std::endl;
}
}
// in C++ you will soon learn about iterators too
for (auto iter = nodes4.begin(); iter != nodes4.end(); iter++) {
if (iter == nodes4.begin()) {
std::cout << iter->name << std::endl;
}
}
return 0;
}
How do I create an array of objects?
Given a type named ObjectName, you can define an array variable with name variableName and a compile time constant size size like this:
ObjectName variableName[size]{};

Can we use conventional pointer arithmetic with std::array?

I want to work out how to use old style pointer arithmetic on pointers to elements of the std::array class. The following code (unsurprisingly perhaps) does not compile:
int main(int argc, char *argv[])
{
double* data1 = new double[(int)std::pow(2,20)];
std::cout << *data1 << " " << *(data1 +1) << std::endl;
delete data1;
data1 = NULL;
double* data2 = new std::array<double, (int)std::pow(2,20)>;
std::cout << *data2 << " " << *(data2 +1) << std::endl;
delete data2;
data2 = NULL;
return 0;
}
As an exercise, I want to use all the conventional pointer arithmetic, but instead of pointing at an old style double array, I want it to point to the elements of a std::array. My thinking with this line:
double* data2 = new std::array<double, (int)std::pow(2,20)>;
is to instruct the compiler that data2 is a pointer to the first element of the heap allocated std::array<double,(int)std::pow(2,20)>.
I have been taught that the double* name = new double[size]; means EXACTLY the following: «Stack allocate memory for a pointer to ONE double and name the pointer name, then heap allocate an array of doubles of size size, then set the pointer to point to the first element of the array.» Since the above code does not compile, I must have been taught something incorrect since the same syntax doesnt work for std::arrays.
This raises a couple of questions:
What is the actual meaning of the statement type* name = new othertype[size];?
How can I achieve what I want using std::array?
Finally, how can I achieve the same using std::unqiue_ptr and std::make_unique?
I have been taught that the double* name = new double[size]; means EXACTLY the following: «Stack allocate memory for a pointer to ONE double and name the pointer name, then heap allocate an array of doubles of size size, then set the pointer to point to the first element of the array.» Since the above code does not compile, I must have been taught something incorrect since the same syntax doesnt work for std::arrays.
You are correct about that statement, but keep in mind that the way this works is that new[] is a different operator from new. When you dynamically allocate an std::array, you're calling the single-object new, and the returned pointer points to the std::array object itself.
You can do pointer arithmetic on the contents of an std::array. For example, data2.data() + 1 is a pointer to data2[1]. Note that you have to call .data() to get a pointer to the underlying array.
Anyway, don't dynamically allocate std::array objects. Avoid dynamic allocation if possible, but if you need it, then use std::vector.
Can we use conventional pointer arithmetic with std::array?
Yes, sure you can - but not on the array itself, which is an object. Rather, you use the address of the data within the array, which you get with the std::array's data() method, like so:
std::array<double, 2> data2 { 12.3, 45.6 };
double* raw_data2 = data2.data(); // or &(*data2.begin());
std::cout << *raw_data2 << " " << *(raw_data2 + 1) << std::endl;
and this compiles and runs fine. But you probably don't really need to use pointer arithmetic and could just write your code different, utilizing the nicer abstraction of an std::array.
PS - Avoid using explicit memory allocation with new and delete(see the C++ Core Guidelines item about this issue). In your case you don't need heap allocation at all - just like you don't need it with the regular array.
You can have access to the "raw pointer" view of std::array using the data() member function. However, the point of std::array is that you don't have to do this:
int main(int argc, char *argv[])
{
std::array<double, 2> myArray;
double* data = myArray.data();
// Note that the builtin a[b] operator is exactly the same as
// doing *(a+b).
// But you can just use the overloaded operator[] of std::array.
// All of these thus print the same thing:
std::cout << *(data) << " " << *(data+1) << std::endl;
std::cout << data[0] << " " << data[1] << std::endl;
std::cout << myArray[0] << " " << myArray[1] << std::endl;
return 0;
}
The meaning of a generalized:
type* name = new othertype[size];
Ends up being "I need a variable name that's a pointer to type and initialize that with a contiguous allocation of size instances of othertype using new[]". Note that this involves casting and might not even work as othertype and type might not support that operation. A std::array of double is not equivalent to a pointer to double. It's a pointer to a std::array, period, but if you want to pretend that's a double and you don't mind if your program crashes due to undefined behaviour you can proceed. Your compiler should warn you here, and if it doesn't your warnings aren't strict enough.
Standard Library containers are all about iterators, not pointers, and especially not pointer arithmetic. Iterators are far more flexible and capable than pointers, they can handle exotic data structures like linked lists, trees and more without imposing a lot of overhead on the caller.
Some containers like std::vector and std::array support "random access iterators" which are a form of direct pointer-like access to their contents: a[1] and so on. Read the documentation of any given container carefully as some permit this, and many don't.
Remember that "variable" and "allocated on stack" are not necessarily the same thing. An optimizing compiler can and will put that pointer wherever it wants, including registers instead of memory, or nowhere at all if it thinks it can get away with it without breaking the expressed behaviour of your code.
If you want std::array, which you really do as Standard Library containers are almost always better than C-style arrays:
std::array<double, 2> data2;
If you need to share this structure you'll need to consider if the expense of using std::unique_ptr is worth it. The memory footprint of this thing will be tiny and copying it will be trivial, so it's pointless to engage a relatively expensive memory management function.
If you're passing around a larger structure, consider using a reference instead and locate the structure in the most central data structure you have so it doesn't need to be copied by design.
Sure, these are all legal:
template<class T, std::size_t N>
T* alloc_array_as_ptr() {
auto* arr = new std::array<T,N>;
if (!arr) return nullptr;
return arr->data();
}
template<class T, std::size_t N>
T* placement_array_as_ptr( void* ptr ) {
auto* arr = ::new(ptr) std::array<T,N>;
return arr->data();
}
template<std::size_t N, class T>
std::array<T, N>* ptr_as_array( T* in ) {
if (!in) return nullptr;
return reinterpret_cast<std::array<T,N>*>(in); // legal if created with either above 2 functions!
}
// does not delete!
template<std::size_t N, class T>
void destroy_array_as_ptr( T* t ) {
if (!t) return;
ptr_as_array<N>(t)->~std::array<T,N>();
}
// deletes
template<std::size_t N, class T>
void delete_array_as_ptr(T* t) {
delete ptr_as_array<N>(t);
}
the above is, shockingly, actually legal if used perfectly. The pointer-to-first-element-of-array is pointer interconvertable with the entire std::array.
You do have to keep track of the array size yourself.
I wouldn't advise doing this.
std::array is a STL container after all!
auto storage = std::array<double, 1 << 20>{};
auto data = storage.begin();
std::cout << *data << " " << *(data + 1) << std::endl;

Memory size of the new allocated int in c++, Is there a different and better way to see it?

In this program I am trying to find out how much memory is allocated for my pointer. I can see it in this way that it should be 1 gibibyte which is = 1 073 741 824 bytes. My problem is that the only way I can get this thru is by taking the size of int which is 4 and multiplying by that const number. Is there a different way?
#include "stdafx.h"
#include <iostream>
#include <new>
int main(){
const int gib = 268435256; //Created a constant int so I could allocate 1
//Gib memory
int *ptr = new int [gib];
std::cout << sizeof (int)*gib << std::endl;
std::cout << *ptr << std::endl;
std::cout << ptr << std::endl;
try {
}catch (std::bad_alloc e) {
std::cerr << e.what() << std::endl;
}
system("PAUSE");
delete[] ptr;
return 0;
}
No, there is no way. The compiler internally adds information about how much memory was allocated and how many elements were created by new[], because otherwise it couldn't perform delete[] correctly. However, there is no portable way in C++ to get that information and use it directly.
So you have to store the size separately while you still know it.
Actually, you don't, because std::vector does it for you:
#include <iostream>
#include <vector>
#include <new>
int main() {
const int gib = 268435256;
try {
std::vector<int> v(gib);
std::cout << (v.capacity() * sizeof(int)) << '\n';
} catch (std::bad_alloc const& e) {
std::cerr << e.what() << '\n';
}
}
You should practically never use new[]. Use std::vector.
Note that I've used capacity and not size, because size tells you how many items the vector represents, and that number can be smaller than the number of elements supported by the vector's currently allocated memory.
There is also no way to avoid the sizeof, because the size of an int can vary among implementations. But that's not a problem, either, because a std::vector cannot lose its type information, so you always know how big one element is.
You wouldn't need the multiplication if it was a std::vector<char>, a std::vector<unsigned char> or a std::vector<signed char>, because those three character types' sizeof is guaranteed to be 1.
There is no way to retrieve the amount of allocated memory from the pointer. Lets forget for a moment that standard containers (and smart pointers) exist, then you could use a struct that encapsulates the pointer and the size. The most simple dynamic array I can imagine is this:
template <typename T>
struct my_dynamic_array {
size_t capacity;
T* data;
my_dynamic_array(size_t capacity) : capacity(capacity),data(new T[capacity]) {}
~my_dynamic_array() { delete[] data; }
const T& operator[](int i) const { return data[i];}
T& operator[](int i) { return data[i];}
};
Note that his is just a basic example for the sake of demonstration, eg you shouldnt copy instances of this struct or bad things will happen. However, it can be used like this:
my_dynamic_array<int> x(5);
x[3] = 1;
std::cout << x[3];
ie no pointers and no manual memory allocation in the code using the array, which is a good thing. Actually, this alone is already a big deal, because now you can make use of RAII and cannot forget to delete the memory.
Next you may want to resize your array, which requires just a bit more boilerplate (again: take it with a grain of salt!):
template <typename T>
struct my_dynamically_sized_array : my_dynamic_array<T> {
size_t size;
my_dynamically_sized_array(size_t size, size_t capacity) :
my_dynamic_array<T>(capacity),size(size) {}
void push(const T& t) {
my_dynamic_array<T>::data[size] = t;
++size;
}
};
It can be used like this:
my_dynamically_sized_array<int> y(0,3);
y.push(3);
std::cout << y[0];
Of course the memory would need to be reallocated when the size grows bigger than the capacity and many more things would be required to make this wrapper really functional (eg being able to copy would be nice).
The bottom line is: Dont do any of this! To write a good full-blown container class much more than I can outline here is required and most of that is boiler-plate that doesnt really add value to your code base, because std::vector already is a thin wrapper around dynamically allocated memory that offers you all you need while not imposing overhead for stuff you dont use.

STL container with different contained types?

Let's say I have different types of components which are structs. Maybe I have TransformComponent and RigidBodyComponent
Now, this is the problem: I want something like an std::map where you map a component type and an id to a component. The ids are what links components together. What kind of container should I use for this? I can't use an std::map<std::typeindex, std::map<id_t, T>> since the type T depends on which typeindex you use to index the first map.
Your use case sounds like a typical use of polymorphism. You should know that any attempts to store "non-homogenous" types in a single container will come with the performance penalty of polymorphism. As of whether you will use "out of the box" polymorphism that C++ provides or go for a custom solution - its entirely up to you.
BTW, to cite one of the questions from the comments on the question:
Suppose you can have such a container. What will you do with it? Can
you show some intended usage examples?
This is a very good question, because revealing your particular usage scenario will allow other to answer your question in much more detail, because right now it sounds like you don't really know what you are doing or need to do. So, if you need further guidance, you should really clarify and build on your question.
If you need to work with container which contain different types, look on some BOOST libraries:
Any: Safe, generic container for single values of different value types. (http://www.boost.org/doc/libs/1_54_0/doc/html/any.html)
Variant: Safe, generic, stack-based discriminated union container (http://www.boost.org/doc/libs/1_54_0/doc/html/variant.html)
Use variant, if list of your types is well defined and does not change.
So your code can look like this:
typedef boost::variant<TransformComponent, RigidBodyComponent> my_struct;
std::map<std::typeindex, std::map<id_t, my_struct> > cont;
...
std::typeindex index = std::type_index(typeid(TransformComponent));
std::map<id_t, my_struct> & m = cont[index];
id_t id = ...;
TransformComponent & component = boost::get<TransformComponent>(m[id]);
This code is quite ugly, so think about changing architecture. May be it will simpler with boost::any or with boost::variant.
P.S.
If you write template code then may be better to look to boost::mpl.
So you can solve this if you don't mind writing a custom container that uses ye olde C hacking.
I've written you an example here:
#include <iostream>
using namespace std;
struct ent
{
int myInt;
};
struct floats
{
float float1;
float float2;
};
struct container
{
bool isTypeFloats;
union
{
ent myEnt;
floats myFloats;
};
};
void main( void )
{
ent a = { 13 };
floats b = { 1.0f, 2.0f };
container c;
container d;
cout << b.float1 << " " << b.float2 << endl;
c.isTypeFloats = false;
c.myEnt = a;
d.isTypeFloats = true;
d.myFloats = b;
//correct accessor
if( c.isTypeFloats )
{
cout << c.myFloats.float1 << " " << c.myFloats.float2 << endl;
}
else
{
cout << c.myEnt.myInt << endl;
}
if( d.isTypeFloats )
{
cout << d.myFloats.float1 << " " << d.myFloats.float2 << endl;
}
else
{
cout << d.myEnt.myInt << endl;
}
}
To put these structs in a container you'd just do: std::vector< container >
A couple things you should know about this:
unions allocate space for the largest type. So in my example if an int takes 4 bytes and a float takes 4 bytes, then even when I'm storing just an ent it will allocate space for a floats so I'll be wasting 4 bytes each time I'm storing an ent. Depending on the purpose of your application and the size of the types you're storing this may be negligible.
If there is a significant disparity in size in the types you're storing then you can go about this the way C++ is really handling the union under the hood. And that's to use a void*. So you would do: std::vector< void* > myVec and you would insert like this: myVec.push_back( &x ) where x is your type, for example the ent from our example. But reading it out you'd have to know what you were pointing at so you'd have to know to do something like: cout << ( ( ent* )myVec[0] )->myInt << endl;
Because you probably wouldn't know what type it was unless you had some predefined writing pattern you'd probably just end up wanting to use a container struct like so:
struct container2
{
bool isTypeFloats;
void* myUnion;
}
How about boost::any or boost::any_cast ??
http://www.boost.org/doc/libs/1_54_0/doc/html/any.html

Apply std::begin() on an dynamically allocated array in a unique_ptr?

I have an unique pointer on a dynamically allocated array like this:
const int quantity = 6;
unique_ptr<int[]> numbers(new int[quantity]);
This should be correct so far (I think, the [] in the template parameter is important, right?).
By the way: Is it possible to initialize the elements like in int some_array[quantity] = {}; here?
Now I was trying to iterate over the array like this:
for (auto it = begin(numbers); it != end(numbers); ++it)
cout << *it << endl;
But I cannot figure out, how the syntax is right. Is there a way?
Alternatively I can use the index like:
for (int i = 0; i < quantity; ++i)
cout << numbers[i] << endl;
Is one of these to be preferred?
(Not directly related to the title: As a next step I would like to reduce that to a range-based for loop but I just have VS2010 right now and cannot try that. But would there be something I have to take care of?)
Thank you! Gerrit
Compiler is supposed to apply this prototype for std::begin:
template< class T, size_t N >
T* begin( T (&array)[N] );
It means the parameter type is int(&)[N], neither std::unique_ptr nor int *. If this is possible, how could std::end to calculate the last one?
But why not use raw pointer directly or a STL container?
const int quantity = 6;
std::unique_ptr<int[]> numbers{new int[quantity]};
// assignment
std::copy_n(numbers.get(), quantity,
std::ostream_iterator<int>(std::cout, "\n"));
const int quantity = 6;
std::vector<int> numbers(quantity, 0);
// assignment
std::copy(cbegin(numbers), cend(numbers),
std::ostream_iterator<int>(std::cout, "\n"));
Dynamically allocated arrays in C++ (ie: the result of new []) do not have sizing information. Therefore, you can't get the size of the array.
You could implement std::begin like this:
namespace std
{
template<typename T> T* begin(const std::unique_ptr<T[]> ptr) {return ptr.get();}
}
But there's no way to implement end.
Have you considered using std::vector? With move support, it shouldn't be any more expensive than a unique_ptr to an array.