So I have a struct as shown below, I would like to create an array of that structure and allocate memory for it (using malloc).
typedef struct {
float *Dxx;
float *Dxy;
float *Dyy;
} Hessian;
My first instinct was to allocate memory for the whole structure, but then, I believe the internal arrays (Dxx, Dxy, Dyy) won't be assigned. If I assign internal arrays one by one, then the structure of arrays would be undefined. Now I think I should assign memory for internal arrays and then for the structure array, but it seems just wrong to me. How should I solve this issue?
I require a logic for using malloc in this situation instead of new / delete because I have to do this in cuda and memory allocation in cuda is done using cudaMalloc, which is somewhat similar to malloc.
In C++ you should not use malloc at all and instead use new and delete if actually necessary. From the information you've provided it is not, because in C++ you also rather use std::vector (or std::array) over C-style-arrays. Also the typedef is not needed.
So I'd suggest rewriting your struct to use vectors and then generate a vector of this struct, i.e.:
struct Hessian {
std::vector<float> Dxx;
std::vector<float> Dxy;
std::vector<float> Dyy;
};
std::vector<Hessian> hessianArray(2); // vector containing two instances of your struct
hessianArray[0].Dxx.push_back(1.0); // example accessing the members
Using vectors you do not have to worry about allocation most of the time, since the class handles that for you. Every Hessian contained in hessianArray is automatically allocated for you, stored on the heap and destroyed when hessianArray goes out of scope.
It seems like problem which could be solved using STL container. Regarding the fact you won't know sizes of arrays you may use std::vector.
It's less error-prone, easier to maintain/work with and standard containers free their resources them self (RAII). #muXXmit2X already shown how to use them.
But if you have/want to use dynamic allocation, you have to first allocate space for array of X structures
Hessian *h = new Hessian[X];
Then allocate space for all arrays in all structures
for (int i = 0; i < X; i++)
{
h[i].Dxx = new float[Y];
// Same for Dxy & Dyy
}
Now you can access and modify them. Also dont forget to free resources
for (int i = 0; i < X; i++)
{
delete[] h[i].Dxx;
// Same for Dxy & Dyy
}
delete[] h;
You should never use malloc in c++.
Why?
new will ensure that your type will have their constructor called. While malloc will not call constructor. The new keyword is also more type safe whereas malloc is not typesafe at all.
As other answers point out, the use of malloc (or even new) should be avoided in c++. Anyway, as you requested:
I require a logic for using malloc in this situation instead of new / delete because I have to do this in cuda...
In this case you have to allocate memory for the Hessian instances first, then iterate throug them and allocate memory for each Dxx, Dxy and Dyy. I would create a function for this like follows:
Hessian* create(size_t length) {
Hessian* obj = (Hessian*)malloc(length * sizeof(Hessian));
for(size_t i = 0; i < length; ++i) {
obj[i].Dxx = (float*)malloc(sizeof(float));
obj[i].Dxy = (float*)malloc(sizeof(float));
obj[i].Dyy = (float*)malloc(sizeof(float));
}
return obj;
}
To deallocate the memory you allocated with create function above, you have to iterate through Hessian instances and deallocate each Dxx, Dxy and Dyy first, then deallocate the block which stores the Hessian instances:
void destroy(Hessian* obj, size_t length) {
for(size_t i = 0; i < length; ++i) {
free(obj[i].Dxx);
free(obj[i].Dxy);
free(obj[i].Dyy);
}
free(obj);
}
Note: using the presented method will pass the responsibility of preventing memory leaks to you.
If you wish to use the std::vector instead of manual allocation and deallocation (which is highly recommended), you can write a custom allocator for it to use cudaMalloc and cudaFree like follows:
template<typename T> struct cuda_allocator {
using value_type = T;
cuda_allocator() = default;
template<typename U> cuda_allocator(const cuda_allocator<U>&) {
}
T* allocate(std::size_t count) {
if(count <= max_size()) {
void* raw_ptr = nullptr;
if(cudaMalloc(&raw_ptr, count * sizeof(T)) == cudaSuccess)
return static_cast<T*>(raw_ptr);
}
throw std::bad_alloc();
}
void deallocate(T* raw_ptr, std::size_t) {
cudaFree(raw_ptr);
}
static std::size_t max_size() {
return std::numeric_limits<std::size_t>::max() / sizeof(T);
}
};
template<typename T, typename U>
inline bool operator==(const cuda_allocator<T>&, const cuda_allocator<U>&) {
return true;
}
template<typename T, typename U>
inline bool operator!=(const cuda_allocator<T>& a, const cuda_allocator<U>& b) {
return !(a == b);
}
The usage of an custom allocator is very simple, you just have to specify it as second template parameter of std::vector:
struct Hessian {
std::vector<float, cuda_allocator<float>> Dxx;
std::vector<float, cuda_allocator<float>> Dxy;
std::vector<float, cuda_allocator<float>> Dyy;
};
/* ... */
std::vector<Hessian, cuda_allocator<Hessian>> hessian;
Related
I'm creating pooled_allocator to alloc memory for my components of specific type in one memory block. How can I update already allocated ptrs stored in some container?
I implemented ECS for my game with common approach. I used
std::unordered_map<TypeIndex, vector<BaseComponent*>>
to store my components by type.
And it is how I allocates memory for components now.
template<typename T, typename... Args>
T* Entity::assign(Args&&... args)
{
//....
// if there are no components of this type yet
std::allocator<T> alloc;
T* comp = std::allocator_traits<std::allocator<T>>::allocate(alloc, 1);
std::allocator_traits<std::allocator<T>>::construct(alloc, comp, T(args...));
// get container for type and add pointer to newly allocated component, that has to be stored in pool
container.push_back(comp);
components.insert({ getTypeIndex<T>(), container });
}
So, now I wanna implement some pooled_allocator, that should meet all std::allocator_traits requirenments to use it as allocator for my components. But I wanna to make it dynamic so it should be able to extend its internal memory block.
What I have for now?
template<typename T, unsigned int Size = 10>
class pooled_allocator
{
// some typedefs
typedef value_type * pointer;
static void* m_memoryBlock; // malloc(SIZE*Size) or nullptr to reproduce problem
static std::list<MyBlockNode> m_allocated;
static const size_t SIZE; // sizeof(T)
// some methods
static pointer allocate(const size_t amount)
{
if (!m_memoryBlock)
m_memoryBlock = malloc(amount * SIZE);
else
{
// here realloc can return another pointer
m_memoryBlock = realloc(m_memoryBlock, m_allocated.size() * SIZE + amount * SIZE);
}
int index = m_allocated.size();
m_allocated.push_back(MyBlockNode { });
pointer elPointer = (pointer)((pointer)m_memoryBlock + index * SIZE);
return elPointer;
}
template <class Up, class... Args>
void construct(Up* p, Args&&... args)
{
new((void*)p) Up(std::forward<Args>(args)...);
}
}
The problem is where realloc returns the pointer to another allocated block ( different pointer ) already allocated objects have copied to a new place and all ptrs stored inside components[typeIndex()] becomes invalid(
How I can fix it? One of the options is to return some ComponentHandle instead of T* and if internal memory block has moved, update all already returned handles with new pointers, but it will decrease allocation speed and makes some restrictions.
And I know about boost::pool_allocator, but my game is too small to integrate such libs.
You created your problem defining your own memory allocation like that.
A solution is to not realloc but to allocate a new block when needed, and to manage all these blocks.
But the real question is why do you do that ? You probably choose that way so solve an other problem, which one ?
I currently have a c++ class as follows:
template<class T>
class MyQueue {
T** m_pBuffer;
unsigned int m_uSize;
unsigned int m_uPendingCount;
unsigned int m_uAvailableIdx;
unsigned int m_uPendingndex;
public:
MyQueue(): m_pBuffer(NULL), m_uSize(0), m_uPendingCount(0), m_uAvailableIdx(0),
m_uPendingndex(0)
{
}
~MyQueue()
{
delete[] m_pBuffer;
}
bool Initialize(T *pItems, unsigned int uSize)
{
m_uSize = uSize;
m_uPendingCount = 0;
m_uAvailableIdx = 0;
m_uPendingndex = 0;
m_pBuffer = new T *[m_uSize];
for (unsigned int i = 0; i < m_uSize; i++)
{
m_pBuffer[i] = &pItems[i];
}
return true;
}
};
So, I have this pointer to arrays m_pBuffer object and I was wondering if it is possible to replace this way of doing things with the c++ smart pointer perhaps? I know I can do things like:
std::unique_ptr<T> buffer(new T[size]);
Is using a vector of smart pointers the way to go? Is this recommended and safe?
[EDIT]
Based on the answers and the comments, I have tried to make a thread-safe buffer array. Here it is. Please comment.
#ifndef __BUFFER_ARRAY_H__
#define __BUFFER_ARRAY_H__
#include <memory>
#include <vector>
#include <mutex>
#include <thread>
#include "macros.h"
template<class T>
class BufferArray
{
public:
class BufferArray()
:num_pending_items(0), pending_index(0), available_index(0)
{}
// This method is not thread-safe.
// Add an item to our buffer list
void add(T * buffer)
{
buffer_array.push_back(std::unique_ptr<T>(buffer));
}
// Returns a naked pointer to an available buffer. Should not be
// deleted by the caller.
T * get_available()
{
std::lock_guard<std::mutex> lock(buffer_array_mutex);
if (num_pending_items == buffer_array.size()) {
return NULL;
}
T * buffer = buffer_array[available_index].get();
// Update the indexes.
available_index = (available_index + 1) % buffer_array.size();
num_pending_items += 1;
return buffer;
}
T * get_pending()
{
std::lock_guard<std::mutex> lock(buffer_array_mutex);
if (num_pending_items == 0) {
return NULL;
}
T * buffer = buffer_array[pending_index].get();
pending_index = (pending_index + 1) % buffer_array.size();
num_pending_items -= 1;
}
private:
std::vector<std::unique_ptr<T> > buffer_array;
std::mutex buffer_array_mutex;
unsigned int num_pending_items;
unsigned int pending_index;
unsigned int available_index;
// No copy semantics
BufferArray(const BufferArray &) = delete;
void operator=(const BufferArray &) = delete;
};
#endif
Vector of smart pointers is good idea. It is safe enough inside your class - automatic memory deallocation is provided.
It is not thread-safe though, and it's not safe in regard of handling external memory given to you by simple pointers.
Note that you current implementation does not delete pItems memory in destructor, so if after refactoring you mimic this class, you should not use vector of smart pointers as they will delete memory referenced by their pointers.
On the other side you cannot garantee that noone outside will not deallocate memory for pItems supplied to your Initialize. IF you want to use vector of smart pointers, you should formulate contract for this function that clearly states that your class claims this memory etc. - and then you should rework outside code that calls your class to fit into new contract.
If you don't want to change memory handling, vector of simple pointers is the way to go. Nevertheless, this piece of code is so simple, that there is no real benefit of vector.
Note that overhead here is creation of smart pointer class for each buffer and creation of vector class. Reallocation of vector can take up more memory and happens without your direct control.
The code has two issues:
1) Violation of the rule of zero/three/five:
To fix that you do not need a smart pointer here. To represent a dynamic array with variable size use a std:vector<T*>. That allows you to drop m_pBuffer, m_uSize and the destructor, too.
2) Taking the addresses of elements of a possible local array
In Initialize you take the addresses of the elements of the array pItems passed as argument to the function. Hence the queue does not take ownership of the elements. It seems the queue is a utility class, which should not be copyable at all:
template<class T>
class MyQueue
{
std::vector<T*> m_buffer;
unsigned int m_uPendingCount;
unsigned int m_uAvailableIdx;
unsigned int m_uPendingndex;
public:
MyQueue(T *pItems, unsigned int uSize)
: m_buffer(uSize, nullptr), m_uPendingCount(0), m_uAvailableIdx(0), m_uPendingndex(0)
{
for (unsigned int i = 0; i < uSize; i++)
{
m_buffer[i] = &pItems[i];
}
}
private:
MyQueue(const MyQueue&); // no copy (C++11 use: = delete)
MyQueue& operator = (const MyQueue&); // no copy (C++11 use: = delete)
};
Note:
The red herring is the local array.
You may consider a smart pointer for that, but that is another question.
I have a class that contains several arrays whose sizes can be determined by parameters to its constructor. My problem is that instances of this class have sizes that can't be determined at compile time, and I don't know how to tell a new method at run time how big I need my object to be. Each object will be of a fixed size, but different instances may be different sizes.
There are several ways around the problem:- use a factory- use a placement constructor- allocate arrays in the constructor and store pointers to them in my object.
I am adapting some legacy code from an old application written in C. In the original code, the program figures out how much memory will be needed for the entire object, calls malloc() for that amount, and proceeds to initialize the various fields.
For the C++ version, I'd like to be able to make a (fairly) normal constructor for my object. It will be a descendant of a parent class, and some of the code will be depending on polymorphism to call the right method. Other classes descended from the same parent have sizes known at compile time, and thus present no problem.
I'd like to avoid some of the special considerations necessary when using placement new, and I'd like to be able to delete the objects in a normal way.
I'd like to avoid carrying pointers within the body of my object, partially to avoid ownership problems associated with copying the object, and partially because I would like to re-use as much of the existing C code as possible. If ownership were the only issue, I could probably just use shared pointers and not worry.
Here's a very trimmed-down version of the C code that creates the objects:
typedef struct
{
int controls;
int coords;
} myobject;
myobject* create_obj(int controls, int coords)
{
size_t size = sizeof(myobject) + (controls + coords*2) * sizeof(double);
char* mem = malloc(size);
myobject* p = (myobject *) mem;
p->controls = controls;
p->coords = coords;
return p;
}
The arrays within the object maintain a fixed size of the life of the object. In the code above, memory following the structure of myobject will be used to hold the array elements.
I feel like I may be missing something obvious. Is there some way that I don't know about to write a (fairly) normal constructor in C++ but be able to tell it how much memory the object will require at run time, without resorting to a "placement new" scenario?
How about a pragmatic approach: keep the structure as is (if compatibility with C is important) and wrap it into a c++ class?
typedef struct
{
int controls;
int coords;
} myobject;
myobject* create_obj(int controls, int coords);
void dispose_obj(myobject* obj);
class MyObject
{
public:
MyObject(int controls, int coords) {_data = create_obj(controls, coords);}
~MyObject() {dispose_obj(_data);}
const myobject* data() const
{
return _data;
}
myobject* data()
{
return _data;
}
int controls() const {return _data->controls;}
int coords() const {return _data->coords;}
double* array() { return (double*)(_data+1); }
private:
myobject* _data;
}
While I understand the desire to limit the changes to the existing C code, it would be better to do it correctly now rather than fight with bugs in the future. I suggest the following structure and changes to your code to deal with it (which I suspect would mostly be pulling out code that calculates offsets).
struct spots
{
double x;
double y;
};
struct myobject
{
std::vector<double> m_controls;
std::vector<spots> m_coordinates;
myobject( int controls, int coordinates ) :
m_controls( controls ),
m_coordinates( coordinates )
{ }
};
To maintain the semantics of the original code, where the struct and array are in a single contigious block of memory, you can simply replace malloc(size) with new char[size] instead:
myobject* create_obj(int controls, int coords)
{
size_t size = sizeof(myobject) + (controls + coords*2) * sizeof(double);
char* mem = new char[size];
myobject* p = new(mem) myobject;
p->controls = controls;
p->coords = coords;
return p;
}
You will have to use a type-cast when freeing the memory with delete[], though:
myobject *p = create_obj(...);
...
p->~myobject();
delete[] (char*) p;
In this case, I would suggest wrapping that logic in another function:
void free_obj(myobject *p)
{
p->~myobject();
delete[] (char*) p;
}
myobject *p = create_obj(...);
...
free_obj(p);
That being said, if you are allowed to, it would be better to re-write the code to follow C++ semantics instead, eg:
struct myobject
{
int controls;
int coords;
std::vector<double> values;
myobject(int acontrols, int acoords) :
controls(acontrols),
coords(acoords),
values(acontrols + acoords*2)
{
}
};
And then you can do this:
std::unique_ptr<myobject> p = std::make_unique<myobject>(...); // C++14
...
std::unique_ptr<myobject> p(new myobject(...)); // C++11
...
std::auto_ptr<myobject> p(new myobject(...)); // pre C++11
...
New Answer (given comment from OP):
Allocate a std::vector<byte> of the correct size. The array allocated to back the vector will be contiguous memory. This vector size can be calculated and the vector will manage your memory correctly. You will still need to be very careful about how you manage your access to that byte array obviously, but you can use iterators and the like at least (if you want).
By the way here is a little template thing I use to move along byte blobs with a little more grace (note this has aliasing issues as pointed out by Sergey in the comments below, I'm leaving it here because it seems to be a good example of what not to do... :-) ) :
template<typename T>
T readFromBuf(byte*& ptr) {
T * const p = reinterpret_cast<T*>(ptr);
ptr += sizeof(T);
return *p;
}
Old Answer:
As the comments suggest, you can easily use a std::vector to do what you want. Also I would like to make another suggestion.
size_t size = sizeof(myobject) + (controls + coords*2) * sizeof(double);
The above line of code suggests to me that you have some "hidden structure" in your code. Your myobject struct has two int values from which you are calculating the size of what you actually need. What you actually need is this:
struct ControlCoord {
double control;
std::pair<double, double> coordinate;
};
std::vector<ControlCoord>> controlCoords;
When the comments finally scheded some light on the actual requirements, the solution would be following:
allocate a buffer large enough to hold your object and the array
use placement new in the beginning of the buffer
Here is how:
class myobject {
myobject(int controls, int coords) : controls(controls), coords(coords) {}
~myobject() {};
public:
const int controls;
const int coords;
static myobject* create(int controls, int coords) {
std::unique_ptr<char> buffer = new char[sizeof(myobject) + (controls + coords*2) * sizeof(double)];
myobject obj* = new (buffer.get()) myobject(controls, coords);
buffer.release();
return obj;
}
void dispose() {
~myobject();
char* p = (char*)this;
delete[] p;
}
};
myobject *p = myobject::create(...);
...
p->dispose();
(or suitably wrapped inside deleter for smart pointer)
The question is that: is there a way to use the class "vector" in Cuda kernels? When I try I get the following error:
error : calling a host function("std::vector<int, std::allocator<int> > ::push_back") from a __device__/__global__ function not allowed
So there a way to use a vector in global section?
I recently tried the following:
create a new Cuda project
go to properties of the project
open Cuda C/C++
go to Device
change the value in "Code Generation" to be set to this value:
compute_20,sm_20
........ after that I was able to use the printf standard library function in my Cuda kernel.
is there a way to use the standard library class vector in the way printf is supported in kernel code? This is an example of using printf in kernel code:
// this code only to count the 3s in an array using Cuda
//private_count is an array to hold every thread's result separately
__global__ void countKernel(int *a, int length, int* private_count)
{
printf("%d\n",threadIdx.x); //it's print the thread id and it's working
// vector<int> y;
//y.push_back(0); is there a possibility to do this?
unsigned int offset = threadIdx.x * length;
int i = offset;
for( ; i < offset + length; i++)
{
if(a[i] == 3)
{
private_count[threadIdx.x]++;
printf("%d ",a[i]);
}
}
}
You can't use the STL in CUDA, but you may be able to use the Thrust library to do what you want. Otherwise just copy the contents of the vector to the device and operate on it normally.
In the cuda library thrust, you can use thrust::device_vector<classT> to define a vector on device, and the data transfer between host STL vector and device vector is very straightforward. you can refer to this useful link:http://docs.nvidia.com/cuda/thrust/index.html to find some useful examples.
you can't use std::vector in device code, you should use array instead.
I think you can implement a device vector by youself, because CUDA supports dynamic memory alloction in device codes. Operator new/delete are also supported. Here is an extremely simple prototype of device vector in CUDA, but it does work. It hasn't been tested sufficiently.
template<typename T>
class LocalVector
{
private:
T* m_begin;
T* m_end;
size_t capacity;
size_t length;
__device__ void expand() {
capacity *= 2;
size_t tempLength = (m_end - m_begin);
T* tempBegin = new T[capacity];
memcpy(tempBegin, m_begin, tempLength * sizeof(T));
delete[] m_begin;
m_begin = tempBegin;
m_end = m_begin + tempLength;
length = static_cast<size_t>(m_end - m_begin);
}
public:
__device__ explicit LocalVector() : length(0), capacity(16) {
m_begin = new T[capacity];
m_end = m_begin;
}
__device__ T& operator[] (unsigned int index) {
return *(m_begin + index);//*(begin+index)
}
__device__ T* begin() {
return m_begin;
}
__device__ T* end() {
return m_end;
}
__device__ ~LocalVector()
{
delete[] m_begin;
m_begin = nullptr;
}
__device__ void add(T t) {
if ((m_end - m_begin) >= capacity) {
expand();
}
new (m_end) T(t);
m_end++;
length++;
}
__device__ T pop() {
T endElement = (*m_end);
delete m_end;
m_end--;
return endElement;
}
__device__ size_t getSize() {
return length;
}
};
You can't use std::vector in device-side code. Why?
It's not marked to allow this
The "formal" reason is that, to use code in your device-side function or kernel, that code itself has to be in a __device__ function; and the code in the standard library, including, std::vector is not. (There's an exception for constexpr code; and in C++20, std::vector does have constexpr methods, but CUDA does not support C++20 at the moment, plus, that constexprness is effectively limited.)
You probably don't really want to
The std::vector class uses allocators to obtain more memory when it needs to grow the storage for the vectors you create or add into. By default (i.e. if you use std::vector<T> for some T) - that allocation is on the heap. While this could be adapted to the GPU - it would be quite slow, and incredibly slow if each "CUDA thread" would dynamically allocate its own memory.
#Now, you could say "But I don't want to allocate memory, I just want to read from the vector!" - well, in that case, you don't need a vector per se. Just copy the data to some on-device buffer, and either pass a pointer and a size, or use a CUDA-capable span, like in cuda-kat. Another option, though a bit "heavier", is to use the [NVIDIA thrust library]'s 3 "device vector" class. Under the hood, it's quite different from the standard library vector though.
I have a class that is described that way :
class Foo {
int size;
int data[0];
public:
Foo(int _size, int* _data) : size(_size) {
for (int i = 0 ; i < size ; i++) {
data[i] = adapt(_data[i]);
}
}
// Other, uninteresting methods
}
I cannot change the design of that class.
How can I create an instance of that class ? Before calling the constructor, I have to make it reserve enough memory to store its data, so it has to be on the heap, not on the stack. I guess I want something like
Foo* place = static_cast<Foo*>(malloc(sizeof(int) + sizeof(int) * size));
*place = new Foo(size, data); // I mean : "use the memory allocated in place to do your stuff !"
But I can't find a way to make it work.
EDIT : as commentators have noticed, this is not a very good overall design (with non-standards tricks such as data[0]), alas this is a library I am forced to use...
You could malloc the memory for the object and then use a placement new to create the object in the previously allocated memory :
void* memory = malloc(sizeof(Foo) + sizeof(int) * size);
Foo* foo = new (memory) Foo(size, data);
Note that in order to destroy this object, you can't use delete. You would have to manually call the destructor and then use free on the memory allocated with malloc :
foo->~Foo();
free(memory); //or free(foo);
Also note that, as #Nikos C. and #GManNickG suggested, you can do the same in a more C++ way using ::operator new :
void* memory = ::operator new(sizeof(Foo) + sizeof(int) * size);
Foo* foo = new (memory) Foo(size, data);
...
foo->~Foo();
::operator delete(memory); //or ::operator delete(foo);
You have a library that does this thing but doesn't supply a factory function? For shame!
Anyway, while zakinster's method is right (I'd directly call operator new instead of newing an array of chars, though), it's also error-prone, so you should wrap it up.
struct raw_delete {
void operator ()(void* ptr) {
::operator delete(ptr);
}
};
template <typename T>
struct destroy_and_delete {
void operator ()(T* ptr) {
if (ptr) {
ptr->~T();
::operator delete(ptr);
}
}
};
template <typename T>
using dd_unique_ptr = std::unique_ptr<T, destroy_and_delete<T>>;
using FooUniquePtr = dd_unique_ptr<Foo>;
FooUniquePtr CreateFoo(int* data, int size) {
std::unique_ptr<void, raw_delete> memory{
::operator new(sizeof(Foo) + size * sizeof(int))
};
Foo* result = new (memory.get()) Foo(size, data);
memory.release();
return FooUniquePtr{result};
}
Yes, there's a bit of overhead here, but most of this stuff is reusable.
If you really want to be lazy simply use a std::vector<Foo>. It will use more space (I think 3 pointers instead of 1) but you get all the benefits of a container and really no downsides if you know it is never going to change in size.
Your objects will be movable given your definition so you can safely do the following to eliminate reallocation of the vector during initial fill...
auto initialFooValue = Foo(0, 0)
auto fooContainer = std::vector<Foo>(size, initialFooValue);
int i = 0;
for (auto& moveFoo : whereverYouAreFillingFrom)
{
fooContainer[i] = std::move(moveFoo);
++i;
}
Since std::vector is contiguous you can also just memcopy into it safely since your objects are trivially-copyable.
I think a good C++ solution is to get raw memory with new and then use placement new to embed your class into it.
getting raw memory works like this:
Foo *f = static_cast<Foo *>(operator new(sizeof(Foo));
constructing the object in received memory works like this:
new (f) Foo(size, data); // placement new
remember that this also means that you have to manually clean up the place.
f->~Foo(); // call the destructor
operator delete(f); // free the memory again
My personal opinion is, that it is bad to use malloc and free in newly written C++ code.