Dynamically creating std::vector // creating a pointer to a vector

Dynamically creating std::vector // creating a pointer to a vector - c++

I am new so I more than likely missing something key.
I am using std::vector to store data from a ReadFile operation.
Currently I have a structure called READBUFF that contains a vector of byte. READBUFF is then instantiated via a private type in a class called Reader.
class Reader{
public:
void Read();
typedef struct {
std::vector<byte> buffer;
} READBUFF;
private:
READBUFF readBuffer;
}
Within Read() I currently resize the array to my desired size as the default allocator creates a really large vector [4292060576]
void Reader::Read()
{
readBuffer.buffer.resize(8192);
}
This all works fine, but then I got to thinking I'd rather dynamically NEW the vector inline so I control the allocation management of the pointer. I changed buffer to be: std::vector* buffer. When I try to do the following buffer is not set to a new buffer. It's clear from the debugger that it is not initialized.
void Reader::Read()
{
key.buffer = new std::vector<byte>(bufferSize);
}
So then I tried, but this behaves the same as above.
void Reader::Read()
{
std::vector<byte> *pvector = new std::vector<byte>(8192);
key.buffer = pvector;
}
Main first question is why doesn't this work? Why can't I assign the buffer pointer to valid pointer? Also how do I define the size of the inline allocation vs. having to resize?
My ultimate goal is to "new up" buffers and then store them in a deque. Right now I am doing this to reuse the above buffer, but I am in essence copying the buffer into another new buffer when all I want is to store a pointer to the original buffer that was created.
std::vector<byte> newBuffer(pKey->buffer);
pKey->ptrFileReader->getBuffer()->Enqueue(newBuffer);
Thanks in advance. I realize as I post this that I missing something fundamental but I am at a loss.

You shouldn't be using new in this case. It causes you to have to manage the memory manually, which is never something you should want to do for many reasons1. You said you want to manage the lifetime of the vector by using new; in reality, the lifetime of the vector is already managed because it's the same as the object that holds it. So the lifetime of that vector is the lifetime of the instance of your Reader class.
To set the size of the vector before it gets constructed, you'll have to make a constructor for READBUFF:
// inside the READBUFF struct (assuming you're using a normal variable, not a pointer)
READBUFF() { } // default constructor
READBUFF(int size) : buffer(size) { } // immediately sets buffer's size to the argument
and use an initialization list in Reader's constructor:
// inside the Reader class
Reader() : readBuffer(8092) { }
Which will set the readBuffer.buffer's size to 8092.
If you really want to use new just for learning:
key.buffer = new std::vector<byte>(bufferSize);
This will work fine, but you shouldn't be doing it in the Read function, you should be doing it in the object's constructor. That way any member function can use it without having to check if it's NULL.
as the default allocator creates a really large vector [4292060576]
No, it doesn't (if it did, you could have one vector on your entire computer and probably your computer would crash). It incrementally resizes the storage up when you add things and exceed the capacity. Using resize like you are doing is still good though, because instead of allocating a small one, filling it, allocating a bigger one and copying everything over, filling it, allocating a bigger one and copying everything over, etc. you are just allocating the size you need once, which is much faster.
1 Some reasons are:
You have to make sure to allocate it before anyone else uses it, where with a normal member variable it's done automatically before your object has a chance to use it.
You have to remember to delete it in the destructor.
If you don't do the above 2 things, you have either a segfault or a memory leak.

I think you may be misinterpreting the result of calling max_size() on a vector:
#include <vector>
#include <iostream>
int main() {
std::cout << std::vector<char>().max_size() << std::endl;
std::cout << std::vector<char>::size_type(~0) << std::endl;
}
This program prints the maximum possible size of the vector, not the current size. size() on the other hand does print the current size (ignoring anything that's been reserved).

Related

how to transfer ownership from a c++ std::vector<char> to a char *pointer

I have met a situation that, i got a std::vector<char> data output from a third-party lib, it is a very big length data, and then I need to convert it a pybind11::array object, I don't want to allocate memory and do memcpy ,that's not efficient at all.
Now I know I can got the std::vector<char> buffer address, but I do not know how to release the vector's ownership so the buffer would not release when the vector object is destructed, I wonder if there's a way to achieve this,.
I have wrote a test code below to test, but it failed
#include<vector>
#include<iostream>
int *got_vec(int len){// the len in actually scene is decided by the thirdparty lib
std::vector<int> vec;
for(int i =0;i<len;i++){
vec.push_back(i);
}
int *p_vec = &vec[0];
std::move(vec);
return p_vec;
}
int main(int argc,const char **argv){
int len=atoi(argv[1]);
std::cout<<"will go allocate memory size:"<<len<<", before allocation"<<std::endl;
int *p_vec = got_vec(len);
std::cout<<"after allocation, go to print value"<<std::endl;
for(int i = 0; i < len;i++)
std::cout<<p_vec[i]<<",";
std::cout<<std::endl;
delete p_vec;
std::cout<<"deleted"<<std::endl;
}
The program crashed at std::cout<<p_vec[i]<<",";

A std::vector does not allow detaching the underlying buffer. You can also have a look at taking over memory from std::vector for a similar question.

std::vector doesn't provide way to transfer its buffer.
You have to do some copy (or not use std::vector for original buffer).

You can of course, because this is what the std::vector class does for move operation. But you cannot in a portable way (*). You must find in the internals of the std::vector class of your implementation, how the buffer is stored (what attribute is used). Ideally you should search how what the move constructor does with the source vector and do the same. It is likely to just set the internal pointer to NULL, but you should control that.
Simply as you would be using unspecified internals, it would only be guaranteed to work on that version of that compiler.
On a philosophical point of view, we as the programmers are expected to use the classes of the standard library as they are. There are few extension points, and few classes allow to be derived. It is just the opposite of Java which provides abstract classes to help the programmer to build its own custom classes.

You can't directly 'steal' memory from an std::vector, but maybe you don't need to steal it in the 1st place.
I'm not familiar with pybind11::array, but since you want the data pointer from std::vector, I'm guessing you can construct it from data that was allocated somewhere else.
Maybe all you need is a wrapper class, that would keep your data inside a std::vector, while providing a view into it through pybind11::array.
class Wrapper {
public:
Wrapper(std::vector<char>);
pybind11::array asArray();
private:
std::vector<char> m_data
}
You can then use std::move to efficiently transfer data between std::vectors, without copying it around.

Can't make arrays with unspecified length

Basically I am trying to make an array which will get larger every time the user enters a value. Which means I never know how long the array will be. I don't know if this is something to do with my class or anything.
#pragma once
#include <iostream>
#include <string>
#define dexport __declspec(dllexport)
//Im making an DLL
using namespace std;
class dexport API {
private:
string users[] = {"CatMan","ManCat"}; //Line With Error Incomplete Type Is Not Allowed
public:
string getAllUsers(string list[]) {
for (unsigned int a = 0; a < sizeof(list) / sizeof(list[0]); a = a + 1) {
return list[a];
}
}
};
It gives me an error Incomplete type is not allowed. I really have no idea what to do.
Compiler Error

There are a few things wrong with your code. For starters, an array has a fixed size, so, even if your code did compile, it wouldn't work. Normally, the compiler would infer the size of your array from the length of the initializer; but you are creating a class, and it needs to know it, hence the error.
This will solve your compilation problem:
string users[2] = {"CatMan","ManCat"};
But then your array has a fixed size, and that is not what you want, so you need an std::vector:
#include <vector>
[...]
vector<string>users = {"CatMan","ManCat"};
Now you can use the '[]' operator to access the strings, and users.push_back to add new users.
The next problem you need to solve is the way you are trying to return your value: you shouldn't use an argument as an out value (although you can, with either a reference or a pointer). You should decide whether you want to return a reference to your vector, a copy of your vector, or a const reference, for example:
// Returning a copy
vector<string> getAllUsers() {
return users;
}
// Returning a reference
vector<string>& getAllUsers() {
return users;
}
Finally, you are creating a library: you should know that if you want to share memory between different processes, you need to use some kind of shared memory. Currently, every program will keep its own copy of the API.

What you are looking for is an std::vector.
You can find more info here.
It's somewhat similar to an array, except that it allows variable length.

You can use std::vector. It allocate and copy elements to new place if it got out space.
If you wanna make the class yourself for educational reason here is what you should try as a basic solution:
You allocate some memory up front and store its length as capacity. You need a variable(size) to store the number of elements already entered to the class e.g. via a push_back function. Once the size reached capacity, you need to reallocate memory copy over all the elements and then delete the old memory.

Adding element to Array of Objects in C++

How do I add an element to the end of an array dynamically in C++?
I'm accustomed to using vectors to dynamically add an element. However, vectors does not seem to want to handle an array of objects.
So, my main goal is having an array of objects and then being able to add an element to the end of the array to take another object.
EDIT**
Sorry, its the pushback() that causes me the problems.
class classex
{
private:
int i;
public:
classex() { }
void exmethod()
{
cin >> i;
}
};
void main()
{
vector <classex> vectorarray;
cout << vectorarray.size();
cout << vectorarray.push_back();
}
Now I know push_back must have an argument, but What argument?

Now I know push_back must have an argument, but What argument?
The argument is the thing that you want to append to the vector. What could be simpler or more expected?
BTW, you really, really, really do not want exmethod as an actual method of classex in 99% of cases. That's not how classes work. Gathering the information to create an instance is not part of the class's job. The class just creates the instance from that information.

Arrays are fixed sized containers. So enlarging them is not possible. You work around this and copy one array in a bigger and gain space behind the old end, but that's it.
You can create a array larger than you currently need it and remember which elements are empty. Of course they are never empty (they at least contain 0's), but that's a different story.
Like arrays, there are many containers, some are able to grow, like the stl containers: lists, vectors, deques, sets and so on.
add a Constructor to set i (just to give your example a real world touch) to your example classex, like this:
class classex {
public:
classex(int& v) : i(v) {}
private:
int i;
};
An example for a growing container looks like this:
vector <classex> c; // c for container
// c is empty now. c.size() == 0
c.push_back(classex(1));
c.push_back(classex(2));
c.push_back(classex(3));
// c.size() == 3

EDIT: The question was how to add an element to an array dynamically allocated, but the OP actually mean std::vector. Below the separator is my original answer.
std::vector<int> v;
v.push_back( 5 ); // 5 is added to the back of v.
You could always use C's realloc and free. EDIT: (Assuming your objects are PODs.)
When compared to the requirement of manually allocating, copying, and reallocating using new and delete, it's a wonder Stroustrup didn't add a keyword like renew.

C++ Allocate Memory Without Activating Constructors

I'm reading in values from a file which I will store in memory as I read them in. I've read on here that the correct way to handle memory location in C++ is to always use new/delete, but if I do:
DataType* foo = new DataType[sizeof(DataType) * numDataTypes];
Then that's going to call the default constructor for each instance created, and I don't want that. I was going to do this:
DataType* foo;
char* tempBuffer=new char[sizeof(DataType) * numDataTypes];
foo=(DataType*) tempBuffer;
But I figured that would be something poo-poo'd for some kind of type-unsafeness. So what should I do?
And in researching for this question now I've seen that some people are saying arrays are bad and vectors are good. I was trying to use arrays more because I thought I was being a bad boy by filling my programs with (what I thought were) slower vectors. What should I be using???

Use vectors!!! Since you know the number of elements, make sure that you reserve the memory first (by calling myVector.reserve(numObjects) before you then insert the elements.).
By doing this, you will not call the default constructors of your class.
So use
std::vector<DataType> myVector; // does not reserve anything
...
myVector.reserve(numObjects); // tells vector to reserve memory

You can use ::operator new to allocate an arbitrarily sized hunk of memory.
DataType* foo = static_cast<DataType*>(::operator new(sizeof(DataType) * numDataTypes));
The main advantage of using ::operator new over malloc here is that it throws on failure and will integrate with any new_handlers etc. You'll need to clean up the memory with ::operator delete
::operator delete(foo);
Regular new Something will of course invoke the constructor, that's the point of new after all.
It is one thing to avoid extra constructions (e.g. default constructor) or to defer them for performance reasons, it is another to skip any constructor altogether. I get the impression you have code like
DataType dt;
read(fd, &dt, sizeof(dt));
If you're doing that, you're already throwing type safety out the window anyway.
Why are you trying to accomplish by not invoking the constructor?

You can allocate memory with new char[], call the constructor you want for each element in the array, and then everything will be type-safe. Read What are uses of the C++ construct "placement new"?
That's how std::vector works underneath, since it allocates a little extra memory for efficiency, but doesn't construct any objects in the extra memory until they're actually needed.

You should be using a vector. It will allow you to construct its contents one-by-one (via push_back or the like), which sounds like what you're wanting to do.

I think you shouldn't care about efficiency using vector if you will not insert new elements anywhere but at the end of the vector (since elements of vector are stored in a contiguous memory block).

vector<DataType> dataTypeVec(numDataTypes);
And as you've been told, your first line there contains a bug (no need to multiply by sizeof).

Building on what others have said, if you ran this program while piping in a text file of integers that would fill the data field of the below class, like:
./allocate < ints.txt
Then you can do:
#include <vector>
#include <iostream>
using namespace std;
class MyDataType {
public:
int dataField;
};
int main() {
const int TO_RESERVE = 10;
vector<MyDataType> everything;
everything.reserve( TO_RESERVE );
MyDataType temp;
while( cin >> temp.dataField ) {
everything.push_back( temp );
}
for( unsigned i = 0; i < everything.size(); i++ ) {
cout << everything[i].dataField;
if( i < everything.size() - 1 ) {
cout << ", ";
}
}
}
Which, for me with a list of 4 integers, gives:
5, 6, 2, 6

How can I make my char buffer more performant?

I have to read a lot of data into:
vector<char>
A 3rd party library reads this data in many turns. In each turn it calls my callback function whose signature is like this:
CallbackFun ( int CBMsgFileItemID,
unsigned long CBtag,
void* CBuserInfo,
int CBdataSize,
void* CBdataBuffer,
int CBisFirst,
int CBisLast )
{
...
}
Currently I have implemented a buffer container using an STL Container where my method insert() and getBuff are provided to insert a new buffer and getting stored buffer. But still I want better performing code, so that I can minimize allocations and de-allocations:
template<typename T1>
class buffContainer
{
private:
class atomBuff
{
private:
atomBuff(const atomBuff& arObj);
atomBuff operator=(const atomBuff& arObj);
public:
int len;
char *buffPtr;
atomBuff():len(0),buffPtr(NULL)
{}
~atomBuff()
{
if(buffPtr!=NULL)
delete []buffPtr;
}
};
public :
buffContainer():_totalLen(0){}
void insert(const char const *aptr,const unsigned long &alen);
unsigned long getBuff(T1 &arOutObj);
private:
std::vector<atomBuff*> moleculeBuff;
int _totalLen;
};
template<typename T1>
void buffContainer< T1>::insert(const char const *aPtr,const unsigned long &aLen)
{
if(aPtr==NULL,aLen<=0)
return;
atomBuff *obj=new atomBuff();
obj->len=aLen;
obj->buffPtr=new char[aLen];
memcpy(obj->buffPtr,aPtr,aLen);
_totalLen+=aLen;
moleculeBuff.push_back(obj);
}
template<typename T1>
unsigned long buffContainer<T1>::getBuff(T1 &arOutObj)
{
std::cout<<"Total Lenght of Data is: "<<_totalLen<<std::endl;
if(_totalLen==0)
return _totalLen;
// Note : Logic pending for case size(T1) > T2::Value_Type
int noOfObjRqd=_totalLen/sizeof(T1::value_type);
arOutObj.resize(noOfObjRqd);
char *ptr=(char*)(&arOutObj[0]);
for(std::vector<atomBuff*>::const_iterator itr=moleculeBuff.begin();itr!=moleculeBuff.end();itr++)
{
memcpy(ptr,(*itr)->buffPtr,(*itr)->len);
ptr+= (*itr)->len;
}
std::cout<<arOutObj.size()<<std::endl;
return _totalLen;
}
How can I make this more performant?

If my wild guess about your callback function makes sense, you don't need anything more than a vector:
std::vector<char> foo;
foo.reserve(MAGIC); // this is the important part. Reserve the right amount here.
// and you don't have any reallocs.
setup_callback_fun(CallbackFun, &foo);
CallbackFun ( int CBMsgFileItemID,
unsigned long CBtag,
void* CBuserInfo,
int CBdataSize,
void* CBdataBuffer,
int CBisFirst,
int CBisLast )
{
std::vector<char>* pFoo = static_cast<std::vector<char>*>(CBuserInfo);
char* data = static_cast<char*>CBdataBuffer;
pFoo->insert(pFoo->end(), data, data+CBdataSize);
}

Depending on how you plan to use the result, you might try putting the incoming data into a rope datastructure instead of vector, especially if the strings you expect to come in are very large. Appending to the rope is very fast, but subsequent char-by-char traversal is slower by a constant factor. The tradeoff might work out for you or not, I don't know what you need to do with the result.
EDIT: I see from your comment this is no option, then. I don't think you can do much more efficient in the general case when the size of the data coming in is totally arbitrary. Otherwise you could try to initially reserve enough space in the vector so that the data will fit without or at most one reallocation in the average case or so.
One thing I noticed about your code:
if(aPtr==NULL,aLen<=0)
I think you mean
if(aPtr==NULL || aLen<=0)

The main thing you can do is avoid doing quite so much copying of the data. Right now, when insert() is called, you're copying the data into your buffer. Then, when getbuff() is called, you're copying the data out to a buffer they've (hopefully) specified. So, to get data from outside to them, you're copying each byte twice.
This part:
arOutObj.resize(noOfObjRqd);
char *ptr=(char*)(&arOutObj[0]);
Seems to assume that arOutObj is really a vector. If so, it would be a whole lot better to rewrite getbuff as a normal function taking a (reference to a) vector instead of being a template that really only works for one type of parameter.
From there, it becomes a fairly simple matter to completely eliminate one copy of the data. In insert(), instead of manually allocating memory and tracking the size, put the data directly into a vector. Then, when getbuff() is called, instead of copying the data into their buffer, just give then a reference to your existing vector.
class buffContainer {
std::vector<char> moleculeBuff;
public:
void insert(char const *p, unsigned long len) {
Edit: Here you really want to add:
moleculeBuff.reserve(moleculeBuff.size()+len);
End of edit.
std::copy(p, p+len, std::back_inserter(moleculeBuff));
}
void getbuff(vector<char> &output) {
output = moleculeBuff;
}
};
Note that I've changed the result of getbuff to void -- since you're giving them a vector, its size is known, and there's no point in returning the size. In reality, you might want to actually change the signature a bit, to just return the buffer:
vector<char> getbuff() {
vector<char> temp;
temp.swap(moleculeBuff);
return temp;
}
Since it's returning a (potentially large) vector by value, this depends heavily on your compiler implementing the named return value optimization (NRVO), but 1) the worst case is that it does about what you were doing before anyway, and 2) virtually all reasonably current compilers DO implement NRVO.
This also addresses one other detail your original code didn't (seem to). As it was, getbuff returns some data, but if you call it again, it (apparently doesn't keep track of what data has already been returned, so it returns it all again. It keeps allocating data, but never deletes any of it. That's what the swap is for: it creates an empty vector, and then swaps that with the one that's being maintained by buffContainer, so buffContainer now has an empty vector, and the filled one is handed over to whatever called getbuff().
Another way to do things would be to take the swap a step further: basically, you have two buffers:
one owned by buffContainer
one owned by whatever calls getbuffer()
In the normal course of things, we can probably expect that the buffer sizes will quickly reach some maximum size. From there on, we'd really like to simply re-cycle that space: read some data into one, pass it to be processed, and while that's happening, read data into the other.
As it happens, that's pretty easy to do too. Change getbuff() to look something like this:
void getbuff(vector<char> &output) {
swap(moleculeBuff, output);
moleculeBuff.clear();
}
This should improve speed quite a bit -- instead of copying data back and forth, it just swaps one vector's pointer to the data with the others (along with a couple other details like the current allocation size, and used size of the vector). The clear is normally really fast -- for a vector (or any type without a dtor) it'll just set the number of items in the vector to zero (if the items have dtors, it has to destroy them, of course). From there, the next time insert() is called, the new data will just be copied into the memory the vector already owns (until/unless it needs more space than the vector had allocated).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js