I'm reading in values from a file which I will store in memory as I read them in. I've read on here that the correct way to handle memory location in C++ is to always use new/delete, but if I do:
DataType* foo = new DataType[sizeof(DataType) * numDataTypes];
Then that's going to call the default constructor for each instance created, and I don't want that. I was going to do this:
DataType* foo;
char* tempBuffer=new char[sizeof(DataType) * numDataTypes];
foo=(DataType*) tempBuffer;
But I figured that would be something poo-poo'd for some kind of type-unsafeness. So what should I do?
And in researching for this question now I've seen that some people are saying arrays are bad and vectors are good. I was trying to use arrays more because I thought I was being a bad boy by filling my programs with (what I thought were) slower vectors. What should I be using???
Use vectors!!! Since you know the number of elements, make sure that you reserve the memory first (by calling myVector.reserve(numObjects) before you then insert the elements.).
By doing this, you will not call the default constructors of your class.
So use
std::vector<DataType> myVector; // does not reserve anything
...
myVector.reserve(numObjects); // tells vector to reserve memory
You can use ::operator new to allocate an arbitrarily sized hunk of memory.
DataType* foo = static_cast<DataType*>(::operator new(sizeof(DataType) * numDataTypes));
The main advantage of using ::operator new over malloc here is that it throws on failure and will integrate with any new_handlers etc. You'll need to clean up the memory with ::operator delete
::operator delete(foo);
Regular new Something will of course invoke the constructor, that's the point of new after all.
It is one thing to avoid extra constructions (e.g. default constructor) or to defer them for performance reasons, it is another to skip any constructor altogether. I get the impression you have code like
DataType dt;
read(fd, &dt, sizeof(dt));
If you're doing that, you're already throwing type safety out the window anyway.
Why are you trying to accomplish by not invoking the constructor?
You can allocate memory with new char[], call the constructor you want for each element in the array, and then everything will be type-safe. Read What are uses of the C++ construct "placement new"?
That's how std::vector works underneath, since it allocates a little extra memory for efficiency, but doesn't construct any objects in the extra memory until they're actually needed.
You should be using a vector. It will allow you to construct its contents one-by-one (via push_back or the like), which sounds like what you're wanting to do.
I think you shouldn't care about efficiency using vector if you will not insert new elements anywhere but at the end of the vector (since elements of vector are stored in a contiguous memory block).
vector<DataType> dataTypeVec(numDataTypes);
And as you've been told, your first line there contains a bug (no need to multiply by sizeof).
Building on what others have said, if you ran this program while piping in a text file of integers that would fill the data field of the below class, like:
./allocate < ints.txt
Then you can do:
#include <vector>
#include <iostream>
using namespace std;
class MyDataType {
public:
int dataField;
};
int main() {
const int TO_RESERVE = 10;
vector<MyDataType> everything;
everything.reserve( TO_RESERVE );
MyDataType temp;
while( cin >> temp.dataField ) {
everything.push_back( temp );
}
for( unsigned i = 0; i < everything.size(); i++ ) {
cout << everything[i].dataField;
if( i < everything.size() - 1 ) {
cout << ", ";
}
}
}
Which, for me with a list of 4 integers, gives:
5, 6, 2, 6
Related
I'm trying to populate a vector of doubles in C++ and pass the associated array to Fortran. But I'm having trouble freeing the rest of the memory associated with the vector. I'd like to avoid copying. Here's what I have:
std::vector<double> *vec = new std::vector<double>();
(*vec).push_back(1.0);
(*vec).push_back(2.0);
*arr = (*vec).data(); //arr goes to Fortran
How do I delete vec while keeping arr intact? Is there a way to nullify the pointer to arr in vec so that I can then delete vec?
Update
I see that I didn't give enough information here. A couple things:
I'm actually calling a C++ function in Fortran using iso_c_binding
I don't know how large the vec needs to be. The vector class looks good for this situation
I might try Guillaume's suggestion eventually, but for now, I'm passing vec to the Fortran and calling another C++ function to delete it once I'm done with the data
You need to rethink your program design.
Somehow, somewhere, you need to keep an array alive while Fortran is using it. So whatever context you're using to access Fortran should probably be responsible for ownership of this array.
class fortran_context {
/*Blah blah blah whatever API you're using to access Fortran*/
void * arr;
std::vector<double> vec; //Don't allocate this as a pointer; makes no sense!
public:
fortran_context() {
arr = //Do whatever is necessary to setup Fortran stuff. I'm assuming your
//api has some kind of "get_array_pointer" function that you'll use.
}
~fortran_context() {
//Do the cleanup for the fortran stuff
}
//If you want to spend time figuring out a robust copy constructor, you may.
//Personally, I suspect it's better to just delete it, and make this object non-copyable.
fortran_context(fortran_context const&) = delete;
std::vector<double> & get_vector() {
return vec;
}
std::vector<double> const& get_vector() const {
return vec;
}
void assign_vector_to_array() {
*arr = vec.data();
}
void do_stuff_with_fortran() {
assign_vector_to_array();
//???
}
};
int main() {
fortran_context context;
auto & vec = context.get_vector();
vec.push_back(1.0);
vec.push_back(2.0);
context.do_stuff_with_fortran();
return 0;
} //Cleanup happens automatically due to correct implementation of ~fortran_context()
I've abstracted a lot of this because I don't know what API you're using to access Fortran, and I don't know what kind of work you're doing with this array. But this is, by far, the safest way to ensure that
The vector's allocated memory exists so long as you are doing stuff in Fortran
The memory associated with the vector will be cleaned up properly when you're done.
How do I delete vec while keeping arr intact? Is there a way to nullify the pointer to arr in vec so that I can then delete vec?
The library does not provide any built-in capability to do that. You have to do the bookkeeping work yourself.
Allocate memory for the data and copy data from the vector.
Send the data to FORTRAN.
Decide when it is safe to deallocate the data and then delete them.
// Fill up data in vec
std::vector<double> vec;
vec.push_back(1.0);
vec.push_back(2.0);
// Allocate memory for arr and copy the data from vec
double* arr = new double[vec.size()];
std::copy(vec.begin(), vec.end(), arr);
// use arr
// delete arr
delete [] arr;
What you are asking for is not possible. std::vector, being a well behaved template class will release the internals that it owns and manages when it is destroyed.
You will have to keep vector alive while you are using its contents, which makes perfect sense. Otherwise, you will have to make a copy.
Also, I don't see why you are allocating the vector on the heap, it doesn't seem needed at all.
How do I delete vec while keeping arr intact? Is there a way to nullify the pointer to arr in vec so that I can then delete vec?
You don't.
I think you misuse or misunderstood what vector is for. It is not meant to expose memory management of the underlying array, but to represent a dynamically sized array as a regular type.
If you need to explicitly manage memory, I'd suggest you to use std::unique_ptr<T[]>. Since unique pointers offers a way to manage memory and to release it's resource without deleting it, I think it's a good candidate to meet your needs.
auto arr = std::make_unique<double[]>(2);
arr[0] = 1.;
arr[1] = 2.;
auto data = arr.release();
// You have to manage `data` memory manually,
// since the unique pointer released it's resource.
// arr is null here
// data is a pointer to an array, must be deleted manually later.
delete[] data;
But if you want to prevent the 5000 copy (or more) of some potentially ugly heavy objects. Just for performance and ignore readable code you can do this
#include <iostream>
#include <vector>
int main(int argc, char* argv[])
{
std::vector<int> v;
int* a = (int*) malloc(10*sizeof(int));
//int a[10];
for( int i = 0; i<10;++i )
{
*(a + i) = i;
}
delete v._Myfirst; // ? not sure
v._Myfirst = a;
for( int i = 0; i<10;++i )
{
std::cout << v[i] << std::endl;
}
system("PAUSE");
return 0;
}
simply replace the _Myfirst underlying "array". but be very careful, this will be deleted by the vector.
Does my delete/free of my first fail in some cases?
Does this depend on the allocator?
Is there a swap for that? There is the brand new swap for vectors, but how about from an array?
I had to interface with a code I cannot modify (political reasons), which provides me with arrays of objects and have to make them vector of objects.
The pointer seems the right solution, but it's a lot of memory jumps and I have to perform the loop anyway. I was just wondering if there was some way to avoid the copy and tell the vector, "now this is your underlying array".
I agree that the above is highly ugly and this is why I actually reserve and push in my real code. I just wanted to know if there was a way to avoid the loop.
Please don't answer with a loop and a push back, as obviously that's what my "real" code is already doing. If there is no function to perform this swap can we implement a safe one?
Question 1: does my delete/free of my first fail in some cases ?
Yes. You do not know how the memory for _Myfirst is allocated (apart from it uses the allocator). The standard does not specify that that the default allocator should use malloc to allocate the memory so you do not know if delete will work.
Also you are mixing allocation schemes.
You are allocating with malloc(). But expecting the std::vector<> to allocate with new (because you are calling delete).
Also even if the allocator did use new it would be uisng the array version of new and thus you would need to use the array version of delete to reclaim the memory.
Question 2: does this depends on allocator ?
Yes.
Question 3: is there a swap for that ? there is the brand new swap for vectors, but from an array ?
No.
To prevent expensive copies. Use emplace_back().
std::vector<int> v;
v.reserve(10); // reserve space for 10 objects (min).
for( int i = 0; i<10;++i )
{
v.emplace_back(i); // construct in place in the array (no copy).
}
std::vector<SomeUglyHeavyObject*> vect(5000, NULL);
for (int i=0; i<5000; i++)
{
vect[i] = &arr[i];
}
There, avoids copying and doesn't muck with std::vector internals.
If you don't like the way the vector behaves, don't use a vector. Your code depends on implementation details that you MUST not rely upon.
Yes it's ugly as you said. Make it pretty by using the correct container.
Which container you use depends on what operations you need on the container. It may be something as simple as a C-style array will suit your needs. It may be a vector of pointers, or shared pointers if you want the vector to manage the lifetime of the HeavyObjects.
In response to the need to increase the size of collection (see comments below):
std::vector guarantees that the underlying storage is contiguous, so if you increase the size, it will allocate a new larger buffer and copy the old one into the new (using the copy constructor for the objects if one exists.) Then swap the new buffer into place -- freeing the old one using its own allocator.
This will trash your memory unless the buffer you brute forced into the vector was allocated in a manner consistent with the vector's allocator.
So if your original array is big enough, all you need is a "maxUsed" size_t to tell you how big the collection is and where to put new entries. If it's not big enough, then your technique will either fail horribly, or incur the copy costs you are trying to avoid.
I am new so I more than likely missing something key.
I am using std::vector to store data from a ReadFile operation.
Currently I have a structure called READBUFF that contains a vector of byte. READBUFF is then instantiated via a private type in a class called Reader.
class Reader{
public:
void Read();
typedef struct {
std::vector<byte> buffer;
} READBUFF;
private:
READBUFF readBuffer;
}
Within Read() I currently resize the array to my desired size as the default allocator creates a really large vector [4292060576]
void Reader::Read()
{
readBuffer.buffer.resize(8192);
}
This all works fine, but then I got to thinking I'd rather dynamically NEW the vector inline so I control the allocation management of the pointer. I changed buffer to be: std::vector* buffer. When I try to do the following buffer is not set to a new buffer. It's clear from the debugger that it is not initialized.
void Reader::Read()
{
key.buffer = new std::vector<byte>(bufferSize);
}
So then I tried, but this behaves the same as above.
void Reader::Read()
{
std::vector<byte> *pvector = new std::vector<byte>(8192);
key.buffer = pvector;
}
Main first question is why doesn't this work? Why can't I assign the buffer pointer to valid pointer? Also how do I define the size of the inline allocation vs. having to resize?
My ultimate goal is to "new up" buffers and then store them in a deque. Right now I am doing this to reuse the above buffer, but I am in essence copying the buffer into another new buffer when all I want is to store a pointer to the original buffer that was created.
std::vector<byte> newBuffer(pKey->buffer);
pKey->ptrFileReader->getBuffer()->Enqueue(newBuffer);
Thanks in advance. I realize as I post this that I missing something fundamental but I am at a loss.
You shouldn't be using new in this case. It causes you to have to manage the memory manually, which is never something you should want to do for many reasons1. You said you want to manage the lifetime of the vector by using new; in reality, the lifetime of the vector is already managed because it's the same as the object that holds it. So the lifetime of that vector is the lifetime of the instance of your Reader class.
To set the size of the vector before it gets constructed, you'll have to make a constructor for READBUFF:
// inside the READBUFF struct (assuming you're using a normal variable, not a pointer)
READBUFF() { } // default constructor
READBUFF(int size) : buffer(size) { } // immediately sets buffer's size to the argument
and use an initialization list in Reader's constructor:
// inside the Reader class
Reader() : readBuffer(8092) { }
Which will set the readBuffer.buffer's size to 8092.
If you really want to use new just for learning:
key.buffer = new std::vector<byte>(bufferSize);
This will work fine, but you shouldn't be doing it in the Read function, you should be doing it in the object's constructor. That way any member function can use it without having to check if it's NULL.
as the default allocator creates a really large vector [4292060576]
No, it doesn't (if it did, you could have one vector on your entire computer and probably your computer would crash). It incrementally resizes the storage up when you add things and exceed the capacity. Using resize like you are doing is still good though, because instead of allocating a small one, filling it, allocating a bigger one and copying everything over, filling it, allocating a bigger one and copying everything over, etc. you are just allocating the size you need once, which is much faster.
1 Some reasons are:
You have to make sure to allocate it before anyone else uses it, where with a normal member variable it's done automatically before your object has a chance to use it.
You have to remember to delete it in the destructor.
If you don't do the above 2 things, you have either a segfault or a memory leak.
I think you may be misinterpreting the result of calling max_size() on a vector:
#include <vector>
#include <iostream>
int main() {
std::cout << std::vector<char>().max_size() << std::endl;
std::cout << std::vector<char>::size_type(~0) << std::endl;
}
This program prints the maximum possible size of the vector, not the current size. size() on the other hand does print the current size (ignoring anything that's been reserved).
Take a variable length struct (if this were a real program, an int array would be better):
#include <vector>
struct list_of_numbers(){
int length;
int *numbers; //length elements.
};
typedef std::vector<list_of_numbers> list_nums; //just a writing shortcut
(...)
And build a vector out of it:
list_nums lst(10); //make 10 lists.
lst[0].length = 7; //make the first one 7 long.
lst[0].X = new int[7]; //allocate it with new[]
(...)
The above works for g++ in ubuntu. The new() calls are needed to avoid segfaults. Can the lst vector be deleted all at once when it is no longer needed, or will the new calls cause a memory leak? It would be tedious to manually delete() all of the parts called with new().
The typical ways to do this in C++ would be to define constructors and destructors and assignment operators for the list_of_numbers struct that take care of the memory management, or (much better) use a std::vector<int> for the numbers field and get rid of the length field.
But if you do that, you may as well get rid of the struct entirely, and just do this:
#include <vector>
typedef std::vector<int> list_ints;
typedef std::vector<int_ints> list_lists;
(...)
list_lists lst(10); // make 10 lists.
lst[0].resize(7); // set length of the zeroth list to 7
Why not just use a vector of vector of int? That's it's job. You should not be calling new outside of a dedicated class.
In general, you would want to put cleanup code in the destructor of the object (~list_of_numbers()) and memory creating code in the constructor (list_of_numbers()). That way these things are handled for you when the destructor is called (or when the object is created).
I am doing a project converting some Pascal (Delphi) code to C++ and would like to write a function that is roughly equivalent to the Pascal "SetLength" method. This takes a reference to a dynamic array, as well as a length and allocates the memory and returns the reference.
In C++ I was thinking of something along the lines of
void* setlength(void* pp, int array_size, int pointer_size, int target_size, ....) {
void * p;
// Code to allocate memory here via malloc/new
// something like: p = reinterpret_cast<typeid(pp)>(p);
// p=(target_size) malloc(array_size);
return p;
}
My question is this: is there a way to pass the pointer type to a function like this and to successfully allocate the memory (perhaps via a typeid parameter?)? Can I use
<reinterpret_cast>
somehow? The ultimate aim would be something like the following in terms of usage:
float*** p;
p=setlength(100,sizeof(float***),sizeof(float**),.....);
class B;
B** cp;
cp=setlength(100,sizeof(B**),sizeof(B*),.....);
Any help would be most welcome. I am aware my suggested code is all wrong, but wanted to convey the general idea. Thanks.
Use std::vector instead of raw arrays.
Then you can simply call its resize() member method.
And make the function a template to handle arbitrary types:
If you want to use your function, it could look something like this:
template <typename T>
std::vector<T>& setlength(std::vector<T>& v, int new_size) {
v.resize(new_size);
return v;
}
But now it's so simple you might want to eliminate the function entirely and just call resize to begin with.
I'm not entirely sure what you're trying to do with the triple-pointers in your example, but it looks like you don't want to resize though, you want to initialize to a certain size, which can be done with the vector constructor:
std::vector<float>v(100);
If you wanted to do it literally, you would do it like this:
template <typename T>
T* SetLength(T* arr, size_t len) {
return static_cast<T*>(realloc(arr, sizeof(T) * len));
}
Note that the array must have been allocated with malloc or calloc. Also note that this does not actually resize the memory—it deallocates the memory and reallocates memory of the appropriate size. If there were any other pointers to the array being passed in, they will be invalid afterwards.
You're really better off using a more idiomatic C++ solution, like std::vector.
For a multidimensional array, probably the best option would be to use boost's multi_array library:
typedef boost::multi_array<float, 3> array_type;
array_type p(boost::extents[100][100][100]); // make an 100x100x100 array of floats
p[1][2][3] = 4.2;
This lets you completely abstract away the allocation and details of setting up the multidimensional array. Plus, because it uses linear storage, you get the efficiency benefits of linear storage with the ease of access of indirections.
Failing that, you have three other major options.
The most C++-y option without using external libraries would be to use a STL container:
std::vector<float **> p;
p.resize(100);
As with multi_array, p will then automatically be freed when it goes out of scope. You can get the vector bounds with p.size(). However the vector will only handle one dimension for you, so you'll end up doing nested vectors (ick!).
You can also use new directly:
float ***p = new float**[100];
To deallocate:
delete [] p;
This has all the disadvantages of std::vector, plus it won't free it for you, and you can't get the size later.
The above three methods will all throw an exception of type std::bad_alloc if they fail to allocate enough memory.
Finally, for completeness, there's the C route, with calloc():
float ***p = (float ***)calloc(100, sizeof(*p));
To free:
free((void*)p);
This comes from C and is a bit uglier with all the casts. For C++ classes it will not call the constructors for you, either. Also, there's no checking that the sizeof in the argument is consistent with the cast.
If calloc() fails to allocate memory it will return NULL; you'll need to check for this and handle it.
To do this the C++ way:
1) As jalf stated, prefer std::vector if you can
2) Don't do void* p. Prefer instead to make your function a template of type T.
The new operator itself is essentially what you are asking for, with the exception that to appropriately allocate for double/triple pointers you must do something along the following lines:
float** data = new float*[size_of_dimension_1];
for ( size_t i=0 ; i<size_of_dimension_1 ; ++i )
data[i] = new float[size_of_dimension_2];
...
// to delete:
for ( size_t i=0 ; i<size_of_dimension_1 ; ++i )
delete [] data[i];
delete [] data;
Edit: I would suggest using one of the many C++ math/matrix libraries out there. I would suggest uBlas.