Is there a fast (CPU) way to cast a char array into a std::vector<char>?
My function looks like this:
void someFunction(char* data, int size)
{
// Now here I need to make the std::vector<char>
// using the data and size.
}
You can't "cast" anything here, but you can easily construct a vector from the C string:
std::vector<char> v(data, data + size);
This will create a copy though.
The general rule with STL containers is that they make copies of their contents. With C++11, there are special ways of moving elements into a container (for example, the emplace_back() member function of std::vector), but in your example, the elements are char objects, so you are still going to copy each of the size char objects.
Think of a std::vector as a wrapper of a pointer to an array together with the length of the array. The closest equivalent of "casting a char * to a std::vector<char>" is to swap out the vector's pointer and length with a given pointer and length however the length is specified (two possibilities are a pointer to one past the end element or a size_t value). There is no standard member function of std::vector that you can use to swap its internal data with a given pointer and length.
This is for good reason, though. std::vector implements ownership semantics for every element that it contains. Its underlying array is allocated with some allocator (the second template parameter), which is std::allocator by default. If you were allowed to swap out the internal members, then you would need to ensure that the same set of heap allocation routines were used. Also, your STL implementation would need to fix the method of storing "length" of the vector rather than leaving this detail unspecified. In the world of OOP, specifying more details than necessary is generally frowned upon because it can lead to higher coupling.
But, assume that such a member function exists for your STL implementation. In your example, you simply do not know how data was allocated, so you could inadvertently give a std::vector a pointer to heap memory that was not allocated with the expected allocator. For example, data could have been allocated with malloc whereas the vector could be freeing the memory with delete. Using mismatched allocation and deallocation routines leads to Undefined Behavior. You might require that someFunction() only accept data allocated with a particular allocation routine, but this is specifying more details than necessary again.
Hopefully I have made my case that a std::vector member function that swaps out the internal data members is not a good idea. If you really need a std::vector<char> from data and size, you should construct one with:
std::vector<char> vec(data, data + size);
Related
I want to grow a ::std::vector at runtime, like that:
::std::vector<int> i;
i.push_back(100);
i.push_back(10);
At some point, the vector is finished, and i do not need the extra functionality ::std::vector provides any more, so I want to convert it to a C-Array:
int* v = i.data();
Because I will do that more than once, I want to deallocate all the heap memory ::std::vector reserved, but I want to keep the data, like that (pseudo-code):
free(/*everything from ::std::vector but the list*/);
Can anybody give me a few pointers on that?
Thanks in advance,
Jack
In all the implementations I have seen, the allocated part of a vector is... the array itself. If you know that it will no longer grow, you can try to shrink it, to release possibly unused memory:
i.shrink_to_fit();
int* v = i.data();
Of course, nothing guarantees that the implementation will do anything, but the only foolproof alternative would be to allocated a new array, move data from the vector to the array and clear the vector.
int *v = new int[i.size];
memcpy(v, i.data, i.size * sizeof(int));
i.clear();
But I really doubt that you will have a major gain that way...
You can use the contents of a std::vector as a C array without the need to copy or release anything. Just make sure that std::vector outlives the need for your pointer to be valid (and that further modifications which could trigger a reallocation are done on the vector itself).
When you obtain a pointer to internal storage through data() you're not actually allocating anything, just referring to the already allocated memory.
The only additional precaution you could use to save memory is to use shrink_to_fit() to release any excess memory used as spare capacity (though it's not guaranteed).
You have two options:
Keep data in vector, but call shrink_to_fit. All overhead you'll have - is an extra pointer (to the vector end). It is available since C++ 11
Copy data to an external array and destroy the vector object:
Here is the example:
std::vector<int> vec;
// fill vector
std::unique_ptr<int[]> arr(new int[vec.size()]);
std::copy(vec.begin(), vec.end(), arr.get());
I'm dabbling around with C++ and trying to wrap my head around the pointer thing. When creating an array of objects, is it better to create the array of the object type or of pointers?
For instance:
Block grid[size]; or Block* grid[size];
From what I've been reading I was under the impression that when using objects it's almost always better to use pointers to save memory? Eventually I'm going to create a dynamically allocated 2D array of these objects, and I'm wondering if it's worth hassling to try and store them as pointers in the multidimensional array.
NOTE: I have been told to look into 2D vectors, when it comes to dynamically creating them and allocating them I'm struggling just as much as with these arrays.
Thanks in advance!
I think you are mixing something. A classic C-array is a pointer pointing to the first element. What you think of are classes which are expensive when they get (deep-)copied while calling a function or similar. These expensive function arguments (strings, vectors, and similar) are better be passed as references (or pointers).
'Modern C++' provides you with enough utility that you shouldnt use any allocation on your own, if you are not writing your own containers. You usually use pointers to point to things, that you do not own in a context. I also recommend to not use classic C-arrays anymore. Use
array<Block, size> grid; // compile-time constant size
instead of
Block grid[size]; // compile-time constant size
array<> will try to be on the stack. If size is not known to compile time, you have to allocate the memory on the heap. The STL calls these heap-arrays vector instead
vector<Block> grid(size); // runtime variable size
if (grid.size() > 0)
cout << grid[0]; // use grid like an array
The old C-style way for heap arrays was to allocate memory manually and free it after you do not need it anymore. This is very error-prone and should be avoided unless you know what you do.
As mentioned the difference to plain old arrays (which are pointers) is when calling a function. Before you had to pass a pointer and potentially the size of the array
void fancy_old_function(Block* grid, size_t size) // BAD practise
{
// do things
}
// ...
{
Block grid[size];
fancy_old_function(grid, size); // error prone
...
// maybe alot later
fancy_old_function(grid, 13); // ohoh! 13 < size?
}
In C++ and with big objects you should pass these as a reference (or make it a pointer) otherwise your vector gets deep copied.
void fancy_new_function(vector<Block>& grid) // GOOD
{
// do fancy stuff an potentially change grid
}
The drawback is now, that we have different signatures for dynamic and static arrays:
template <size_t N>
void fancy_new_function(array<Block, N>& grid) // GOOD
{
// do fancy stuff an potentially change grid
}
A huge improvement to classic C-arrays is, that we know the size of an array implicitly at all times.
You can call your fancy_new_function like this:
{
array<Block,size> grid1;
fancy_new_function(grid1); // size is implicitly deduced
// or use grid1.size();
vector<Block> grid2(size);
fancy_new_function(grid2); // developer can use grid2.size()
}
HTH
I would use Block grid[size], because why would you create an array of pointers (the only reason I can think of is for an array with unknown size at compile time) In terms of memory, the pointers are a bit more expensive, because their address and their content is stored (on the stack and on the heap respectively). And you can't forget to deallocate them when you are done with them, otherwise you will get a memory leak.
The better alternative would be std::array<Block, size> grid, or even std::vector<Block> grid.
std::vector is like an std::array, but its dynamic, without a fixed size (so you can change it a runtime, for example).
Assuming this is in a function, it doesn't 'save' memory.
One implementation uses more stack memory.
The other uses less stack memory, but uses more heap.
This entirely uses the stack:
Block grid[size];
This uses the stack to store your pointers to objects.
The memory used to store the objects is the heap:
Block* grid[size];
However, if these are global declarations, then yes, this:
Block grid[size];
does potentially waste memory because it is allocated for the lifetime of your application, and cannot be deallocated/resized.
The answer to "better" depends upon how you would the array.
If Block is small (2-8 bytes) or you need to use every element of grid, you'd be better off doing
Block grid [size] ;
In almost all cases.
Suppose sizeof (Block) > 512 and you do not use all the elements of grid and
const int size = 10 ;
you'd probably be better off with
Block grid [size] ;
Would probably make the most sense. On the other hand, suppose
const in size = 2000000 ;
then
Block *grid [size] ;
makes more sense.
Better is rather a matter of taste :-)
An array of objects may be simpler to use for simple use cases because you create and destroy the full array at once. It look a little more C++ idiomatic.
The exception I can think would be:
performance reasons with expensive to create objects. Here you only create an array for pointers with the maximum size and create objects when you need them
polymorphism with derived classes. IMHO this is the best use case for using arrays of pointers. Because if you try to copy a derived object into its base class, you will get object slicing: the base class copy will loose all attributes that are only present in derived class. So in that case, you create an array to pointers to the base class (it can even be abstract) and populate it with concrete implementations. And you can manipulate any object in the array with the virtual methods from the base class.
If you want to have an array as a member variable of a class there are two main options:
A: Allocate the memory on the heap
class X
{
int * arr;
public:
UnionFind(int numNodes)
{
arr = new int[numNodes];
}
}
B: Use a vector
class X
{
vector <int> arr;
public:
UnionFind(int numNodes)
{
arr.resize(numNodes);
}
}
Which of these is the preferred method? I know one drawback of heap allocated arrays is that you need to take care of deleting the memory yourself.
As a small side question, if an object of X is created on the heap is vector <int> arr also in the heap within the object? If so, how come vector <int> arr does not manually need to be deleted?
When you have the choice between a dynamically allocated C-style array and a std::vector<>, choose the vector.
It is safe, does all the alloc/realloc/resizing for you
It makes you code more flexible, readable, and easier to maintain
It is extremely efficient in most use cases
It provides explicit iterators, and plenty of member functions, including size()
Many implementations will do index checking in debug mode to catch out-of-bounds errors
Note that std::array exists for most of the cases where a C-array would be preferable (e.g., when allocation on the stack is preferred)
You should prefer vector:
the vector and vector's elements' destructors are guaranteed to run at the appropriate times
things like .push_back are massively easier and more concise to use correctly than coding your own checks on "capacity" and resizing/copy-constructing/moving in a C++ object-aware fashion
it's easier to apply algorithms to Standard containers, use the iterators etc.
it will be easier to move from vector to another Standard container (list, stack, map, unordered_map, deque etc) if evolving application/library needs suggest it
vector has some housekeeping information that's useful: size(), capacity()
before C++11 there was a single performance issue compared to using new[] and delete[] - you couldn't do an up-front "sizing" of the vector to hold however-many elements without copy-constructing their values from a prototypical element (constructor "2" here, and resize here) - that meant the constructor or resize had to iterate over every element doing copy construction even if the default constructor was a no-op (e.g. deliberately leaving all members uninitialised)
this is very rarely relevant or problematic, and indeed the C++ behaviour was generally safer
because it's a proper class, you can (whether you should is another matter) overload operator<<, operator>> for conveniently streaming arbitrary vectors
if an object of X is created on the heap is vector <int> arr also in the heap within the object? If so, how come vector <int> arr does not manually need to be deleted?
Yes, the vector object itself will be embedded within X, so will be on the heap too (similarly, it could be embedded in an automatic/stack variable, a global/namespace/static variable, a thread-specific variable etc.). That said, the vector object contains a pointer which tracks any further memory needed for elements in the vector, and that memory is by default dynamically allocated (i.e. on the heap) regardless of where the vector object itself is held.
Member variables with destructors (such as any vector) have them called automatically by the containing class's destructor.
I think I will first just say the entire question and then comment on my problems below it.
Specifications
You should design two classes ConstArray and Array which represent a kind of dynamic array of int values (similar to the standard libraries valarray template).
ConstArray
The class ConstArray should be designed so that it behaves like a constant array whose contents cannot be changed. Memory is to be allocated and deallocated dynamically. The size and the contents of the array are determined when an object of the class is constructed. At least the following members should be implemented:
size_type, element_type
- public type definitions (typedef) of the corresponding types.
size()
- returns the number of elements of the array.
operator [] (int) const
- returns the value element of the k-th element.
A copy constructor performing a deep copy operation.
A constructor that can be used to initialize the array with data from a C-style array.
A destructor that frees any dynamically allocated memory.
Array
The class Array is to be derived from ConstArray. In addition to that classes functionality, the size and the contents of Array objects can be changed. The following additional members should be provided
operator=(const ConstArray&)
- assignment of a ConstArray to an Array
operator=(const Array&)
- assignment of an Array
operator[](int)
- returning a reference to the k-th element
resize()
-Changes the number of elements in the array. The current contents of the array may be destroyed by this operation.
Now for my problems
I do not really get how to implement size_type and element_type into this. I understand that they are typedefs, so basically alias, but from that I do not really understand.
What is a "deep copy operation"?
What is a C-style array? I do not really understand how to set up the constructor to initialize the size of array. Where exactly do I determine the size? It doesn't say to ask the user to input it or anything.
Thank you.
I do not really get how to implement size_type and element_type into this.
As you said, they are just typedefs. You can define types in the scope of the class, for example:
class ConstArray {
public:
typedef int size_type;
};
After that you can use size_type whenever you want to get or return the size of the array.
What is a "deep copy operation"?
As you allocate the data memory dynamically, you will end up with a pointer to the array data (of type element_type*). A shallow copy would just copy the pointer, so that you end up with two objects referring to the same shared data (dangerous, as it is unclear who is responsible to release the memory). A deep copy, on the other hand, would allocate a new array, and copy all the data, so that you end up with two objects that have their own data array each.
What is a C-style array?
A pointer to the data, along with the size of the array. In your assignment, implement a constructor ConstArray(element_type* data, size_type size).
Where exactly do I determine the size?
You need to ask for it, using a separate constructor argument, see above. Whoever uses the constructor also knows the size of the existing (!) data array.
I'd need a class like std::auto_ptr for an array of unsigned char*, allocated with new[]. But auto_ptr only calls delete and not delete[], so i can't use it.
I also need to have a function which creates and returns the array. I came out with my own implementation within a class ArrayDeleter, which i use like in this example:
#include <Utils/ArrayDeleter.hxx>
typedef Utils::ArrayDeleter<unsigned char> Bytes;
void f()
{
// Create array with new
unsigned char* xBytes = new unsigned char[10];
// pass array to constructor of ArrayDeleter and
// wrap it into auto_ptr
return std::auto_ptr<Bytes>(new Bytes(xBytes));
}
...
// usage of return value
{
auto_ptr<Bytes> xBytes(f());
}// unsigned char* is destroyed with delete[] in destructor of ArrayDeleter
Is there a more elegant way to solve this? (Even using another "popular" library)
Boost has a variety of auto-pointers, including ones for arrays. Have you considered if std::vector is sufficient? Vectors are guaranteed to be contiguous in memory, and if you know the size and allocate memory ahead of time via reserve() or resize(), the location in memory will not change.
I then have to call a methods that takes unsigned char* as argument.
std::vector<unsigned char> vec;
.
.
.
legacy_function(&vec[0], vec.size());
How about using std::basic_string<unsigned char>? Or maybe std::vector<unsigned char>?
You're talking about an array of int, not complex C++ types that have a destructor.
For such an array calling delete[] is equivalent to calling delete. So there's no problem to use std::auto_ptr.
The method you suggest is very barbaric IMHO. You actually allocate memory twice: once for the needed array, and then you also allocate an instance of ArrayDeleter, which encapsulates the pointer to the allocated array.
The drawbacks of such a method are:
Worse performance. Heap operations are heavy. Also they have significant memory overhead.
Slower access. To access an element of your array via the std::auto_ptr<Bytes> the compiler will generate two indirections: one to get your Bytes object, and the other to access the element.
In simple words: std::auto_ptr has a pointer to Bytes object, which has a pointer to the array.
Worse error/exception consistency. Imagine what if the operator new fails to allocate the Bytes object. It'll generate an exception, which may be handled. But at this point you've already allocated the array. And this allocation will be lost.
I'd do one of the following:
If you're talking about an ordinary type - just use std::auto_ptr<type>. This should do the work. However you should check it with your compiler.
For complex types: you may create your own wrapper instead of the std::auto_ptr.
Another variant: similar to what you did. However you should get rid of the extra memory allocations and indirections.