In my application there's a part that requires me to make a copy of a container. At the moment I use std::vector (but might consider using something else). The application is very sensitive with regard to the latency. So, I was looking for a way to make a copy of a vector as fast as possible. I found that memcpy does it better than anything else. However, it does not change the internal size of the vector. I.e. vector.size() would still give me 0.
I know what slippery path I am on. I do not mind throwing away safety checks. I do know how many elements are being copied. I do not want to use vector.resize() to change the size of the destination vector (it is a slow operation).
Question:
std::vector<my_struct> destination_vector;
destination_vector.reserve(container_length);
std::memcpy(destination_vector.data(), original_vector.data(), sizeof(my_struct)*container_length);
After the code above I need to tell my destination_vector what size it is. How do I do this?
Thank you.
You must to actually resize() the vector before you copy stuff to it using memcpy():
destination_vector.resize(container_length);
But it would be better to avoid the use of memcpy() in the first place and use the mechanisms to copy vector content which is offered by vector, as suggested in the other answers:
std::vector<my_struct> destination_vector(original_vector);
or if the destination_vector instance already exists:
destination_vector.insert(destination_vector.begin(), original_vector.begin(), original_vector.end);
or, the fastest if you do not need the original content any more:
destination_vector.swap(original_vector);
All of these variants will be as fast or even faster than your memcpy()variant. If you experience slowness then see 2.:
You probably have a non-trivial default constructor in my_struct. Remove it, or insert a trivial (empty) default constructor to speed things up (to avoid construction of many elements which you never use).
If my_structcontains non-POD data members (like std::string) you cannot use memcpy() at all.
(Side note: You rarely want to call reserve(). The vector maintains its own internal storage in such a way that is always allocates more than is actually needed, exponentially, to avoid frequent resizes/copying when frequently appending elements.)
resize() is not a slow operation. It is as fast as any memory allocation.
Does my_struct have a non-trivial default constructor? Remove it and take care of initialization manually. This might be the reason why you say resize() is slow. It will actually construct your objects. But since you can apparently memcpy() your objects you can probably get away with a trivial (empty) default constructor.
How to manually assign vector's size?
You can't. You can only modify vector's size through the modification functions that add or remove elements such as insert, clear, resize etc.
After the code above I need to tell my destination_vector what size it is. How do I do this?
The mentioned code above has undefined behaviour, so it doesn't matter what you do after it.
A simple and efficient way to copy a vector is:
std::vector<my_struct> destination_vector(original_vector);
Your snippet has undefined behaviour, you can't memcpy into an empty vector, even if you have reserved space. It may also be undefined behaviour to memcpy any my_struct objects, if it isn't a TriviallyCopyable type.
You can construct the vector as a copy of the source directly. Most likely your compiler will emit code identical (or faster) than your original snippet, if my_struct is TriviallyCopyable.
std::vector<my_struct> destination_vector(original_vector.begin(), original_vector.begin() + container_length);
Related
I'm introducing myself to C++, and sadly it's starting to seem like the support for dynamically created arrays of fixed size (but with the size known only at run time) is very poor in C++, as new[] can't call an arbitrary user-specified constructor with user-set arguments.
Consider class A which has a number of constructors, each with some parameters. Assume that a constructor without parameters would be useless (I don't want to have to write one if I essentially don't need it). I guess the following doesn't matter, but, just in case: assume that A contains only a possibly large std::vector<Internal> (Internal is a private class, T and S parameterize A) and an integer counter as far as data members go. Also, A is parameterized.
Assume we want n instances of A stored contiguously in memory as an array, where n is determined at run time and constant afterwards. We want to be able create and initialize the structure with a single call that passes arguments to a constructor of A, or something similar. So each instance in the array gets the same, but programmatic initialization. EDIT: sorry, I didn't mean to say I want O(1) initialization, as that's impossible, I just wanted O(n) initialization, but so that I can create the array in one statement. I.e., so that I don't have to write an initialization loop for every array I create.
A possible, but suboptimal, solution is std::vector<A<T,S>>, but assume we can't live with the inefficiency. (Remember that std::vector supports resizing.)
How to implement and/or use an efficient solution with a nice API?
I would prefer a solution that doesn't reimplement half of the standard library, i.e. consider C++20 features and the standard library available for the implementation. Also, don't make me violate the C++ aliasing rules.
A possibly related question is why is such a "fixed_size_vector" class missing from the standard library?
(BTW: not that it matters, but please don't say "just use vector", because in this case I'm indeed going to go with the mentioned suboptimal solution, as the performance is not significant for my toy program, but in the real world the performance will matter one day and I want to be prepared. EDIT: I did not mean I want to optimize my toy program, rather I was referring to the fact that one day I will have to optimize some other program.)
EDIT: answering to some commenters: wrapping std::vector could provide the right abstraction, but it would be unnecessarily inefficient. A comment linked a question whose top answer explains this nicely:
dynarray is smaller and simpler than vector, because it doesn't need
to manage separate size and capacity values, and it doesn't need to
store an allocator
(dynarray here was a proposed addition to stdlib that seems to be what I wanted, except that it was also supposed to rely on special compiler support for some of its semantics). Of course, this difference compared to std::vector won't matter most of the time, but it would still be good if I was able to simply use the right tool for the job.
There is a proposal to add a fixed capacity vector to the standard.
Note that this proposal proposes the capacity be known at compile-time, so it's not applicable in your case.
There are also some open source libraries that implement one, e.g., Boost's static_vector, or . If you really want a fixed-capacity vector, you can use one of the open source implementations that exist out there.
If you really know what you're doing, you could write one on your own, but that's not the case for >99% of C++ users.
However, it should be noted that reserve()ing space on a vector will probably have the effect you want, and there's probably no need for an actual fixed capacity vector.
Since you mention that the size is only known at runtime this is exactly what std::vector is meant to be used for.
template <typename T, typename...Args>
auto make_vector(std::size_t size, const Args&...args) -> std::vector<T>
{
auto result = std::vector<T>{};
result.reserve(size); // whatever the known size is
for (auto i = 0; i < size; ++i) {
result.emplace_back(args...);
}
return result;
}
// Use like:
auto vec = make_vector<std::string>(20, "hello world");
This will pre-allocate enough room for size entries of type T, and the loop will call T's constructor with whatever arguments you pass it.
Be aware that:
No additional constructors are called.
No extra memory is used.
No copies or relocations are performed.
The returned vector is not copied (or even moved) with c++17 or above thanks to guaranteed copy elision.
Doing this is as optimal as you can get whether you use a specialized container or otherwise. This is why every experienced C++ developer will tell you the same thing: std::vector is the solution.[2]
Note: The above function uses const Args&... for propagation and not proper forwarding references, since rvalue references could result in use-after-move bugs.[1]
A specialized container like a fixed_size_vector that you mention will either be one of two things:
Fixed at compile-time on the max size, in which case it wouldn't work for you since you mentioned the size is only known at runtime
Fixed at runtime on the max size, in which case it will do exactly what I suggested above, since it will reserve the storage space up-front.
It is not possible at the language level to dynamically construct N objects only known at runtime using a custom constructor. Full stop. This could be done if the sequence is known at compile-time, but not runtime.
C++ is statically compiled, so we cannot variadically expand a runtime n value into a pack of T{...} constructor calls; it's simply not possible. This means there will be a loop every time. Thus the most optimal thing you can do is allocate n objects once, and call T's constructor n times.
[1] A short-hand syntax for passing a list of arguments to all of a sequences constructors is not a good general solution in C++. In fact, it would be suboptional. This would either force copies via const lvalue references, or it would allow for rvalues -- in which case only the first object constructed will get a valid value, and everything after will receive a use-after-moved object! Just imagine unique_ptr to a sequence of T's. Only the first instance will get a valid pointer, and everything else will receive nullptr
[2] Honestly, about the only real optimization you might be able to make on this solution would be to use a custom allocator, such as a std::pmr::vector with a stack-allocated memory buffer resource.
Footnote
I strongly advise you to get over the "efficiency first" mentality. Most developers' intuition on what is and is not efficient is wrong; this is why profilers are so important. Things like speculative execution, cache locality, and pipelining play a huge role in performance -- and these things are far more complex than simply constructing a dynamic array of objects.
Real software is written for other developers, not for the machine. It's better to have code that is maintainable and scalable, and optimized in places where bottlenecks have been identified through proper tooling.
Let's assume that I have a class
class Foo
{
public:
Foo (const std::string&);
virtual ~Foo()=default;
private:
//some private properties
};
And I want to create many instances of this class. Since I aim for good performance, I want to allocate the memory at once for all of them (at this point, I know the exact number but only at runtime). However, each object shall be constructed with an individual constructor parameter from a vector of parameters
std::vector<std::string> parameters;
Question: How can this be achieved?
My first try was to start with a std::vector<Foo> and then reserve(parameters.size()) and use emplace_back(...) in a loop. However I cannot use this approach because I use pointers to the individual objects and want to be sure that they are not moved to a different location in memory by the internal methods of std::vector. To avoid this I tried to delete the copy constructor of Foo to be sure at compile time that no methods can be called that might copy the objects to a different location but then I cannot use emplace_back(...) anymore. The reason is that in this method, the vector might want to grow and copy all the elements to the new location, it does not know that I reserved enough space.
I see three possibilities:
Use vector with reserve + emplace_back. You have the guarantee that your elements don't get moved as long as you don't exceed the capacity.
Use malloc + placement new. This allows you to allocate raw memory and then construct each element one by one e.g. in a loop.
If you already have a range of parameters from which to construct you objects as in the example, you can brobably (depending on your implementation of std::vector) use std::vector's iterator based constructor like this:
std::vector<Foo> v(parameters.begin(),parameters.end());
First solution has the advantage to be much simpler and has all the other goodies of a vector like taking care of destruction, keeping the size around etc.
The second solution might be faster, because you don't need to do the housekeeping stuff of vector emplace_back and it works even with a deleted move / copy constructor if that is important to you, but it leaves you with dozens of possibilities for errors
The third solution - if applicable - is imho the best. It also works with deleted copy / move constructors, should not have any performance overhead and it gives you all the advantages of using a standard container.
It does however rely on the constructor first determining the size of the range (e.g. via std::distance) and I'm not sure if this is guaranteed for any kind of iterators (in practice, all implementations do this at least for random access iterators). Also in some cases, providing appropriate iterators requires writing some boilerplate code.
I'm writing C++ on an Arduino. I've run into a problem trying to copy and array using memcpy.
Character characters[5] = {
Character("Bob", 40, 20),
Character("Joe", 30, 10),
...
};
I then pass this array into a constructor like so:
Scene scene = Scene(characters, sizeof(characters)/sizeof(Character));
Inside this constructor I attempt to copy the characters using memcpy:
memcpy(this->characters, characters, characters_sz);
This seems to lock up my application. Upon research it appears that memcpy is not the right tool for this job. If I comment that line out the rest of the application continues to freeze.
I can't use vectors because they're not supported on the Arduino, neither is std::copy. Debugging is a pain.
Is there any way to do this?
Edit
The reason why I am copying is because multiple objects will get their own copy of the characters. Each class can modify and destroy them accordingly because their copies. I don't want to have the Scene class responsible for creating the characters, so I'd rather pass them in.
You will have to copy the members individually, or create a copy constructor in the Character class / struct
It's very unclear what's going on in your code.
First of all, you aren't using std::array as your question title suggests, you are using a built-in array.
You could concievably use std::array instead, and just use copy constructor of std::array. But that brings us to second question.
When you are doing memcpy in the constructor of Scene, what is the actual size of this->characters? It's not a good thing to have a constructor that takes characters_sz dynamically if in fact there is a static limit on how many it can accept.
If I were you and really trying to avoid dynamic allocations and std::vector, I would use std::array for both things, the member of Scene and the temporary variable you are passing, and I would make the constructor a template, so that it can accept arbitrary sized std::array of characters. But, I would put a static assert so that if the size of the passed array is too large, it fails at compile time.
Also assuming you are in C++11 here.
I guess depending on your application, this strategy wouldn't be appropriate. It might be that the size of the arrays really needs to be variable at run-time, but you still don't want to make dynamic allocations. In that case you could have a look at boost::static_vector.
http://www.boost.org/doc/libs/1_62_0/doc/html/container/non_standard_containers.html
boost::static_vector will basically be like a heap-allocated buffer large enough to hold N objects, but it won't default construct N of them for sure, you may have only one or two etc. It will keep track of how many of them are actually alive, and basically act like a stack-allocated std::vector with a capacity limit of N.
Use std::copy_n:
std::copy_n(characters, num_characters, this->characters);
Note that the order of arguments is different from memcpy and the number is the number of elements, not the size of those elements. You'll also need #include <algorithm> in the top of your source file.
That said, you're probably better off using a std::vector rather than a fixed size array, That way you can just use a simple assignment to copy it, and you can grow and shrink it dynamically.
Of course I would like to know some magic fix to this but I am open to restructuring.
So I have a class DeviceDependent, with the following constructor
DeviceDependent(Device& device);
which stores a reference to the device. The device can change state, which will necessitate a change in all DeviceDependent instances dependent on that device. (You guessed it this is my paltry attempt to ride the directX beast)
To handle this I have the functions DeviceDependent::createDeviceResources(), DeviceDependent::onDeviceLost().
I planned to register each DeviceDependentinstance to the device specified in the DeviceDependent constructor. The Device would keep a std::vector<DeviceDependent*> of all DeviceDependent instances so registered. It would then iterate through that vector and called the above functions when appropriate.
This seemed simple enough, but what I especially liked about it was that I could have a std::vector<DeviceDependent (or child)> somewhere else in the code and iterate over them quickly. For instance I have a class Renderable which as the name suggest represents a renderable object, I need to iterate over this once a frame at least and because of this I did not want the objects to be scattered throughout memory.
Down to business, here is the problem:
When I create the solid objects I relied on move semantics. This was purely by instinct I did not consider copying large objects like these to add them to the std::vector<DeviceDependent (or child)> collection. (and still abhor the idea)
However, with move semantics (and I have tested this for those who don't believe it) the address of the object changes. What's more it changes after the default constructor is called. That means my code inside the constructor of DeviceDependant calling device.registerDeviceDependent(this) compiles and runs fine, but the device accumulates a list of pointers which are invalidated as soon as the object is moved into the vector.
I want to know if there is someway I can stick to this plan and make it work.
Things I thought of:
Making the 'real' vector a collection of shared pointers, no issue copying. The object presumably will not change address. I don't like this plan because I am afraid that leaving things out on the heap will harm iteration performance.
Calling register after the object has been moved, it's what I'm doing provisionally but I don't like it because I feel the constructor is the proper place to do this. There
should not exist an instance of DeviceDependent that is not on some device's manifest.
Writing my own move constructor or move assignment functions. This way I could remove the old address from the device and change it to the new one. I don't want to do this because I don't want to keep updating it as the class evolves.
This has nothing to do with move constructors. The issue is std::vector. When you add a new item to that vector, it may reallocate its memory, and that will cause all the DeviceDependant objects to be transferred to a new memory block internal to the vector. Then new versions of each item will be constructed, and the old ones deleted. Whether the construction is copy-construction or move-construction is irrelevant; the objects effectively change their address either way.
To make your code correct, DeviceDependant objects need to unregister themselves in their destructor, and register themselves in both copy- and move-constructors. You should do this regardless of what else you decide about storage, if you have not deleted those constructors. Otherwise those constructors, if called, will do the wrong thing.
One approach not on your list would be to prevent the vector reallocating by calling reserve() with the maximum number of items you will store. This is only practical if you know a reasonable upper-bound to the number of DeviceDependant objects. However, you may find that reserving an estimate, while not eliminating the vector reallocations entirely, makes it rare enough that the cost of un-registering and re-registering becomes insignificant.
It sounds like your goal is getting cache-coherency for the DeviceDependants. You might find that using a std::deque as main storage avoids the re-allocations while still giving enough cache-coherency. Or you could gain cache-coherency by writing a custom allocator or operator new().
As an aside, it sounds like your design is being driven by performance costs that you are only guessing at. If you actually measure it, you might find that using std::vector> is fine, and doesn't significantly the time it takes to iterate over them. (Note you don't need shared pointers here, since the vector is the only owner, so you can avoid the overheads of reference-counting.)
I'd like to use a std::vector to control a given piece of memory. First of all I'm pretty sure this isn't good practice, but curiosity has the better of me and I'd like to know how to do this anyway.
The problem I have is a method like this:
vector<float> getRow(unsigned long rowIndex)
{
float* row = _m->getRow(rowIndex); // row is now a piece of memory (of a known size) that I control
vector<float> returnValue(row, row+_m->cols()); // construct a new vec from this data
delete [] row; // delete the original memory
return returnValue; // return the new vector
}
_m is a DLL interface class which returns an array of float which is the callers responsibility to delete. So I'd like to wrap this in a vector and return that to the user.... but this implementation allocates new memory for the vector, copies it, and then deletes the returned memory, then returns the vector.
What I'd like to do is to straight up tell the new vector that it has full control over this block of memory so when it gets deleted that memory gets cleaned up.
UPDATE: The original motivation for this (memory returned from a DLL) has been fairly firmly squashed by a number of responders :) However, I'd love to know the answer to the question anyway... Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?
The obvious answer is to use a custom allocator, however you might find that is really quite a heavyweight solution for what you need. If you want to do it, the simplest way is to take the allocator defined (as the default scond template argument to vector<>) by the implementation, copy that and make it work as required.
Another solution might be to define a template specialisation of vector, define as much of the interface as you actually need and implement the memory customisation.
Finally, how about defining your own container with a conforming STL interface, defining random access iterators etc. This might be quite easy given that underlying array will map nicely to vector<>, and pointers into it will map to iterators.
Comment on UPDATE: "Is there a way to construct a std::vector using a given chunk of pre-allocated memory T* array, and the size of this memory?"
Surely the simple answer here is "No". Provided you want the result to be a vector<>, then it has to support growing as required, such as through the reserve() method, and that will not be possible for a given fixed allocation. So the real question is really: what exactly do you want to achieve? Something that can be used like vector<>, or something that really does have to in some sense be a vector, and if so, what is that sense?
Vector's default allocator doesn't provide this type of access to its internals. You could do it with your own allocator (vector's second template parameter), but that would change the type of the vector.
It would be much easier if you could write directly into the vector:
vector<float> getRow(unsigned long rowIndex) {
vector<float> row (_m->cols());
_m->getRow(rowIndex, &row[0]); // writes _m->cols() values into &row[0]
return row;
}
Note that &row[0] is a float* and it is guaranteed for vector to store items contiguously.
The most important thing to know here is that different DLL/Modules have different Heaps. This means that any memory that is allocated from a DLL needs to be deleted from that DLL (it's not just a matter of compiler version or delete vs delete[] or whatever). DO NOT PASS MEMORY MANAGEMENT RESPONSIBILITY ACROSS A DLL BOUNDARY. This includes creating a std::vector in a dll and returning it. But it also includes passing a std::vector to the DLL to be filled by the DLL; such an operation is unsafe since you don't know for sure that the std::vector will not try a resize of some kind while it is being filled with values.
There are two options:
Define your own allocator for the std::vector class that uses an allocation function that is guaranteed to reside in the DLL/Module from which the vector was created. This can easily be done with dynamic binding (that is, make the allocator class call some virtual function). Since dynamic binding will look-up in the vtable for the function call, it is guaranteed that it will fall in the code from the DLL/Module that originally created it.
Don't pass the vector object to or from the DLL. You can use, for example, a function getRowBegin() and getRowEnd() that return iterators (i.e. pointers) in the row array (if it is contiguous), and let the user std::copy that into its own, local std::vector object. You could also do it the other way around, pass the iterators begin() and end() to a function like fillRowInto(begin, end).
This problem is very real, although many people neglect it without knowing. Don't underestimate it. I have personally suffered silent bugs related to this issue and it wasn't pretty! It took me months to resolve it.
I have checked in the source code, and boost::shared_ptr and boost::shared_array use dynamic binding (first option above) to deal with this.. however, they are not guaranteed to be binary compatible. Still, this could be a slightly better option (usually binary compatibility is a much lesser problem than memory management across modules).
Your best bet is probably a std::vector<shared_ptr<MatrixCelType>>.
Lots more details in this thread.
If you're trying to change where/how the vector allocates/reallocates/deallocates memory, the allocator template parameter of the vector class is what you're looking for.
If you're simply trying to avoid the overhead of construction, copy construction, assignment, and destruction, then allow the user to instantiate the vector, then pass it to your function by reference. The user is then responsible for construction and destruction.
It sounds like what you're looking for is a form of smart pointer. One that deletes what it points to when it's destroyed. Look into the Boost libraries or roll your own in that case.
The Boost.SmartPtr library contains a whole lot of interesting classes, some of which are dedicated to handle arrays.
For example, behold scoped_array:
int main(int argc, char* argv[])
{
boost::scoped_array<float> array(_m->getRow(atoi(argv[1])));
return 0;
}
The issue, of course, is that scoped_array cannot be copied, so if you really want a std::vector<float>, #Fred Nurk's is probably the best you can get.
In the ideal case you'd want the equivalent to unique_ptr but in array form, however I don't think it's part of the standard.