How can I use something like std::vector<std::mutex>? - c++

I have a large, but potentially varying, number of objects which are concurrently written into. I want to protect that access with mutexes. To that end, I thought I use a std::vector<std::mutex>, but this doesn't work, since std::mutex has no copy or move constructor, while std::vector::resize() requires that.
What is the recommended solution to this conundrum?
edit:
Do all C++ random-access containers require copy or move constructors for re-sizing? Would std::deque help?
edit again
First, thanks for all your thoughts. I'm not interested in solutions that avoid mutices and/or move them into the objects (I refrain from giving details/reasons). So given the problem that I want a adjustable number of mutices (where the adjustment is guaranteed to occur when no mutex is locked), then there appear to be several solutions.
1 I could use a fixed number of mutices and use a hash-function to map from objects to mutices (as in Captain Oblivous's answer). This will result in collisions, but the number of collisions should be small if the number of mutices is much larger than the number of threads, but still smaller than the number of objects.
2 I could define a wrapper class (as in ComicSansMS's answer), e.g.
struct mutex_wrapper : std::mutex
{
mutex_wrapper() = default;
mutex_wrapper(mutex_wrapper const&) noexcept : std::mutex() {}
bool operator==(mutex_wrapper const&other) noexcept { return this==&other; }
};
and use a std::vector<mutex_wrapper>.
3 I could use std::unique_ptr<std::mutex> to manage individual mutexes (as in Matthias's answer). The problem with this approach is that each mutex is individually allocated and de-allocated on the heap. Therefore, I prefer
4 std::unique_ptr<std::mutex[]> mutices( new std::mutex[n_mutex] );
when a certain number n_mutex of mutices is allocated initially. Should this number later be found insufficient, I simply
if(need_mutex > n_mutex) {
mutices.reset( new std::mutex[need_mutex] );
n_mutex = need_mutex;
}
So which of these (1,2,4) should I use?

vector requires that the values are movable, in order to maintain a contiguous array of values as it grows. You could create a vector containing mutexes, but you couldn't do anything that might need to resize it.
Other containers don't have that requirement; either deque or [forward_]list should work, as long as you construct the mutexes in place either during construction, or by using emplace() or resize(). Functions such as insert() and push_back() will not work.
Alternatively, you could add an extra level of indirection and store unique_ptr; but your comment in another answer indicates that you believe the extra cost of dynamic allocation to be unacceptable.

If you want to create a certain length:
std::vector<std::mutex> mutexes;
...
size_t count = 4;
std::vector<std::mutex> list(count);
mutexes.swap(list);

You could use std::unique_ptr<std::mutex> instead of std::mutex. unique_ptrs are movable.

I suggest using a fixed mutex pool. Keep a fixed array of std::mutex and select which one to lock based on the address of the object like you might do with a hash table.
std::array<std::mutex, 32> mutexes;
std::mutex &m = mutexes[hashof(objectPtr) % mutexes.size()];
m.lock();
The hashof function could be something simple that shifts the pointer value over a few bits. This way you only have to initialize the mutexes once and you avoid the copy of resizing the vector.

If efficiency is such a problem, I assume that you have only very small data structures which are changed very often. It is then probably better to use Atomic Compare And Swap (and other atomic operations) instead of using mutexes, specifically std::atomic_compare_exchange_strong

I'll sometimes use a solution along the lines of your 2nd option when I want a std::vector of classes or structs that each have their own std::mutex. Of course, it is a bit tedious as I write my own copy/move/assignment operators.
struct MyStruct {
MyStruct() : value1(0), value2(0) {}
MyStruct(const MyStruct& other) {
std::lock_guard<std::mutex> l(other.mutex);
value1 = other.value1;
value2 = other.value2;
}
MyStruct(MyStruct&& other) {
std::lock_guard<std::mutex> l(other.mutex);
value1 = std::exchange(other.value1, 0);
value2 = std::exchange(other.value2, 0);
}
MyStruct& operator=(MyStruct&& other) {
std::lock_guard<std::mutex> l1(this->mutex), l2(other.mutex);
std::swap(value1, other.value1);
std::swap(value2, other.value2);
return *this;
}
MyStruct& operator=(const MyStruct& other) {
// you get the idea
}
int value1;
double value2;
mutable std::mutex mutex;
};
You don't need to "move" the std::mutex. You just need to hold a lock on it while you "move" everything else.

How about declaring each mutex as a pointer?
std::vector<std::mutex *> my_mutexes(10)
//Initialize mutexes
for(int i=0;i<10;++i) my_mutexes[i] = new std::mutex();
//Release mutexes
for(int i=0;i<10;++i) delete my_mutexes[i];

Related

Is std::push_back relatively expensive to use?

I want to improve the performance of the following code. What aspect might affect the performance of the code when it's executed?
Also, considering that there is no limit to how many objects you can add to the container, what improvements could be made to “Object” or “addToContainer” to improve the performance of the program?
I was wondering if std::push_back in C++ affects performance of the code in any way? Especially if there is no limit to adding to list.
struct Object{
string name;
string description;
};
vector<Object> container;
void addToContainer(Object object) {
container.push_back(object);
}
int main() {
addToContainer({ "Fira", "+5 ATTACK" });
addToContainer({ "Potion", "+10 HP" });
}
Before you do ANYTHING profile the code and get a benchmark. After you make a change profile the code and get a benchmark. Compare the benchmarks. If you do not do this, you're rolling dice. Is it faster? Who knows.
Profile profile profile.
With push_back you have two main concerns:
Resizing the vector when it fills up, and
Copying the object into the vector.
There are a number of improvements you can make to the resizing cost cost of push_back depending on how items are being added.
Strategic use of reserve to minimize the amount of resizing, for example. If you know how many items are about to be added, you can check the capacity and size to see if it's worth your time to reserve to avoid multiple resizes. Note this requires knowledge of vector's expansion strategy and that is implementation-specific. An optimization for one vector implementation could be a terribly bad mistake on another.
You can use insert to add multiple items at a time. Of course this is close to useless if you need to add another container into the code in order to bulk-insert.
If you have no idea how many items are incoming, you might as well let vector do its job and optimize HOW the items are added.
For example
void addToContainer(Object object) // pass by value. Possible copy
{
container.push_back(object); // copy
}
Those copies can be expensive. Get rid of them.
void addToContainer(Object && object) //no copy and can still handle temporaries
{
container.push_back(std::move(object)); // moves rather than copies
}
std::string is often very cheap to move.
This variant of addToContainer can be used with
addToContainer({ "Fira", "+5 ATTACK" });
addToContainer({ "Potion", "+10 HP" });
and might just migrate a pointer and as few book-keeping variables per string. They are temporaries, so no one cares if it will rips their guts out and throws away the corpses.
As for existing Objects
Object o{"Pizza pop", "+5 food"};
addToContainer(std::move(o));
If they are expendable, they get moved as well. If they aren't expendable...
void addToContainer(const Object & object) // no copy
{
container.push_back(object); // copy
}
You have an overload that does it the hard way.
Tossing this one out there
If you already have a number of items you know are going to be in the list, rather than appending them all one at a time, use an initialization list:
vector<Object> container{
{"Vorpal Cheese Grater", "Many little pieces"},
{"Holy Hand Grenade", "OMG Damage"}
};
push_back can be extremely expensive, but as with everything, it depends on the context. Take for example this terrible code:
std::vector<float> slow_func(const float* ptr)
{
std::vector<float> v;
for(size_t i = 0; i < 256; ++i)
v.push_back(ptr[i]);
return v;
}
each call to push_back has to do the following:
Check to see if there is enough space in the vector
If not, allocate new memory, and copy the old values into the new vector
copy the new item to the end of the vector
increment end
Now there are two big problems here wrt performance. Firstly each push_back operation depends upon the previous operation (since the previous operation modified end, and possibly the entire contents of the array if it had to be resized). This pretty much destroys any vectorisation possibilities in the code. Take a look here:
https://godbolt.org/z/RU2tM0
The func that uses push_back does not make for very pretty asm. It's effectively hamstrung into being forced to copy a single float at a time. Now if you compare that to an alternative approach where you resize first, and then assign; the compiler just replaces the whole lot with a call to new, and a call to memcpy. This will be a few orders of magnitude faster than the previous method.
std::vector<float> fast_func(const float* ptr)
{
std::vector<float> v(256);
for(size_t i = 0; i < 256; ++i)
v[i] = ptr[i];
return v;
}
BUT, and it's a big but, the relative performance of push_back very much depends on whether the items in the array can be trivially copied (or moved). If you example you do something silly like:
struct Vec3 {
float x = 0;
float y = 0;
float z = 0;
};
Well now when we did this:
std::vector<Vec3> v(256);
The compiler will allocate memory, but also be forced to set all the values to zero (which is pointless if you are about to overwrite them again!). The obvious way around this is to use a different constructor:
std::vector<Vec3> v(ptr, ptr + 256);
So really, only use push_back (well, really you should prefer emplace_back in most cases) when either:
additional elements are added to your vector occasionally
or, The objects you are adding are complex to construct (in which case, use emplace_back!)
without any other requirements, unfortunately this is the most efficient:
void addToContainer(Object) { }
to answer the rest of your question. In general push_back will just add to the end of the allocated vector O(1), but will need to grow the vector on occasion, which can be amortized out but is O(N)
also, it would likely be more efficient not to use string, but to keep char * although memory management might be tricky unless it is always a literal being added

Custom allocators as alternatives to vector of smart pointers?

This question is about owning pointers, consuming pointers, smart pointers, vectors, and allocators.
I am a little bit lost on my thoughts about code architecture. Furthermore, if this question has already an answer somewhere, 1. sorry, but I haven't found a satisfying answer so far and 2. please point me to it.
My problem is the following:
I have several "things" stored in a vector and several "consumers" of those "things". So, my first try was like follows:
std::vector<thing> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return &i_am_the_owner_of_things[5]; // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
In my application, this would be safe because the "things" outlive the "consumers" in any case. However, more "things" can be added during runtime and that can become a problem because if the std::vector<thing> i_am_the_owner_of_things; gets reallocated, all the thing* m_thing pointers become invalid.
A fix to this scenario would be to store unique pointers to "things" instead of "things" directly, i.e. like follows:
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return i_am_the_owner_of_things[5].get(); // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
The downside here is that memory coherency between "things" is lost. Can this memory coherency be re-established by using custom allocators somehow? I am thinking of something like an allocator which would always allocate memory for, e.g., 10 elements at a time and whenever required, adds more 10-elements-sized chunks of memory.
Example:
initially:
v = ☐☐☐☐☐☐☐☐☐☐
more elements:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
and again:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
Using such an allocator, I wouldn't even have to use std::unique_ptrs of "things" because at std::vector's reallocation time, the memory addresses of the already existing elements would not change.
As alternative, I can only think of referencing the "thing" in "consumer" via a std::shared_ptr<thing> m_thing, as opposed to the current thing* m_thing but that seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So, is the allocator-approach a good one? And if so, how can it be done? Do I have to implement the allocator by myself or is there an existing one?
If you are able to treat thing as a value type, do so. It simplifies things, you don't need a smart pointer for circumventing the pointer/reference invalidation issue. The latter can be tackled differently:
If new thing instances are inserted via push_front and push_back during the program, use std::deque instead of std::vector. Then, no pointers or references to elements in this container are invalidated (iterators are invalidated, though - thanks to #odyss-jii for pointing that out). If you fear that you heavily rely on the performance benefit of the completely contiguous memory layout of std::vector: create a benchmark and profile.
If new thing instances are inserted in the middle of the container during the program, consider using std::list. No pointers/iterators/references are invalidated when inserting or removing container elements. Iteration over a std::list is much slower than a std::vector, but make sure this is an actual issue in your scenario before worrying too much about that.
There is no single right answer to this question, since it depends a lot on the exact access patterns and desired performance characteristics.
Having said that, here is my recommendation:
Continue storing the data contiguously as you are, but do not store aliasing pointers to that data. Instead, consider a safer alternative (this is a proven method) where you fetch the pointer based on an ID right before using it -- as a side-note, in a multi-threaded application you can lock attempts to resize the underlying store whilst such a weak reference lives.
So your consumer will store an ID, and will fetch a pointer to the data from the "store" on demand. This also gives you control over all "fetches", so that you can track them, implement safety measure, etc.
void consumer::foo() {
thing *t = m_thing_store.get(m_thing_id);
if (t) {
// do something with t
}
}
Or more advanced alternative to help with synchronization in multi-threaded scenario:
void consumer::foo() {
reference<thing> t = m_thing_store.get(m_thing_id);
if (!t.empty()) {
// do something with t
}
}
Where reference would be some thread-safe RAII "weak pointer".
There are multiple ways of implementing this. You can either use an open-addressing hash table and use the ID as a key; this will give you roughly O(1) access time if you balance it properly.
Another alternative (best-case O(1), worst-case O(N)) is to use a "reference" structure, with a 32-bit ID and a 32-bit index (so same size as 64-bit pointer) -- the index serves as a sort-of cache. When you fetch, you first try the index, if the element in the index has the expected ID you are done. Otherwise, you get a "cache miss" and you do a linear scan of the store to find the element based on ID, and then you store the last-known index value in your reference.
IMO best approach would be create new container which will behave is safe way.
Pros:
change will be done on separate level of abstraction
changes to old code will be minimal (just replace std::vector with new container).
it will be "clean code" way to do it
Cons:
it may look like there is a bit more work to do
Other answer proposes use of std::list which will do the job, but with larger number of allocation and slower random access. So IMO it is better to compose own container from couple of std::vectors.
So it may start look more or less like this (minimum example):
template<typename T>
class cluster_vector
{
public:
static const constexpr cluster_size = 16;
cluster_vector() {
clusters.reserve(1024);
add_cluster();
}
...
size_t size() const {
if (clusters.empty()) return 0;
return (clusters.size() - 1) * cluster_size + clusters.back().size();
}
T& operator[](size_t index) {
thowIfIndexToBig(index);
return clusters[index / cluster_size][index % cluster_size];
}
void push_back(T&& x) {
if_last_is_full_add_cluster();
clusters.back().push_back(std::forward<T>(x));
}
private:
void thowIfIndexToBig(size_t index) const {
if (index >= size()) {
throw std::out_of_range("cluster_vector out of range");
}
}
void add_cluster() {
clusters.push_back({});
clusters.back().reserve(cluster_size);
}
void if_last_is_full_add_cluster() {
if (clusters.back().size() == cluster_size) {
add_cluster();
}
}
private:
std::vector<std::vector<T>> clusters;
}
This way you will provide container which will not reallocate items. It doesn't meter what T does.
[A shared pointer] seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So what? Maybe the code is a little less self-documenting, but it will solve all your problems.
(And by the way you are muddling things by using the word "consumer", which in a traditional producer/consumer paradigm would take ownership.)
Also, returning a raw pointer in your current code is already entirely ambiguous as to ownership. In general, I'd say it's good practice to avoid raw pointers if you can (like you don't need to call delete.) I would return a reference if you go with unique_ptr
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing& get_thing_for_consumer() {
// some thing-selection logic
return *i_am_the_owner_of_things[5]; // 5 is just an example
}

Copying vector elements to a vector pair

In my C++ code,
vector <string> strVector = GetStringVector();
vector <int> intVector = GetIntVector();
So I combined these two vectors into a single one,
void combineVectors(vector<string>& strVector, vector <int>& intVector, vector < pair <string, int>>& pairVector)
{
for (int i = 0; i < strVector.size() || i < intVector.size(); ++i )
{
pairVector.push_back(pair<string, int> (strVector.at(i), intVector.at(i)));
}
}
Now this function is called like this,
vector <string> strVector = GetStringVector();
vector <int> intVector = GetIntVector();
vector < pair <string, int>> pairVector
combineVectors(strVector, intVector, pairVector);
//rest of the implementation
The combineVectors function uses a loop to add the elements of other 2 vectors to the vector pair. I doubt this is a efficient way as this function gets called hundrands of times passing different data. This might cause a performance issue because everytime it goes through the loop.
My goal is to copy both the vectors in "one go" to the vector pair. i.e., without using a loop. Am not sure whether that's even possible.
Is there a better way of achieving this without compromising the performance?
You have clarified that the arrays will always be of equal size. That's a prerequisite condition.
So, your situation is as follows. You have vector A over here, and vector B over there. You have no guarantees whether the actual memory that vector A uses and the actual memory that vector B uses are next to each other. They could be anywhere.
Now you're combining the two vectors into a third vector, C. Again, no guarantees where vector C's memory is.
So, you have really very little to work with, in terms of optimizations. You have no additional guarantees whatsoever. This is pretty much fundamental: you have two chunks of bytes, and those two chunks need to be copied somewhere else. That's it. That's what has to be done, that's what it all comes down to, and there is no other way to get it done, other than doing exactly that.
But there is one thing that can be done to make things a little bit faster. A vector will typically allocate memory for its values in incremental steps, reserving some extra space, initially, and as values get added to the vector, one by one, and eventually reach the vector's reserved size, the vector has to now grab a new larger block of memory, copy everything in the vector to the larger memory block, then delete the older block, and only then add the next value to the vector. Then the cycle begins again.
But you know, in advance, how many values you are about to add to the vector, so you simply instruct the vector to reserve() enough size in advance, so it doesn't have to repeatedly grow itself, as you add values to it. Before your existing for loop, simply:
pairVector.reserve(pairVector.size()+strVector.size());
Now, the for loop will proceed and insert new values into pairVector which is guaranteed to have enough space.
A couple of other things are possible. Since you have stated that both vectors will always have the same size, you only need to check the size of one of them:
for (int i = 0; i < strVector.size(); ++i )
Next step: at() performs bounds checking. This loop ensures that i will never be out of bounds, so at()'s bound checking is also some overhead you can get rid of safely:
pairVector.push_back(pair<string, int> (strVector[i], intVector[i]));
Next: with a modern C++ compiler, the compiler should be able to optimize away, automatically, several redundant temporaries, and temporary copies here. It's possible you may need to help the compiler, a little bit, and use emplace_back() instead of push_back() (assuming C++11, or later):
pairVector.emplace_back(strVector[i], intVector[i]);
Going back to the loop condition, strVector.size() gets evaluated on each iteration of the loop. It's very likely that a modern C++ compiler will optimize it away, but just in case you can also help your compiler check the vector's size() only once:
int i=strVector.size();
for (int i = 0; i < n; ++i )
This is really a stretch, but it might eke out a few extra quantums of execution time. And that pretty much all obvious optimizations here. Realistically, the most to be gained here is by using reserve(). The other optimizations might help things a little bit more, but it all boils down to moving a certain number of bytes from one area in memory to another area. There aren't really special ways of doing that, that's faster than other ways.
We can use std:generate() to achieve this:
#include <bits/stdc++.h>
using namespace std;
vector <string> strVector{ "hello", "world" };
vector <int> intVector{ 2, 3 };
pair<string, int> f()
{
static int i = -1;
++i;
return make_pair(strVector[i], intVector[i]);
}
int main() {
int min_Size = min(strVector.size(), intVector.size());
vector< pair<string,int> > pairVector(min_Size);
generate(pairVector.begin(), pairVector.end(), f);
for( int i = 0 ; i < 2 ; i++ )
cout << pairVector[i].first <<" " << pairVector[i].second << endl;
}
I'll try and summarize what you want with some possible answers depending on your situation. You say you want a new vector that is essentially a zipped version of two other vectors which contain two heterogeneous types. Where you can access the two types as some sort of pair?
If you want to make this more efficient, you need to think about what you are using the new vector for? I can see three scenarios with what you are doing.
The new vector is a copy of your data so you can do stuff with it without affecting the original vectors. (ei you still need the original two vectors)
The new vector is now the storage mechanism for your data. (ei you
no longer need the original two vectors)
You are simply coupling the vectors together to make use and representation easier. (ei where they are stored doesn't actually matter)
1) Not much you can do aside from copying the data into your new vector. Explained more in Sam Varshavchik's answer.
3) You do something like Shakil's answer or here or some type of customized iterator.
2) Here you make some optimisations here where you do zero coping of the data with the use of a wrapper class. Note: A wrapper class works if you don't need to use the actual std::vector < std::pair > class. You can make a class where you move the data into it and create access operators for it. If you can do this, it also allows you to decompose the wrapper back into the original two vectors without copying. Something like this might suffice.
class StringIntContainer {
public:
StringIntContaint(std::vector<std::string>& _string_vec, std::vector<int>& _int_vec)
: string_vec_(std::move(_string_vec)), int_vec_(std::move(_int_vec))
{
assert(string_vec_.size() == int_vec_.size());
}
std::pair<std::string, int> operator[] (std::size_t _i) const
{
return std::make_pair(string_vec_[_i], int_vec_[_i]);
}
/* You may want methods that return reference to data so you can edit it*/
std::pair<std::vector<std::string>, std::vector<int>> Decompose()
{
return std::make_pair(std::move(string_vec_), std::move(int_vec_[_i])));
}
private:
std::vector<std::string> _string_vec_;
std::vector<int> int_vec_;
};

Copying std::vector between threads without locking

I have a vector that is modified in one thread, and I need to use its contents in another. Locking between these threads is unacceptable due to performance requirements. Since iterating over the vector while it is changing will cause a crash, I thought to copy the vector and then iterate over the copy. My question is, can this way also crash?
struct Data
{
int A;
double B;
bool C;
};
std::vector<Data> DataVec;
void ModifyThreadFunc()
{
// Here the vector is changed, which includes adding and erasing elements
...
}
void ReadThreadFunc()
{
auto temp = DataVec; // Will this crash?
for (auto& data : temp)
{
// Do stuff with the data
...
}
// This definitely can crash
/*for (auto& data : DataVec)
{
// Do stuff with the data
...
}*/
}
The basic thread safety guarantee for vector::operator= is:
"if an exception is thrown, the container is in a valid state."
What types of exceptions are possible here?
EDIT:
I solved this using double buffering, and posted my answer below.
As has been pointed out by the other answers, what you ask for is not doable. If you have concurrent access, you need synchronization, end of story.
That being said, it is not unusual to have requirements like yours where synchronization is not an option. In that case, what you can still do is get rid of the concurrent access. For example, you mentioned that the data is accessed once per frame in a game-loop like execution. Is it strictly required that you get the data from the current frame or could it also be the data from the last frame?
In that case, you could work with two vectors, one that is being written to by the producer thread and one that is being read by all the consumer threads. At the end of the frame, you simply swap the two vectors. Now you no longer need *(1) fine-grained synchronization for the data access, since there is no concurrent data access any more.
This is just one example how to do this. If you need to get rid of locking, start thinking about how to organize data access so that you avoid getting into the situation where you need synchronization in the first place.
*(1): Strictly speaking, you still need a synchronization point that ensures that when you perform the swapping, all the writer and reader threads have finished working. But this is far easier to do (usually you have such a synchronization point at the end of each frame anyway) and has a far lesser impact on performance than synchronizing on every access to the vector.
My question is, can this way also crash?
Yes, you still have a data race. If thread A modifies the vector while thread B is creating a copy, all iterators to the vector are invalidated.
What types of exceptions are possible here?
std::vector::operator=(const vector&) will throw on memory allocation failure, or if the contained elements throw on copy. The same thing applies to copy construction, which is what the line in your code marked "Will this crash?" is actually doing.
The fundamental problem here is that std::vector is not thread-safe. You have to either protect it with a lock/mutex, or replace it with a thread-safe container (such as the lock-free containers in Boost.Lockfree or libcds).
I have a vector that is modified in one thread, and I need to use its contents in another. Locking between these threads is unacceptable due to performance requirements.
this is an impossible to meet requirement.
Anyway, any sharing of data between 2 threads will require a kind of locking, be it explicit or implementation (eventually hardware) provided. You must examine again your actual requirements: it can be inacceptable to suspend one thread until the other one ends, but you could lock short sequences of instructions. And/or possibly use a diffent architecture. For example erasing an item in a vector is a costly operation (linear time because you have to move all the data above the removed item) while marking it as invalid is much quicker (constant time because it is one single write). If you really have to erase in the middle of a vector, maybe a list would be more appropriate.
But if you can put a locking exclusion around the copy of the vector in ReadThreadFunc and around any vector modification in ModifyThreadFunc, it could be enough. To give a priority to the modifying thread, you could just try to lock in the other thread and immediately give up if you cannot.
Maybe you should rethink your design!
Each thread should have his own vector (list, queue whatever fit your needs) to work on. So thread A can do some work and pass the result to thrad B. You simply have to lock when writing the data from thread A int thread B's queue.
Without some kind of locking it's not possible.
So I solved this using double buffering, which guarantees no crashing, and the reading thread will always have usable data, even if it might not be correct:
struct Data
{
int A;
double B;
bool C;
};
const int MAXSIZE = 100;
Data Buffer[MAXSIZE];
std::vector<Data> DataVec;
void ModifyThreadFunc()
{
// Here the vector is changed, which includes adding and erasing elements
...
// Copy from the vector to the buffer
size_t numElements = DataVec.size();
memcpy(Buffer, DataVec.data(), sizeof(Data) * numElements);
memset(&Buffer[numElements], 0, sizeof(Data) * (MAXSIZE - numElements));
}
void ReadThreadFunc()
{
Data* p = Buffer;
for (int i = 0; i < MAXSIZE; ++i)
{
// Use the data
...
++p;
}
}

fastest way to convert a std::vector to another std::vector

What is the fastest way (if there is any other) to convert a std::vector from one datatype to another (with the idea to save space)? For example:
std::vector<unsigned short> ----> std::vector<bool>
we obviously assume that the first vector only contains 0s and 1s. Copying element by element is highly inefficient in case of a really large vector.
Conditional question:
If you think there is no way to do it faster, is there a complex datatype which actually allows fast conversion from one datatype to another?
std::vector<bool>
Stop.
A std::vector<bool> is... not. std::vector has a specialization for the use of the type bool, which causes certain changes in the vector. Namely, it stops acting like a std::vector.
There are certain things that the standard guarantees you can do with a std::vector. And vector<bool> violates those guarantees. So you should be very careful about using them.
Anyway, I'm going to pretend you said vector<int> instead of vector<bool>, as the latter really complicates things.
Copying element by element is highly inefficient in case of a really large vector.
Only if you do it wrong.
Vector casting of the type you want needs to be done carefully to be efficient.
If the the source T type is convertible to the destination T, then this is works just fine:
vector<Tnew> vec_new(vec_old.begin(), vec_old.end());
Decent implementations should recognize when they've been given random-access iterators and optimize the memory allocation and loop appropriately.
The biggest problem for non-convertible types you'll have for simple types is not doing this:
std::vector<int> newVec(oldVec.size());
That's bad. That will allocate a buffer of the proper size, but it will also fill it with data. Namely, default-constructed ints (int()).
Instead, you should do this:
std::vector<int> newVec;
newVec.reserve(oldVec.size());
This reserves capacity equal to the original vector, but it also ensures that no default construction takes place. You can now push_back to your hearts content, knowing that you will never cause reallocation in your new vector.
From there, you can just loop over each entry in the old vector, doing the conversion as needed.
There's no way to avoid the copy, since a std::vector<T> is a distinct
type from std::vector<U>, and there's no way for them to share the
memory. Other than that, it depends on how the data is mapped. If the
mapping corresponds to an implicit conversion (e.g. unsigned short to
bool), then simply creating a new vector using the begin and end
iterators from the old will do the trick:
std::vector<bool> newV( oldV.begin(), oldV.end() );
If the mapping isn't just an implicit conversion (and this includes
cases where you want to verify things; e.g. that the unsigned short
does contain only 0 or 1), then it gets more complicated. The
obvious solution would be to use std::transform:
std::vector<TargetType> newV;
newV.reserve( oldV.size() ); // avoids unnecessary reallocations
std::transform( oldV.begin(), oldV.end(),
std::back_inserter( newV ),
TranformationObject() );
, where TranformationObject is a functional object which does the
transformation, e.g.:
struct ToBool : public std::unary_function<unsigned short, bool>
{
bool operator()( unsigned short original ) const
{
if ( original != 0 && original != 1 )
throw Something();
return original != 0;
}
};
(Note that I'm just using this transformation function as an example.
If the only thing which distinguishes the transformation function from
an implicit conversion is the verification, it might be faster to verify
all of the values in oldV first, using std::for_each, and then use
the two iterator constructor above.)
Depending on the cost of default constructing the target type, it may be
faster to create the new vector with the correct size, then overwrite
it:
std::vector<TargetType> newV( oldV.size() );
std::transform( oldV.begin(), oldV.end(),
newV.begin(),
TranformationObject() );
Finally, another possibility would be to use a
boost::transform_iterator. Something like:
std::vector<TargetType> newV(
boost::make_transform_iterator( oldV.begin(), TranformationObject() ),
boost::make_transform_iterator( oldV.end(), TranformationObject() ) );
In many ways, this is the solution I prefer; depending on how
boost::transform_iterator has been implemented, it could also be the
fastest.
You should be able to use assign like this:
vector<unsigned short> v;
//...
vector<bool> u;
//...
u.assign(v.begin(), v.end());
class A{... }
class B{....}
B convert_A_to_B(const A& a){.......}
void convertVector_A_to_B(const vector<A>& va, vector<B>& vb)
{
vb.clear();
vb.reserve(va.size());
std::transform(va.begin(), va.end(), std::back_inserter(vb), convert_A_to_B);
}
The fastest way to do it is to not do it. For example, if you know in advance that your items only need a byte for storage, use a byte-size vector to begin with. You'll find it difficult to find a faster way than that :-)
If that's not possible, then just absorb the cost of the conversion. Even if it's a little slow (and that's by no means certain, see Nicol's excellent answer for details), it's still necessary. If it wasn't, you would just leave it in the larger-type vector.
First, a warning: Don't do what I'm about to suggest. It's dangerous and must never be done. That said, if you just have to squeeze out a tiny bit more performance No Matter What...
First, there are some caveats. If you don't meet these, you can't do this:
The vector must contain plain-old-data. If your type has pointers, or uses a destructor, or needs an operator = to copy correctly ... do not do this.
The sizeof() both vector's contained types must be the same. That is, vector< A > can copy from vector< B > only if sizeof(A) == sizeof(B).
Here is a fairly stable method:
vector< A > a;
vector< B > b;
a.resize( b.size() );
assert( sizeof(vector< A >::value_type) == sizeof(vector< B >::value_type) );
if( b.size() == 0 )
a.clear();
else
memcpy( &(*a.begin()), &(*b.begin()), b.size() * sizeof(B) );
This does a very fast, block copy of the memory contained in vector b, directly smashing whatever data you have in vector a. It doesn't call constructors, it doesn't do any safety checking, and it's much faster than any of the other methods given here. An optimizing compiler should be able to match the speed of this in theory, but unless you're using an unusually good one, it won't (I checked with Visual C++ a few years ago, and it wasn't even close).
Also, given these constraints, you could forcibly (via void *) cast one vector type to the other and swap them -- I had a code sample for that, but it started oozing ectoplasm on my screen, so I deleted it.
Copying element by element is not highly inefficient. std::vector provides constant access time to any of its elements, hence the operation will be O(n) overall. You will not notice it.
#ifdef VECTOR_H_TYPE1
#ifdef VECTOR_H_TYPE2
#ifdef VECTOR_H_CLASS
/* Other methods can be added as needed, provided they likewise carry out the same operations on both */
#include <vector>
using namespace std;
class VECTOR_H_CLASS {
public:
vector<VECTOR_H_TYPE1> *firstVec;
vector<VECTOR_H_TYPE2> *secondVec;
VECTOR_H_CLASS(vector<VECTOR_H_TYPE1> &v1, vector<VECTOR_H_TYPE2> &v2) { firstVec = &v1; secondVec = &v2; }
~VECTOR_H_CLASS() {}
void init() { // Use this to copy a full vector into an empty (or garbage) vector to equalize them
secondVec->clear();
for(vector<VECTOR_H_TYPE1>::iterator it = firstVec->begin(); it != firstVec->end(); it++) secondVec->push_back((VECTOR_H_TYPE2)*it);
}
void push_back(void *value) {
firstVec->push_back((VECTOR_H_TYPE1)value);
secondVec->push_back((VECTOR_H_TYPE2)value);
}
void pop_back() {
firstVec->pop_back();
secondVec->pop_back();
}
void clear() {
firstVec->clear();
secondVec->clear();
}
};
#undef VECTOR_H_CLASS
#endif
#undef VECTOR_H_TYPE2
#endif
#undef VECTOR_H_TYPE1
#endif