Order a list depending on other list - c++

Given:
struct Object {
int id;
...
};
list<Object> objectList;
list<int> idList;
What is the best way to order objectList depending on order of idList?
Example (pseudo code):
INPUT
objectList = {o1, o2, o3};
idList = {2, 3, 1};
ACTION
sort(objectList, idList);
OUTPUT
objectList = {o2, o3, o1};
I searched in documentation but I only found methods to order elements comparing among themselves.

You can store the objects in an std::map, with id as key. Then traverse idList, get the object out of map with its id.
std::map<int, Object> objectMap;
for (auto itr = objectList.begin(); itr != objectList.end(); itr++)
{
objectMap.insert(std::make_pair(itr->id, *itr));
}
std::list<Object> newObjectList;
for (auto itr = idList.begin(); itr != idList.end(); itr++)
{
// here may fail if your idList contains ids which does not appear in objectList
newObjectList.push_back(objectMap[*itr]);
}
// now newObjectList is sorted as order in idList

Here is another variant, which works in O(n log n). This is asymptotcally optimal.
#include <list>
#include <vector>
#include <algorithm>
#include <iostream>
#include <cassert>
int main() {
struct O {
int id;
};
std::list<O> object_list{{1}, {2}, {3}, {4}};
std::list<int> index_list{4, 2, 3, 1};
assert(object_list.size() == index_list.size());
// this vector is optional. It is needed if sizeof(O) is quite large.
std::vector<std::pair<int, O*>> tmp_vector(object_list.size());
// this is O(n)
std::transform(begin(object_list), end(object_list), begin(tmp_vector),
[](auto& o) { return std::make_pair(o.id, &o); });
// this is O(n log n)
std::sort(begin(tmp_vector), end(tmp_vector),
[](const auto& o1, const auto& o2) {
return o1.first < o2.first;
});
// at this point, tmp_vector holds pairs in increasing index order.
// Note that this may not be a contiguous list.
std::list<O> tmp_list(object_list.size());
// this is again O (n log n), because lower_bound is O (n)
// we then insert the objects into a new list (you may also use some
// move semantics here).
std::transform(begin(index_list), end(index_list), begin(tmp_list),
[&tmp_vector](const auto& i) {
return *std::lower_bound(begin(tmp_vector), end(tmp_vector),
std::make_pair(i, nullptr),
[](const auto& o1, const auto& o2) {
return o1.first < o2.first;
})->second;
});
// As we just created a new list, we swap the new list with the old one.
std::swap(object_list, tmp_list);
for (const auto& o : object_list)
std::cout << o.id << std::endl;
}
I assumed that O is quite large and not easily movable. Therefore i first create tmp_vector which only contains of pairs. Then I sort this vector.
Afterwards I can simply go through the index_list and find the matching indices using binary search.
Let me elaborate on why a map is not the best solution eventhough you get a quite small piece of code. If you use a map you need to rebalance your tree after each insertion. This doesn't cost asympatotically (because n times rebalancing costs you the same as sorting once), but the constant is way larger. A "constant map" makes not that much sense (except accessing it may be easier).
I then timed the "simple" map-approach against my "not-so-simple" vector-approach. I created a randomly sorted index_list with N entries. And this is what I get (in us):
N map vector
1000 90 75
10000 1400 940
100000 24500 15000
1000000 660000 250000
NOTE: This test shows the worst case as in my case only index_list was randomly sorted, while the object_list (which is inserted into the map in order) is sorted. So rebalancing shows all its effect. If the object_list is kind of random, performance will behave more similar, eventhough performance will always be worse. The vector list will even behave better when the object list is completely random.
So already with 1000 entries the difference is already quite large. So I would strongly vote for a vector-based approach.

Assuming the data is handled to you externally and you don't have the choice of the containers:
assert( objectList.size() == idList.size() );
std::vector<std::pair<int,Object>> wrapper( idList.size() );
auto idList_it = std::begin( idList );
auto objectList_it = std::begin( objectList );
for( auto& e: wrapper )
e = std::make_pair( *idList_it++, *objectList_it++ );
std::sort(
std::begin(wrapper),
std::end(wrapper),
[]
(const std::pair<int,Object>& a, const std::pair<int,Object>& b) -> bool
{ return a.first<b.first; }
);
Then, copy back to original container.
{
auto objectList_it = std::begin( objectList );
for( const auto& e: wrapper )
*objectList_it++ = e;
}
But this solution is not optimal, I'm sure somebody will come with a better solution.
Edit: The default comparison operator for pairs requires that it is defined both for first and second members. Thus the easiest way is to provide a lambda.
Edit2: for some reason, this doesn't build if using a std::list for the wrapper. But it's ok if you use a std::vector (see here).

std::list has a sort member function you can use with a custom comparison functor.
That custom functor has to look up an object's id in the idList and can then use std::distance to calculate the position of the element in idList. It does so for both objects to be compared and returns true if the first position is smaller than the second.
Here is an example:
#include <iostream>
#include <list>
#include <algorithm>
#include <stdexcept>
struct Object
{
int id;
};
int main()
{
Object o1 = { 1 };
Object o2 = { 2 };
Object o3 = { 3 };
std::list<Object> objectList = { o1, o2, o3 };
std::list<int> const idList = { 2, 3, 1 };
objectList.sort([&](Object const& first, Object const& second)
{
auto const id_find_iter1 = std::find(begin(idList), end(idList), first.id);
auto const id_find_iter2 = std::find(begin(idList), end(idList), second.id);
if (id_find_iter1 == end(idList) || id_find_iter2 == end(idList))
{
throw std::runtime_error("ID not found");
}
auto const pos1 = std::distance(begin(idList), id_find_iter1);
auto const pos2 = std::distance(begin(idList), id_find_iter2);
return pos1 < pos2;
});
for (auto const& object : objectList)
{
std::cout << object.id << '\n';
}
}
It's probably not terribly efficient, but chances are you will never notice. If it still bothers you, you might want to look for a solution with std::vector, which unlike std::list provides random-access iterators. That turns std::distance from O(n) to O(1).

I would find it strange to end up in this situation as I would use the pointers instead of the ids. Though; there might be usecases for this.
Note that in all examples below, I assume that the ids-list contains all ids exactly ones.
Writing it yourself
The issue you like to solve is creating/sorting a list of objects based on the order of the ids in another list.
The naive way of doing this, is simply writing it yourself:
void sortByIdVector(std::list<Object> &list, const std::list<int> &ids)
{
auto oldList = std::move(list);
list = std::list<Object>{};
for (auto id : ids)
{
auto itElement = std::find_if(oldList.begin(), oldList.end(), [id](const Object &obj) { return id == obj.id; });
list.emplace_back(std::move(*itElement));
oldList.erase(itElement);
}
}
If you use a sorted vector as input, you can optimize this code to get the best performance out of it. I'm leaving it up-to you to do so.
Using sort
For this implementation, I'm gonna assume this are std::vector instead of std::list, as this is the better container to request the index of an element. (You can with some more code do the same for list)
size_t getIntendedIndex(const std::vector<int> &ids, const Object &obj)
{
auto itElement = std::find_if(ids.begin(), ids.end(), [obj](int id) { return id == obj.id; });
return itElement - ids.begin();
}
void sortByIdVector(std::list<Object> &list, const std::vector<int> &ids)
{
list.sort([&ids](const Object &lhs, const Object &rhs){ return getIntendedIndex(ids, lhs) < getIntendedIndex(ids, rhs); });
}
Insertion
Another approach, also more suitable for std::vector would be simply inserting the elements at the right place and will be more performant than the std::sort.
void sortByIdVector(std::vector<Object> &list, const std::vector<int> &ids)
{
auto oldList = std::move(list);
list = std::vector<Object>{};
list.resize(oldList.size());
for (Object &obj : oldList)
{
auto &newLocation = list[getIntendedIndex(ids, obj)];
newLocation = std::move(obj);
}
}

objectList.sort([&idList] (const Object& o1, const Object& o2) -> bool
{ return std::find(++std::find(idList.begin(), idList.end(), o1.id),
idList.end(), o2.id)
!= idList.end();
});
The idea is to check if we find o1.id before o2.id in the idList.
We search o1.id, increment the found position then we search o2.id: if found, that implies o1 < o2.
Test
#include <iostream>
#include <string>
#include <list>
#include <algorithm>
struct Object {
int id;
string name;
};
int main()
{
list<Object> objectList {{1, "one_1"}, {2, "two_1"}, {3, "three_1"}, {2, "two_2"}, {1, "one_2"}, {4, "four_1"}, {3, "Three_2"}, {4, "four_2"}};
list<int> idList {3, 2, 4, 1};
objectList.sort([&idList] (const Object& o1, const Object& o2) -> bool
{ return std::find(++std::find(idList.begin(), idList.end(), o1.id), idList.end(), o2.id) != idList.end(); });
for(const auto& o: objectList) cout << o.id << " " << o.name << "\n";
}
/* OUTPUT:
3 three_1
3 Three_2
2 two_1
2 two_2
4 four_1
4 four_2
1 one_1
1 one_2
*/

Related

If find_if() takes too long, are there alternatives that can be used for better program performance?

I'm working on a D* Lite path planner in C++. The program maintains a priority queue of cells (U), each cell have two cost values, and a key can be calculated for a cell which determine it's order on the priority queue.
using Cost = float;
using HeapKey = pair<Cost, Cost>;
using KeyCompare = std::greater<std::pair<HeapKey, unsigned int>>;
vector<pair<HeapKey, unsigned int>> U;
When a cell is added it is done so by using:
U.push_back({ k, id });
push_heap(U.begin(), U.end(), KeyCompare());
As part of the path planning algorithm cells sometimes need to be removed, and here lies the current problem as far as I can see. I recently had help on this site to speed my program up quite a bit by using push_heap instead of make_heap, but now it seems that the part of the program that removes cells is the slowest part. Cells are removed from the priority queue by:
void DstarPlanner::updateVertex(unsigned int id) {
...
...
auto it = find_if(U.begin(), U.end(), [=](auto p) { return p.second == id; });
U.erase(it);
...
...
}
From my tests this seems to take roughly 80% of the time my program use for path planning. It was my hope coming here that a more time-saving method existed.
Thank you.
EDIT - Extra information.
void DstarPlanner::insertHeap(unsigned int id, HeapKey k) {
U.push_back({ k, id });
push_heap(U.begin(), U.end(), KeyCompare());
in_U[id]++;
}
void DstarPlanner::updateVertex(unsigned int id) {
Cell* u = graph.getCell(id);
if (u->id != id_goal) {
Cost mincost = infinity;
for (auto s : u->neighbors) {
mincost = min(mincost, graph.getEdgeCost(u->id, s->id) + s->g);
}
u->rhs = mincost;
}
if (in_U[id]) {
auto it = find_if(U.begin(), U.end(), [=](auto p) { return p.second == id; });
U.erase(it);
in_U[id]--;
}
if (u->g != u->rhs) {
insertHeap(id, u->calculateKey());
}
}
vector<int> DstarPlanner::ComputeShortestPath() {
vector<int> bestPath;
vector<int> emptyPath;
Cell* n = graph.getCell(id_start);
while (U.front().first < n->calculateKey() || n->rhs != n->g) {
auto uid = U.front().second;
Cell* u = graph.getCell(uid);
auto kold = U.front().first;
pop_heap(U.begin(), U.end(), KeyCompare());
U.pop_back();
in_U[u->id]--;
if (kold < u->calculateKey()) {
insertHeap(u->id, u->calculateKey());
} else if (u->g > u->rhs) {
u->g = u->rhs;
for (auto s : u->neighbors) {
if (!occupied(s->id)) {
updateVertex(s->id);
}
}
} else {
u->g = infinity;
for (auto s : u->neighbors) {
if (!occupied(s->id)) {
updateVertex(s->id);
}
}
updateVertex(u->id);
}
}
bestPath=constructPath();
return bestPath;
}
find_if does a linear search. It maybe faster to use:
std::map/std::set -> Standard binary search tree implementations
std::unordered_map/std::unordered_set -> Standard hash table implementations
These may use a lot of memory if your elements (key-value pairs) are small integers. To avoid that you can use 3rd party alternatives like boost::unordered_flat_map.
How do you re-heapify after U.erase(it)? Do you ever delete multiple nodes at once?
If deletions need to be atomic between searches, then you can
swap it with end() - 1,
erase end() - 1, and
re-heapify.
Erasing end() - 1 is O(1) while erasing it is linear in std::distance(it, end).
void DstarPlanner::updateVertex(unsigned int id) {
...
// take the id by reference since this is synchronous
auto it = find_if(U.begin(), U.end(), [&](const auto& p) { return p.second == id; });
*it = std::move(*(U.end() - 1));
U.erase((U.end() - 1));
std::make_heap(U.begin(), U.end()); // expensive!!! 3*distance(begin, end)
...
}
If you can delete multiple nodes between searches, then you can use a combination of erase + remove_if to only perform one mass re-heapify. This is important be heapify is expensive.
it = remove_if(begin, end, [](){ lambda }
erase(it, end)
re-heapify
void DstarPlanner::updateVertex(const std::vector<unsigned int>& sorted_ids) {
...
auto it = remove_if(U.begin(), U.end(), [&](const auto& p) { return std::binary_search(ids.begin(), ids.end(), p.second); });
U.erase(it, U.end());
std::make_heap(U.begin(), U.end()); // expensive!!! 3*distance(begin, end)
...
}
Doing better
You can possibly improve on this by replacing std::make_heap (which makes no assumptions about the heapiness of [begin(), end()) with a custom method that re-heapifies a former heap around "poison points" -- it only needs to initially inspect the elements around the elements that were swapped. This sounds like a pain to write and I'd only do it if the resulting program was still too slow.
Have you thought of...
Just not even removing elements from the heap? The fact you're using a heap tells me that the algorithm designers suggested a heap. If they suggested a heap, then they likely didn't envision random removals. This is speculation on my part. I'm otherwise not familiar with D* lite.

How to remove non contiguous elements from a vector in c++

I have a vector std::vector<inputInfo> inputList and another vector std::vector<int> selection.
inputInfo is a struct that has some information stored.
The vector selection corresponds to positions inside inputList vector.
I need to remove elements from inputList which correspond to entries in the selection vector.
Here's my attempt on this removal algorithm.
Assuming the selection vector is sorted and using some (unavoidable ?) pointer arithmetic, this can be done in one line:
template <class T>
inline void erase_selected(std::vector<T>& v, const std::vector<int>& selection)
{
v.resize(std::distance(
v.begin(),
std::stable_partition(v.begin(), v.end(),
[&selection, &v](const T& item) {
return !std::binary_search(
selection.begin(), selection.end(),
static_cast<int>(static_cast<const T*>(&item) - &v[0]));
})));
}
This is based on an idea of Sean Parent (see this C++ Seasoning video) to use std::stable_partition ("stable" keeps elements sorted in the output array) to move all selected items to the end of an array.
The line with pointer arithmetic
static_cast<int>(static_cast<const T*>(&item) - &v[0])
can, in principle, be replaced with STL algorithms and index-free expression
std::distance(std::find(v.begin(), v.end(), item), std::begin(v))
but this way we have to spend O(n) in std::find.
The shortest way to remove non-contiguous elements:
template <class T> void erase_selected(const std::vector<T>& v, const std::vector<int>& selection)
{
std::vector<int> sorted_sel = selection;
std::sort(sorted_sel.begin(), sorted_sel.end());
// 1) Define checker lambda
// 'filter' is called only once for every element,
// all the calls respect the original order of the array
// We manually keep track of the item which is filtered
// and this way we can look this index in 'sorted_sel' array
int itemIndex = 0;
auto filter = [&itemIndex, &sorted_sel](const T& item) {
return !std::binary_search(
sorted_sel.begin(),
sorted_sel.end(),
itemIndex++);
}
// 2) Move all 'not-selected' to the end
auto end_of_selected = std::stable_partition(
v.begin(),
v.end(),
filter);
// 3) Cut off the end of the std::vector
v.resize(std::distance(v.begin(), end_of_selected));
}
Original code & test
If for some reason the code above does not work due to strangely behaving std::stable_partition(), then below is a workaround (wrapping the input array values with selected flags.
I do not assume that inputInfo structure contains the selected flag, so I wrap all the items in the T_withFlag structure which keeps pointers to original items.
#include <algorithm>
#include <iostream>
#include <vector>
template <class T>
std::vector<T> erase_selected(const std::vector<T>& v, const std::vector<int>& selection)
{
std::vector<int> sorted_sel = selection;
std::sort(sorted_sel.begin(), sorted_sel.end());
// Packed (data+flag) array
struct T_withFlag
{
T_withFlag(const T* ref = nullptr, bool sel = false): src(ref), selected(sel) {}
const T* src;
bool selected;
};
std::vector<T_withFlag> v_with_flags;
// should be like
// { {0, true}, {0, true}, {3, false},
// {0, true}, {2, false}, {4, false},
// {5, false}, {0, true}, {7, false} };
// for the input data in main()
v_with_flags.reserve(v.size());
// No "beautiful" way to iterate a vector
// and keep track of element index
// We need the index to check if it is selected
// The check takes O(log(n)), so the loop is O(n * log(n))
int itemIndex = 0;
for (auto& ii: v)
v_with_flags.emplace_back(
T_withFlag(&ii,
std::binary_search(
sorted_sel.begin(),
sorted_sel.end(),
itemIndex++)
));
// I. (The bulk of ) Removal algorithm
// a) Define checker lambda
auto filter = [](const T_withFlag& ii) { return !ii.selected; };
// b) Move every item marked as 'not-selected'
// to the end of an array
auto end_of_selected = std::stable_partition(
v_with_flags.begin(),
v_with_flags.end(),
filter);
// c) Cut off the end of the std::vector
v_with_flags.resize(
std::distance(v_with_flags.begin(), end_of_selected));
// II. Output
std::vector<T> v_out(v_with_flags.size());
std::transform(
// for C++20 you can parallelize this
// with 'std::execution::par' as first parameter
v_with_flags.begin(),
v_with_flags.end(),
v_out.begin(),
[](const T_withFlag& ii) { return *(ii.src); });
return v_out;
}
The test function is
int main()
{
// Obviously, I do not know the structure
// used by the topic starter,
// so I just declare a small structure for a test
// The 'erase_selected' does not assume
// this structure to be 'light-weight'
struct inputInfo
{
int data;
inputInfo(int v = 0): data(v) {}
};
// Source selection indices
std::vector<int> selection { 0, 1, 3, 7 };
// Source data array
std::vector<inputInfo> v{ 0, 0, 3, 0, 2, 4, 5, 0, 7 };
// Output array
auto v_out = erase_selected(v, selection);
for (auto ii : v_out)
std::cout << ii.data << ' ';
std::cout << std::endl;
}

Finding and erasing a value from a std::vector holding std::map elements

First, I have the following two objects, both filled with data:
std::vector<std::map<std::uint8_t, std::uint8_t>> x1;
std::vector<std::map<std::uint8_t, std::uint8_t>> x2;
My objective is to search inside x2 (by the key), checking if any value from x1 doesn't exist inside x2, and then erase it from x1.
I tried with the following code snippet, but to no avail (it doesn't compile!):
for (auto i = x1.begin(); i != x1.end(); ++i)
{
auto it = std::find(x2.begin(), x2.end(), i);
if (it == x2.end())
{
x1.erase(i);
}
}
What am I doing wrong? Could you please share some insights on how to solve this problem?
There are several problems with your code:
std::find() searches for a single matching element, which in this case means you have to give it a std::map to search for. But you are passing in the i iterator itself, not the std::map that it refers to. You need to dereference i, eg:
auto it = std::find(x2.cbegin(), x2.cend(), *i);
When calling x1.erase(i), i becomes invalidated, which means the loop cannot use i anymore - not for ++i, not for i != x1.end(). You need to save the new iterator that erase() returns, which refers to the next element after the one being erased. Which means you also need to update your loop logic to NOT increment i when erase() is called, eg:
for (auto i = x1.cbegin(); i != x1.cend(); )
{
auto it = std::find(x2.cbegin(), x2.cend(), *i);
if (it == x2.cend())
i = x1.erase(i);
else
++i;
}
lastly, when using std::find(), you are comparing entire std::map objects to each other. If you are interested in comparing only the keys, try something more like this:
for (auto i = x1.cbegin(); i != x1.cend(); )
{
const auto &m1 = *i:
auto it = std::find_if(m1.cbegin(), m1.cend(),
[&](const decltype(m1)::value_type &m1_pair) { // or (const auto &m1_pair) in C++14...
return std::find_if(x2.cbegin(), x2.cend(),
[&](const decltype(x2)::value_type &m2){ // or (const auto &m2) in C++14...
return m2.find(m1_pair.first) != m2.cend();
}
);
}
);
if (it == m1.cend())
i = x1.erase(i);
else
++i;
}
You can also go a little bit functional: Playground
#include <algorithm>
#include <functional>
// removes maps from x1, that are equal to none of x2 maps
auto remove_start = std::remove_if(x1.begin(), x1.end(), [&](const auto& x1_map){
return std::none_of(x2.begin(), x2.end(),
std::bind(std::equal_to(), x1_map, std::placeholders::_1));
});
x1.erase(remove_start, x1.end());
EDIT: To check keys only, change std::equal_to to a custom lambda
auto keys_equal = [](auto& m1, auto& m2){
return m1.size() == m2.size()
&& std::equal(m1.begin(), m1.end(), m2.begin(),
[](auto& kv1, auto& kv2){ return kv1.first == kv2.first; });
};
// removes maps from x1, that are equal to none of x2 maps
auto remove_start =
std::remove_if(x1.begin(), x1.end(), [&](const auto& x1_map){
return std::none_of(x2.begin(), x2.end(),
std::bind(keys_equal, x1_map, std::placeholders::_1));
});
x1.erase(remove_start, x1.end());

Inserting multiple values into a vector at specific positions

Say I have a vector of integers like this std::vector<int> _data;
I know that if I want to remove multiple items from _data, then I can simply call
_data.erase( std::remove_if( _data.begin(), _data.end(), [condition] ), _data.end() );
Which is much faster than eraseing multiple elements, as less movement of data is required within the vector. I'm wondering if there's something similar for insertions.
For example, if I have the following pairs
auto pair1 = { _data.begin() + 5, 5 };
auto pair2 = { _data.begin() + 12, 12 };
Can I insert both of these in one iteration using some existing std function? I know I can do something like:
_data.insert( pair2.first, pair2.second );
_data.insert( pair1.first, pair1.second );
But this is (very) slow for large vectors (talking 100,000+ elements).
EDIT: Basically, I have a custom set (and map) which use a vector as the underlying containers. I know I can just use std::set or std::map, but the number of traversals I do far outweighs the insertion/removals. Switching from a set and map to this custom set/map already cut 20% of run-time off. Currently though, insertions take approximately 10% of the remaining run time, so reducing that is important.
The order is also required, unfortunately. As much as possible, I use the unordered_ versions, but in some places the order does matter.
One way is to create another vector with capacity equal to the original size plus the number of the elements being inserted and then do an insert loop with no reallocations, O(N) complexity:
template<class T>
std::vector<T> insert_elements(std::vector<T> const& v, std::initializer_list<std::pair<std::size_t, T>> new_elements) {
std::vector<T> u;
u.reserve(v.size() + new_elements.size());
auto src = v.begin();
size_t copied = 0;
for(auto const& element : new_elements) {
auto to_copy = element.first - copied;
auto src_end = src + to_copy;
u.insert(u.end(), src, src_end);
src = src_end;
copied += to_copy;
u.push_back(element.second);
}
u.insert(u.end(), src, v.end());
return u;
}
int main() {
std::vector<int> v{1, 3, 5};
for(auto e : insert_elements(v, {{1,2}, {2,4}}))
std::cout << e << ' ';
std::cout << '\n';
}
Output:
1 2 3 4 5
Ok, we need some assumptions. Let old_end be a reverse iterator to the last element of your vector. Assume that your _data has been resized to exactly fit both its current content and what you want to insert. Assume that inp is a container of std::pair containing your data to be inserted that is ordered reversely (so first the element that is to be inserted at the hindmost position and so on). Then we can do:
std::merge(old_end, _data.rend(), inp.begin(), inp.end(), data.rend(), [int i = inp.size()-1](const &T t, const &std::pair<Iter, T> p) mutable {
if( std::distance(_data.begin(), p.first) == i ) {
--i;
return false;
}
return true;
}
But I think that is not more clear than using a good old for. The problem with the stl-algorithms is that the predicates work on values and not on iterators thats a bit annoying for this problem.
Here's my take:
template<class Key, class Value>
class LinearSet
{
public:
using Node = std::pair<Key, Value>;
template<class F>
void insert_at_multiple(F&& f)
{
std::queue<Node> queue;
std::size_t index = 0;
for (auto it = _kvps.begin(); it != _kvps.end(); ++it)
{
// The container size is left untouched here, no iterator invalidation.
if (std::optional<Node> toInsert = f(index))
{
queue.push(*it);
*it = std::move(*toInsert);
}
else
{
++index;
// Replace current node with queued one.
if (!queue.empty())
{
queue.push(std::move(*it));
*it = std::move(queue.front());
queue.pop();
}
}
}
// We now have as many displaced items in the queue as were inserted,
// add them to the end.
while (!queue.empty())
{
_kvps.emplace_back(std::move(queue.front()));
queue.pop();
}
}
private:
std::vector<Node> _kvps;
};
https://godbolt.org/z/EStKgQ
This is a linear time algorithm that doesn't need to know the number of inserted elements a priori. For each index, it asks for an element to insert there. If it gets one, it pushes the corresponding existing vector element to a queue and replaces it with the new one. Otherwise, it extracts the current item to the back of the queue and puts the item at the front of the queue into the current position (noop if no elements were inserted yet). Note that the vector size is left untouched during all this. Only at the end do we push back all items still in the queue.
Note that the indices we use for determining inserted item locations here are all pre-insertion. I find this a point of potential confusion (and it is a limitation - you can't add an element at the very end with this algorithm. Could be remedied by calling f during the second loop too, working on that...).
Here's a version that allows inserting arbitrarily many elements at the end (and everywhere else). It passes post-insertion indices to the functor!
template<class F>
void insert_at_multiple(F&& f)
{
std::queue<Node> queue;
std::size_t index = 0;
for (auto it = _kvps.begin(); it != _kvps.end(); ++it)
{
if (std::optional<Node> toInsert = f(index))
queue.push(std::move(*toInsert));
if (!queue.empty())
{
queue.push(std::move(*it));
*it = std::move(queue.front());
queue.pop();
}
++index;
}
// We now have as many displaced items in the queue as were inserted,
// add them to the end.
while (!queue.empty())
{
if (std::optional<Node> toInsert = f(index))
{
queue.push(std::move(*toInsert));
}
_kvps.emplace_back(std::move(queue.front()));
queue.pop();
++index;
}
}
https://godbolt.org/z/DMuCtJ
Again, this leaves potential for confusion over what it means to insert at indices 0 and 1 (do you end up with an original element in between the two? In the first snippet you would, in the second you wouldn't). Can you insert at the same index multiple times? With pre-insertion indices that makes sense, with post-insertion indices it doesn't. You could also write this in terms of passing the current *it (i.e. key value pair) to the functor, but that alone seems not too useful...
This is an attempt I made, which inserts in reverse order. I did get rid of the iterators/indices for this.
template<class T>
void insert( std::vector<T> &vector, const std::vector<T> &values ) {
size_t last_index = vector.size() - 1;
vector.resize( vector.size() + values.size() ); // relies on T being default constructable
size_t move_position = vector.size() - 1;
size_t last_value_index = values.size() - 1;
size_t values_size = values.size();
bool isLastIndex = false;
while ( !isLastIndex && values_size ) {
if ( values[last_value_index] > vector[last_index] ) {
vector[move_position] = std::move( values[last_value_index--] );
--values_size;
} else {
isLastIndex = last_index == 0;
vector[move_position] = std::move( vector[last_index--] );
}
--move_position;
}
if ( isLastIndex && values_size ) {
while ( values_size ) {
vector[move_position--] = std::move( values[last_value_index--] );
--values_size;
}
}
}
Tried with ICC, Clang, and GCC on Godbolt, and vector's insert was faster (for 5 numbers inserted). On my machine, MSVC, same result but less severe. I also compared with Maxim's version from his answer. I realize using Godbolt isn't a good method for comparison, but I don't have access to the 3 other compilers on my current machine.
https://godbolt.org/z/vjV2wA
Results from my machine:
My insert: 659us
Maxim insert: 712us
Vector insert: 315us
Godbolt's ICC
My insert: 470us
Maxim insert: 139us
Vector insert: 127us
Godbolt's GCC
My insert: 815us
Maxim insert: 97us
Vector insert: 97us
Godbolt's Clang:
My insert: 477us
Maxim insert: 188us
Vector insert: 96us

Copy values (mapped_type) from map to vector given a mapping from keys to indices

Suppose we have a function key_to_index that maps keys of a map to indices of a vector. For an example, let's just make it trivial:
std::map<int, int> source = {{1,55}, {4, 20}, {6, 25}};
std::vector<int> target;
int key_to_index(int key) {return key;}
What would be a version of the following loop that uses STL algorithms?
for (const auto &el: source) {
int index = key_to_index(el.first);
if (index > (int)target.size() - 1) target.resize(index + 1);
target[index] = el.second;
}
#Edgar answer is good, however, I do not like second map creation. Assuming that key_to_index is reasonably fast, it is better to just run it twice more than create map with converted indices.
Obvious optimization for your code (unless key_to_index is too complex) is to avoid more than one resizing. Then apply std::for_each to original map
auto max = std::max_element(source.cbegin(), source.cend(), [](auto& lhs, auto& rhs) {
return key_to_index(lhs.first) < key_to_index(rhs.first); });
target.resize(key_to_index(max->first) + 1);
std::for_each(source.cbegin(), source.cend(), [&target](const auto& e) {
target[key_to_index(e.first)] = e.second; });
Basically you can create a new map storing the same values with transformed keys:
std::map<int, int> transformed;
std::transform(std::cbegin(source), std::cend(source),
std::inserter(transformed, transformed.end()),
[](const auto& e) {
return std::make_pair(key_to_index(e.first), e.second);
}
);
And then fill the target:
std::vector<int> target;
target.resize(transformed.rbegin()->first + 1);
std::for_each(std::cbegin(transformed), std::cend(transformed),
[&target](const auto& e) {
target[e.first] = e.second;
}
);
wandbox example
Anyway I believe that the initial version is better. STL does not always make your code more efficient or even more readable.
You can create an output iterator, very similar to std::insert_iterator, with the value type of std::pair<int,int> and operator= that mutates your array. Then your function can be written as std::transform.