Is there a BOOST pool fixed-sized allocator? - c++

I want to create unordered_map(Because I specifically want a hash map). I want to allocate its max size (according to my constraints) in the beginning.
So, if I want to allocated 256 entries, and the size of each entry is 1B (just an example. Let's say 1Byte includes the Key and the Value). Then the total size of my unordered_map keys + entries is 256B. I want to pre-allocate 256B in the allocator.
Then, when the unordered_map will call allocate()/deallocate(), the allocator will give it 1B from the already-allocated memory.
typedef boost::unordered::unordered_map<int, MyClass, boost::hash<int>, std::equal_to<MyClass>, ??? > > myMap
Does it exists in BOOST? or somewhere else?
---- edit ----
As I see it (Thanks to the answers here) - there are two solutions for my problem:
Implement an allocator, which holds a boost::pool<>. This pool is built in the allocator constructor. When allocate() is being called from unordered_map, it actually calls pool.malloc(), and when deallocate() is called from unordered_map, it actually calls pool.free().
Use an already implemented allocator, such as pool_allocator like this:
typedef pool_allocator<std::pair<MyKey, MyClass>, boost::default_user_allocator_new_delete, boost::mutex, 1024 >) MyAllocator;
typedef unordered_map<MyKey, MyClass, hash, eq, MyAllocator> MyUnorderedMap;
The seconds option is still unclear to me, because:
a. Can I declare only one MyUnorderedMap?
b. How can I declare a new MyUnorderedMap with different next_block size than 1024 in run time?

What you describe can actually only achieved with something like Boost Intrusive "Maps" (actually, sets then).
However to get truly 1B - allocated elements you'd need to define a custom stateful value traits, so you can store the node-index metadata separately from the element payload.
However, from the fact that you claim the element type to be 1B (which can obviously never be true for a concrete key and value type), I'll not assume you actually wanted this contrived solution for "some reason".
Instead, let me suggest three more mundane approaches:
Using a flat_map
Using a Boost Intrusive unordered set
Using an unordered set with Boost Pool fixed size allocator¹
Boost flat_map
If hash lookup is not mandatory, you can simplify a lot by just reserving contiguous element storage up front and storing an ordered map instead:
Live On Coliru
#include <boost/container/flat_map.hpp>
#include <iostream>
using Elements = boost::container::flat_map<std::string, std::string>;
int main() {
Elements map;
map.reserve(256); // pre-allocate 256 "nodes"!
map.insert({
{ "one", "Eins" },
{ "two", "Zwei" },
{ "three", "Drei" },
{ "four", "Vier" },
{ "five", "Fuenf" },
});
for (auto& e : map) {
std::cout << "Entry: " << e.first << " -> " << e.second << "\n";
}
std::cout << "map[\"three\"] -> " << map["three"] << "\n";
}
Prints
Entry: five -> Fuenf
Entry: four -> Vier
Entry: one -> Eins
Entry: three -> Drei
Entry: two -> Zwei
map["three"] -> Drei
Boost Intrusive
CAVEAT Intrusive containers come with their own set of trade offs. Managing the underlying storage of the elements can be error-prone. Auto-link behaviour of the hooks inhibits the constant-time implementation of size() and similar (empty() on some of the unordered set configurations) so this might not be your thing.
Live On Coliru
#include <boost/intrusive/unordered_set.hpp>
#include <boost/intrusive/unordered_set_hook.hpp>
#include <iostream>
namespace bi = boost::intrusive;
struct Element;
namespace boost {
template <> struct hash<Element> {
size_t operator()(Element const& e) const;
};
}
struct Element : bi::unordered_set_base_hook<> {
std::string key;
mutable std::string value;
Element(std::string k = "", std::string v = "")
: key(std::move(k)), value(std::move(v)) { }
bool operator==(Element const& other) const { return key == other.key; }
};
size_t boost::hash<Element>::operator()(Element const& e) const {
return hash_value(e.key);
}
using Elements = bi::unordered_set<Element>;
int main() {
std::array<Element, 256> storage; // reserved 256 entries
std::array<Elements::bucket_type, 100> buckets; // buckets for the hashtable
Elements hashtable(Elements::bucket_traits(buckets.data(), buckets.size()));
storage[0] = { "one", "Eins" };
storage[1] = { "two", "Zwei" };
storage[2] = { "three", "Drei" };
storage[3] = { "four", "Vier" };
storage[4] = { "five", "Fuenf" };
hashtable.insert(storage.data(), storage.data() + 5);
for (auto& e : hashtable) {
std::cout << "Hash entry: " << e.key << " -> " << e.value << "\n";
}
std::cout << "hashtable[\"three\"] -> " << hashtable.find({"three"})->value << "\n";
}
Prints
Hash entry: two -> Zwei
Hash entry: four -> Vier
Hash entry: five -> Fuenf
Hash entry: three -> Drei
Hash entry: one -> Eins
hashtable["three"] -> Drei
Pool fixed size allocator¹
If you absolutely require the node-based storage, consider using a custom allocator.
¹ You'll note that (at least with Boost's unordered_map implementation) the allocator is used for two types (bucket pointers and value nodes) and as such there are two fixed size allocations possible.
(See the cleanup calls at the bottom of the sample)
Live On Coliru
#include <boost/pool/pool_alloc.hpp>
#include <boost/unordered/unordered_map.hpp>
#include <iostream>
using RawMap = boost::unordered_map<std::string, std::string>;
using Elements = boost::unordered_map<
std::string, std::string,
RawMap::hasher, RawMap::key_equal,
boost::fast_pool_allocator<RawMap::value_type>
>;
int main() {
{
Elements hashtable;
hashtable.insert({
{ "one", "Eins" },
{ "two", "Zwei" },
{ "three", "Drei" },
{ "four", "Vier" },
{ "five", "Fuenf" },
});
for (auto& e : hashtable) {
std::cout << "Hash entry: " << e.first << " -> " << e.second << "\n";
}
std::cout << "hashtable[\"three\"] -> " << hashtable.find("three")->second << "\n";
}
// OPTIONALLY: free up system allocations in fixed size pools
// Two sizes, are implementation specific. My 64 system has the following:
boost::singleton_pool<boost::fast_pool_allocator_tag, 8>::release_memory(); // the bucket pointer allocation
boost::singleton_pool<boost::fast_pool_allocator_tag, 32>::release_memory(); // the ptr_node<std::pair<std::string const, std::string> >
}

Related

How can I construct one map from another map?

I am trying to create a map from another map using a comparator function that the new value in the key value pair is not same as the previous value in the key value pair stored in the map.
I am getting a compilation error while compiling below code. What is the issue is with that code? Is there a better way to accomplish this as well?
#include <iostream>
#include <map>
#include <set>
#include <algorithm>
#include <functional>
int main() {
// Creating & Initializing a map of String & Ints
std::map<std::string, int> mapOfWordCount = { { "aaa", 10 }, { "ddd", 41 },
{ "bbb", 62 }, { "ccc", 10} };
// Declaring the type of Predicate that accepts 2 pairs and return a bool
typedef std::function<bool(std::pair<std::string, int>, std::pair<std::string, int>)> Comparator;
// Defining a lambda function to compare two pairs. It will compare two pairs using second field
Comparator compFunctor =
[](std::pair<std::string, int> elem1 ,std::pair<std::string, int> elem2)
{
return elem1.second != elem2.second;
};
// Declaring a set that will store the pairs using above comparision logic
std::map<std::string, int, Comparator> setOfWords(
mapOfWordCount.begin(), mapOfWordCount.end(), compFunctor);
return 0;
}
The expected output of the second map is:
{ "aaa", 10 }
{ "ddd", 41 }
{ "bbb", 62 }
This means, that { "ccc", 10 } has to be ignored.
Excerpt from the error:
sortMap.cpp:25:70: required from here
/opt/tools/installs/gcc-4.8.3/include/c++/4.8.3/bits/stl_tree.h:1422:8:
error: no match for call to
‘(std::function, int>,
std::pair, int>)>) (const
std::basic_string&, const key_type&)’
&& _M_impl._M_key_compare(_S_key(_M_rightmost()), __k))
^ In file included from /opt/tools/installs/gcc-4.8.3/include/c++/4.8.3/bits/stl_algo.h:66:0,
from /opt/tools/installs/gcc-4.8.3/include/c++/4.8.3/algorithm:62,
from sortMap.cpp:4: /opt/tools/installs/gcc-4.8.3/include/c++/4.8.3/functional:2174:11:
note: candidate is:
class function<_Res(_ArgTypes...)>
^ /opt/tools/installs/gcc-4.8.3/include/c++/4.8.3/functional:2466:5:
note: _Res std::function<_Res(_ArgTypes ...)>::operator()(_ArgTypes
...) const [with _Res = bool; _ArgTypes =
{std::pair,
std::allocator >, int>, std::pair, std::allocator >, int>}]
function<_Res(_ArgTypes...)>::
^
This is a solution according to the intention described by OP.
Sample code:
#include <iostream>
#include <map>
#include <set>
#include <vector>
int main()
{
// Creating & Initializing a map of String & Ints
std::map<std::string, int> mapOfWordCount = {
{ "aaa", 10 }, { "ddd", 41 }, { "bbb", 62 }, { "ccc", 10 }
};
// auxiliary set of values
std::set<int> counts;
// creating a filtered map
std::vector<std::pair<std::string, int> > mapOfWordCountFiltered;
for (const std::map<std::string, int>::value_type &entry : mapOfWordCount) {
if (!counts.insert(entry.second).second) continue; // skip duplicate counts
mapOfWordCountFiltered.push_back(entry);
}
// output
for (const std::pair<std::string, int> &entry : mapOfWordCountFiltered) {
std::cout << "{ \"" << entry.first << "\", " << entry.second << " }\n";
}
// done
return 0;
}
Output:
{ "aaa", 10 }
{ "bbb", 62 }
{ "ddd", 41 }
Live Demo on coliru
There is no custom predicate used as the standard predicate (std::less<Key>) is sufficient for the solution (for map as well as set).
The filtered map doesn't even use a std::map as there is no necessity for this. (The entries are already sorted, the filtering is done by an extra std::set<int>.)
Actually, I have no idea how to perform this with a custom predicate as I don't know how to keep the (required) order of map with the extra check for duplicated values.
Isn't there a way to create a comparator that makes sure that another "key, value" is not inserted, if the value is already present in the map previously corresponding to a different key? This would save extra space that I would use by creating another set.
I have thought about this a while. Yes, it is possible but I wouldn't recommend it for productive code.
std::map::insert() probably calls std::map::lower_bound() to find the insertion point (i.e. iterator). (The std::map::lower_bound() in turn will use our custom predicate.) If the returned iterator is end() the entry is inserted at end. Otherwise, the key at this iterator is compared with the one which is provided as new (to be inserted). If it is equal the insertion will be denied otherwise the new entry is inserted there.
So, to deny insertion of an entry with duplicated value, the predicate has to return false regardless of comparison of keys. For this, the predicate has to do extra checks.
To perform these extra checks, the predicate needs access to the whole map as well as to the value of entry to be inserted. To solve the first issue, the predicate gets a reference to the map where it is used in. For the second issue, I had no better idea as to use a std::set<std::pair<std::string, int> > instead of the original std::map<std::string, int>. As there is already a custom predicate involved, the sorting behavior can be adjusted sufficiently.
So, this is what I got:
#include <iostream>
#include <map>
#include <set>
#include <vector>
typedef std::pair<std::string, int> Entry;
struct CustomLess;
typedef std::set<Entry, CustomLess> Set;
struct CustomLess {
Set &set;
CustomLess(Set &set): set(set) { }
bool operator()(const Entry &entry1, const Entry &entry2) const;
};
bool CustomLess::operator()(
const Entry &entry1, const Entry &entry2) const
{
/* check wether entry1.first already in set
* (but don't use find() as this may cause recursion)
*/
bool entry1InSet = false;
for (const Entry &entry : set) {
if ((entry1InSet = entry.first == entry1.first)) break;
}
/* If entry1 not in set check whether if could be added.
* If not any call of this predicate should return false.
*/
if (!entry1InSet) {
for (const Entry &entry : set) {
if (entry.second == entry1.second) return false;
}
}
/* check wether entry2.first already in set
* (but don't use find() as this may cause recursion)
*/
bool entry2InSet = false;
for (const Entry &entry : set) {
if ((entry2InSet = entry.first == entry2.first)) break;
}
/* If entry2 not in set check whether if could be added.
* If not any call of this predicate should return false.
*/
if (!entry2InSet) {
for (const Entry &entry : set) {
if (entry.second == entry2.second) return false;
}
}
/* fall back to regular behavior of a less predicate
* for entry1.first and entry2.first
*/
return entry1.first < entry2.first;
}
int main()
{
// Creating & Initializing a map of String & Ints
// with very specific behavior
Set mapOfWordCount({
{ "aaa", 10 }, { "ddd", 41 }, { "bbb", 62 }, { "ccc", 10 }
},
CustomLess(mapOfWordCount));
// output
for (const Entry &entry : mapOfWordCount) {
std::cout << "{ \"" << entry.first << "\", " << entry.second << " }\n";
}
// done
return 0;
}
Output:
{ "aaa", 10 }
{ "bbb", 62 }
{ "ddd", 41 }
Live Demo on coliru
My collaborator would call this a Frankenstein solution and IMHO this is sufficient in this case.
The intention of a std::map/std::set is usually an amortized insert() and find(). This effect is probably totally lost as the CustomLess must iterate (in worst case) over the whole set twice before a value can be returned. (The possible early-outs from iterations in some cases don't help much.)
So, this was a nice puzzle and I solved it somehow but rather to present a counter example.
As #Galik mentioned in the comments, the problem with your code is that the compare function of a map expects two keys as parameters and not key-value pairs. Consequently, you don't have access to the values within the comparator.
Similar to #Scheff, I also don't see a way to make your solution using a custom comparator work in a practical or recommended way. But instead of using a set and a vector, you could also invert your map. The filtering can then be performed by the map::insert() function:
#include <map>
#include <string>
#include <iostream>
int main() {
// Creating & Initializing a map of String & Ints
std::map<std::string, int> mapOfWordCount = { { "aaa", 10 }, { "ddd", 41 },
{ "bbb", 62 }, { "ccc", 10} };
std::map<int, std::string> inverseMap;
for(const auto &kv : mapOfWordCount)
inverseMap.insert(make_pair(kv.second, kv.first));
for(const auto& kv : inverseMap)
std::cout << "{ \"" << kv.second << "\", " << kv.first << " }" << std::endl;
}
The function map::insert() only inserts an item if its key doesn't exist in the map, yet. Output:
{ "aaa", 10 }
{ "ddd", 41 }
{ "bbb", 62 }
However, if you require your target map setOfWords to be of the type std::map<std::string, int>, then you can invert the inverted map from the code above once again in the following way:
std::map<std::string, int> setOfWords;
for(const auto& kv : inverseMap)
setOfWords[kv.second] = kv.first;
for(const auto& kv : setOfWords)
std::cout << "{ \"" << kv.first << "\", " << kv.second << " }" << std::endl;
As a result (even if this isn't your requirement), setOfWords becomes sorted by the key. Output:
{ "aaa", 10 }
{ "bbb", 62 }
{ "ddd", 41 }
Code on Ideone

How to use find_first_not_of with a vector of string?

Let's say I have the following object:
vector<string> data = {"12","12","12","12","13","14","15", "15", "15", "15", "18"};
I'm trying to find the first non-repeating entry in the data object.
For example, data.find_first_not_of(data.at(0)); this would work if data is of string type only (no container).
How can I achieve the same thing with an object of type vector.
I looked at adjacent_find and find_if_not from the algorithm library, but to no avail.
Your suggestions are much appreciated.
What problem did you have with adjacent_find? You should be able to use that with an inverse predicate:
std::vector<std::string> data = {"12","12","12","12","13","14","15", "15", "15", "15", "18"};
// Sort data here if necessary
auto itr = std::adjacent_find(data.cbegin(), data.cend(), std::not_equal_to<std::string>{});
if (itr != data.cend()) {
std::cout << "First mismatch: " << *itr << " " << *std::next(itr) << std::endl;
} else {
std::cout << "All elements equal" << std::endl;
}
Wandbox
Since you have to go through the list at least once, and you don't know when or where you will encounter the duplicate of a number (if there is one), one way to solve this is to first gather "statistics" and then from what you've gathered you can determine the first non-duplicate.
Here is an example using std::unordered_map:
#include <algorithm>
#include <unordered_map>
#include <iostream>
#include <vector>
#include <string>
// struct to hold some information on the numbers
struct info
{
std::string number;
int count;
int position;
info(const std::string n, int c, int p) : number(n), count(c), position(p) {}
};
int main()
{
std::vector<std::string> data = {"12","12","12","12","13","14","15", "15", "15", "15", "18"};
std::unordered_map<std::string, info> infoMap;
std::vector<info> vInfo;
int pos = 0;
// loop for each data element
std::for_each(data.begin(), data.end(), [&](const std::string& n)
{
// insert entry into the map
auto pr = infoMap.insert(std::make_pair(n, info(n, 0, pos)));
// bump up the count for this entry.
++pr.first->second.count;
// bump up the postion number
++pos;
});
// create a vector of the information with a count of 1 item.
std::for_each(infoMap.begin(), infoMap.end(), [&](std::unordered_map<std::string, info>::value_type& vt) { if (vt.second.count == 1) vInfo.push_back(vt.second); });
// sort this by position
std::sort(vInfo.begin(), vInfo.end(), [&](const info& pr1, const info &pr2){return pr1.position < pr2.position; });
// output the results
if ( vInfo.empty() )
std::cout << "All values are duplicated\n";
else
std::cout << "The first number that isn't repeated is " << vInfo.front().number << "\n";
}
Live Example
First, we just simply go through all the entries in the vector and just tally up the count for each item. In addition, we store the position in the original list of where the item was found.
After that we filter out the ones with a count of exactly 1 and copy them to a vector. We then sort this vector based on the position they were found in the original list.

Can I use a vector with the same functionality as a static array?

just a quick one:
I plan to have an array of AVL Trees (for an assignment, as you imagined - does anyone ever use AVL trees apart from data structures students anyway?) and I was wondering if I could use a nice vector - and take advantage of the for(auto i : vect) c++ 11 functionality.
What I want to do: AVLTree array of 1.000.000 elements so I can check in CONSTANT time if the tree exists or not (array position will be NULL or not)
AVLTree_GeeksforGeeks **AVLArray = new (AVLTree_GeeksforGeeks*)[1000000];
for(int i=0; i<1000000; i++){ AVLArray[i] = nullptr; } //init everything to null
//do stuff with AVL trees
//...
if(AVLTree[52000]!=nullptr)
{
cout << "tree exists!\n";
}
Is there an equivalent with vectors, that will allow me constant time of searching for a tree? All the examples I've seen use vector.push_back() and vector.find() to search.
You can use std::vector as suggested by Exceptyon:
std::vector<unique_ptr<AVLTree>> trees(1000000);
by using also the smart pointers implemented in c++11. If your concern is dynamic resizing keep in mind that you can can reserve an initial amount of storage when you create the vector (by passing it as a parameter in the constructor) or via the resize member.
If your concern is random access to its objects, rest assured that the operator[] has O(1) complexity.
If you know the total capacity of the container at compile time you could also consider using c++11's std::array which provides the same for each functionality as well as the same constant time access to its elements.
std::array<unique_ptr<AVLTree>, 1000000> trees;
vector will work because they have an overloaded operator[] that guarantee constant time access to the nth element.
But your code is not clear:
AVLTree_GeeksforGeeks *AVLArray = new AVLTree_GeeksforGeeks[1000000];
for(int i=0; i<1000000; i++){ AVLArray[i] = nullptr; } //init everything to null
If you set to nullptr, then you need a pointer. Is AVLTree_GeeksforGeeks a typedef on a pointer ? I assume it is not the case, and that there is a typo -- otherwise you just have to remove this typedef definition to use std::unique_ptr<TheRealTyp>. So to clarify, I suppose your code is really:
AVLTree_GeeksforGeeks **AVLArray = new (AVLTree_GeeksforGeeks*)[1000000];
for(int i=0; i<1000000; i++){ AVLArray[i] = nullptr; } //init everything to null
In that case, as suggested you should use a std::vector<std::unique_ptr<AVLTree_GeeksforGeeks>>, and you won't have to initialize it to nullptr and the test of nullity changed for a direct "test" of std::unique_ptr:
std::vector<std::unique_ptr<AVLTree_GeeksforGeeks>> AVLArray(100000);
// Do stuff with AVL trees
if (AVLArray[52000])
{
cout << "tree exists!\n";
}
Now, how to use a std::vector<std::unique_ptr<X>> ?
Setting a value in the already allocated zone: AVLArray[5200] = std::unique_ptr(new AVLTree_GeeksforGeeks));
Setting a entry to null: AVLArray[5200].reset()
If you need to add something (the vector will grow): AVLArray.push_back(std::unique_ptr(new AVLTree_GeeksforGeeks));
To iterate over use for (auto& elem: AVLArray). The & is mandatory otherwise a copy construcotr is called and std::unique_ptr forbids this.
Here a example:
#include <iostream>
#include <vector>
#include <memory>
// boost
#include <boost/range/algorithm/for_each.hpp>
#include <boost/range/adaptor/filtered.hpp>
class A {};
int main(int argc, char const *argv[])
{
std::vector<std::unique_ptr<A>> vector;
vector.resize(10000);
// Adding some values
if (!vector[100])
{
std::cout << "Adding vector[100]" << std::endl;
vector[100] = std::unique_ptr<A>(new A);
}
if (!vector[1000])
{
std::cout << "Adding vector[1000]" << std::endl;
vector[1000] = std::unique_ptr<A>(new A);
}
// Removing one
if (vector[100])
{
std::cout << "Removing vector[100]" << std::endl;
vector[100].reset();
}
std::cout << "Testing element." << std::endl;
auto printer = [](const std::unique_ptr<A>& elem) {
std::cout << "There is an elem !" << std::endl; };
// use auto& otherwise use unique_ptr(const unique_ptr&) that has been
// deleted)
for (auto& elem: vector)
{
if (elem)
{
printer(elem);
}
}
std::cout << "for_each element with filtering." << std::endl;
auto is_null = [](const std::unique_ptr<A>& elem) { return (bool) elem; };
// Just because I move boost range !
boost::for_each(vector | boost::adaptors::filtered(is_null), printer);
std::cout << "end !" << std::endl;
return 0;
}

Merging two lists efficiently with limited bound

I am trying to merge two arrays/lists where each element of the array has to be compared. If there is an identical element in both of them I increase their total occurrence by one. The arrays are both 2D, where each element has a counter for its occurrence. I know both of these arrays can be compared with a double for loop in O(n^2), however I am limited by a bound of O(nlogn). The final array will have all of the elements from both lists with their increased counters if there are more than one occurrence
Array A[][] = [[8,1],[5,1]]
Array B[][] = [[2,1],[8,1]]
After the merge is complete I should get an array like so
Array C[][] = [[2,1],[8,2],[8,2],[5,1]]
The arrangement of the elements does not have to be necessary.
From readings, Mergesort takes O(nlogn) to merge two lists however I am currently at a roadblock with my bound problem. Any pseudo code visual would be appreciated.
I quite like Stepanov's Efficient Programming although they are rather slow. In sessions 6 and 7 (if I recall correctly) he discusses the algorithms add_to_counter() and reduce_counter(). Both algorithms are entirely trivial, of course, but can be used to implement a non-recursive merge-sort without too much effort. The only possibly non-obvious insight is that the combining operation can reduce the two elements into a sequence rather than just one element. To do the operations in-place you'd actually store iterators (i.e., pointers in case of arrays) using a suitable class to represent a partial view of an array.
I haven't watched the sessions beyond session 7 (and actually not even the complete session 7, yet) but I would fully expect that he actually presents how to use the counter produced in session 7 to implement, e.g., merge-sort. Of course, the run-time complexity of merge-sort is O(n ln n) and, when using the counter approach it will use O(ln n) auxiliary space.
A simple algorithm that requires twice as much memory would be to order both inputs (O(n log n)) and then sequentially pick the elements from the head of both lists and do the merge (O(n)). The overall cost would be O(n log n) with O(n) extra memory (additional size of the smallest of both inputs)
Here's my algorithm based on bucket counting
time complexity: O(n)
memory complexity: O(max), where max is the maximum element in the arrays
Output:
[8,2][5,1][2,1][8,2]
Code:
#include <iostream>
#include <vector>
#include <iterator>
int &refreshCount(std::vector<int> &counters, int in) {
if((counters.size() - 1) < in) {
counters.resize(in + 1);
}
return ++counters[in];
}
void copyWithCounts(std::vector<std::pair<int, int> >::iterator it,
std::vector<std::pair<int, int> >::iterator end,
std::vector<int> &counters,
std::vector<std::pair<int, int&> > &result
) {
while(it != end) {
int &count = refreshCount(counters, (*it).first);
std::pair<int, int&> element((*it).first, count);
result.push_back(element);
++it;
}
}
void countingMerge(std::vector<std::pair<int, int> > &array1,
std::vector<std::pair<int, int> > &array2,
std::vector<std::pair<int, int&> > &result) {
auto array1It = array1.begin();
auto array1End = array1.end();
auto array2It = array2.begin();
auto array2End = array2.end();
std::vector<int> counters = {0};
copyWithCounts(array1It, array1End, counters, result);
copyWithCounts(array2It, array2End, counters, result);
}
int main()
{
std::vector<std::pair<int, int> > array1 = {{8, 1}, {5, 1}};
std::vector<std::pair<int, int> > array2 = {{2, 1}, {8, 1}};
std::vector<std::pair<int, int&> > result;
countingMerge(array1, array2, result);
for(auto it = result.begin(); it != result.end(); ++it) {
std::cout << "[" << (*it).first << "," << (*it).second << "] ";
}
return 0;
}
Short explanation:
because you mentioned, that final arrangement is not necessary, I did simple merge (without sort, who asked sort?) with counting, where result contains reference to counters, so no need to walk through the array to update the counters.
You could write an algorithm to merge them by walking both sequences sequentially in order, inserting where appropriate.
I've chosen a (seemingly more apt) datastructure here: std::map<Value, Occurence>:
#include <map>
using namespace std;
using Value = int;
using Occurence = unsigned;
using Histo = map<Value, Occurence>;
If you insist on contiguous storage, boost::flat_map<> should be your friend here (and a drop-in replacement).
The algorithm (tested with your inputs, read comments for explanation):
void MergeInto(Histo& target, Histo const& other)
{
auto left_it = begin(target), left_end = end(target);
auto right_it = begin(other), right_end = end(other);
auto const& cmp = target.value_comp();
while (right_it != right_end)
{
if ((left_it == left_end) || cmp(*right_it, *left_it))
{
// insert at left_it
target.insert(left_it, *right_it);
++right_it; // and carry on
} else if (cmp(*left_it, *right_it))
{
++left_it; // keep left_it first, so increment it
} else
{
// keys match!
left_it->second += right_it->second;
++left_it;
++right_it;
}
}
}
It's really quite straight-forward!
A test program: See it Live On Coliru
#include <iostream>
// for debug output
static inline std::ostream& operator<<(std::ostream& os, Histo::value_type const& v) { return os << "{" << v.first << "," << v.second << "}"; }
static inline std::ostream& operator<<(std::ostream& os, Histo const& v) { for (auto& el : v) os << el << " "; return os; }
//
int main(int argc, char *argv[])
{
Histo A { { 8, 1 }, { 5, 1 } };
Histo B { { 2, 1 }, { 8, 1 } };
std::cout << "A: " << A << "\n";
std::cout << "B: " << B << "\n";
MergeInto(A, B);
std::cout << "merged: " << A << "\n";
}
Printing:
A: {5,1} {8,1}
B: {2,1} {8,1}
merged: {2,1} {5,1} {8,2}
You could shuffle the interface a tiny bit in case you really wanted to merge into a new object (C):
// convenience
Histo Merge(Histo const& left, Histo const& right)
{
auto copy(left);
MergeInto(copy, right);
return copy;
}
Now you can just write
Histo A { { 8, 1 }, { 5, 1 } };
Histo B { { 2, 1 }, { 8, 1 } };
auto C = Merge(A, B);
See that Live on Coliru, too

How to find the index of current object in range-based for loop?

Assume I have the following code:
vector<int> list;
for(auto& elem:list) {
int i = elem;
}
Can I find the position of elem in the vector without maintaining a separate iterator?
Yes you can, it just take some massaging ;)
The trick is to use composition: instead of iterating over the container directly, you "zip" it with an index along the way.
Specialized zipper code:
template <typename T>
struct iterator_extractor { typedef typename T::iterator type; };
template <typename T>
struct iterator_extractor<T const> { typedef typename T::const_iterator type; };
template <typename T>
class Indexer {
public:
class iterator {
typedef typename iterator_extractor<T>::type inner_iterator;
typedef typename std::iterator_traits<inner_iterator>::reference inner_reference;
public:
typedef std::pair<size_t, inner_reference> reference;
iterator(inner_iterator it): _pos(0), _it(it) {}
reference operator*() const { return reference(_pos, *_it); }
iterator& operator++() { ++_pos; ++_it; return *this; }
iterator operator++(int) { iterator tmp(*this); ++*this; return tmp; }
bool operator==(iterator const& it) const { return _it == it._it; }
bool operator!=(iterator const& it) const { return !(*this == it); }
private:
size_t _pos;
inner_iterator _it;
};
Indexer(T& t): _container(t) {}
iterator begin() const { return iterator(_container.begin()); }
iterator end() const { return iterator(_container.end()); }
private:
T& _container;
}; // class Indexer
template <typename T>
Indexer<T> index(T& t) { return Indexer<T>(t); }
And using it:
#include <iostream>
#include <iterator>
#include <limits>
#include <vector>
// Zipper code here
int main() {
std::vector<int> v{1, 2, 3, 4, 5, 6, 7, 8, 9};
for (auto p: index(v)) {
std::cout << p.first << ": " << p.second << "\n";
}
}
You can see it at ideone, though it lacks the for-range loop support so it's less pretty.
EDIT:
Just remembered that I should check Boost.Range more often. Unfortunately no zip range, but I did found a pearl: boost::adaptors::indexed. However it requires access to the iterator to pull of the index. Shame :x
Otherwise with the counting_range and a generic zip I am sure it could be possible to do something interesting...
In the ideal world I would imagine:
int main() {
std::vector<int> v{1, 2, 3, 4, 5, 6, 7, 8, 9};
for (auto tuple: zip(iota(0), v)) {
std::cout << tuple.at<0>() << ": " << tuple.at<1>() << "\n";
}
}
With zip automatically creating a view as a range of tuples of references and iota(0) simply creating a "false" range that starts from 0 and just counts toward infinity (or well, the maximum of its type...).
jrok is right : range-based for loops are not designed for that purpose.
However, in your case it is possible to compute it using pointer arithmetic since vector stores its elements contiguously (*)
vector<int> list;
for(auto& elem:list) {
int i = elem;
int pos = &elem-&list[0]; // pos contains the position in the vector
// also a &-operator overload proof alternative (thanks to ildjarn) :
// int pos = addressof(elem)-addressof(list[0]);
}
But this is clearly a bad practice since it obfuscates the code & makes it more fragile (it easily breaks if someone changes the container type, overload the & operator or replace 'auto&' by 'auto'. good luck to debug that!)
NOTE: Contiguity is guaranteed for vector in C++03, and array and string in C++11 standard.
No, you can't (at least, not without effort). If you need the position of an element, you shouldn't use range-based for. Remember that it's just a convenience tool for the most common case: execute some code for each element. In the less-common circumstances where you need the position of the element, you have to use the less-convenient regular for loop.
Based on the answer from #Matthieu there is a very elegant solution using the mentioned boost::adaptors::indexed:
std::vector<std::string> strings{10, "Hello"};
int main(){
strings[5] = "World";
for(auto const& el: strings| boost::adaptors::indexed(0))
std::cout << el.index() << ": " << el.value() << std::endl;
}
You can try it
This works pretty much like the "ideal world solution" mentioned, has pretty syntax and is concise. Note that the type of el in this case is something like boost::foobar<const std::string&, int>, so it handles the reference there and no copying is performed. It is even incredibly efficient: https://godbolt.org/g/e4LMnJ (The code is equivalent to keeping an own counter variable which is as good as it gets)
For completeness the alternatives:
size_t i = 0;
for(auto const& el: strings) {
std::cout << i << ": " << el << std::endl;
++i;
}
Or using the contiguous property of a vector:
for(auto const& el: strings) {
size_t i = &el - &strings.front();
std::cout << i << ": " << el << std::endl;
}
The first generates the same code as the boost adapter version (optimal) and the last is 1 instruction longer: https://godbolt.org/g/nEG8f9
Note: If you only want to know, if you have the last element you can use:
for(auto const& el: strings) {
bool isLast = &el == &strings.back();
std::cout << isLast << ": " << el << std::endl;
}
This works for every standard container but auto&/auto const& must be used (same as above) but that is recommended anyway. Depending on the input this might also be pretty fast (especially when the compiler knows the size of your vector)
Replace the &foo by std::addressof(foo) to be on the safe side for generic code.
If you have a compiler with C++14 support you can do it in a functional style:
#include <iostream>
#include <string>
#include <vector>
#include <functional>
template<typename T>
void for_enum(T& container, std::function<void(int, typename T::value_type&)> op)
{
int idx = 0;
for(auto& value : container)
op(idx++, value);
}
int main()
{
std::vector<std::string> sv {"hi", "there"};
for_enum(sv, [](auto i, auto v) {
std::cout << i << " " << v << std::endl;
});
}
Works with clang 3.4 and gcc 4.9 (not with 4.8); for both need to set -std=c++1y. The reason you need c++14 is because of the auto parameters in the lambda function.
If you insist on using range based for, and to know index, it is pretty trivial to maintain index as shown below.
I do not think there is a cleaner / simpler solution for range based for loops. But really why not use a standard for(;;)? That probably would make your intent and code the clearest.
vector<int> list;
int idx = 0;
for(auto& elem:list) {
int i = elem;
//TODO whatever made you want the idx
++idx;
}
There is a surprisingly simple way to do this
vector<int> list;
for(auto& elem:list) {
int i = (&elem-&*(list.begin()));
}
where i will be your required index.
This takes advantage of the fact that C++ vectors are always contiguous.
Here's a quite beautiful solution using c++20:
#include <array>
#include <iostream>
#include <ranges>
template<typename T>
struct EnumeratedElement {
std::size_t index;
T& element;
};
auto enumerate(std::ranges::range auto& range)
-> std::ranges::view auto
{
return range | std::views::transform(
[i = std::size_t{}](auto& element) mutable {
return EnumeratedElement{i++, element};
}
);
}
auto main() -> int {
auto const elements = std::array{3, 1, 4, 1, 5, 9, 2};
for (auto const [index, element] : enumerate(elements)) {
std::cout << "Element " << index << ": " << element << '\n';
}
}
The major features used here are c++20 ranges, c++20 concepts, c++11 mutable lambdas, c++14 lambda capture initializers, and c++17 structured bindings. Refer to cppreference.com for information on any of these topics.
Note that element in the structured binding is in fact a reference and not a copy of the element (not that it matters here). This is because any qualifiers around the auto only affect a temporary object that the fields are extracted from, and not the fields themselves.
The generated code is identical to the code generated by this (at least by gcc 10.2):
#include <array>
#include <iostream>
#include <ranges>
auto main() -> int {
auto const elements = std::array{3, 1, 4, 1, 5, 9, 2};
for (auto index = std::size_t{}; auto& element : elements) {
std::cout << "Element " << index << ": " << element << '\n';
index++;
}
}
Proof: https://godbolt.org/z/a5bfxz
I read from your comments that one reason you want to know the index is to know if the element is the first/last in the sequence. If so, you can do
for(auto& elem:list) {
// loop code ...
if(&elem == &*std::begin(list)){ ... special code for first element ... }
if(&elem == &*std::prev(std::end(list))){ ... special code for last element ... }
// if(&elem == &*std::rbegin(list)){... (C++14 only) special code for last element ...}
// loop code ...
}
EDIT: For example, this prints a container skipping a separator in the last element. Works for most containers I can imagine (including arrays), (online demo http://coliru.stacked-crooked.com/a/9bdce059abd87f91):
#include <iostream>
#include <vector>
#include <list>
#include <set>
using namespace std;
template<class Container>
void print(Container const& c){
for(auto& x:c){
std::cout << x;
if(&x != &*std::prev(std::end(c))) std::cout << ", "; // special code for last element
}
std::cout << std::endl;
}
int main() {
std::vector<double> v{1.,2.,3.};
print(v); // prints 1,2,3
std::list<double> l{1.,2.,3.};
print(l); // prints 1,2,3
std::initializer_list<double> i{1.,2.,3.};
print(i); // prints 1,2,3
std::set<double> s{1.,2.,3.};
print(s); // print 1,2,3
double a[3] = {1.,2.,3.}; // works for C-arrays as well
print(a); // print 1,2,3
}
Tobias Widlund wrote a nice MIT licensed Python style header only enumerate (C++17 though):
GitHub
Blog Post
Really nice to use:
std::vector<int> my_vector {1,3,3,7};
for(auto [i, my_element] : en::enumerate(my_vector))
{
// do stuff
}
If you want to avoid having to write an auxiliary function while having
the index variable local to the loop, you can use a lambda with a mutable variable.:
int main() {
std::vector<char> values = {'a', 'b', 'c'};
std::for_each(begin(values), end(values), [i = size_t{}] (auto x) mutable {
std::cout << i << ' ' << x << '\n';
++i;
});
}
Here's a macro-based solution that probably beats most others on simplicity, compile time, and code generation quality:
#include <iostream>
#define fori(i, ...) if(size_t i = -1) for(__VA_ARGS__) if(i++, true)
int main() {
fori(i, auto const & x : {"hello", "world", "!"}) {
std::cout << i << " " << x << std::endl;
}
}
Result:
$ g++ -o enumerate enumerate.cpp -std=c++11 && ./enumerate
0 hello
1 world
2 !