C++: boost range iterator pointing to wrong element - c++

I ran into a strange problem. I have a vector<pair<bool, int>> from which I need to read (and possibly write) only the vector elements for which the boolean value of the pair is true. I am using boost range filter and reverse adaptors to do that.
However, I noticed that the order of the adaptors, ie whether I use reversed | filtered or filtered | reversed produces different results. In fact, when I use filtered | reversed then when I use an iterator to the transformed range to change the boolean value of the pair, then the iterator after the change points to a different vector element. This does not happen when I use reversed | filtered. Below is the code demonstrating the issue. Any ideas as to why this is happening are much appreciated!
#include <boost/range/adaptors.hpp>
#include <vector>
#include <utility>
#include <iostream>
using namespace boost::adaptors;
using container_type = std::vector<std::pair<bool,int>>;
struct to_include {
bool operator()(const std::pair<bool,int>& x) {
return x.first;
}
};
int main() {
container_type container;
/* element0: 1, 1 */
/* element1: 1, 2 */
/* element2: 1, 3 */
for(size_t i=0; i!=3; ++i) container.push_back(std::make_pair(true, i+1));
container_type container_cpy = container;
/* filter and then reverse */
auto fr = container | filtered(to_include()) | reversed;
auto fr_it1 = fr.begin();
auto fr_it2 = std::next(fr_it1);
fr_it2->first = false;
std::cout << "FILTER AND THEN REVERSE\n";
std::cout << fr_it2->first << " " << fr_it2->second << '\n'; /* prints (1,1) instead of (0,2) */
/* reverse and then filter */
auto rf = container_cpy | reversed | filtered(to_include());
auto rf_it1 = rf.begin();
auto rf_it2 = std::next(rf_it1);
rf_it2->first = false;
std::cout << "\nREVERSE AND THEN FILTER\n";
std::cout << rf_it2->first << " " << rf_it2->second << '\n'; /* prints (0,2) */
return 0;
}

This is a subtle issue. The point here is that after you modify the element pointed to by fr_it2, you also implicitly modify fr_it1 because fr is a lazy view on the original range. This means that the transformed filter needs to be recomputed. This is a very non-intuitive property, because for eager STL ranges, modifications through iterators don't modify the iterators themselves, but for lazy ranges this is no longer true!
In fact, if you print the entire fr and rf ranges using "fresh" iterators, you will see that their contents are in fact the same.
fr_it2->first = false;
for (auto e : fr) std::cout << e.first << e.second << ";"; // prints 13;11
...
rf_it2->first = false;
for (auto e : rf) std::cout << e.first << e.second << ";"; // prints 13;11
Live Example 1. So in fact the middle element is indeed deleted!
I think you should not modify elements through iterators into the adapated range, but rather through iterators into your primary container, like this:
auto fr_it1 = container.begin();
...
auto rf_it1 = container_cpy.begin();
Live Example 2.
If you do that, you get consistent results that show "0 2" for both approaches.

Related

How to get the elements of a tuple

I am creating a scrabble game and i need to have a basic score to words on the dictionary.
I used make_tuple and stored it inside my tuple. Is there a way to access elements in a tuple as if it was in a vector?
#include <iostream>
#include <tuple>
#include <string>
#include <fstream>
void parseTextFile()
{
std::ifstream words_file("scrabble_words.txt"); //File containing the words in the dictionary (english) with words that do not exist
std::ofstream new_words_file("test.txt"); //File where only existing words will be saved
std::string word_input;
std::tuple<std::string, int> tupleList;
unsigned int check_integrity;
int counter = 0;
while(words_file >> word_input)
{
check_integrity = 0;
for (unsigned int i = 0; i < word_input.length(); i++)
{
if((int)word_input[i] >= 97 && (int)word_input[i] <= 123) //if the letter of the word belongs to the alphabet
{
check_integrity++;
}
}
if(word_input.length() == check_integrity)
{
new_words_file << word_input << std::endl; //add the word to the new file
tupleList = std::make_tuple(word_input, getScore(word_input)); //make tuple with the basic score and the word
counter++; //to check if the amount of words in the new file are correct
std::cout << std::get<0>(tupleList) << ": " << std::get<1>(tupleList) << std::endl;
}
}
std::cout << counter << std::endl;
}
One would generally use a tuple when there are more than two values of different types to store. For just two values a pair is a better choice.
In your case what you want to achieve seems to be a list of word-value pairs. You can store them in a container like a vector but you can also store them as key-value pairs in a map. As you can see when following the link, an std::map is literally a collection of std::pair object and tuples are a generalization of pairs.
For completeness, if my understanding of your code purpose is correct, these are additions to your code for storing each tuple in a vector - declarations,
std::tuple<std::string, int> correct_word = {};
std::vector<std::tuple<std::string, int>> existing_words = {};
changes in the loop that saves existing words - here you want to add each word-value tuple to the vector,
if(word_input.length() == check_integrity)
{
// ...
correct_word = std::make_tuple(word_input, getScore(word_input));
existing_words.push_back(correct_word);
// ...
}
..and finally example of usage outside the construction loop:
for (size_t iv=0; iv<existing_words.size(); ++iv)
{
correct_word = existing_words[iv];
std::cout << std::get<0>(correct_word) << ": " << std::get<1>(correct_word) << std::endl;
}
std::cout << counter << std::endl;
The same code with a map would look like:
The only declaration would be a map from strings to values (instead of a tuple and vector of tuples),
std::map<std::string, int> existing_words = {};
In the construction loop you would be creating the map pair in a single line like this,
if(word_input.length() == check_integrity)
{
// ...
existing_words[word_input] = getScore(word_input);
// ...
}
While after constructing you would be accessing map elements using .first for the word and .second for the counter. Below is a printing example that also uses a for auto loop:
for (const auto& correct_word : existing_words)
std::cout << correct_word.first << ": " << correct_word.second << std::endl;
std::cout << counter << std::endl;
Notice that maps are by default alphabetically ordered, you can provide your own ordering rules and also use an unordered map if you don't want any ordering/sorting.

Slice vectors inside a map

I have a std::map<std::string, std::vector<int>>. Is there a way to provide a "view" of that map to a function that takes a variable of the same type? Specifically, is there a way to slice the vectors within the map, yet provide a view (sliced) that is compliant to the std::map interface? Something similar to boost range adapters or indexes, but for nested structures.
I am mainly looking for something via boost, but I am open to other suggestions as well.
[UPDATE] the goal is to "NOT" copy or move the map, only access its vectors according to the slicing criteria. And the function that takes the map as a variable should not be aware of the slicing. I hope this makes the question clearer.
Here's a pseudo example:
map<string, vector<int>> my_map;
my_map["a"] = {0,1,2,3,4,5};
my_map["b"] = {0,1,2,3,4,5};
my_map["c"] = {0,1,2,3,4,5};
map<string, pair<int>> slices;
slices["a"] = {1,4};
slices["b"] = {2,3};
slices["c"] = {0,5};
map_view = magic(my_map, slices);
cout << "a: " << print_vector(map_view["a"]) << endl;
cout << "b: " << print_vector(map_view["b"]) << endl;
cout << "c: " << print_vector(map_view["c"]) << endl;
//desired output
a: 1,2,3
b: 2
c: 0,1,2,3,4
No magic needed try this:
auto& ref_map = my_map["a"];
auto& ref_slice = slices["a"];
std::cout << "a: ";
std::copy (
ref_map.begin() + ref_slice.first,
ref_map.begin() + ref_slice.second,
std::ostream_iterator<int> (std::cout,", ")
);
I ended up adding a function that takes one of the map values and a slicing criteria, then returns a joined boost range over multiple boost slices, depending on the slicing criteria. In addition to that, I resorted to using auto return type inference of C++14 to avoid messing with the actual return types of boost adaptors and ranges.
Here's a quick snippet:
const auto get_map_view(
string key,
const map<string, vector<int>> & my_map,
const pair<int,int> & slice, bool exclude=false) {
const auto & values = my_map.at(key);
if (!exclude) {
return boost::range::join(
values | boost::adaptors::sliced(0, 0),
values | boost::adaptors::sliced(slice.first, slice.second));
} else {
return boost::range::join(
values | boost::adaptors::sliced(0, slice.first),
values | boost::adaptors::sliced(slice.second, values.size());
}
}
( auto a)[&]{ auto bounds = slices[a];
auto v = mymap[a];
return find(v.begin(),v.end(), bounds.first);}
This for example will get you an iterator to the front edge based on your criterion the rest should be trivial with lambdas.

Iterating through two maps in c++

I would like to loop through two maps at the same time, how could I achieve this?
I have two vectors want to print both, can I do two time (auto it : mymap) within one for? Something like:
for (auto it: mymap && auto on: secondMap)
is this even allowed?
I am trying to print values like (value1, value2) where each of the values is in a different map. The maps do not necessarily contain the exact same items but the key is an Instruction and the value is an integer, so if I have a element in the map for value2, then not necessarily there is a value1 corresponding to the same key, but in that case it should be 0 which is the default integer value.
Any ideas?
Perhaps it is possible to combine two iterators, one for each map?
Kind regards,
Guus Leijsten
You can use the regular for-loop for this :
#include <iostream>
#include <map>
int main(int argc, char* argv[]) {
std::map<int, std::string> m1, m2;
m1.insert({15, "lala"});
m1.insert({10, "hey!"});
m1.insert({99, "this"});
m2.insert({50, "foo"});
m2.insert({51, "bar"});
for(auto it_m1 = m1.cbegin(), end_m1 = m1.cend(),
it_m2 = m2.cbegin(), end_m2 = m2.cend();
it_m1 != end_m1 || it_m2 != end_m2;)
{
if(it_m1 != end_m1) {
std::cout << "m1: " << it_m1->first << " " << it_m1->second << " | ";
++it_m1;
}
if(it_m2 != end_m2) {
std::cout << "m2: " << it_m2->first << " " << it_m2->second << std::endl;
++it_m2;
}
}
return EXIT_SUCCESS;
}
Note that because you want to iterate over maps of different size, you have to use the || operator in loop condition. The direct consequence is that you cannot increment in the last part of the for-loop, as one of the iterator may be invalid at that time (and lead to a segmentation fault).
You have to check iterator validity inside the loop and increment it when it's valid, as shown in the sample above.

Combinations of N Boost interval_set

I have a service which has outages in 4 different locations. I am modeling each location outages into a Boost ICL interval_set. I want to know when at least N locations have an active outage.
Therefore, following this answer, I have implemented a combination algorithm, so I can create combinations between elemenets via interval_set intersections.
Whehn this process is over, I should have a certain number of interval_set, each one of them defining the outages for N locations simultaneusly, and the final step will be joining them to get the desired full picture.
The problem is that I'm currently debugging the code, and when the time of printing each intersection arrives, the output text gets crazy (even when I'm using gdb to debug step by step), and I can't see them, resulting in a lot of CPU usage.
I guess that somehow I'm sending to output a larger portion of memory than I should, but I can't see where the problem is.
This is a SSCCE:
#include <boost/icl/interval_set.hpp>
#include <algorithm>
#include <iostream>
#include <vector>
int main() {
// Initializing data for test
std::vector<boost::icl::interval_set<unsigned int> > outagesPerLocation;
for(unsigned int j=0; j<4; j++){
boost::icl::interval_set<unsigned int> outages;
for(unsigned int i=0; i<5; i++){
outages += boost::icl::discrete_interval<unsigned int>::closed(
(i*10), ((i*10) + 5 - j));
}
std::cout << "[Location " << (j+1) << "] " << outages << std::endl;
outagesPerLocation.push_back(outages);
}
// So now we have a vector of interval_sets, one per location. We will combine
// them so we get an interval_set defined for those periods where at least
// 2 locations have an outage (N)
unsigned int simultaneusOutagesRequired = 2; // (N)
// Create a bool vector in order to filter permutations, and only get
// the sorted permutations (which equals the combinations)
std::vector<bool> auxVector(outagesPerLocation.size());
std::fill(auxVector.begin() + simultaneusOutagesRequired, auxVector.end(), true);
// Create a vector where combinations will be stored
std::vector<boost::icl::interval_set<unsigned int> > combinations;
// Get all the combinations of N elements
unsigned int numCombinations = 0;
do{
bool firstElementSet = false;
for(unsigned int i=0; i<auxVector.size(); i++){
if(!auxVector[i]){
if(!firstElementSet){
// First location, insert to combinations vector
combinations.push_back(outagesPerLocation[i]);
firstElementSet = true;
}
else{
// Intersect with the other locations
combinations[numCombinations] -= outagesPerLocation[i];
}
}
}
numCombinations++;
std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl; // The problem appears here
}
while(std::next_permutation(auxVector.begin(), auxVector.end()));
// Get the union of the intersections and see the results
boost::icl::interval_set<unsigned int> finalOutages;
for(std::vector<boost::icl::interval_set<unsigned int> >::iterator
it = combinations.begin(); it != combinations.end(); it++){
finalOutages += *it;
}
std::cout << finalOutages << std::endl;
return 0;
}
Any help?
As I surmised, there's a "highlevel" approach here.
Boost ICL containers are more than just containers of "glorified pairs of interval starting/end points". They are designed to implement just that business of combining, searching, in a generically optimized fashion.
So you don't have to.
If you let the library do what it's supposed to do:
using TimePoint = unsigned;
using DownTimes = boost::icl::interval_set<TimePoint>;
using Interval = DownTimes::interval_type;
using Records = std::vector<DownTimes>;
Using functional domain typedefs invites a higher level approach. Now, let's ask the hypothetical "business question":
What do we actually want to do with our records of per-location downtimes?
Well, we essentially want to
tally them for all discernable time slots and
filter those where tallies are at least 2
finally, we'd like to show the "merged" time slots that remain.
Ok, engineer: implement it!
Hmm. Tallying. How hard could it be?
❕ The key to elegant solutions is the choice of the right datastructure
using Tally = unsigned; // or: bit mask representing affected locations?
using DownMap = boost::icl::interval_map<TimePoint, Tally>;
Now it's just bulk insertion:
// We will do a tally of affected locations per time slot
DownMap tallied;
for (auto& location : records)
for (auto& incident : location)
tallied.add({incident, 1u});
Ok, let's filter. We just need the predicate that works on our DownMap, right
// define threshold where at least 2 locations have an outage
auto exceeds_threshold = [](DownMap::value_type const& slot) {
return slot.second >= 2;
};
Merge the time slots!
Actually. We just create another DownTimes set, right. Just, not per location this time.
The choice of data structure wins the day again:
// just printing the union of any criticals:
DownTimes merged;
for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys)
merged.insert(slot);
Report!
std::cout << "Criticals: " << merged << "\n";
Note that nowhere did we come close to manipulating array indices, overlapping or non-overlapping intervals, closed or open boundaries. Or, [eeeeek!] brute force permutations of collection elements.
We just stated our goals, and let the library do the work.
Full Demo
Live On Coliru
#include <boost/icl/interval_set.hpp>
#include <boost/icl/interval_map.hpp>
#include <boost/range.hpp>
#include <boost/range/algorithm.hpp>
#include <boost/range/adaptors.hpp>
#include <boost/range/numeric.hpp>
#include <boost/range/irange.hpp>
#include <algorithm>
#include <iostream>
#include <vector>
using TimePoint = unsigned;
using DownTimes = boost::icl::interval_set<TimePoint>;
using Interval = DownTimes::interval_type;
using Records = std::vector<DownTimes>;
using Tally = unsigned; // or: bit mask representing affected locations?
using DownMap = boost::icl::interval_map<TimePoint, Tally>;
// Just for fun, removed the explicit loops from the generation too. Obviously,
// this is bit gratuitous :)
static DownTimes generate_downtime(int j) {
return boost::accumulate(
boost::irange(0, 5),
DownTimes{},
[j](DownTimes accum, int i) { return accum + Interval::closed((i*10), ((i*10) + 5 - j)); }
);
}
int main() {
// Initializing data for test
using namespace boost::adaptors;
auto const records = boost::copy_range<Records>(boost::irange(0,4) | transformed(generate_downtime));
for (auto location : records | indexed()) {
std::cout << "Location " << (location.index()+1) << " " << location.value() << std::endl;
}
// We will do a tally of affected locations per time slot
DownMap tallied;
for (auto& location : records)
for (auto& incident : location)
tallied.add({incident, 1u});
// We will combine them so we get an interval_set defined for those periods
// where at least 2 locations have an outage
auto exceeds_threshold = [](DownMap::value_type const& slot) {
return slot.second >= 2;
};
// just printing the union of any criticals:
DownTimes merged;
for (auto&& slot : tallied | filtered(exceeds_threshold) | map_keys)
merged.insert(slot);
std::cout << "Criticals: " << merged << "\n";
}
Which prints
Location 1 {[0,5][10,15][20,25][30,35][40,45]}
Location 2 {[0,4][10,14][20,24][30,34][40,44]}
Location 3 {[0,3][10,13][20,23][30,33][40,43]}
Location 4 {[0,2][10,12][20,22][30,32][40,42]}
Criticals: {[0,4][10,14][20,24][30,34][40,44]}
At the end of the permutation loop, you write:
numCombinations++;
std::cout << "[-INTERSEC-] " << combinations[numCombinations] << std::endl; // The problem appears here
My debugger tells me that on the first iteration numCombinations was 0 before the increment. But incrementing it made it out of range for the combinations container (since that is only a single element, so having index 0).
Did you mean to increment it after the use? Was there any particular reason not to use
std::cout << "[-INTERSEC-] " << combinations.back() << "\n";
or, for c++03
std::cout << "[-INTERSEC-] " << combinations[combinations.size()-1] << "\n";
or even just:
std::cout << "[-INTERSEC-] " << combinations.at(numCombinations) << "\n";
which would have thrown std::out_of_range?
On a side note, I think Boost ICL has vastly more efficient ways to get the answer you're after. Let me think about this for a moment. Will post another answer if I see it.
UPDATE: Posted the other answer show casing highlevel coding with Boost ICL

Does g++'s std::list::sort invalidate iterators?

According to SGI, cplusplus.com, and every other source I've got, the sort() member function of the std::list should not invalidate iterators. However, that doesn't seem to be the case when I run this code (c++11):
#include <list>
#include <chrono>
#include <random>
#include <iostream>
#include "print.hpp"
unsigned int seed = std::chrono::system_clock::now().time_since_epoch().count();
std::default_random_engine generator(seed);
std::uniform_int_distribution<unsigned int> distribution(1, 1000000000);
auto rng = std::bind(distribution, generator);
// C++11 RNG stuff. Basically, rng() now gives some unsigned int [1, 1000000000]
int main() {
unsigned int values(0);
std::cin >> values; // Determine the size of the list
std::list<unsigned int> c;
for (unsigned int n(0); n < values; ++n) {
c.push_front(rng());
}
auto c0(c);
auto it(c.begin()), it0(c0.begin());
for (unsigned int n(0); n < 7; ++n) {
++it; // Offset these iterators so I can print 7 values
++it0;
}
std::cout << "With seed: " << seed << "\n";
std::cout << "Unsorted list: \n";
print(c.begin(), c.end()) << "\n";
print(c.begin(), it) << "\n\n";
auto t0 = std::chrono::steady_clock::now();
c0.sort();
auto d0 = std::chrono::steady_clock::now() - t0;
std::cout << "Sorted list: \n";
print(c0.begin(), c0.end()) << "\n";
print(c0.begin(), it0) << "\n"; // My own print function, given further below
std::cout << "Seconds: " << std::chrono::duration<double>(d0).count() << std::endl;
return 0;
}
In print.hpp:
#include <iostream>
template<class InputIterator>
std::ostream& print(InputIterator begin, const InputIterator& end,
std::ostream& out = std::cout) {
bool first(true);
out << "{";
for (; begin != end; ++begin) {
if (first) {
out << (*begin);
first = false;
} else {
out << ", " << (*begin);
}
}
out << "}";
return out;
}
Sample input/output:
11
With seed: 3454921017
Unsorted list:
{625860546, 672762972, 319409064, 8707580, 317964049, 762505303, 756270868, 249266563, 224065083, 843444019, 523600743}
{625860546, 672762972, 319409064, 8707580, 317964049, 762505303, 756270868}
Sorted list:
{8707580, 224065083, 249266563, 317964049, 319409064, 523600743, 625860546, 672762972, 756270868, 762505303, 843444019}
{8707580, 224065083}
Seconds: 2.7e-05
Everything works as expected, except for the printing. It is supposed to show 7 elements, but instead the actual number is fairly haphazard, provided "value" is set to more than 7. Sometimes it gives none, sometimes it gives 1, sometimes 10, sometimes 7, etc.
So, is there something observably wrong with my code, or does this indicate that g++'s std::list (and std::forward_list) is not standards conforming?
Thanks in advance!
The iterators remain valid and still refer to the same elements of the list, which have been re-ordered.
So I don't think your code does what you think it does. It prints the list from the beginning, to wherever the 7th element ended up after the list was sorted. The number of elements it prints therefore depends on the values in the list, of course.
Consider the following code:
#include <list>
#include <iostream>
int main() {
std::list<int> l;
l.push_back(1);
l.push_back(0);
std::cout << (void*)(&*l.begin()) << "\n";
l.sort();
std::cout << (void*)(&*l.begin()) << "\n";
}
The two address printed differ, showing that (unlike std::sort), std::list::sort has sorted by changing the links between the elements, not by assigning new values to the elements.
I've always assumed that this is mandated (likewise for reverse()). I can't actually find explicit text to say so, but if you look at the description of merge, and consider that the reason for list::sort to exist is presumably because mergesort works nicely with lists, then I think it's "obviously" intended. merge says, "Pointers and references to the moved elements of x now refer to those same elements but as members of *this" (23.3.5.5./23), and the start of the section that includes merge and sort says, "Since lists allow fast insertion and erasing from the middle of a list, certain operations are provided specifically for them" (23.3.5.5/1).