Discrepancy between size and number of iterations C++ - c++

I create a pointer p_dataStatStorage_ to an object of type:
STK::Array2D<std::vector<std::pair<int, float> > >
where STK::Array2D is a type of 2D container, with elt() as an accessor. At some point in my code I fill the vector of a particular cell (ind, j) with pushbacks of the form:
p_dataStatStorage_->elt(ind, j).push_back(std::pair<int, float>(currMod, currProba));
Later in the code, in another function, I need to read the couples that were pushed earlier. I then use the following code:
for (std::vector<std::pair<int, float> >::const_iterator itVec = p_dataStatStorage->elt(i, j).begin();
itVec != p_dataStatStorage->elt(i, j).end();
++itVec)
{
std::cout << "itVec->first: " << itVec->first << ", itVec->second: " << itVec->second << std::endl;
}
Problems are:
For every given couple (i, j), the first itVec->first always output 0, no matter what was set earlier, while the rest (int or float) of the std::pair<int, STK::Real> are output correctly.
As I output the size() of each vector prior to the above loop, I always get the value number of elements corresponding to the data originally pushed. However, in some cases, the loop above has an infinite number of iterations, after the iterations described at 1., and the values output for the int are random, while the values output for the float are near zero.
Are there standard things I should check immediately ? I am a bit puzzled because if I check the elements pushed immediately, in the function where I pushed them, I get no errors, while if I do this in another function later I get the behaviour described above.

Related

Why iterator is not dereferenced as an lvalue

Apologies if my question does not contain all relevant info. Please comment and I will amend accordingly.
I use CLion on Win7 with MinGW and gcc
I have been experimenting with circular buffers and came across boost::circular_buffer, but for the size of my project I want to use circular buffer by Pete Goodlife, which seems like a solid implementation in just one .hpp.
Note: I am aware of how to reduce boost dependecies thanks to Boost dependencies and bcp.
However, the following example with Pete's implementation does not behave as expected, i.e. the result to std::adjacent_difference(cbuf.begin(),cbuf.end(),df.begin()); comes out empty. I would like to understand why and possibly correct its behaviour.
Follows a MWE:
#include "circular.h"
#include <iostream>
#include <algorithm>
typedef circular_buffer<int> cbuf_type;
void print_cbuf_contents(cbuf_type &cbuf){
std::cout << "Printing cbuf size("
<<cbuf.size()<<"/"<<cbuf.capacity()<<") contents...\n";
for (size_t n = 0; n < cbuf.size(); ++n)
std::cout << " " << n << ": " << cbuf[n] << "\n";
if (!cbuf.empty()) {
std::cout << " front()=" << cbuf.front()
<< ", back()=" << cbuf.back() << "\n";
} else {
std::cout << " empty\n";
}
}
int main()
{
cbuf_type cbuf(5);
for (int n = 0; n < 3; ++n) cbuf.push_back(n);
print_cbuf_contents(cbuf);
cbuf_type df(5);
std::adjacent_difference(cbuf.begin(),cbuf.end(),df.begin());
print_cbuf_contents(df);
}
Which prints the following:
Printing cbuf size(3/5) contents...
0: 0
1: 1
2: 2
front()=0, back()=2
Printing cbuf size(0/5) contents...
empty
Unfortunately, being new to c++ I can’t figure out why the df.begin() iterator is not dereferenced as an lvalue.
I supsect the culprit is (or don't completely uderstand) the member call of the circular_buffer_iterator on line 72 in Pete's circular.h:
elem_type &operator*() { return (*buf_)[pos_]; }
Any help is very much appreciated.
The iterator you pass as the output iterator is dereferenced and treated as an lvalue, and most probably the data you expect is actually stored in the circular buffer's buffer.
The problem is, that apart from the actual storage buffer, most containers also contain some internal book-keeping state that has to be maintained. (for instance: how many elements is in the buffer, how much frees space is left etc).
Dereferencing and incrementing the container doesn't update the internal state, so the container does not "know" that new data has been added.
Consider the following code:
std::vector<int> v;
v.reserve(3);
auto i = v.begin();
*(i++) = 1; // this simply writes to memory
*(i++) = 2; // but doesn't update the internal
*(i++) = 3; // state of the vector
assert(v.size() == 0); // so the vector still "thinks" it's empty
Using push_back would work as expected:
std::vector<int> v;
v.reserve(3);
v.push_back(1); // adds to the storage AND updates internal state
v.push_back(2);
v.push_back(3);
assert(v.size() == 3); // so the vector "knows" it has 3 elements
In your case, you should use std::back_inserter, an iterator that calls "push_back" on a container every time it is dereferenced:
std::adjacent_difference(
cbuf.begin(), cbuf.end(),
std::back_inserter(df));
std::adjacent_difference writes to the result iterator. In your case, that result iterator points into df, which has a size of 0 and a capacity of 5. Those writes will be into the reserved memory of df, but will not change the size of the container, so size will still be 0, and the first 3 ints of the reserved container space will have your difference. In order to see the results, the container being written into must already have data stored in the slots being written to.
So to see the results you must put data into the circular buffer before the difference, then resize the container to the appropriate size (based in the iterator returned by adjacent_difference.

Show the percentage of process completion in a C++ program

I am making a set of C++ library as a part of my Data Structures assignment, which includes custom implementation of vector, sorting algorithms, stacks, etc. I am supposed to work on the running time of sorting algorithms, bubble sort, selection sort, quick sort, etc., which are part of my library.
Now the data set given to test the algorithms in of the order of 10^6. I ran bubble sort on a data of 2*10^6 elements, and it took about 138 minutes for the program to run, and in all this time, I did not know if my sorting algorithm is working correctly or not, or is it even working or not. I would want to add another feature to the sorting functions, i.e they could display the percentage of sorting done, and I think this is possible, since algorithms like bubble sort are deterministic.
I need a message like something to appear as soon as I start the process:
Bubble sort under progress. Done: 17%
This percentage is to be determined by the algorithm. Consider the example of bubble sort with 10000 elements. If you look at the bubble sort algorithm(refer here: https://en.wikipedia.org/wiki/Bubble_sort), it has 2 loops, and after each iteration of the main loop, one element is fixed to its correct position in the sorted array. So after like 1 iteration, the percentage should increase by 0.01%.
Though this percentage calculation has a problem that in this case, the time for the percentage to increase keeps on decreasing, but something like this would do.
Also, this number should increase as and when required, on the same place. But I have no idea how to implement it.
You can pass a callback function of a generic type to your bubblesort function and call the function at reasonable intervals.
This will impact performance, but this shouldn't be a concern when you're using bubblesort anyway.
First we'll need some includes:
#include <iostream>
#include <vector>
#include <random>
#include <chrono>
And then the bubblesort function, which I essentially took from wikipedia: https://en.wikipedia.org/wiki/Bubble_sort#Optimizing_bubble_sort
template <typename T, typename Func>
void bubblesort(std::vector<T> &v, Func callback) {
size_t const len = v.size();
size_t n = v.size();
while(n > 0) {
size_t newn = 0;
for(size_t i = 1; i <= n-1; ++i) {
if (v[i - 1] > v[i]) {
std::swap(v[i-1], v[i]);
newn = i;
}
}
n = newn;
callback(100-static_cast<int>(n*100/len));
}
}
We will call the given callback function (or use operator() on an object) whenever it's done sorting in one element.
The parameter we pass is an integer percentage of how far we've come. Note that due to integer arithmetic you cannot change the order of operations with n*100/v.size() or else it would always result in 0, since n will always be smaller than v.size();
using namespace std::chrono; //to avoid the horrible line becoming even longer
int main() {
std::vector<int> vec;
/* fill vector with some data */
std::mt19937 generator(static_cast<unsigned long>(duration_cast<milliseconds>(system_clock::now().time_since_epoch()).count())); //oh god
for(int i = 0; i < 100000; ++i) {
vec.push_back(static_cast<int>(generator()));
}
For initialization we get create a random number generator and seed it with the current time. Then we put some elements in the vector.
char const *prefix = "Bubble sort under progress. Done: ";
int lastp = -1;
bubblesort(vec, [&lastp,prefix](int p){
//if progress has changed, update it
if(p != lastp) {
lastp = p;
std::cout << "\r" << prefix << p << "%" << std::flush;
/*std::flush is needed when we don't start a new line
'\r' puts the cursor to the start of the line */
}
});
std::cout << "\r" << prefix << "100%" << std::endl;
//make sure we always end on 100% and end the line
}
Now the core part: we pass a C++ lambda function to our bubblesort function as a callback. Our bubblesort function will then call this lambda with the percentage value and write it to the screen.
And voilà, we got ourselves some neat output:
https://youtu.be/iFGN8Wy9T3o
Closing notes:
You can of course integrate the lamda function into the sort function itself, however I would not recommend this as you lose a lot of flexibility. But it's a design choice that's up to you - if you don't need the flexibility, just hardcode it.
The percentage is not very accurate, in fact knowing you're at 20% (and how long it took to get there) does not tell you much at all about the time it will take to get to 100% as it could very well be, that the last 20% of the vector were sorted (and thus were quick to sort with bubblesort - O(n)), but the remaining 80% are random, and take O(n^2) to sort.
In fact all it tells you is that you're making progress, but that's all you wanted in the first place so I guess that's okay.
If you want a more accurate percentage adjust your program like this:
#include <iomanip>
/* ... */
callback(10000-static_cast<int>(n*10000/len));
/* ... */
std::cout.fill('0'); //to fill leading zero of p%100
std::cout << "\r" << prefix << p/100 << "." << std::setw(2) << p%100 << "%" << std::flush;
If you decide to use floating point values instead remember to clear remnant characters from previous outputs - "\r" only resets the cursor position, but does not clear the line.
Use std::cout.precision(3); for a fixed precision or write some spaces after your message to clear previous runs.
For the special case of bubblesort, you can take the number of elements you have, then divide that by 100. If you have 552 elements, then you will get 5. (integers make sense to work with). Then, have a counter in your loop. If the counter is a multiple of 5, (you've so far sorted 5 elements) then you can increase the percentage by 1 and print it. As far as printing it so that the percentage appears on the spot instead of printing below, you can print backspaces! Either that or try using the ncurses library, though that might be overkill. Finally, a different way to do this might be to use a linux style progress bar that is 50 characters long or something similar.

Setting vector elements in range-based for loop [duplicate]

This question already has answers here:
How can I modify values in a map using range based for loop?
(4 answers)
Closed 1 year ago.
I have come across what I consider weird behaviour with the c++11 range-based for loop when assigning to elements of a dynamically allocated std::vector. I have the following code:
int arraySize = 1000;
std::string fname = "aFileWithLoadsOfNumbers.bin";
CTdata = new std::vector<short int>(arraySize, 0);
std::ifstream dataInput(fname.c_str(), std::ios::binary);
if(dataInput.is_open()
{
std::cout << "File opened sucessfully" << std::endl;
for(auto n: *CTdata)
{
dataInput.read(reinterpret_cast<char*>(&n), sizeof(short int));
// If I do "cout << n << endl;" here, I get sensible results
}
// However, if I do something like "cout << CTdata->at(500) << endl;" here, I get 0
}
else
{
std::cerr << "Failed to open file." << std::endl;
}
If I change the loop to a more traditional for(int i=0; i<arraySize; i++) and use &CTdata->at(i) in place of &n in the read function, things do as I would expect.
What am I missing?
Change this loop statement
for(auto n: *CTdata)
to
for(auto &n : *CTdata)
that is you have to use references to elements of the vector.
you have to write
for( auto& n : *CTdata )
because auto n means short int n when you need short int& n.
i recommend you to read difference beetween decltype and auto.
The reason your loop fails is because you reference vector elements by value. However, in this case you can eliminate the loop altogether:
dataInput.read(reinterpret_cast<char*>(CTdata->data()), arraySize*sizeof(short int));
This reads the content into the vector in a single call.
Vlad's answer perfectly answers your question.
However, consider this for a moment. Instead of filling your array with zeroes from the beginning, you could call vector<>::reserve(), which pre allocates your backing buffer without changing the front facing portion of the vector.
You can then call vector<>::push_back() like normal, with no performance implications, while still maintaining the logic clear in your source code. Coming from a C# background, looping over your vector like that looks like an abomination to me, not to mention you set each element twice. Plus if at any point your element generation fails, you'll have a bunch of zeroes that weren't supposed to be there in the first place.

Finding the intersection of two vectors of strings

I have two vectors of strings and want to find the strings which are present in both, filling a third vector with the common elements. EDIT: I've added the complete code listing with the respective output so that things are clear.
std::cout << "size " << m_HLTMap->size() << std::endl;
/// Vector to store the wanted, present and found triggers
std::vector<std::string> wantedTriggers;
wantedTriggers.push_back("L2_xe25");
wantedTriggers.push_back("L2_vtxbeamspot_FSTracks_L2Star_A");
std::vector<std::string> allTriggers;
// Push all the trigger names to a vector
std::map<std::string, int>::iterator itr = m_HLTMap->begin();
std::map<std::string, int>::iterator itrLast = m_HLTMap->end();
for(;itr!=itrLast;++itr)
{
allTriggers.push_back((*itr).first);
}; // End itr
/// Sort the list of trigger names and find the intersection
/// Build a typdef to make things clearer
std::vector<std::string>::iterator wFirst = wantedTriggers.begin();
std::vector<std::string>::iterator wLast = wantedTriggers.end();
std::vector<std::string>::iterator aFirst = allTriggers.begin();
std::vector<std::string>::iterator aLast = allTriggers.end();
std::vector<std::string> foundTriggers;
for(;aFirst!=aLast;++aFirst)
{
std::cout << "Found:" << (*aFirst) << std::endl;
};
std::vector<std::string>::iterator it;
std::sort(wFirst, wLast);
std::sort(aFirst, aLast);
std::set_intersection(wFirst, wLast, aFirst, aLast, back_inserter(foundTriggers));
std::cout << "Found this many triggers: " << foundTriggers.size() << std::endl;
for(it=foundTriggers.begin();it!=foundTriggers.end();++it)
{
std::cout << "Found in both" << (*it) << std::endl;
}; // End for intersection
The output is then
Here is the partial output, there are over 1000 elements in the vector so I didn't include the full output:
Found:L2_te1400
Found:L2_te1600
Found:L2_te600
Found:L2_trk16_Central_Tau_IDCalib
Found:L2_trk16_Fwd_Tau_IDCalib
Found:L2_trk29_Central_Tau_IDCalib
Found:L2_trk29_Fwd_Tau_IDCalib
Found:L2_trk9_Central_Tau_IDCalib
Found:L2_trk9_Fwd_Tau_IDCalib
Found:L2_vtxbeamspot_FSTracks_L2Star_A
Found:L2_vtxbeamspot_FSTracks_L2Star_B
Found:L2_vtxbeamspot_activeTE_L2Star_A_peb
Found:L2_vtxbeamspot_activeTE_L2Star_B_peb
Found:L2_vtxbeamspot_allTE_L2Star_A_peb
Found:L2_vtxbeamspot_allTE_L2Star_B_peb
Found:L2_xe25
Found:L2_xe35
Found:L2_xe40
Found:L2_xe45
Found:L2_xe45T
Found:L2_xe55
Found:L2_xe55T
Found:L2_xe55_LArNoiseBurst
Found:L2_xe65
Found:L2_xe65_tight
Found:L2_xe75
Found:L2_xe90
Found:L2_xe90_tight
Found:L2_xe_NoCut_allL1
Found:L2_xs15
Found:L2_xs30
Found:L2_xs45
Found:L2_xs50
Found:L2_xs60
Found:L2_xs65
Found:L2_zerobias_NoAlg
Found:L2_zerobias_Overlay_NoAlg
Found this many triggers: 0
Possible Reason
I am starting to think that the way in which I compile my code is to blame. I am currently compiling with ROOT (the physics data analysis framework) instead of doing a standalone compile. I get the feeling that it doesn't work all that well with the STL Algorithm library and that's the cause of the issue, especially given how many people seem to have the code working for them. I will try to do a stand-alone compilation and re-running.
Passing foundTriggers.begin(), with foundTriggers empty, as the output argument will not cause the output to be pushed onto foundTriggers. Instead, it will increment the iterator past the end of the vector without resizing it, randomly corrupting memory.
You want to use an insert iterator:
std::set_intersection(wFirst, wLast, aFirst, aLast,
std::back_inserter(foundTriggers));
UPDATE: As pointed out in the comments, the vector is resized to be at least large enough for the result, so your code should work. Note that you should use the iterator returned from set_intersection to indicate the end of the intersection - your code ignores it, so you will also iterate over the empty strings left at the end of the output.
Could you post a complete test case so that we can see whether the intersection is actually empty or not?
Your allTrigers vector is empty, afterall. You never reset itr to the beginning of the map when you're filling it.
EDIT:
Actually, you never reset aFirst:
for(;aFirst!=aLast;++aFirst)
{
std::cout << "Found:" << (*aFirst) << std::endl;
};
// here aFirst == aLast
std::vector<std::string>::iterator it;
std::sort(wFirst, wLast);
std::sort(aFirst, aLast); // **** sorting empty range ****
std::set_intersection(wFirst, wLast, aFirst, aLast, back_inserter(foundTrigger));
// ^^^^^^^^^^^^^^
// ***** empty range *****
I hope you can now see why it is good practice to narrow down the scope of your variables.
You never use the return value of set_intersection. In this case you could use it to resize foundIterators after set_intersection has returned, or as the upper limit of the for loop. Otherwise your code seems to work. Can we see a full compilable program and its actual output please?

Memoization Recursion C++

I was implementing a recursive function with memoization for speed ups. The point of the program is as follows:
I shuffle a deck of cards (with an equal number of red and black
cards) and start dealing them face up.
After any card you can say “stop”, at which point I pay you $1 for
every red card dealt and you pay me $1 for every black card dealt.
What is your optimal strategy, and how much would you pay to play
this game?
My recursive function is as follows:
double Game::Value_of_game(double number_of_red_cards, double number_of_black_cards)
{
double value, key;
if(number_of_red_cards == 0)
{
Card_values.insert(Card_values.begin(), pair<double, double> (Key_hash_table(number_of_red_cards, number_of_black_cards), number_of_black_cards));
return number_of_black_cards;
}
else if(number_of_black_cards == 0)
{
Card_values.insert(Card_values.begin(), pair<double, double> (Key_hash_table(number_of_red_cards, number_of_black_cards), 0));
return 0;
}
card_iter = Card_values.find(Key_hash_table(number_of_red_cards, number_of_black_cards));
if(card_iter != Card_values.end())
{
cout << endl << "Debug: [" << number_of_red_cards << ", " << number_of_black_cards << "] and value = " << card_iter->second << endl;
return card_iter->second;
}
else
{
number_of_total_cards = number_of_red_cards + number_of_black_cards;
prob_red_card = number_of_red_cards/number_of_total_cards;
prob_black_card = number_of_black_cards/number_of_total_cards;
value = max(((prob_red_card*Value_of_game(number_of_red_cards - 1, number_of_black_cards)) +
(prob_black_card*Value_of_game(number_of_red_cards, number_of_black_cards - 1))),
(number_of_black_cards - number_of_red_cards));
cout << "Check: value = " << value << endl;
Card_values.insert(Card_values.begin(), pair<double, double> (Key_hash_table(number_of_red_cards, number_of_black_cards), value));
card_iter = Card_values.find(Key_hash_table(number_of_red_cards , number_of_black_cards ));
if(card_iter != Card_values.end());
return card_iter->second;
}
}
double Game::Key_hash_table(double number_of_red_cards, double number_of_black_cards)
{
double key = number_of_red_cards + (number_of_black_cards*91);
return key;
}
The third if statement is the "memoization" part of the code, it stores all the necessary values. The values that are kept in the map can be thought of as a matrix, these values will correspond to a certain #red cards and #black cards. What is really werid is that when I execute the code for 8 cards in total (4 blacks and 4 reds), I get an incorrect answer. But when I execute the code for 10 cards, my answer is wrong, but now my answer for 4 blacks and 4 reds are correct (8 cards)! Same can be said for 12 cards, where I get the wrong answer for 12 cards, but the correct answer for 10 cards, so on and so forth. There is some bug in the code, however, I can't figure it out.
Nobody actually answered this question with an answer. So I will give it a try, though nneonneo actually put his or her finger on the likely source of your problem.
The first problem that's probably not actually a problem in this case, but sticks out like a sore thumb... you are using double to hold a value that you mostly treat as an integer. In this case, on most systems, this is probably OK. But as a general practice, it is very bad. In particular because you check if a double is exactly equal to 0. It probably will be as, on most systems, with most compilers, a double can hold integers values up to a fairly large size with perfect precision as long as you restrict yourself to adding, subtracting and multiplying by other integers or doubles masquerading as integers to get a new value.
But, that's likely not the source of the error you're seeing, it's just trips every good programmer's alarm bells for smelly code. It should be fixed. The only time you really need them to be doubles is when you're calculating the relative probability of red or black.
And that brings me to the thing that probably is your problem. You have these two statements in your code:
number_of_total_cards = number_of_red_cards + number_of_black_cards;
prob_red_card = number_of_red_cards/number_of_total_cards;
prob_black_card = number_of_black_cards/number_of_total_cards;
which, of course, should read:
number_of_total_cards = number_of_red_cards + number_of_black_cards;
prob_red_card = number_of_red_cards/double(number_of_total_cards);
prob_black_card = number_of_black_cards/double(number_of_total_cards);
because you've been a good programmer and declared those variables as integers.
Presumably prob_red_card and prob_black_card are variables of type double. But they are not declared anywhere in the code you show us. This means that no matter where they are declared, or what their types are, they must be effectively shared by all sub-calls in the recursive call tree for Game::Value_of_game.
The is almost certainly not what you want. It makes it extremely difficult to reason about what values those variables have and what those values represent during any given call in the recursive call tree for your function. They really have to be local variables in order for the algorithm to be tractable to analyze. Luckily, they seem to only be used within the else clause of a particular if statement. So they can be declared when they are initially assigned values. Here is probably what this code should read:
unsigned const int number_of_total_cards = number_of_red_cards + number_of_black_cards;
const double prob_red_card = number_of_red_cards/double(number_of_total_cards);
const double prob_black_card = number_of_black_cards/double(number_of_total_cards);
Note that I also declare them const. It is good practice to declare any variable who's value you don't expect to change during the lifetime of the variable as const. It helps you write code that is more correct by asking the compiler to tell you when you accidentally write code that is incorrect. It also can help the compiler generate better code, though in this case even a trivial analysis of the code reveals that they are not modified during their lifetimes and can be treated as const, so most decent optimizers will essentially put the const in for you for the purposes of code optimization, though that still will not give you the benefit of having the compiler tell you if you accidentally use them in a non-const way.