Random slowdown when inserting elements at random into vectors

Random slowdown when inserting elements at random into vectors - c++

EDIT:
I've fixed the insertion. As Blastfurnace kindly mentioned the insertion invalidated the iterators. The loop is needed I believe to compare performance (see my comment on Blastfurnance's answer). My code is updated. I have completely similar code for the list just with vector replaced by list. However, with the code I find that the list performs better than the vector both for small and large datatypes and even for linear search (if I remove the insertion). According to http://java.dzone.com/articles/c-benchmark-%E2%80%93-stdvector-vs and other sites that should not be the case. Any clues to how that can be?
I am taking a course on programming of mathematical software (exam on monday) and for that I would like to present a graph that compares performance between random insertion of elements into a vector and a list. However, when I'm testing the code I get random slowdowns. For instance I might have 2 iterations where inserting 10 elements at random into a vector of size 500 takes 0.01 seconds and then 3 similar iterations that each take roughly 12 seconds. This is my code:
void AddRandomPlaceVector(vector<FillSize> &myContainer, int place) {
int i = 0;
vector<FillSize>::iterator iter = myContainer.begin();
while (iter != myContainer.end())
{
if (i == place)
{
FillSize myFill;
iter = myContainer.insert(iter, myFill);
}
else
++iter;
++i;
}
//cout << i << endl;
}
double testVector(int containerSize, int iterRand)
{
cout << endl;
cout << "Size: " << containerSize << endl << "Random inserts: " << iterRand << endl;
vector<FillSize> myContainer(containerSize);
boost::timer::auto_cpu_timer tid;
for (int i = 0; i != iterRand; i++)
{
double randNumber = (int)(myContainer.size()*((double)rand()/RAND_MAX));
AddRandomPlaceVector(myContainer, randNumber);
}
double wallTime = tid.elapsed().wall/1e9;
cout << "New size: " << myContainer.size();
return wallTime;
}
int main()
{
int testSize = 500;
int measurementIters = 20;
int numRand = 1000;
int repetionIters = 100;
ofstream tidOutput1_sum("VectorTid_8bit_sum.txt");
ofstream tidOutput2_sum("ListTid_8bit_sum.txt");
for (int i = 0; i != measurementIters; i++)
{
double time = 0;
for (int j = 0; j != repetionIters; j++) {
time += testVector((i+1)*testSize, numRand);
}
std::ostringstream strs;
strs << double(time/repetionIters);
tidOutput1_sum << ((i+1)*testSize) << "," << strs.str() << endl;
}
for (int i = 0; i != measurementIters; i++)
{
double time = 0;
for (int j = 0; j != repetionIters; j++) {
time += testList((i+1)*testSize, numRand);
}
std::ostringstream strs;
strs << double(time/repetionIters);
tidOutput2_sum << ((i+1)*testSize) << "," << strs.str() << endl;
}
return 0;
}
struct FillSize
{
double fill1;
};
The struct is just for me to easily add more values so I can test for elements with different size. I know that this code is probably not perfect concerning performance-testing, but they would rather have me make a simple example than simply reference to something I found.
I've tested this code on two computers now, both having the same issues. How can that be? And can you help me with a fix so I can graph it and present it Monday? Perhaps adding some seconds of wait time between each iteration will help?
Kind regards,
Bjarke

Your AddRandomPlaceVector function has a serious flaw. Using insert() will invalidate iterators so the for loop is invalid code.
If you know the desired insertion point there's no reason to iterate over the vector at all.
void AddRandomPlaceVector(vector<FillSize> &myContainer, int place)
{
FillSize myFill;
myContainer.insert(myContainer.begin() + place, myFill);
}

Related

Find indexes of duplicated (repeated) numbers in an array

I have 2 arrays in which arr1 stores a number (the salary) and arr2 stores a string (the employee's name). Since the two arrays are linked, I cannot change the order of arr1, or sort it. I am looking for a more efficient way to solve the problem which is to find if there are any duplicates in the array. It might be more than one duplicate, but if no are found it should print "no duplicates found".
int count = 0;
for (int i = 0;i<arr_size ;i++)
{
for (int j = 0; j < arr_size && i != j; j++)
{
if (arr[i] == arr[j])
{
cout << arr2[i] << " " << arr1[i] << endl;
cout << arr2[j] << " " << arr1[j] << endl;
count ++;
}
}
}
if (count == 0)
{
cout << "No employee have same salaries"<<endl;
}
I don't want to use such an inefficient way to solve the problem. Is there any better suggestion? Thanks for the help :)
And the question also requires me to print out all the duplicated employee and salaries pair

You can use an unordered_set which has an average constant time insertion and retrieval:
#include <unordered_set>
// ...set up arr
int count = 0;
std::unordered_set<int> salaries;
for (int i = 0; i < arr_size; i ++) {
if (salaries.count(arr[i]) > 0) {
// it's a duplicate
}
salaries.insert(arr[i]);
}
// do more stuff

Create a Haspmap using unordered_map and store salaries and index of the salary .
Now if the same salary exist then increase count

You can reduce the time complexity of the algorithm to O(n) by using unordered_set on the expense of using additional space.
#include<unordered_set>
int main(){
// Initialise your arrays
unordered_set<string> unique;
bool flag = false;
for(int i=0;i<arr_size;i++){
// Since unordered_set does not support pair out of the box, we will convert the pair to string and use as a key
string key = to_string(arr1[i]) + arr2[i];
// Check if key exists in set
if(unique.find(key)!=unique.end())
unique.push(key);
else{
// mark that duplicate found
flag = true;
// Print the duplicate
cout<<"Duplicate: "+to_string(arr1[i])+"-"+arr2[i]<<endl;
}
}
if(!flag){
cout<<"No duplicates found"<<endl;
} else cout<<"Duplicates found"<<endl;
return 0;
}

Linear search vs binary search efficiency

I am currently studying different search algorithms, and I have made a little program to see the difference in the efficiency. Binary search should be faster than linear search, but the time mesures show otherwise. Did I made some mistake in the code or is this some special case?
#include <chrono>
#include <unistd.h>
using namespace std;
const int n=1001;
int a[n];
void assign() {
for (int i=0; i<n; i++) {
a[i]=i;
}
}
void print() {
for (int i=0; i<n; i++) {
cout << a[i] << endl;
}
}
bool find1 (int x) {
for (int i=0; i<n; i++) {
if (x==a[i]){
return true;
}
} return false;
}
bool binsearch(int x) {
int l=0,m;
int r=n-1;
while (l<=r) {
m = ((l+r)/2);
if (a[m]==x) return true;
if (a[m]<x) l=m+1;
if (a[m]>x) r=m-1;
}
return false;
}
int main() {
assign();
//print();
auto start1 = chrono::steady_clock::now();
cout << binsearch(500) << endl;
auto end1 = chrono::steady_clock::now();
auto start2 = chrono::steady_clock::now();
cout << find1(500) << endl;
auto end2 = chrono::steady_clock::now();
cout << "binsearch: " << chrono::duration_cast<chrono::nanoseconds>(end1 - start1).count()
<< " ns " << endl;
cout << "linsearch: " << chrono::duration_cast<chrono::nanoseconds>(end2 - start2).count()
<< " ns " << endl;
return 0;
}

Your test dataset is too small (1001 integers). It will fit entirely in the fastest (L1) cache when you fill it; consequently, you're bound by branch complexity, not memory.
The binary search version exhibits more branch mispredictions, resulting in more pipeline stalls than a simple linear pass.
I increased n to 1000001 and also increased the number of test passes:
auto start1 = chrono::steady_clock::now();
for (int i = 0; i < n; i += n/13) {
if (!binsearch(i%n)) {
std::cerr << i << std::endl;
}
}
auto end1 = chrono::steady_clock::now();
auto start2 = chrono::steady_clock::now();
for (int i = 0; i < n; i += n / 13) {
if (!find1(i%n)) {
std::cerr << i << std::endl;
}
}
auto end2 = chrono::steady_clock::now();
and I'm getting different results:
binsearch: 10300 ns
linsearch: 3129600 ns
Note also that you should not call cout in a timed loop, but you do need to use the result of the find in order for it to not get optimized away.

To my mind N=1001 is enough to notice that binary search has a better performance. Specific realizations of linear search could be faster only for small N (approximately < 100). However, in your case the reason of such strange results is incorrect profiling measurements. All your data has been successfully cached during calculations of the first algorithm (binary search), which dramatically improved performance of the second (linear search).
If you just swap their calls, you will get an opposite result:
binsearch: 6019 ns
linsearch: 77587 ns
For precise measurements you should use special frameworks (google benchmark, for example), which ensures the 'fair conditions' for both algorithms.
Other online benchmarking tool (it runs the testing code on a pool of many AWS machines whose load is unknown and returns average result) gives these charts for your code without changes (with the same n=1001 as well):

Get the best of both!
Do a binary search down to some level, then switch to linear. Think of it this way, a binary search has a bunch of bookkeeping; a linear search is faster because it is 'simpler'.
When I first experimented with this (back in the 1970's) in assembly language, I deduced that doing binary searches down to about 4 items, then doing linear, was about optimal. However YMMV; It depends on the hardware, the complexity of comparing two items (float / int / string / whatever), etc.
Tip: Count the number of operations in your code. I see about twice as many operations are needed for each step in your binsearch() routine versus the linear scan.

C++ Persistent Vector, fill vector with data from a text file

i am currently trying to learn some C++ and now i got stuck in an exercise with vectors. So the task is to read ints from a text file and store them in the vector which should be dynamic.
I guess there is something wrong with the while-loop?
If I start this, the program fails and if I set the vector size to 6, I get
6 0 0 0 0 0 as output.
Thanks for any hints.
int main()
{
const string filename = "test.txt";
int s = 0;
fstream f;
f.open(filename, ios::in);
vector<int> v;
if (f){
while(f >> s){
int i = 0;
v[i] = s;
i = i+1;
}
f.close();
}
for(int i = 0; i < 6; i++){
cout << v[i] << "\n";
}
}

You don't grow the vector. It is empty and cannot hold any ints. You'll need to either resize it every time you want to add another int or you use push_back which automatically enlarges the vector.
You set i = 0 for every iteration so you would change the first value of the vector every iteration instead of the next one.
Go for:
v.push_back(s);
in your loop and
for(int i = 0; i < v.size(); i++) { // ...
Remark:
You normally don't hardcode vector sizes/bounds. One major point about using std::vector is its ability to behave dynamically with respect to its size. Thus, the code dealing with vectors should not impose any restrictions about the size of the vector onto the respective object.
Example:
for(int i = 0; i < 6; i++){ cout << v[i] << "\n"; }
requires the vector to have at least 6 elements, otherwise (less than 6 ints) you access values out of bounds (and you potentially miss elements if v contains more than 6 values).
Use either
for(int i = 0; i < v.size(); i++){ cout << v[i] << "\n"; }
or
for(std::vector<int>::const_iterator i = v.begin(); i != v.end(); ++i)
{
cout << *i << "\n";
}
or
for(auto i = v.begin(); i != v.end(); ++i)
{
cout << *i << "\n";
}
or
for(int x : v){ cout << x << "\n"; }
or
for(auto && x : v){ cout << x << "\n"; }
or
std::for_each(v.begin(), v.end(), [](int x){ std::cout << x << "\n"; });
or variants of the above which possibly pre-store v.size() or v.end()
or whatever you like as long as you don't impose any restriction on the dynamic size of your vector.

The issue is in the line i= 0. Fixing that will give an issue in the line v[i] = s.
You always initialise i to 0 in the while loop, and that is responsible for the current output. You should shift it out of the while loop.
After fixing that, you have not allocated memory to that vector, and so v[i] doesn't make sense as it would access memory beyond bounds. This will give a segmentation fault. Instead, it should be v.push_back(i), as that adds elements to the end of a vector, and also allocates memory if needed.

If you are using std::vector you can use v.push_back(i) to fill this vector

Error is this line int i = 0;
because you declare i=0 every time in while-loop.
To correct this move this line outside from loop.
Note: this will work, if you declare v like normal array for example int v[101]
When you use std vectors you can just push element at the end of vector with v.push_back(element);

v[i] = s; //error,you dont malloc room for vector
change into : v.push_back(s);

Parallel implemention of Lisp-style mapping of a function to a list in C++ fails without cout after use of thread

This code works only when any of the lines under /* debug messages */ are uncommented. Or if the list being mapped to is less than 30 elements.
func_map is a linear implementation of a Lisp-style mapping and can be assumed to work.
Use of it would be as follows func_map(FUNC_PTR foo, std::vector* list, locs* start_and_end)
FUNC_PTR is a pointer to a function that returns void and takes in an int pointer
For example: &foo in which foo is defined as:
void foo (int* num){ (*num) = (*num) * (*num);}
locs is a struct with two members int_start and int_end; I use it to tell func_map which elements it should iterate over.
void par_map(FUNC_PTR func_transform, std::vector<int>* vector_array) //function for mapping a function to a list alla lisp
{
int array_size = (*vector_array).size(); //retain the number of elements in our vector
int num_threads = std::thread::hardware_concurrency(); //figure out number of cores
int array_sub = array_size/num_threads; //number that we use to figure out how many elements should be assigned per thread
std::vector<std::thread> threads; //the vector that we will initialize threads in
std::vector<locs> vector_locs; // the vector that we will store the start and end position for each thread
for(int i = 0; i < num_threads && i < array_size; i++)
{
locs cur_loc; //the locs struct that we will create using the power of LOGIC
if(array_sub == 0) //the LOGIC
{
cur_loc.int_start = i; //if the number of elements in the array is less than the number of cores just assign one core to each element
}
else
{
cur_loc.int_start = (i * array_sub); //otherwise figure out the starting point given the number of cores
}
if(i == (num_threads - 1))
{
cur_loc.int_end = array_size; //make sure all elements will be iterated over
}
else if(array_sub == 0)
{
cur_loc.int_end = (i + 1); //ditto
}
else
{
cur_loc.int_end = ((i+1) * array_sub); //otherwise use the number of threads to determine our ending point
}
vector_locs.push_back(cur_loc); //store the created locs struct so it doesnt get changed during reference
threads.push_back(std::thread(func_map,
func_transform,
vector_array,
(&vector_locs[i]))); //create a thread
/*debug messages*/ // <--- whenever any of these are uncommented the code works
//cout << "i = " << i << endl;
//cout << "int_start == " << cur_loc.int_start << endl;
//cout << "int_end == " << cur_loc.int_end << endl << endl;
//cout << "Thread " << i << " initialized" << endl;
}
for(int i = 0; i < num_threads && i < array_size; i++)
{
(threads[i]).join(); //make sure all the threads are done
}
}
I think that the issue might be in how vector_locs[i] is used and how threads are resolved. But the use of a vector to maintain the state of the locs instance referenced by thread should prevent that from being an issue; I'm really stumped.

You're giving the thread function a pointer, &vector_locs[i], that may become invalidated as you push_back more items into the vector.
Since you know beforehand how many items vector_locs will contain - min(num_threads, array_size) - you can reserve that space in advance to prevent reallocation.
As to why it doesn't crash if you uncomment the output, I would guess that the output is so slow that the thread you just started will finish before the output is done, so the next iteration can't affect it.

I think you should make this loop inner to the main one:
...
for(int i = 0; i < num_threads && i < array_size; i++)
{
(threads[i]).join(); //make sure all the threads are done
}
}

Find Two Largest Numbers, C++

I'm using this approach: First find the largest among 5 numbers then save the subscript of array of the largest number in an "ivariable" after displaying the largest number, do like this
array[ivariable] = 0 ;
so that first largest set to zero and its no longer here in the array.
And do the same again, find the largest, but I'm not getting what I'm trying to.
Its a logical error.
Thanks
#include <iostream>
using namespace std;
int main(void)
{
int counter, large,number,det_2, i , large3, det_3= 0;
int det[5] = {0,0,0,0,0};
for(int k(0); k < 5 ; k++)
{
cout << "Enter the number " << endl ;
cin >> det[k] ;
}
for( i; i<5; i++)
{
large = det[i] ;
if (large > det_2)
{
det_2= large ;
counter = i ;
}
else
{
}
}
cout << "Largest among all is " << det_2 << endl;
det[i] = 0 ;
for( int j(0); j<5; j++)
{
large3 = det[j] ;
if(large3 > det_3)
{
det_3= large3 ;
}
else
{
}
}
cout << "Second largest " << large3 << endl ;
system("PAUSE");
}

You've got possible syntax and initialization errors. Fix those first:
for(int k(0); k < 5 ; k++): I've never seen an integer initialized this way. Shouldn't it be:
for (int k = 0; k < 5; k++) ? (Same with the last loop.)
Also,
for( i; i<5; i++)
The variable i is uninitialized. Variables are not initialized to any default value in C++. Because you've left it uninitialized, it might execute 5 times, no times, or 25,899 times. You don't know.
This should be:
for (i = 0; i < 5; i++)
But the whole thing could probably be a bit clearer anyway:
#include <iostream>
using namespace std;
int main(void)
{
int largest = -1;
int second_largest = -1;
int index_of_largest = -1;
int index_of_second_largest = -1;
int det[5] = {0, 0, 0, 0, 0};
for (int i = 0; i < 5; i++)
{
cout << "Enter the number " << endl;
cin >> det[i]; // assuming non-negative integers!
}
for (int j = 0; j < 5; j++) // find the largest
{
if (det[j] >= largest)
{
largest = det[j];
index_of_largest = j;
}
}
for (int k = 0; k < 5; k++) // find the second largest
{
if (k != index_of_largest) // skip over the largest one
{
if (det[k] >= second_largest)
{
second_largest = det[k];
index_of_second_largest = k;
}
}
}
cout << "Largest is " << largest << " at index " << index_of_largest << endl;
cout << "Second largest is " << second_largest <<
" at index " << index_of_second_largest << endl;
return 0;
}

Always give your variables values before you use them
det_2 = det[0];
counter = 0;
for (i = 1; i < 5; i++)

first problem I saw was that you are iterating using i as an index, but you don't initialize i.
code should be:
for(i = 0; i<5; i++)
^^^^
same goes for det_2. You compare elements against it, but do not initialize it. You should set it to det[0] before the loop where you use it.
third problem: Your "set largest value to zero after printing" sounds like it is there so that you can apply the same algorithm the second time.
You should create an additional function that gives you the index of the largest element, and call it like this:
int index = find_largest_index(a);
cout << "largest element: " << a[index] << endl;
a[index] = 0;
cout << "second largest element: " << a[ find_largest_index(a) ] << endl;

GCC 4.7.3: g++ -Wall -Wextra -std=c++0x largest.cpp
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
int main() {
std::cout << "Enter 5 numbers: ";
// Read 5 numbers.
std::vector<int> v;
for (auto i = 0; i < 5; ++i) {
int x = 0;
while (!(std::cin >> x)) {
// Error. Reset and try again.
std::cin.clear();
std::cin.ignore();
}
v.push_back(x);
}
// partition on element 3 (4th number)
std::nth_element(std::begin(v), std::next(std::begin(v), 3), std::end(v));
std::cout << "Two largest are: ";
std::copy(std::next(std::begin(v), 3), std::end(v), std::ostream_iterator<int>(std::cout, " "));
}

In the specific case of 5 elements, the algorithm you use is unlikely to make any real difference.
That said, the standard algorithm specifically designed for this kind of job is std::nth_element.
It allows you to find the (or "an", if there are duplicates) element that would end up on position N if you were to sort the entire collection.
That much is pretty obvious from the name. What's not so obvious (but is still required) is that nth_element also arranges the elements into two (or three, depending on how you look at it) groups: the elements that would short before that element, the element itself, and the elements that would sort after that element. Although the elements are not sorted inside of each of those groups, they are arranged into those groups -- i.e., all the elements that would sort before it are placed before it, then the element itself, then the elements that would sort after it.
That gives you exactly what you want -- the 4th and 5th elements of the 5 you supply.
As I said originally, in the case of just 5 elements, it won't matter much -- but if you wanted (say) the top 50000 out of ten million, choosing the right algorithm would make a much bigger difference.

nth_element isn't always suitable (or as efficient as it could be) as it needs to rearrange the input elements.
It's very common to want just the top two elements, and can be done efficiently in one pass by keeping the best and second-best values seen so far, and whenever a value you iterate over is better than the second-best, you'll either replace the second-best with it or the best, and in the latter case you also overwrite the best with the new value. That can look like this:
#include <utility>
template <typename It, typename EndIt, typename Less = std::less<>>
auto top_two(It it, EndIt end, Less less = Less{}) -> std::pair<It, It>
{
It first = it;
if (it == end || ++it == end)
return {first, end};
std::pair<It, It> results = less(*it, *first) ? std::pair{first, it} : std::pair{it, first};
while (++it != end)
if (less(*results.second, *it))
results.second = less(*results.first, *it)
? std::exchange(results.first, it) : it;
return results;
}
(See it running at http://coliru.stacked-crooked.com/a/a7fa0c9f1945b3fe)
I return iterators so the caller can know where in the input the top two elements are, should they care (e.g. to erase them from a container, or calculate their distance from begin(), or modify their values).
It you want the two lowest values, just pass std::greater<>{} as your "less" argument.
Some convenience functions to make it easier to call with containers or initializer_lists:
template <typename Container, typename Less = std::less<>>
auto top_two(const Container& c, Less less = Less{})
{
return top_two(begin(c), end(c), less);
}
template <typename T, typename Less = std::less<>>
auto top_two(const std::initializer_list<T>& il, Less less = Less{})
{
return top_two(begin(il), end(il), less);
}
If you want a general solution for the top-N elements, it's better to make N an argument and create a multiset of N top values (using a dereferencing comparison type), putting the initial N elements in, then whenever a new element is more than the **top_n.begin() value, do a top_n.insert(it); followed by top_n.erase(top_n.rbegin()); to drop the worst element: these operations are O(log N) so remain reasonably efficient even in pathological cases, such as input that is incrementing numbers.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Random slowdown when inserting elements at random into vectors - c++

Related

Find indexes of duplicated (repeated) numbers in an array

Linear search vs binary search efficiency

C++ Persistent Vector, fill vector with data from a text file

Parallel implemention of Lisp-style mapping of a function to a list in C++ fails without cout after use of thread

Find Two Largest Numbers, C++

Categories

Resources