I have a function that randomly deletes elements from a map called bunnies (the map contains class objects) and a list called names ( the list contains the keys to the map) when the number of elements reach more than 250. However, randomly the map element will be deleted but the list entry will not (I think this is what is going on, though clearly part of the map element survives). The outcome is that when I use the second section of code to iterate through the list and display the mapped values associated with those keys, I get large negative values like the example at the bottom.
Clearly the list element isn't being deleted, but why?
void cull(std::map<std::string, Bunny> &bunnies, std::list<std::string> &names,int n)
{
int number = n, position = 0;
for (number = n; number > 125; number--)
{
position = rand() % names.size();
std::list<std::string>::iterator it = names.begin();
std::advance(it, position);
bunnies.erase(*it);
names.erase(it);
it = names.begin();
}
std::cout << "\n" << n - 125 << "rabbits culled";
}
I use this code to print out the map values.
for (std::list<std::string>::iterator it = names.begin(); it != names.end(); it++)
{
n++;
std::cout << n << "\t" << " " << *it << "\t" << bunnies[*it].a() << "\t" << bunnies[*it].s() << "\t" << bunnies[*it].c() << "\t" << bunnies[*it].st() << "\n";
This is the output. The top is what it should display, the bottom is what happens when the program fails.
165 Tom_n 14 1 0 1
166 Lin_c -842150451 -842150451 -842150451 -842150451
The problem seems to be this:
std::advance(it, number);
This should be position, not number.
The other problem is that a map stores unique names. What if there is more than one Bunny with the same name? For example, if the list has 3 bunnies names "John", the map will be able to hold only one "John", since the key in a map must be unique.
Either use a multimap if names can be duplicated, or use a std::set instead of a std::list if Bunnies must have unique names.
Maybe overall, you can just use std::map<std::string, Bunny>, and forget about the std::list. The map by itself has all the information you need. Unless there is something I'm missing, I don't see the need for a std::list to do redundant work.
Related
I know unordered_set doesn't guarantee to maintain order.
But I was wondering if simple ascending sequence say N natural numbers was inserted to a set, will it maintain the order.
It did..
unordered_set<int> nums;
for (int i=10; i>0; --i) {
nums.insert(i);
}
nums.erase(4);
for (auto i : nums) {
cout<< i<< endl;
}
This prints:
1
2
3
5
6
7
8
9
10
I ran this several times with consistent result.
Why did it maintaint reverse order of insertion?
Is there an solid reason for this behaviour?(If this works, it can make my code super efficient ;) )
if simple ascending sequence say N natural numbers was inserted to a set, will it maintain the order. It did...
But it didn't, it reversed the order. Like you say, in fact:
Why did it maintain reverse order of insertion?
This is just an implementation detail. How any implementation of std::unordered_set stores its elements is not specified by the standard, except for the fact that it has buckets. Another implementation could very well store these 10 integers in order of insertion (not in the reverse order), or really any order at all.
Is there an solid reason for this behaviour?
Kind of. I will take GCC's implementation as an example.
std::unordered_set stores elements in buckets. When inserting a first element, it allocates enough space for a small number, say 13. This space is divided into 13 buckets, and each correspond to the hash of the integers 0 through 12 (the hash being the integer itself). That way, inserting the next few elements, which are all integers between 0 and 12, does not cause any rehash or collision and each element takes a bucket. Again, the reason they end up in the reverse order is an implementation detail and not relevant to this part.
However, if you insert more than 13 elements, the set needs to reallocate memory and move the elements, and at that point their order can change. In the case of GCC's implementation, it turns out the elements end up in the insertion order after the move, and so inserting the integers from 0 to 13 gives this sequence: 13 0 1 2 3 4 5 6 7 8 9 10 11 12.
You can look at the buckets yourself:
#include <unordered_set>
#include <iostream>
int main() {
std::unordered_set<int> s;
auto print_s = [&s](){
std::cout << "s = [ ";
for (auto i : s) {
std::cout << i << " ";
}
std::cout << "]\n";
};
auto print_bucket = [&s](int bucket){
for (auto it = s.begin(bucket); it != s.end(bucket); ++it) {
std::cout << *it << " ";
}
std::cout << "\n";
};
for (int i = 0; i < 20; ++i) {
std::cout << "i=" << i << "\n";
s.insert(i);
std::cout << "bucket_count=" << s.bucket_count() << "\n";
for (auto b = 0; b < s.bucket_count(); ++b) {
if (s.bucket_size(b) != 0) {
std::cout << "\tb=" << b << ": ";
print_bucket(b);
} else {
std::cout << "\tb=" << b << ": empty\n";
}
}
print_s();
}
}
Demo
Clang (with libc++ - thanks Miles Budnek) and MSVC do something completely different from GCC (libstdc++).
The container keep reverse insertion order is probably a side effect.
The primary reason would be to avoid iterate through all empty bucket which may be inefficient, by implement it, the insertion order is somewhat preserved.
From what it appears, seems like it simply keep the last bucket pointer/id and link to it when new bucket is used (and result in a reverse-linked buckets).
I have the following program which finds the frequency of the numbers.
map<int,int> mp;
vector<int> x(4);
x[0] = x[2] = x[3] = 6;
x[1] = 8;
for(int i=0;i<x.size();++i)
mp[x[i]]++;
cout<<"size:"<<mp.size()<<endl; //Prints 2 as expected
for(int i=0;i<mp.size();++i) //iterates from 0->8 inclusive
cout<<i<<":"<<mp[i]<<endl;
The output is as follows:
size:2
0:0
1:0
2:0
3:0
4:0
5:0
6:3
7:0
8:1
Why does it iterate over 9 times? I also tried using insert instead of [] operator while inserting elements, but the result is the same. I also tested by iterating over the map using iterator.
Before your printing loop, the populated mp elements are [6] and [8]. When you call cout ... << mp[i] to print with i 0, it inserts a new element [0] with the default value 0, returning a reference to that element which then gets printed, then your loop test i < mp.size() actually compares against the new size of 3. Other iterations add further elements.
You should actually do:
for (std::map<int,int>::const_iterator i = std::begin(mp);
i != std::end(mp); ++i)
std::cout << i->first << ':' << i->second << '\n';
...or, for C++11...
for (auto& e : mp)
std::cout << e.first << ':' << e.second << '\n';
When you access mp[i], the element is added to the map if it doesn't already exist. So the first iteration of the loop will attemp to read mp[0]. This will create the element, so now mp.size() == 3. The size will continue to increase whenever an iteration attempts to access an element that doesn't exist.
When you get to i == 8, the element exists, so it doesn't increase the size. When it gets back to the top of the loop and tests 9 < mp.size(), it will fail and the loop ends.
Because when you use mp[i] in the printing loop you create those items that doesn't exist in the map.
Use iterators instead,
I have 4 vectors with about 45,000 records each right now. Looking for an efficient method to run through these 4 vectors and output how many times it matches the users input. Data needs to match on the same index of each vector.
Multiple for loops? Vector find?
Thanks!
If the elements need to match at the same location, it seems that a std::find() or std::find_if() combined with a check for the other vectors at the position is a reasonable approach:
std::vector<A> a(...);
std::vector<B> b(...);
std::vector<C> c(...);
std::vector<D> d(...);
std::size_t match(0);
for (auto it = a.begin(), end = a.end(); it != end; ) {
it = std::find_if(it, end, conditionA));
if (it != end) {
if (conditionB[it - a.begin()]
&& conditionC[it - a.begin()]
&& conditionD[it - a.begin()]) {
++match;
}
++it;
}
}
What I got from description is that, you have 4 vectors and lots of user data, you need to find out how many of times it matches with vectors at same index
so here goes the code ( i am writing a c++4.3.2 code)
#include<iostream>
#include<vector>
#include<algorithm>
using namespace std;
int main(){
vector<typeT>a;
vector<typeT>b;
vector<typeT>c;
vector<typeT>d;
vector<typeT>matched;
/*i am assuming you have initialized a,b,c and d;
now we are going to do pre-calculation for matching user data and store
that in vector matched */
int minsize=min(a.size(),b.size(),c.size(),d.size());
for(int i=0;i<minsize;i++)
{
if(a[i]==b[i]&&b[i]==c[i]&&c[i]==d[i])matched.push_back(a[i]);
}
return 0;
}
this was the precalculation part. now next depend on data type you are using, Use binary search with little bit of extra counting or using a better data structure which stores a pair(value,recurrence) and then applying binary search.
Time complexity will be O(n+n*log(n)+m*log(n)) where n is minsize in code and m is number of user input
Honestly, I would have a couple of methods to maintain your database(vectors).
Essentially, do a QuickSort to start out with.
Then ever so often consistently run a insertion sort (Faster then QuickSort for partially sorted lists)
Then just run binary search on those vectors.
edit:
I think a better way to store this is instead of using multiple vectors per entry. Have one class vector that stores all the values. (your current vectors)
class entry {
public:
variable data1;
variable data2;
variable data3;
variable data4;
}
Make this into a single vector. Then use my method I described above to sort through these vectors.
You will have to sort through by what type of data it is first. Then after call binary search on that data.
You can create a lookup table for the vector with std::unordered_multimap in O(n). Then you can use unordered_multimap::count() to get the number of times the item appears in the vector and unordered_multimap::equal_range() to get the indices of the items inside your vector.
std::vector<std::string> a = {"ab", "ba", "ca", "ab", "bc", "ba"};
std::vector<std::string> b = {"fg", "fg", "ba", "eg", "gf", "ge"};
std::vector<std::string> c = {"pq", "qa", "ba", "fg", "de", "gf"};
std::unordered_multimap<std::string,int> lookup_table;
for (int i = 0; i < a.size(); i++) {
lookup_table.insert(std::make_pair(a[i], i));
lookup_table.insert(std::make_pair(b[i], i));
lookup_table.insert(std::make_pair(c[i], i));
}
// count
std::string userinput;
std::cin >> userinput;
int count = lookup_table.count(userinput);
std::cout << userinput << " shows up " << count << " times" << std::endl;
// print all the places where the key shows up
auto range = lookup_table.equal_range(userinput);
for (auto it = range.first; it != range.second; it++) {
int ind = it->second;
std::cout << " " << it->second << " "
<< a[ind] << " "
<< b[ind] << " "
<< c[ind] << std::endl;
}
This will be the most efficient if you will be searching the lookup table many items. If you only need to search one time, then Dietmar Kühl's approach would be most efficient.
I'm currently trying to display the total no. of topicids and testids based on the name.
However I'm having trouble doing that display. I initially had a vector containing all the data.
For e.g.
user1:name:topic1:test1
user1:name:topic2:test1
user2:name:topic1:test1
user2:name:topic2:test1
Due to the multiple duplicates in the vector, I want to display in the following format:
username:name:numofTopics:numofTests
user1:name:2:2
user1:name:2:2
Therefore, i thought of comparing the name against the next name in the vector and push in the element to a new vector called singleAcc. The purpose of this is to display the duplicate element as ONE element.
Below is my code for displaying the data
vector<AccDetails> singleAcc;
for (vector<AccDetails>::iterator itr=accInfo.begin();itr!=accInfo.end()-1; ++itr) {
if (itr->name == itr[1].name) {
//cout << (*itr) << endl;
singleAcc.push_back(*itr);
}
}
for (vector<AccDetails>::iterator itr = singleAcc();itr!=singleAcc();++itr) {
cout << left
<< setfill(' ')
<< setw(20) << itr[0].username
<< setw(20) << itr[0].name
<< setw(20) << countTopics(itr->name)
<< setw(20) << countTests()
<< endl;
}
Problem:
On the first vector iteration, the name will not compare against the last element bcoz of accDetails.end()-1.
How to display the duplicate elements as ONE element? Is what I'm doing in the 2nd iteration the right thing?
Hope someone can help me with this. Or is there a better way to doing this?
Thanks!
Why this won't work
Your proposed solution simply won't work as intended. Consider three elements that are considered duplicates in consecutive subsequence (I am using numbers to simplify the concept):
[1,1,1]
The iterator will first compare 1 to 1, and then push_back the first one.
Then it will compare second 1 to the third one, which again returns true, and the result that was supposed to have no duplicates ends up:
[1,1]
So it's clearly not something you want to do. In general, it looks like a rather weird problem, but so solve this one part you've posted here, I suggest using std::multiset.
A better solution
Create a comparator that tests for the name field just like you do here.
Then, recovering unique fields is rather simple:
std::multiset<AccDetail> s;
for (auto element_it = s.begin(); element_it != s.end(); element_it = s.upper_bound(*element_it)) {
auto er = s.equal_range(*element_it);
// use this to get all the elements with given name
for (auto i = er.first; i != er.second; ++i)
cout << *i << " ";
// use this to get the number of them
cout << std::distance(er.first, er.second);
}
See a working sample on Coliru.
Bonus
You are right that the iterator in first loop would go over the bounds. The solution for that is rather simple: zip iterator which can handle this automatically.
for (auto elem_pair : zip(v, v | drop(1)))
if (elem_pair.first == elem_pair.second)
...
Boost.Range has tools that allow that code to work. It would still suffer from the problems I've mentioned, though.
For example map1 is gaving values 1 to 10 with some address(begin to end).
i want to have values 10 to 1 with corresponding address in map2(begin to end)
map<long , int* > v;
map<long , int* > rv;
int i,a[10];
for(i=0; i<10; i++)
{
a[i] = i+1;
v.insert(pair<long, int *>(i+1,&a[i]));
}
itr = v.begin();
while(itr != v.end())
{
cout << itr->first << " "<<itr->second;
cout << endl;
itr++;
}
rv.insert(v.rbegin(),v.rend());
cout << "copied array: "<<endl;
itr = rv.begin();
while(itr != rv.end())
{
cout << itr->first << " "<<itr->second;
cout << endl;
itr++;
}
i tried above one but am getting values 1 to 10 only..my expected values 10 to 1.
please help me to find out....
STL map is an ordered container. The order of items that you get during the iteration is independent of the order in which you insert the items into the container.
The order of iteration is determined by two things:
The value of the key, and
The Compare class passed as the template parameter to the map
You can iterate the map in reverse order (your code snippet shows that you already know how it is done). The performance penalty for reverse-iterating a map, if any, is negligible. You can also provide a non-default Compare (std::greater<long> instead of the default std::less<long>) to have the default order of iteration altered.
This is impossible because std::map is ordered associative container. If you want to preserve order of insertion use other containers such as std::list or std::vector.
Maps sort by increasing value (as dictated by operator< ), so no matter how you insert the elements, they will be returned in sorted order. You are certainly doing the insertion in reverse, but every element placed is being duly sorted into proper ascending order.