C++ map showing values as 0 for all keys - c++

I am inserting number from a vector in map with key as the number and value as it's (index + 1).
But when I print the contents of the map, value being shown is 0 though I pass interger i.
// taking input of n integers in vector s;
vector<int> s;
for(int i=0;i<n;i++){
int tmp;cin>>tmp;
s.push_back(tmp);
}
//creating map int to int
map<int,int> m;
bool done = false;
for(int i=1;i<=s.size();i++){
//check if number already in map
if (m[s[i-1]]!=0){
if (i-m[s[i-1]]>1){
done = true;
break;
}
}
// if number was not in map then insert the number and it's index + 1
else{
m.insert({s[i-1],i});
}
}
for(auto it=m.begin();it!=m.end();it++){
cout<<endl<<it->first<<": "<<it->second<<endl;
}
For input
n = 3
and numbers as
1 2 1 in vector s, I expect the output to be
1: 1
2: 2
but output is
1: 0
2: 0
Why 0? What's wrong?

Your code block following the comment:
// check if number already in map
is logically faulty, because operator[] will actually insert an element, using value initialisation(a), if it does not currently exist.
If you were to instead use:
if (m.find(s[i-1]) != m.end())
that would get rid of this problem.
(a) I believe(b) that value initialisation for classes involve one of the constructors; for arrays, value initialisation for each item in the array; and, for other types (this scenario), zero initialisation. That would mean using your method creates an entry for your key, with a zero value, and returns that zero value
It would then move to the else block (because the value is zero) and try to do the insert. However, this snippet from the standard (C++20, [map.modifiers] discussing insert) means that nothing happens:
If the map already contains an element whose key is equivalent to k, there is no effect.
(b) Though, as my kids will point out frequently, and without much prompting, I've been wrong before :-)

std::map::operator[] will create a default element if it doesn't exist. Because you do m[s[i-1]] in the if condition, m.insert({s[i-1],i}); in else branch will always fail.
To check if key is already present in map use either find(), count() or contains() (if your compiler supports C++20)
//either will work instead of `if (m[s[i-1]]!=0)`
if (m.find(s[i-1]) != m.end())
if (m.count(s[i-1]) == 1)
if (m.contains(s[i-1])) //C++20

Related

Most efficient algorithm for Two-sum problem (involving indices)

The problem statement is given an array and a given sum "T", find all the pairs of indices of the elements in the array which add up to T. Additional requirements/constraints:
Indexing starts from 0
The indices must be displayed with lower index first (Ex: 24, 30 instead of 30, 24)
The indices must be displayed in ascending order (Ex: if we find (1,3), (0,2) and (5,8) the output must be (0,2) (1,3) (5,8)
There can be duplicate elements in the array, which also have to be considered
Here's my code in C++, I used the hash-table approach using unordered_set:
void Twosum(vector <int> res, int T){
int temp; int ti = -1;
unordered_set<int> s;
vector <int> res2 = res; //Just a copy of the input vector
vector <tuple<int, int>> indices; //Result to be output
for (int i = 0; i < (int)res.size(); i++){
temp = T - res[i];
if (s.find(temp) != s.end()){
while(ti < (int)res.size()){ //While loop for finding all the instances of temp in the array,
//not part of the original hash-table algorithm, something I added
ti = find(res2.begin(), res2.end(), temp) - res2.begin();
//Here find() takes O(n) time which is an issue
res2[ti] = lim; //To remove that instance of temp so that new instances
//can be found in the while loop, here lim = 10^9
if(i <= ti) indices.push_back(make_tuple(i, ti));
else indices.push_back(make_tuple(ti, i));
}
}
s.insert(res[i]);
}
if(ti == -1)
{cout<<"-1 -1"; //if no indices were found
return;}
sort(indices.begin(), indices.end()); //sorting since unordered_set stores elements randomly
for(int i=0; i<(int)indices.size(); i++)
cout<<get<0>(indices[i])<<" "<<get<1>(indices[i])<<endl;
}
This has multiple issues:
firstly that while loop doesn't work as intended, instead it shows SIGABRT error (free(): invalid pointer). The ti index is also somehow going beyond the vector bounds, even though I have that check in the while loop.
Secondly the find() function works in O(n) time, which increases the overall complexity to O(n^2), which is causing my program to timeout during execution. However that function is required since we have to output indices.
Lastly this unordered-set implementation doesn't seem to work when there are many duplicate elements in the array (since sets only take unique elements), which is one of the main constraints of the problem. This makes me think we need some sort of hash function or hashmap to deal with the duplicates? I'm not sure...
All the different algorithms I've found for this on the internet have dealt with just printing the elements and not the indices, hence I've had no luck with this problem.
If any of you know an optimal algorithm for this while also satisfying the constraints and running under O(n) time, your help would be highly appreciated. Thank you in advance.
Here is a pseudo-code answering your question, using hash tables (or maps) and set. I let you translate this to cpp using adapted data structures (in this case, classic hashmaps and sets will do the job well).
Notations: we will denote A the array, n its length, and T the "sum".
// first we build a map element -> {set of indices corresponding to this element}
Let M be an empty map; // or hash map, or hash table, or dictionary
for i from 0 to n-1 do {
Let e = A[i];
if e is not a key of M then {
M[e] = new_set()
}
M[e].add(i)
}
// Now we iterate over the elements
for each key e of M do {
if T-e is a key of M then {
display_combinations(M[e], M[T-e]);
}
}
// The helper function display_combinations
function display_combinations(set1, set2) {
for each element e1 of set1 do {
for element e2 of set2 do {
if e1 < e2 then {
display "(e1, e2)";
} else if e1 > e2 then {
display "(e2, e1)";
}
}
}
}
As said in the comments, the complexity in the worst case of this algorithm is in O(n²). A way to see that we cannot go below this complexity is that the size of the output may be in O(n²), in the case where all elements of the array have the value T/2.
Edit: this pseudo code does not output the pairs in the order. Just store them in an array of pairs, and sort this array before displaying it. Same, I did not treat the case where a pair (i, i) may satisfy the requirement. You may have to consider it (just change e1 > e2 by e1 >= e2 in the last loop)

How does insertion in an unordered_map in C++ work?

int main()
{
auto n=0, sockNumber=0, pairs=0;
unordered_map<int, int> numberOfPairs;
cin >> n;
for(int i=0; i<n; ++i)
{
cin >> sockNumber;
numberOfPairs.insert({sockNumber, 0}); // >>>>>> HERE <<<<<<
numberOfPairs.at(sockNumber) += 1;
if(numberOfPairs.at(sockNumber) % 2 == 0)
{
pairs += 1;
}
}
cout << pairs;
return 0;
}
This code counts the number of pairs in the given input and prints it. I want to know how the insert method of an unordered_map works. Every time I see a number, I've inserted it with a value '0'.
Does the insert method skip inserting the value '0' when it sees the same number again? How does it work?
Input -
9
10 20 20 10 10 30 50 10 20
Output -
3
Does the insert method skip inserting the value '0' when it sees the
same number again?
Yes, it does.
From the cpp.reference.com unordered_map :
Unordered map is an associative container that contains key-value
pairs with unique keys. Search, insertion, and removal of elements
have average constant-time complexity.
And from the cpp.reference.com unordered_map::insert :
Inserts element(s) into the container, if the container doesn't
already contain an element with an equivalent key.
How does it work?
I suppose that certain work principles depend much on the particular STL implementation.
Basically unordered_map is implemented as a hash table where elements are organized into the buckets corresponding to the same hash. When you try to insert a key-value pair key hash is computed. If there is no such hash in the hash table or there is no such key-value pair in the bucket corresponding to the computed hash then the new pair is inserted into the unordered_map.
A std::unordered_map holds unique keys as values. If you want to keep inserting the same key, then use std::unordered_multimap.
Also, you should realize that std::unordered_map::insert returns a value that denotes whether the insertion was successful.
if ( !numberOfPairs.insert({sockNumber, 0}).second )
{
// insertion didn't work
}
You could have used the above to confirm that the item wasn't inserted, since the same key existed already in the map.
unordered_map does not allow key duplicates, so if you are trying to use the .insert() method to insert the same key it will fail and skip that operation. However if you use unorderedMap[key] = value to insert a duplicate key, it will not skip but updating the value matching the key to the new value.

Sort std::vector<int> but ignore a certain number

I have an std::vector<int> of the size 10 and each entry is initially -1. This vector represents a leaderboard for my game (high scores), and -1 just means there is no score for that entry.
std::vector<int> myVector;
myVector.resize(10, -1);
When the game is started, I want to load the high score from a file. I load each line (up to 10 lines), convert the value that is found to an int with std::stoi, and if the number is >0 I replace it with the -1 currently in the vector at the current position.
All this works. Now to the problem:
Since the values in the file aren't necessarily sorted, I want to sort myVector after I load all entries. I do this with
std::sort(myVector.begin(), myVector.end());
This sorts it in ascending order (lower score is better in my game).
The problem is that, since the vector is initially filled with -1 and there aren't necessarily 10 entries saved in the high scores file, the vector might contain a few -1 in addition to the player's scores.
That means when sorting the vector with the above code, all the -1 will appear before the player's scores.
My question is: How do I sort the vector (in ascending order), but all entries with -1 will be put at the end (since they don't represent a real score)?
Combine partitioning and sorting:
std::sort(v.begin(),
std::partition(v.begin(), v.end(), [](int n){ return n != -1; }));
If you store the iterator returned from partition, you already have a complete description of the range of non-trivial values, so you don't need to look for −1s later.
You can provide lambda as parameter for sort:
std::sort(myVector.begin(), myVector.end(),[]( int i1, int i2 ) {
if( i1 == -1 ) return false;
if( i2 == -1 ) return true;
return i1 < i2; }
);
here is the demo (copied from Kerrek)
but it is not clear how you realize where is which score after sort.
From your description, it appears that the score can be never negative. In that case, I'd recommend the scores to be a vector of unsigned int. You can define a constant
const unsigned int INFINITY = -1;
and load your vector with INFINITY initially. INFINITY is the maximum positive integer that can be stored in a 32 bit unsigned integer (which also corresponds to -1 in 2's complement)
Then you could simply sort using
sort(v.begin(),v.end());
All INFINITY will be at the end after the sort.
std::sort supports using your own comparison function with the signature bool cmp(const T& a, const T& b);. So write your own function similar to this:
bool sort_negatives(const int& a, const int& b)
{
if (a == -1) {
return false;
}
if (b == -1) {
return true;
}
return a < b;
}
And then call sort like std::sort(myVector.begin(), myVector.end(), sort_negatives);.
EDIT: Fixed the logic courtesy of Slava. If you are using a compiler with C++11 support, use the lambda or partition answers, but this should work on compilers pre C++11.
For the following, I assume that the -1 values are all placed at the end of the vector. If they are not, use KerrekSB's method, or make sure that you do not skip the indices in the vector for which no valid score is in the file (by using an extra index / iterator for writing to the vector).
std::sort uses a pair of iterators. Simply provide the sub-range which contains non--1 values. You already know the end of this range after reading from a file. If you already use iterators to fill the vector, like in
auto it = myVector.begin();
while (...) {
*it = stoi(...);
++it;
}
then simply use it instead of myVector.end():
std::sort(myVector.begin(), it);
Otherwise (i.e., when using indices to fill up the vector, let's say i is the number of values), use
std::sort(myVector.begin(), myVector.begin() + i);
An alternative approach is to use reserve() instead of resize().
std::vector<int> myVector;
myVector.reserve(10);
for each line in file:
int number_in_line = ...;
myVector.push_back(number_in_line);
std::sort(myVector.begin(), myVector.end());
This way, the vector would have only the numbers that are actually in file, no extra (spurious) values (e.g. -1). If the vector need to be later passed to other module or function for further processing, they do not need to know about the special nature of '-1' values.

Using a hash to find one duplicated and one missing number in an array

I had this question during an interview and am curious to see how it would be implemented.
Given an unsorted array of integers from 0 to x. One number is missing and one is duplicated. Find those numbers.
Here is what I came up with:
int counts[x+1];
for(int i =0;i<=x; i++){
counts[a[i]]++;
if(counts[a[i]] == 2)
cout<<”Duplicate element: “<<a[i]; //I realized I could find this here
}
for(int j=0; j<=x; j++){
if(counts[j] == 0)
cout<<”Missing element: “<<j;
//if(counts[j] == 2)
// cout<<”Duplicate element: “<<j; //No longer needed here.
}
My initial solution was to create another array of size x+1, loop through the given array and index into my array at the values of the given array and increment. If after the increment any value in my array is two, that is the duplicate. However, I then had to loop through my array again to find any value that was 0 for the missing number.
I pointed out that this might not be the most time efficient solution, but wasn't sure how to speed it up when I was asked. I realized I could move finding the duplicate into the first loop, but that didn't help with the missing number. After waffling for a bit, the interviewer finally gave me the idea that a hash would be a better/faster solution. I have not worked with hashes much, so I wasn't sure how to implement that. Can someone enlighten me? Also, feel free to point out any other glaring errors in my code... Thanks in advance!
If the range of values is the about the same or smaller than the number of values in an array, then using a hash table will not help. In this case, there are x+1 possible values in an array of size x+1 (one missing, one duplicate), so a hash table isn't needed, just a histogram which you've already coded.
If the assignment were changed to be looking for duplicate 32 bit values in an array of size 1 million, then the second array (a histogram) could need to be 2^32 = 4 billion counts long. This is when a hash table would help, since the hash table size is a function of the array size, not the range of values. A hash table of size 1.5 to 2 million would be large enough. In this case, you would have 2^32 - 2^20 = 4293918720 "missing" values, so that part of the assignment would go away.
Wiki article on hash tables:
Hash Table
If x were small enough (such that the sum of 0..x can be represented), you could compute the sum of the unique values in a, and subtract that from the sum of 0..x, to get the missing value, without needing the second loop.
Here is a stab at a solution that uses an index (a true key-value hash doesn't make sense when the array is guaranteed to include only integers). Sorry OP, it's in Ruby:
values = mystery_array.sort.map.with_index { |n,i| n if n != i }.compact
missing_value,duplicate_value = mystery_array.include?(values[0] - 1) ? \
[values[-1] + 1, values[0]] : [values[0] - 1, values[-1]]
The functions used likely employ a non-trivial amount of looping behind the scenes, and this will create a (possibly very large) variable values which contains a range between the missing and/or duplicate value, as well as a second lookup loop, but it works.
Perhaps the interviewer meant to say Set instead of hash?
Sorting allowed?
auto first = std::begin(a);
auto last = std::end(a);
// sort it
std::sort( first, last );
// find duplicates
auto first_duplicate = *std::adjacent_find( first, last );
// find missing value
auto missing = std::adjacent_find(first, last, [](int x, int y) {return x+2 == y;});
int missing_number = 0;
if (missing != last)
{
missing_number = 1+ *missing;
}
else
{
if (counts[0] != 0)
{
missing_number = 0;
}
else
{
missing_number = 9;
}
}
Both could be done in a single hand-written loop, but I wanted to use only stl algorithms. Any better idea for handling the corner cases?
for (i=0 to length) { // first loop
for( j=0 to length ){ // second loop
if (t[i]==j+1) {
if (counter==0){//make sure duplicated number has not been found already
for( k=i+1 to length ) { //search for duplicated number
if(t[k]==j+1){
j+1 is the duplicated number ;
if(missingIsFound)
exit // exit program, missing and dup are found
counter=1 ;
}//end if t[k]..
}//end loop for duplicated number
} // end condition to search
continue ; // continue to first loop
}
else{
j+1 is the missing number ;
if(duplicatedIsFound)
exit // exit program, missing and dup are found
continue ; //continue to first loop
}//end second loop
} //end first loop

simple hash map with vectors in C++

I'm in my first semester of studies and as a part of my comp. science assignment I have to implement a simple hash map using vectors, but I have some problems understanding the concept.
First of all I have to implement a hash function. To avoid collisions I thought it would be better to use double hashing, as follows:
do {
h = (k % m + j*(1+(k % (m-2)));
j++;
} while ( j % m != 0 );
where h is the hash to be returned, k is the key and m is the size of hash_map (and a prime number; they are all of type int).
This was easy, but then I need to be able to insert or remove a pair of key and the corresponding value in the map.
The signature of the two functions should be bool, so I have to return either true or flase, and I'm guessing that I should return true when there is no element at position h in the vector. (But I have no idea why remove should be bool as well).
My problem is what to do when the insert function returns false (i.e. when there is already a key-value pair saved on position h - I implemented this as a function named find). I could obviously move it to the next free place by simply increasing j, but then the hash calculated by my hash function wouldn't tell us anymore at which place a certain key is saved, causing wrong behaviour of remove function.
Is there any good example online, that doesn't use the pre defined STD methods? (My Google behaves wierdly in the past few days and only reutrns me unuseful hits in the local language)
I've been told to move my comment to an answer so here it is. I am presuming your get method takes the value you are looking for an argument.
so what we are going to do is a process called linear probing.
when we insert the value we hash it as normal lets say our hash value is 4
[x,x,x,,,x,x]
as we can see we can simply insert it in:
[x,x,x,x,,x,x]
however if 4 is taken when we insert it we will simply move to the next slot that is empty
[x,x,x,**x**,x,,x,x]
In linear probing if we reach the end we loop back round to the beginning until we find a slot. You shouldn't run out of space as you are using a vector which can allocate extra space when it starts getting near full capacity
this will cause problems when you are searching because the value at 4 may not be at 4 anymore (in this case its at 5). To solve this we do a little bit of a hack. Note that we still get O(1) run time complexity for inserting and retrieval as long as the load balance is below 1.
in our get method instead of returning the value in the array at 4 we are instead going to start looking for our value at 4 if its there we can return it. If not we look at the value at 5 and so on till we find the value.
in psudo code the new stuff looks like this
bool insert(value){
h = hash(value);
while(node[h] != null){
h++;
if( h = node.length){
h = 0;
}
}
node[h] = value;
return true;
}
get
get(value){
h = hash(value);
roundTrip = 0; //used to see if we keep going round the hashmap
while(true){
if(node[h] == value)
return node[h];
h++;
if( h = node.length){
h = 0;
roundTrip++;
}
if(roundTrip > 1){ //we can't find it after going round list once
return -1;
}
}
}