How to initialize unordered_map directly with fixed element? - c++

I want to initialize one unordered_map with fixed element 100. And the keys are from 0 to 100, all values of those keys are 0
using HashMap = unordered_map < int, int > ;
HashMap map;
for (int idx = 0; idx < 100; ++idx) {
map[idx] = 0;
}
Question 1:
Is there any directly way to do that like the following codes in python?
d = {x: x % 2 == 0 for x in range(1, 11)}
Question 2:
With initialization codes above, I think all elements are sorted in ascending order, but the results are:
Why the first element is 8 and the second element is 64, all left elements are in ascending order?

This is not quite so pretty as the Python expression, but it should do the trick.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_map>
int main() {
std::unordered_map<int, bool> m;
int i = -1;
std::generate_n(std::inserter(m, m.begin()),
10,
[&i](){
++i;
return std::make_pair(i, i % 2 == 0);
});
for (auto const &p: m)
std::cout << '<' << p.first << ", " << p.second << ">\n";
return 0;
}
Live on ideone.com
There is a reason unordered maps are called unordered maps. Since they are implemented as hash maps, the keys are not in any predictable order. Using an std::unordered_map for a dense collection of integer keys is probably not the most efficient solution to any problem, particularly if you expect to be able to extract the keys in order.

consider boost::irange
the internal data structure for unordered map is hash table, which does not always hold key order during hashing.

Related

STL container to use for deleting a range of values

I have with myself n integers. I am given with starting value(SV) and ending value(EV), I need to delete the values that lie within the range of these values. The starting value as well the ending value may not exist in the set of n integers. This I need to do in O(No. Of elements deleted). Container such as vector of integers is failing since I need to B.Search and get the iterator that is greater than equal to the SV and for EV as well which takes an additional time of Log n. Any approach appreciated.
Edit : I was even thinking of using maps, storing the values as keys, and erasing on the basis of those key values. But the problem again is that the operation lower_bound and upper_bound occur in log time.
If you need keep order in container just use:
set http://www.cplusplus.com/reference/set/set/
or multiset http://www.cplusplus.com/reference/set/multiset/ if values can repeat.
Both have lower_bound and upper_bound functionality.
You can use the erase-remove idiom with std::remove_if to check if each value is between the two bounds.
#include <algorithm>
#include <iostream>
#include <vector>
int main()
{
std::vector<int> values = {1,2,3,4,5,6,7,8,9};
int start = 3;
int stop = 7;
values.erase(std::remove_if(values.begin(),
values.end(),
[start, stop](int i){ return i > start && i < stop; }),
values.end());
for (auto i : values)
{
std::cout << i << " ";
}
}
Output
1 2 3 7 8 9
As Marek R suggested there is std::multiset.
Complexity of the whole exercise is O(log(values.size()+std::distance(first,last)) where that distance is 'number of elements erased'.
It's difficult to see how you can beat that. At the end of it there is always going to be something that increases with the size of the container and log of it is a good deal!
#include <iostream>
#include <set>
void dump(const std::multiset<int>& values);
int main() {
std::multiset<int> values;
values.insert(5);
values.insert(7);
values.insert(9);
values.insert(11);
values.insert(8);
values.insert(8);
values.insert(76);
dump(values);
auto first=values.lower_bound(7);
auto last=values.upper_bound(10);
values.erase(first,last);
dump(values);
return 0;
}
void dump(const std::multiset<int>& values){
auto flag=false;
std::cout<<'{';
for(auto curr : values){
if(flag){
std::cout<<',';
}else{
flag=true;
}
std::cout<< curr;
}
std::cout<<'}'<<std::endl;
}

Which STL to use to find index by value in O(1) in C++

Say I have an array arr[] = {1 , 3 , 5, 12124, 24354, 12324, 5}
I want to know the index of the value 5(i.e, 2) in O(1).
How should I go about this?
P.S :
1. Throughout my program, I shall be finding only indices and not the vice versa (getting the value by index).
2. The array can have duplicates.
If you can guarantee there are no duplicates in the array, you're best bet is probably creating an unordered_map where the map key is the array value, and map value is its index.
I wrote a method below that converts an array to an unordered_map.
#include <unordered_map>
#include <iostream>
template <typename T>
void arrayToMap(const T arr[], size_t arrSize, std::unordered_map<T, int>& map)
{
for(int i = 0; i < arrSize; ++i) {
map[arr[i]] = i;
}
}
int main()
{
int arr[] = { 1 , 3 , 5, 12124, 24354, 12324, 5 };
std::unordered_map<int, int> map;
arrayToMap(arr, sizeof(arr)/sizeof(*arr), map);
std::cout << "Value" << '\t' << "Index" << std::endl;
for(auto it = map.begin(), e = map.end(); it != e; ++it) {
std::cout << it->first << "\t" << it->second << std::endl;
}
}
However, in your example you use the value 5 twice. This causes a strange output in the above code. The outputted map does not have a value with an index 2. Even if you use an array, you would be confronted with a similar problem (i.e. should you use the value at 2 or 6?).
If you really need both values, you could use unordered_multimap, but the syntax for accessing elements isn't easy as using the operator[] (you have to use unordered_multipmap::find() which returns an iterator).
template <typename T>
void arrayToMap(const T arr[], size_t arrSize, std::unordered_multimap<T, int>& map)
{
for(int i = 0; i < arrSize; ++i) {
map.emplace(arr[i], i);
}
}
Finally, you should consider that unordered_map's fast look-up time O(1) comes with some overhead, so it uses more memory than a simple array. But if you end up using an array (which is comparatively much more memory efficient), searching for a specific value is guaranteed to be O(n) where n is the index of the value.
Edit - If you need the duplicate with the lowest index to be kept instead of the highest, you can just reverse the order of insertion:
template <typename T>
void arrayToMap(const T arr[], size_t arrSize, std::unordered_map<T, int>& map)
{
for(int i = arraySize - 1; i >= 0; --i) {
map[arr[i]] = i;
}
}
Use std::unordered_map from C++11 to map elements as key and indices as value. Then you can get answer of your query in amortized O(1) complexity. std::unordered_map will work because there is no duplicacy as you said but cost you linear size extra space.
If your value's range is not too large, you can use an array as well. This will yield even better theta(1) complexity.
use unordered_multimap (C++11 only) with the value as the key, and the position index as the value.

remove duplicates int number in a vector c++

I'm trying to remove the same integer numbers in a vector. My aim is to have only one copy them. Well I wrote a simple code, but it doesn't work properly. Can anyone help? Thanks in advance.
#include <iostream>
#include <vector>
using namespace std;
int main()
{
int a = 10, b = 10 , c = 8, d = 8, e = 10 , f = 6;
vector<int> vec;
vec.push_back(a);
vec.push_back(b);
vec.push_back(c);
vec.push_back(d);
vec.push_back(e);
vec.push_back(f);
for (int i=vec.size()-1; i>=0; i--)
{
for(int j=vec.size()-1; j>=0; j--)
{
if(vec[j] == vec[i-1])
vec.erase(vec.begin() + j);
}
}
for(int i=0; i<vec.size(); i++)
{
cout<< "vec: "<< vec[i]<<endl;
}
return 0;
}
Don't use a list for this. Use a set:
#include <set>
...
set<int> vec;
This will ensure you will have no duplicates by not adding an element if it already exists.
To remove duplicates it's easier if you sort the array first. The code below uses two different methods for removing the duplicates: one using the built-in C++ algorithms and the other using a loop.
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
using namespace std;
int main() {
int a = 10, b = 10 , c = 8, d = 8, e = 10 , f = 6;
vector<int> vec;
vec.push_back(a);
vec.push_back(b);
vec.push_back(c);
vec.push_back(d);
vec.push_back(e);
vec.push_back(f);
// Sort the vector
std::sort(vec.begin(), vec.end());
// Remove duplicates (v1)
std::vector<int> result;
std::unique_copy(vec.begin(), vec.end(), std::back_inserter(result));
// Print results
std::cout << "Result v1: ";
std::copy(result.begin(), result.end(), std::ostream_iterator<int>(cout, " "));
std::cout << std::endl;
// Remove duplicates (v2)
std::vector<int> result2;
for (int i = 0; i < vec.size(); i++) {
if (i > 0 && vec[i] == vec[i - 1])
continue;
result2.push_back(vec[i]);
}
// Print results (v2)
std::cout << "Result v2: ";
std::copy(result2.begin(), result2.end(), std::ostream_iterator<int>(cout, " "));
std::cout << std::endl;
return 0;
}
If you need to save initial order of numbers you can make a function that will remove duplicates using helper set<int> structure:
void removeDuplicates( vector<int>& v )
{
set<int> s;
vector<int> res;
for( int i = 0; i < v.size(); i++ ) {
int x = v[i];
if( s.find(x) == s.end() ) {
s.insert(x);
res.push_back(x);
}
}
swap(v, res);
}
The problem with your code is here:
for(int j=vec.size()-1; j>=0; j--)
{
if(vec[j] == vec[i-1])
vec.erase(vec.begin() + j);
}
there's going to be a time when j==i-1 and that's going to kill your algorithms and there will be a time when i-1 < 0 so you will get an out of boundary exception.
What you can do is to change your for loop conditions:
for (int i = vec.size() - 1; i>0; i--){
for(int j = i - 1; j >= 0; j--){
//do stuff
}
}
this way, your the two variables your comparing will never be the same and your indices will always be at least 0.
Others have already pointed to std::set. This is certainly simple and easy--but it can be fairly slow (quite a bit slower than std::vector, largely because (like a linked list) it consists of individually allocated nodes, linked together via pointers to form a balanced tree1.
You can (often) improve on that by using an std::unordered_set instead of a std::set. This uses a hash table2 instead of a tree to store the data, so it normally uses contiguous storage, and gives O(1) expected access time instead of the O(log N) expected for a tree.
An alternative that's often faster is to collect the data in the vector, then sort the data and use std::unique to eliminate duplicates. This tends to be best when you have two distinct phases of operation: first you collect all the data, then you need duplicates removed. If you frequently alternate between adding/deleting data, and needing a duplicate free set, then something like std::set or std::unordered_set that maintain the set without duplicates at all times may be more useful.
All of these also affect the order of the items. An std::set always maintains the items sorted in a defined order. With std::unique you need to explicit sort the data. With std::unordered_set you get the items sorted in an arbitrary order that's neither their original order nor is it sorted.
If you need to maintain the original order, but without duplicates, you normally end up needing to store the data twice. For example when you need to add a new item, you attempt to insert it into an std::unordered_set, then if and only if that succeeds, add it to the vector as well.
Technically, implementation as a tree isn't strictly required, but it's about the only possibility of which I'm aware that can meet the requirements, and all the implementations of which I'm aware are based on trees.
Again, other implementations might be theoretically possible, but all of which I'm aware use hashing--but in this case, enough of the implementation is exposed that avoiding a hash table would probably be even more difficult.
The body of a range for must not change the size of the sequence over which it is iterating..
you can remove duplicates before push_back
void push(std::vector<int> & arr, int n)
{
for(int i = 0; i != arr.size(); ++i)
{
if(arr[i] == n)
{
return;
}
}
arr.push_back(n);
}
... ...
push(vec, a);
push(vec, b);
push(vec, c);
...

Is there a sorted container in the STL?

Is there a sorted container in the STL?
What I mean is following: I have an std::vector<Foo>, where Foo is a custom made class. I also have a comparator of some sort which will compare the fields of the class Foo.
Now, somewhere in my code I am doing:
std::sort( myvec.begin(), myvec.end(), comparator );
which will sort the vector according to the rules I defined in the comparator.
Now I want to insert an element of class Foo into that vector. If I could, I would like to just write:
mysortedvector.push_back( Foo() );
and what would happen is that the vector will put this new element according to the comparator to its place.
Instead, right now I have to write:
myvec.push_back( Foo() );
std::sort( myvec.begin(), myvec.end(), comparator );
which is just a waste of time, since the vector is already sorted and all I need is to place the new element appropriately.
Now, because of the nature of my program, I can't use std::map<> as I don't have a key/value pairs, just a simple vector.
If I use stl::list, I again need to call sort after every insertion.
Yes, std::set, std::multiset, std::map, and std::multimap are all sorted using std::less as the default comparison operation. The underlying data-structure used is typically a balanced binary search tree such as a red-black tree. So if you add an element to these data-structures and then iterate over the contained elements, the output will be in sorted order. The complexity of adding N elements to the data-structure will be O(N log N), or the same as sorting a vector of N elements using any common O(log N) complexity sort.
In your specific scenario, since you don't have key/value pairs, std::set or std::multiset is probably your best bet.
I'd like to expand on Jason's answer. I agree to Jason, that either std::set or std::multiset is the best choice for your specific scenario. I'd like to provide an example in order to help you to further narrow down the choice.
Let's assume that you have the following class Foo:
class Foo {
public:
Foo(int v1, int v2) : val1(v1), val2(v2) {};
bool operator<(const Foo &foo) const { return val2 < foo.val2; }
int val1;
int val2;
};
Here, Foo overloads the < operator. This way, you don't need to specify an explicit comparator function. As a result, you can simply use a std::multiset instead of a std::vector in the following way. You just have to replace push_back() by insert():
int main()
{
std::multiset<Foo> ms;
ms.insert(Foo(1, 6));
ms.insert(Foo(1, 5));
ms.insert(Foo(3, 4));
ms.insert(Foo(2, 4));
for (auto const &foo : ms)
std::cout << foo.val1 << " " << foo.val2 << std::endl;
return 0;
}
Output:
3 4
2 4
1 5
1 6
As you can see, the container is sorted by the member val2 of the class Foo, based on the < operator. However, if you use std::set instead of a std::multiset, then you will get a different output:
int main()
{
std::set<Foo> s;
s.insert(Foo(1, 6));
s.insert(Foo(1, 5));
s.insert(Foo(3, 4));
s.insert(Foo(2, 4));
for (auto const &foo : s)
std::cout << foo.val1 << " " << foo.val2 << std::endl;
return 0;
}
Output:
3 4
1 5
1 6
Here, the second Foo object where val2 is 4 is missing, because a std::set only allows for unique entries. Whether entries are unique is decided based on the provided < operator. In this example, the < operator compares the val2 members to each other. Therefore, two Foo objects are equal, if their val2 members have the same value.
So, your choice depends on whether or not you want to store Foo objects that may be equal based on the < operator.
Code on Ideone
C++ do have sorted container e.g std::set and std::map
int main()
{
//ordered set
set<int> s;
s.insert(5);
s.insert(1);
s.insert(6);
s.insert(3);
s.insert(7);
s.insert(2);
cout << "Elements of set in sorted order: ";
for (auto it : s)
cout << it << " ";
return 0;
}
Output:
Elements of set in sorted order:
1 2 3 5 6 7
int main()
{
// Ordered map
std::map<int, int> order;
// Mapping values to keys
order[5] = 10;
order[3] = 5;
order[20] = 100;
order[1] = 1;
// Iterating the map and printing ordered values
for (auto i = order.begin(); i != order.end(); i++) {
std::cout << i->first << " : " << i->second << '\n';
}
Output:
1 : 1
3 : 5
5 : 10
20 : 100

how would I sort a list and get the top K elements? (STL)

I have a vector of doubles. I want to sort it from highest to lowest, and get the indices of the top K elements. std::sort just sorts in place, and does not return the indices I believe. What would be a quick way to get the top K indices of largest elements?
you could use the nth_element STL algorithm - this will return you the N greatest elements ( this is the fastest way,using stl ) and then use .sort on them,or you could use the partial_sort algorithm,if you want the first K elements to be sorted (:
Using just .sort is awful - it is very slow for the purpose you want.. .sort is great STL algorithm,but for sorting the whole container,not just the first K elements (; it's not an accident the existung of nth_element and partial_sort ;)
The first thing that comes to mind is somewhat hackish, but you could define a struct that stored both the double and its original index, then overload the < operator to sort based on the double:
struct s {
double d;
int index;
bool operator < (const struct &s) const {
return d < s.d;
}
};
Then you could retrieve the original indices from the struct.
Fuller example:
vector<double> orig;
vector<s> v;
...
for (int i=0; i < orig.size(); ++i) {
s s_temp;
s_temp.d = orig[i];
s_temp.index = i;
v.push_back(s);
}
sort(v.begin(), v.end());
//now just retrieve v[i].index
This will leave them sorted from smallest to largest, but you could overload the > operator instead and then pass in greater to the sort function if wanted.
OK, how about this?
bool isSmaller (std::pair<double, int> x, std::pair<double, int> y)
{
return x.first< y.first;
}
int main()
{
//...
//you have your vector<double> here, say name is d;
std::vector<std::pair<double, int> > newVec(d.size());
for(int i = 0; i < newVec.size(); ++i)
{
newVec[i].first = d[i];
newVec[i].second = i; //store the initial index
}
std::sort(newVec.begin(), newVec.end(), &isSmaller);
//now you can iterate through first k elements and the second components will be the initial indices
}
Not sure about pre-canned algorithms, but take a look at selection algorithms; if you need the top K elements of a set of N values and N is much larger than K, there are much more efficient methods.
If you can create an indexing class (like #user470379's answer -- basically a class that encapsulates a pointer/index to the "real" data which is read-only), then use a priority queue of maximum size K, and add each unsorted element to the priority queue, popping off the bottom-most element when the queue reaches size K+1. In cases like N = 106, K = 100, this handles cases much more simply + efficiently than a full sort.
So you actually need a structure that maps indices to corresponding doubles.
You could use std::multimap class to perform this mapping. As Jason have noted std::map does not allow duplicate keys.
std::vector<double> v; // assume it is populated already
std::multimap<double, int> m;
for (int i = 0; i < v.size(); ++i)
m.insert(std::make_pair(v[i], i));
...
After you've done this you could iterate over first ten elements as map preserves sorting of keys to the elements.
Use multimap for vector's (value, index) to handle dups. Use reverse iterators to walk results in descending order.
#include <multimap>
#include <vector>
using namespace std;
multimap<double, size_t> indices;
vector<double> values;
values.push_back(1.0);
values.push_back(2.0);
values.push_back(3.0);
values.push_back(4.0);
size_t i = 0;
for(vector<double>::const_iterator iter = values.begin();
iter != values.end(); ++iter, ++i)
{
indices.insert(make_pair<double,int>(*iter, i));
}
i = 0;
size_t limit = 2;
for (multimap<double, size_t>::const_reverse_iterator iter = indices.rbegin();
iter != indices.rend() && i < limit; ++iter, ++i)
{
cout << "Value " << iter->first << " index " << iter->second << endl;
}
Output is
Value 4 index 3
Value 3 index 2
If you just want the vector indices after sort, use this:
#include <algorithm>
#include <vector>
using namespace std;
vector<double> values;
values.push_back(1.0);
values.push_back(2.0);
values.push_back(3.0);
values.push_back(4.0);
sort(values.rbegin(), values.rend());
The top K entries are indexed by 0 to K-1, and appear in descending order. This uses reverse iterators combined with standard sort (using less<double> to achieve descending order when iterated forward. Equivalently:
sort(values.rbegin(), values.rend(), less<double>());
Sample code for the excellent nth_element solution suggested by #Kiril here (K = 125000, N = 500000). I wanted to try this out, so here it is.
vector<double> values;
for (size_t i = 0; i < 500000; ++i)
{
values.push_back(rand());
}
nth_element(values.begin(), values.begin()+375000, values.end());
sort(values.begin()+375000, values.end());
vector<double> results(values.rbegin(), values.rbegin() + values.size() - 375000);