Is there a sorted container in the STL? - c++

Is there a sorted container in the STL?
What I mean is following: I have an std::vector<Foo>, where Foo is a custom made class. I also have a comparator of some sort which will compare the fields of the class Foo.
Now, somewhere in my code I am doing:
std::sort( myvec.begin(), myvec.end(), comparator );
which will sort the vector according to the rules I defined in the comparator.
Now I want to insert an element of class Foo into that vector. If I could, I would like to just write:
mysortedvector.push_back( Foo() );
and what would happen is that the vector will put this new element according to the comparator to its place.
Instead, right now I have to write:
myvec.push_back( Foo() );
std::sort( myvec.begin(), myvec.end(), comparator );
which is just a waste of time, since the vector is already sorted and all I need is to place the new element appropriately.
Now, because of the nature of my program, I can't use std::map<> as I don't have a key/value pairs, just a simple vector.
If I use stl::list, I again need to call sort after every insertion.

Yes, std::set, std::multiset, std::map, and std::multimap are all sorted using std::less as the default comparison operation. The underlying data-structure used is typically a balanced binary search tree such as a red-black tree. So if you add an element to these data-structures and then iterate over the contained elements, the output will be in sorted order. The complexity of adding N elements to the data-structure will be O(N log N), or the same as sorting a vector of N elements using any common O(log N) complexity sort.
In your specific scenario, since you don't have key/value pairs, std::set or std::multiset is probably your best bet.

I'd like to expand on Jason's answer. I agree to Jason, that either std::set or std::multiset is the best choice for your specific scenario. I'd like to provide an example in order to help you to further narrow down the choice.
Let's assume that you have the following class Foo:
class Foo {
public:
Foo(int v1, int v2) : val1(v1), val2(v2) {};
bool operator<(const Foo &foo) const { return val2 < foo.val2; }
int val1;
int val2;
};
Here, Foo overloads the < operator. This way, you don't need to specify an explicit comparator function. As a result, you can simply use a std::multiset instead of a std::vector in the following way. You just have to replace push_back() by insert():
int main()
{
std::multiset<Foo> ms;
ms.insert(Foo(1, 6));
ms.insert(Foo(1, 5));
ms.insert(Foo(3, 4));
ms.insert(Foo(2, 4));
for (auto const &foo : ms)
std::cout << foo.val1 << " " << foo.val2 << std::endl;
return 0;
}
Output:
3 4
2 4
1 5
1 6
As you can see, the container is sorted by the member val2 of the class Foo, based on the < operator. However, if you use std::set instead of a std::multiset, then you will get a different output:
int main()
{
std::set<Foo> s;
s.insert(Foo(1, 6));
s.insert(Foo(1, 5));
s.insert(Foo(3, 4));
s.insert(Foo(2, 4));
for (auto const &foo : s)
std::cout << foo.val1 << " " << foo.val2 << std::endl;
return 0;
}
Output:
3 4
1 5
1 6
Here, the second Foo object where val2 is 4 is missing, because a std::set only allows for unique entries. Whether entries are unique is decided based on the provided < operator. In this example, the < operator compares the val2 members to each other. Therefore, two Foo objects are equal, if their val2 members have the same value.
So, your choice depends on whether or not you want to store Foo objects that may be equal based on the < operator.
Code on Ideone

C++ do have sorted container e.g std::set and std::map
int main()
{
//ordered set
set<int> s;
s.insert(5);
s.insert(1);
s.insert(6);
s.insert(3);
s.insert(7);
s.insert(2);
cout << "Elements of set in sorted order: ";
for (auto it : s)
cout << it << " ";
return 0;
}
Output:
Elements of set in sorted order:
1 2 3 5 6 7
int main()
{
// Ordered map
std::map<int, int> order;
// Mapping values to keys
order[5] = 10;
order[3] = 5;
order[20] = 100;
order[1] = 1;
// Iterating the map and printing ordered values
for (auto i = order.begin(); i != order.end(); i++) {
std::cout << i->first << " : " << i->second << '\n';
}
Output:
1 : 1
3 : 5
5 : 10
20 : 100

Related

How does std::sort determine the sorting basis?

I have a following code
#include <bits/stdc++.h>
using namespace std;
int main () {
pair<int, int> p[4];
p[0] = pair<int, int>(5, 2);
p[1] = pair<int, int>(40, -2);
p[2] = pair<int, int>(-3, 2);
p[3] = pair<int, int>(4, 45);
auto print_pairii = [](pair<int, int> pp[]) {
for (int i = 0; i < 4; i++) {
cout << pp[i].first << " ";
}
cout << endl;
};
print_pairii(p);
sort(p, p + 4);
print_pairii(p);
return 0;
}
The first print_pairii shows 5 40 -3 4.
After sorting the array of pair, the print_pairii shows -3 4 5 40, meaning that the sorting was done in the basis of the first element of the pair.
Why does this happen instead of the basis of the second element?
How does sort work in this sense?
Because when using std::sort without specifying comparator, elements are compared using operator<.
1) Elements are compared using operator<.
And the overloaded operator< for std::pair, compares the 1st element firstly, and then the 2nd element if the 1st elements are equal.
Compares lhs and rhs lexicographically, that is, compares the first elements and only if they are equivalent, compares the second elements.
Why does this happen instead of the basis of the second element? How does sort work in this sense?
Because by default std::sort will sort std::pair::first then std::pair::second.
If you want to sort via second element you have to provide custom comparison operator. Something like:
sort(p, p + 4,
[](const std::pair<int, int> &x, const std::pair<int, int> &y) {
return x.second < y.second;
});
It's useful to break this down into its components.
std::pair compares lexicographically, as you've just seen: https://en.cppreference.com/w/cpp/utility/pair. std::sort compares (all types) using operator< by default: https://en.cppreference.com/w/cpp/algorithm/sort. Put these two together, and you sort pairs in increasing order of first then second elements.

Find equals value into an array in c++

There is a faster way to find equals value into an array instead of comparing all elements one by one with all the array's elements ?
for(int i = 0; i < arrayLenght; i ++)
{
for(int k = i; k < arrayLenght; i ++)
{
if(array[i] == array[k])
{
sprintf(message,"There is a duplicate of %s",array[i]);
ShowMessage(message);
break;
}
}
}
Since sorting your container is a possible solution, std::unique is the simplest solution to your problem:
std::vector<int> v {0,1,0,1,2,0,1,2,3};
std::sort(begin(v), end(v));
v.erase(std::unique(begin(v), end(v)), end(v));
First, the vector is sorted. You can use anything, std::sort is just the simplest. After that, std::unique shifts the duplicates to the end of the container and returns an iterator to the first duplicate. This is then eaten by erase and effectively removes those from the vector.
You could use std::multiset and then count duplicates afterwards like this:
#include <iostream>
#include <set>
int main()
{
const int arrayLenght = 14;
int array[arrayLenght] = { 0,2,1,3,1,4,5,5,5,2,2,3,5,5 };
std::multiset<int> ms(array, array + arrayLenght);
for (auto it = ms.begin(), end = ms.end(); it != end; it = ms.equal_range(*it).second)
{
int cnt = 0;
if ((cnt = ms.count(*it)) > 1)
std::cout << "There are " << cnt << " of " << *it << std::endl;
}
}
https://ideone.com/6ktW89
There are 2 of 1
There are 3 of 2
There are 2 of 3
There are 5 of 5
If your value_type of this array could be sorted by operator <(a strict weak order) it's a good choice to do as YSC answered.
If not,maybe you can try to define a hash function to hash the objects to different values.Then you can do this in O(n) time complexity,like:
struct ValueHash
{
size_t operator()(const Value& rhs) const{
//do_something
}
};
struct ValueCmp
{
bool operator()(const Value& lhs, const Value& rhs) const{
//do_something
}
};
unordered_set<Value,ValueHash,ValueCmp> myset;
for(int i = 0; i < arrayLenght; i ++)
{
if(myset.find(array[i])==myset.end())
myset.insert(array[i]);
else
dosomething();
}
In case you have a large amount of data, you can first sort the array (quick sort gives you a first pass in O(n*log(n))) and then do a second pass by comparing each value with the next (as they might be all together) to find duplicates (this is a sequential pass in O(n)) so, sorting in a first pass and searching the sorted array for duplicates gives you O(n*log(n) + n), or finally O(n*log(n)).
EDIT
An alternative has been suggested in the comments, of using a std::set to check for already processed data. The algorithm just goes element by element, checking if the element has been seen before. This can lead to a O(n) algorithm, but only if you take care of using a hash set. In case you use a sorted set, then you incur in an O(log(n)) for each set search and finish in the same O(n*log(n)). But because the proposal can be solved with a hash set (you have to be careful in selecting an std::unsorted_set, so you don't get the extra access time per search) you get a final O(n). Of course, you have to account for possible automatic hash table grow or a huge waste of memory used in the hash table.
Thanks to #freakish, who pointed the set solution in the comments to the question.

Split vector to unique and duplicates c++

My goal is to split a vector into two parts: with unique values and with duplicates.
For example I have sorted vector myVec=(1,1,3,4,4,7,7,8,9,9) which should be split into myVecDuplicates=(1,7,4,9) and myVecUnique=(1,4,7,9,3,8). So myVecDuplicates contains all values that have duplicates while myVecUnique contains all values but in a single embodiment.
The order does not matter. My idea was to use unique as it splits a vector into two parts. But I have a problem running my code.
vector<int> myVec(8)={1,1,3,4,4,7,8,9};
vector<int>::iterator firstDuplicate=unique(myVec.begin(),myVec.end());
vector<int> myVecDuplicate=myVec(firstDuplicate,myVec.end());\\here error accures that says ' no match for call to '(std::vector<int>) (std::vector<int>::iterator&, std::vector<int>::iterator)'
vector<int> myVecUnique=myVec(myVec.begin()+firstDuplicate-1,myVec.end());
After running this code I get an error that says (2nd line) 'no match for call to '(std::vector) (std::vector::iterator&, std::vector::iterator)'
Please help me to understand the source of error or maybe suggest some more elegant and fast way to solve my problem (without hash tables)!
Ahh..Too many edits in your question for anyone's liking. Just keep it simple by using map.
In C++, map comes really handy in storing the unique + sorted + respective_count values.
map<int, int> m;
for(auto &t : myVec){
m[t]++;
}
vector<int> myVecDuplicate, myVecUnique;
for(map<int, int>::iterator it = m.begin(); it != m.end(); it++){
if(it->second > 1) myVecDuplicate.push_back(it->first);
myVecUnique.push_back(it->first);
}
Edit:
maybe suggest some more elegant and fast way to solve my problem (without hash tables)!
Sort the vector
Traverse through the sorted vector,
and do
if (current_value == previous_value){
if(previous_value != previous_previous_value)
myVecDuplicate.push_back(current_value);
}
else{
myVecUnique.push_back(current_value);
}
To start, initialize previous_value = current_value - 1
and previous_previous_value as current_value - 2.
While this may be frowned upon (for not using standard algorithms and such), I would write some simple solution like this:
vector<int> myVec = {1,1,3,4,4,7,8,9};
unordered_set<int> duplicates;
unordered_set<int> unique;
for(int & v : myVec)
{
if(unique.count(v) > 0)
duplicates.insert(v);
else
unique.insert(v);
}
O(n) complexity solution:
#include <iostream>
#include <vector>
int main()
{
std::vector<int> myVec = {1,1,3,4,4,7,7,8,9,9};
std::vector<int> myVecDuplicatec;
std::vector<int> myVecUnique;
for(int &x : myVec)
{
if(myVecUnique.size() == 0 || myVecUnique.back() != x)
myVecUnique.push_back(x);
else
myVecDuplicatec.push_back(x);
}
std::cout << "V = ";
for(int &x : myVec)
{
std::cout << x << ",";
}
std::cout << std::endl << "U = ";
for(int &x : myVecUnique)
{
std::cout << x << ",";
}
std::cout << std::endl << "D = ";
for(int &x : myVecDuplicatec)
{
std::cout << x << ",";
}
}
cpp.sh/4i45x
std::vector has a constructor that accepts 2 iterators for range [first,second[ You cannot call constructor for existing object - it is already created, so your code
myVec(firstDuplicate,myVec.end());
actually tries to use myVec as a functor, but std::vector does not have operator() hence the error.
you have 2 ways, pass 2 iterators to constructor directly:
vector<int> myVecDuplicate(firstDuplicate,myVec.end());
or use copy initialization with temporary vector:
vector<int> myVecDuplicate = vector<int>(firstDuplicate,myVec.end());
Same for the second vector:
vector<int> myVecUnique(myVec.begin(),firstDuplicate);
as pointed by Logman std::unique does not seem to guarantee value of duplicates, so working solution can use std::set instead (and you would not have to presort source vector):
std::set<int> iset;
vector<int> myVecUnique, myVecDuplicate;
for( auto val : myVec )
( iset.insert( val ).second ? myVecUnique : myVecDuplicate ).push_back( val );

Which STL to use to find index by value in O(1) in C++

Say I have an array arr[] = {1 , 3 , 5, 12124, 24354, 12324, 5}
I want to know the index of the value 5(i.e, 2) in O(1).
How should I go about this?
P.S :
1. Throughout my program, I shall be finding only indices and not the vice versa (getting the value by index).
2. The array can have duplicates.
If you can guarantee there are no duplicates in the array, you're best bet is probably creating an unordered_map where the map key is the array value, and map value is its index.
I wrote a method below that converts an array to an unordered_map.
#include <unordered_map>
#include <iostream>
template <typename T>
void arrayToMap(const T arr[], size_t arrSize, std::unordered_map<T, int>& map)
{
for(int i = 0; i < arrSize; ++i) {
map[arr[i]] = i;
}
}
int main()
{
int arr[] = { 1 , 3 , 5, 12124, 24354, 12324, 5 };
std::unordered_map<int, int> map;
arrayToMap(arr, sizeof(arr)/sizeof(*arr), map);
std::cout << "Value" << '\t' << "Index" << std::endl;
for(auto it = map.begin(), e = map.end(); it != e; ++it) {
std::cout << it->first << "\t" << it->second << std::endl;
}
}
However, in your example you use the value 5 twice. This causes a strange output in the above code. The outputted map does not have a value with an index 2. Even if you use an array, you would be confronted with a similar problem (i.e. should you use the value at 2 or 6?).
If you really need both values, you could use unordered_multimap, but the syntax for accessing elements isn't easy as using the operator[] (you have to use unordered_multipmap::find() which returns an iterator).
template <typename T>
void arrayToMap(const T arr[], size_t arrSize, std::unordered_multimap<T, int>& map)
{
for(int i = 0; i < arrSize; ++i) {
map.emplace(arr[i], i);
}
}
Finally, you should consider that unordered_map's fast look-up time O(1) comes with some overhead, so it uses more memory than a simple array. But if you end up using an array (which is comparatively much more memory efficient), searching for a specific value is guaranteed to be O(n) where n is the index of the value.
Edit - If you need the duplicate with the lowest index to be kept instead of the highest, you can just reverse the order of insertion:
template <typename T>
void arrayToMap(const T arr[], size_t arrSize, std::unordered_map<T, int>& map)
{
for(int i = arraySize - 1; i >= 0; --i) {
map[arr[i]] = i;
}
}
Use std::unordered_map from C++11 to map elements as key and indices as value. Then you can get answer of your query in amortized O(1) complexity. std::unordered_map will work because there is no duplicacy as you said but cost you linear size extra space.
If your value's range is not too large, you can use an array as well. This will yield even better theta(1) complexity.
use unordered_multimap (C++11 only) with the value as the key, and the position index as the value.

How to initialize unordered_map directly with fixed element?

I want to initialize one unordered_map with fixed element 100. And the keys are from 0 to 100, all values of those keys are 0
using HashMap = unordered_map < int, int > ;
HashMap map;
for (int idx = 0; idx < 100; ++idx) {
map[idx] = 0;
}
Question 1:
Is there any directly way to do that like the following codes in python?
d = {x: x % 2 == 0 for x in range(1, 11)}
Question 2:
With initialization codes above, I think all elements are sorted in ascending order, but the results are:
Why the first element is 8 and the second element is 64, all left elements are in ascending order?
This is not quite so pretty as the Python expression, but it should do the trick.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_map>
int main() {
std::unordered_map<int, bool> m;
int i = -1;
std::generate_n(std::inserter(m, m.begin()),
10,
[&i](){
++i;
return std::make_pair(i, i % 2 == 0);
});
for (auto const &p: m)
std::cout << '<' << p.first << ", " << p.second << ">\n";
return 0;
}
Live on ideone.com
There is a reason unordered maps are called unordered maps. Since they are implemented as hash maps, the keys are not in any predictable order. Using an std::unordered_map for a dense collection of integer keys is probably not the most efficient solution to any problem, particularly if you expect to be able to extract the keys in order.
consider boost::irange
the internal data structure for unordered map is hash table, which does not always hold key order during hashing.