How to convert vector to set? - c++

I have a vector, in which I save objects. I need to convert it to set. I have been reading about sets, but I still have a couple of questions:
How to correctly initialize it? Honestly, some tutorials say it is fine to initialize it like set<ObjectName> something. Others say that you need an iterator there too, like set<Iterator, ObjectName> something.
How to insert them correctly. Again, is it enough to just write something.insert(object) and that's all?
How to get a specific object (for example, an object which has a named variable in it, which is equal to "ben") from the set?
I have to convert the vector itself to be a set (a.k.a. I have to use a set rather than a vector).

Suppose you have a vector of strings, to convert it to a set you can:
std::vector<std::string> v;
std::set<std::string> s(v.begin(), v.end());
For other types, you must have operator< defined.

All of the answers so far have copied a vector to a set. Since you asked to 'convert' a vector to a set, I'll show a more optimized method which moves each element into a set instead of copying each element:
std::vector<T> v = /*...*/;
std::set<T> s(std::make_move_iterator(v.begin()),
std::make_move_iterator(v.end()));
Note, you need C++11 support for this.

You can initialize a set using the objects in a vector in the following manner:
vector<T> a;
... some stuff ...
set<T> s(a.begin(), a.end());
This is the easy part. Now, you have to realize that in order to have elements stored in a set, you need to have bool operator<(const T&a, const T& b) operator overloaded. Also in a set you can have no more then one element with a given value acording to the operator definition. So in the set s you can not have two elements for which neither operator<(a,b) nor operator<(b,a) is true. As long as you know and realize that you should be good to go.

If all you want to do is store the elements you already have in a vector, in a set:
std::vector<int> vec;
// fill the vector
std::set<int> myset(vec.begin(), vec.end());

You haven't told us much about your objects, but suppose you have a class like this:
class Thing
{
public:
int n;
double x;
string name;
};
You want to put some Things into a set, so you try this:
Thing A;
set<Thing> S;
S.insert(A);
This fails, because sets are sorted, and there's no way to sort Things, because there's no way to compare two of them. You must provide either an operator<:
class Thing
{
public:
int n;
double x;
string name;
bool operator<(const Thing &Other) const;
};
bool Thing::operator<(const Thing &Other) const
{
return(Other.n<n);
}
...
set<Thing> S;
or a comparison function object:
class Thing
{
public:
int n;
double x;
string name;
};
struct ltThing
{
bool operator()(const Thing &T1, const Thing &T2) const
{
return(T1.x < T2.x);
}
};
...
set<Thing, ltThing> S;
To find the Thing whose name is "ben", you can iterate over the set, but it would really help if you told us more specifically what you want to do.

How to correctly initialize it?
std::set<YourType> set;
The only condition is that YourType must have bool operator<(const YourType&) const and by copyable (default constructor + assignment operator). For std::vector copyable is enough.
How to insert them correctly.
set.insert(my_elem);
How to get specific object (for example object, which has name variable in it, which is equal to "ben") from set?
That's maybe the point. A set is just a bunch of object, if you can just check that an object is inside or iterate throught the whole set.

Creating a set is just like creating a vector. Where you have
std::vector<int> my_vec;
(or some other type rather than int) replace it with
std::set<int> my_set;
To add elements to the set, use insert:
my_set.insert(3);
my_set.insert(2);
my_set.insert(1);

Related

Make object searchable with two different keys

Given a class with two keys:
class A {
int key1;
int key2;
byte x[]; // large array
}
If multiple objects of class A are instantiated and I want to sort them by key1, I can insert them into an std::set.
But if I want to sort these objects both by key1 and by key2, how would I do that?
I could create two sets where one set sorts by key1 and the other set sorts by key2, but that doubles the amount of memory used. How can I avoid this?
Edit 1:
As far as I know, when an object is inserted into a set, the object is copied. So if I create two sets (one sorted by key1 and one sorted by key2), that means two versions of the object will exist: one in set1 and one in set2. This means that member x also exists twice, which unnecessarily doubles the amount of memory used.
Edit 2:
To give a more specific example: given the class Person.
class Person {
std::string name;
std::string address;
// other fields
}
I want to be able to find people either by their name and by their address. Both keys won't be used at the same time: I want to be able to call find(name) and find(address).
Also, objects of the Person class won't be added or removed from the datastructure that often, but lookups will happen often. So lookups should ideally be fast.
Edit 3:
Storing pointers to the objects in the set instead of the objects themselves seems like a good solution. But would it be possible to store pointers in both sets? I.e.
std::set<A*> set_sorted_by_key1;
std::set<A*> set_sorted_by_key2;
A *obj_p = new A();
set_sorted_by_key1.insert(obj_p);
set_sorted_by_key2.insert(obj_p);
Finding an element in a sorted vector via binary_search is O(log(N)) just as std::set::find is O(log(N)), hence if you want to stay with standard containers, concerning time complexity of finding elements, the type of container you actually choose isnt that important.
Concerning the additional memory, you wont get it any cheaper than storing an additional pointer to the elements somewhere.
So what you can do is
std::vector<A> sorted1;
std::sort(sorted1.begin(),sorted1.end(),
[](const A& a,const A& b) { return a.key1 < b.key2; });
std::vector<A*> sorted2;
// ... fill with pointers to elements in sorted2
std::sort(sorted2.begin(),sorted2.end(),
[](A* a, A* b) { return a->key2 < b->key2; });
Storing pointers to the objects in the set instead of the objects themselves seems like a good solution. But would it be possible to store pointers in both sets?
Sure, your sets seem to share ownership of that objects, so:
class Person {
std::string name;
std::string address;
// other fields
};
using PersonPtr = std::shared_ptr<Person>;
Now you want to sort them by name:
struct CmpName {
using is_transparent = void;
bool operator()( const PersonPtr &p1, const PersonPtr &p2 ) const { return p1->name < p2->name; }
bool operator()( const std::string &s, const PersonPtr &p2 ) const { return s < p2->name; }
bool operator()( const PersonPtr &p1, const std::string &s ) const { return p1->name < s; }
};
std::set<PersonPtr,CmpName> byName;
Note type alias using is_transparent = void; and two additional methods are to enable equivalent search in std::set otherwise you would have to create instance of std::shared pointer<Person> just to to lookup. Details can be found here What are transparent comparators?
And search it:
auto f = byName.find( "John" );
Here how it works: Live example
Searching by address can be done very similar way, just add another comparator struct and initialize std::set with it.
Though you can store object and have multiple indexes using boost.multiindex but it has learning curve.

c++ - sorting a vector of custom structs based on frequency

I need to find the most frequent element in an array of custom structs. There is no custom ID to them just matching properties.
I was thinking of sorting my vector by frequency but I have no clue how to do that.
I'm assuming by frequency you mean the number of times an identical structure appears in the array.
You probably want to make a hash function (or overload std::hash<> for your type) for your custom struct. Then iterate over your array, incrementing the value on an unordered_map<mytype, int> for every struct in the array. This will give you the frequency in the value field. Something like the below would work:
std::array<mytype> elements;
std::unordered_map<mytype, int> freq;
mytype most_frequent;
int max_frequency = 0;
for (const mytype &el : elements) {
freq[el]++;
if (freq[el] > max_frequency) {
most_frequent = el;
}
}
For this to work, the map will need to be able to create a hash for the above function. By default, it tries to use std::hash<>. You are expressly allowed by the standard to specialize this template in the standard namespace for your own types. You could do this as follows:
struct mytype {
std::string name;
double value;
};
namespace std {
template <> struct hash<mytype> {
size_t operator()(const mytype &t) const noexcept {
// Use standard library hash implementations of member variable types
return hash<string>()(t.name) ^ hash<double>()(t.value)
}
}
}
The primary goal is to ensure that any two variables that do not contain exactly the same values will generate a different hash. The above XORs the results of the standard library's hash function for each type together, which according to Mark Nelson is probably as good as the individual hashing algorithms XOR'd together. An alternative algorithm suggested by cppreference's hash reference is the Fowler-Noll-Vo hash function.
Look at std::sort and the example provided in the ref, where you actually pass your own comparator to do the trick you want (in your case, use the frequencies). Of course, a lambda function can be used too, if you wish.

Comparing a vector with an array assuming the elements are in different order

I would like to compare a vector with an array assuming that elements are in different order.
I have got a struct like below:
struct A
{
int index;
A() : index(0) {}
};
The size of the vector and the array is the same:
std::vector<A> l_v = {A(1), A(2), A(3)};
A l_a[3] = {A(3), A(1), A(2)};
The function to compare elements is:
bool isTheSame()
{
return std::equal(l_v.begin(), l_v.end(), l_a,
[](const A& lhs, const A& rhs){
return lhs.index == rhs.index;
});
}
The problem is that my function will return false, because the elements are the same, but not in the same order.
A solution is to sort the elements in the vector and array before "std::equal", but is there any better solution?
Using sort would be the way to go. Sorting in general is a good idea. And as far as I know it would result in the best performance.
Note: I would recommend passing the vectors as arguments. Rather than using the member variables. After doing that this would be a typical function that would be very well suited to inline. Also you might also want to consider taking it out of the class and/or making it static.

Find in Vector of a Struct

I made the following program where there is a struct
struct data
{
int integers; //input of integers
int times; //number of times of appearance
}
and there is a vector of this struct
std::vector<data> inputs;
and then I'll get from a file an integer of current_int
std::fstream openFile("input.txt")
int current_int; //current_int is what I want to check if it's in my vector of struct (particularly in inputs.integers)
openFile >> current_int;
and I wanna check if current_int is already stored in my vector inputs.
I've tried researching about finding data in a vector and supposedly you use an iterator like this:
it = std::find(inputs.begin(),inputs.end(),current_int)
but will this work if it's in a struct? Please help.
There are two variants of find:
find() searches for a plain value. In you case you have a vector of data, so the values passed to find() should be data.
find_if() takes a predicate, and returns the first position where the predicates returns true.
Using the latter, you can easily match one field of your struct:
auto it = std::find_if(inputs.begin(), inputs.end(),
[current_int] (const data& d) {
return d.integers == current_int;
});
Note that the above uses a C++11 lambda function. Doing this in earlier versions of C++ requires you to create a functor instead.

Sorting Array of Struct's based on String

I've been reading all the topics related to sorting arrays of structs, but haven't had any luck as of yet, so I'll just ask. I have a struct:
struct question{
string programNum;
string programDesc;
string programPoints;
string programInput;
string programQuestion;
};
And I populate an array of question in main, and now have an array called questions[] so now I need to write a sort that will sort questions[] based on question.programQuestion. Based on what I've read, this is where I'm at, but I'm not sure if its even close:
int myCompare (const void *v1, const void *v2 ) {
const struct question* p1 = static_cast<const struct question*>(v1);
const struct question* p2 = static_cast<const struct question*>(v2);
if (p1->programQuestion > p2->programQuestion){
return(+1);}
else if (p1->programQuestion < p2->programQuestion){
return(-1);}
else{
return(0);}
}
If this is right I'm not sure how to call it in main. Thanks for any help!
If you're intending to use std::sort to sort this array, you likely want to declare an operator< as a method in this struct. Something like this:
struct question{
string programNum;
string programDesc;
string programPoints;
string programInput;
string programQuestion;
bool operator<( const question &rhs) const;
};
bool question::operator<( const question &rhs ) const
{
return programQuestion < rhs.programQuestion;
}
The comparison function you were attempting to declare above appears to be the type qsort expects, and I would not recommend trying to qsort an array of these struct questions.
Just use std::sort. It's safer, nearly always faster (sometimes by huge margins), and generally easier to get right.
Unless there is some important reason not to do so, I would use a std::vector instead of a plain array. It is easier and safer. You could use the following code to sort your vector:
std::vector<question> questions;
// add some elements to the vector
std::sort(begin(questions), end(questions),
[](const question& q1, const question& q2) {
return q1.programQuestion < q2.programQuestion;
});
This code use some C++11 features. But you could achieve the same in previous versions of C++ by using a function object, or simply by implementing operator< in the struct (assuming you always want to sort such a struct based on that field).