string vector not getting properly assigned using set_union - c++

I think I'm lacking some basic understanding of assignment in C/C++ here! I have a function that computes the set union between two string vectors. The reason I do this is because the algorithm library's function set_union requires that both vectors are sorted first and if I do it the following way then I can't forget to sort:
vector<string> SetOperations::my_set_union(vector<string> set1,
vector<string> set2) {
sort(set1.begin(), set1.end());
sort(set2.begin(), set2.end());
vector<string> v;
set_union(set1.begin(), set1.end(), set2.begin(), set2.end(), back_inserter(v));
return v;
}
I then do the following:
vector<string> vec = set_ops.my_set_union(vec1, vec2);
where vec1 and vec2 are string vectors containing a single "a" and "a" each and set_ops is an instantiation of a class that I have these set operations in (like the one above). They both definitely have these elements - I have printed the two vectors out.
For some (simple?) reason, vec ends up having a single element of "a" instead of two elements ("a" and "a").
Any ideas what I'm doing wrong? Am I meant to a copy function or something?
Thank you :).

You have to use merge instead of set_union. set_union will eliminate similar enteries.
see merge and set_union refrences.

I think you've misunderstood what set_union is supposed to do.
It sounds like you want std::merge instead.

If I remember my set theory well, that is what the union of two sets is. So it's expected behavior.
The reason is that a set cannot have duplicate elements. Since the union of two sets also produces a valid set, then it will only have a single "a" value.

This is desired behavior, as mathematical sets do not contain duplicates. When you call set_union, it should remove duplicate elements (in this case your two cases of "a"). Try it on two vectors containing (respectively) ("a", "b"), and ("a", "c"). You should get a vector with just "a" back.

Related

How to remove duplicates from a vector whose numbers might be in different positions?

How do you remove elements from a vector of vectors that are identical to another vector but whose elements are not in the same indices?
For example:
std::vector<vector<int>> vectA = {{1,3,4}. {1,2,3}, {3,2,1};
I want it so that {3,2,1} is removed from vectA and it becomes:
vectA = {{1,3,4}, {1,2,3}}
Any idea how to proceed efficiently?
Sort the elements of each vector
Drop duplicates (this is an easy look-up)
If you need to retain the original element order, then build any correspondence you wish: parallel arrays of vectors (original and sorted), pairs of (unsorted, sorted) vectors, etc. Drop duplicates based on the sorted ones.
I trust that you can take it from here.
What you are describing is the behavior of std::set, ie. this solves your problem:
set<set<int>> input = {{1,3,4}, {1,2,3}, {3,2,1}};
// input is now {{1,2,3},{1,3,4}}
This works because a set is basically equal to a sorted vector with no duplicates.
If you really want to, you can now convert to std::vector:
vector<vector<int>> nums;
for(auto & s : input) nums.emplace_back(s.begin(), s.end());

How to use lower_bound on vector of vectors?

I am relative new at C++ and I have little problem. I have vector and in that vector are vectors with 3 integers.
Inner vector represents like one person. 3 integers inside that inner vector represents distance from start, velocity and original index (because in input integers aren't sorted and in output I need to print original index not index in this sorted vector).
Now I have given some points representing distance from start and I need to find which person will be first at that point so I have been thinking that my first step would be that I would find closest person to the given point so basically I need to find lower_bound/upper_bound.
How can I use lower_bound if I want to find the lower_bound of first item in inner vectors? Or should I use struct/class instead of inner vectors?
You would use the version of std::lower_bound which takes a custom comparator (the versions marked "(2)" at the link); and you would write a comparator of vectors which compares vectors by their first item (or whatever other way you like).
Howerver:
As #doctorlove points out, std::lower_bound doesn't compare the vectors to each other, it compares them to a given value (be it a vector or a scalar). So it's possible you actually want to do something else.
It's usually not a good idea to keep fixed-length sequences of elements in std::vector's. Have you considered std::array?
It's very likely that your "vectors with 3 integers" actually stand for something else, e.g. points in a 3-dimensional geometric space; in which case, yes, they should be in some sort of class.
I am not sure that your inner things should be std::vector-s of 3 elements.
I believe that they should std::array-s of 3 elements (because you know that the size is 3 and won't change).
So you probably want to have
typedef std::array<double,3> element_ty;
then use std::vector<element_ty> and for the rest (your lower_bound point) do like in einpoklum's answer.
BTW, you probably want to use std::min_element with an explicit compare.
Maybe you want something like:
std::vector<element_ty> vec;
auto minit =
std::min_element(vec.begin(), vec.end(),
[](const element_ty& x, const element_ty&y) {
return x[0] < y[0]));

How to remove almost duplicates from a vector in C++

I have an std::vector of floats that I want to not contain duplicates but the math that populates the vector isn't 100% precise. The vector has values that differ by a few hundredths but should be treated as the same point. For example here's some values in one of them:
...
X: -43.094505
X: -43.094501
X: -43.094498
...
What would be the best/most efficient way to remove duplicates from a vector like this.
First sort your vector using std::sort. Then use std::unique with a custom predicate to remove the duplicates.
std::unique(v.begin(), v.end(),
[](double l, double r) { return std::abs(l - r) < 0.01; });
// treats any numbers that differ by less than 0.01 as equal
Live demo
Sorting is always a good first step. Use std::sort().
Remove not sufficiently unique elements: std::unique().
Last step, call resize() and maybe also shrink_to_fit().
If you want to preserve the order, do the previous 3 steps on a copy (omit shrinking though).
Then use std::remove_if with a lambda, checking for existence of the element in the copy (binary search) (don't forget to remove it if found), and only retain elements if found in the copy.
I say std::sort() it, then go through it one by one and remove the values within certain margin.
You can have a separate write iterator to the same vector and one resize operation at the end - instead of calling erase() for each removed element or having another destination copy for increased performance and smaller memory usage.
If your vector cannot contain duplicates, it may be more appropriate to use an std::set. You can then use a custom comparison object to consider small changes as being inconsequential.
Hi you could comprare like this
bool isAlmostEquals(const double &f1, const double &f2)
{
double allowedDif = xxxx;
return (abs(f1 - f2) <= allowedDif);
}
but it depends of your compare range and the double precision is not on your side
if your vector is sorted you could use std::unique with the function as predicate
I would do the following:
Create a set<double>
go through your vector in a loop or using a functor
Round each element and insert into the set
Then you can swap your vector with an empty vector
Copy all elements from the set to the empty vector
The complexity of this approach will be n * log(n) but it's simpler and can be done in a few lines of code. The memory consumption will double from just storing the vector. In addition set consumes slightly more memory per each element than vector. However, you will destroy it after using.
std::vector<double> v;
v.push_back(-43.094505);
v.push_back(-43.094501);
v.push_back(-43.094498);
v.push_back(-45.093435);
std::set<double> s;
std::vector<double>::const_iterator it = v.begin();
for(;it != v.end(); ++it)
s.insert(floor(*it));
v.swap(std::vector<double>());
v.resize(s.size());
std::copy(s.begin(), s.end(), v.begin());
The problem with most answers so far is that you have an unusual "equality". If A and B are similar but not identical, you want to treat them as equal. Basically, A and A+epsilon still compare as equal, but A+2*epsilon does not (for some unspecified epsilon). Or, depending on your algorithm, A*(1+epsilon) does and A*(1+2*epsilon) does not.
That does mean that A+epsilon compares equal to A+2*epsilon. Thus A = B and B = C does not imply A = C. This breaks common assumptions in <algorithm>.
You can still sort the values, that is a sane thing to do. But you have to consider what to do with a long range of similar values in the result. If the range is long enough, the difference between the first and last can still be large. There's no simple answer.

Selectively populated vectors with substrings extracted from a source string

I have a char array, in which its contents look something like the following:
char buffer[] = "I1 I2 V1 V2 I3 V3 I4 DO V4";
As you may see, it's a typical blank separated character string. I want to put all sub-string(s) starting with a letter "I" into a vector (IVector), and sort its elements in ascending order. At the same time, I'd also want to put all sub-string(s) starting with a letter "V" into another vector (VVector), and sort its elements in ascending order. The other(s) (e.g. "DO" in this example) will be ignored.
I'm not familiar with STL algorithm library. Are there any functions to help me achieve the avove-mentioned job?
Thank you!
You can iterate over all the substrings using an std::istream_iterator<std::string>:
std::stringstream s(buffer);
std::istream_iterator<std::string> begin(s);
std::istream_iterator<std::string> end;
for( ; begin != end; ++begin) {
switch((*begin)[0]) { // switch on first character
// insert into appropriate vector here
}
}
Then you can use std::sort to sort the vectors, as #Billy has already pointed out. You could also consider using an std::set, as that will always keep your items sorted in the first place.
Are there any functions to help me achieve the avove-mentioned job?
Yes. Have a look at std::find and std::sort.

The best practice solution of differenses search of two STL vectors

I have already two STL vectors. For instance:
vector<int> MyList;
MyList.push_back(10);
MyList.push_back(20);
MyList.push_back(30);
MyList.push_back(40);
MyList.push_back(50);
vector<int> MyListSub;
MyListSub.push_back(20);
MyListSub.push_back(30);
MyListSub.push_back(40);
And I want to get the number of elements which is in the MyListSub and isn't in MyList.
For this instance, result is "2"
You can use std::set_difference for this:
std::vector<int> diff;
std::set_difference(MyList.begin(), MyList.end(),
MyListSub.begin(), MyListSub.end(),
std::back_inserter(diff));
As #Jan points out, the vectors have to be sorted. If they are not, use std::sort to sort them:
std::sort(MyList.begin(), MyList.end());
Alternatively you can consider storing your elements in an std::set in the first place, thus they will already be sorted.