Inplace versions of set_difference, set_intersection and set_union

Inplace versions of set_difference, set_intersection and set_union - c++

I implemented versions of set_union, set_intersection and set_difference that take a sorted container and a sorted range (that must not be within the container), and write the result of the operation into the container.
template<class Container, class Iter>
void assign_difference(Container& cont, Iter first, Iter last)
{
auto new_end = std::set_difference( // (1)
cont.begin(), cont.end(), first, last, cont.begin());
cont.erase(new_end, cont.end());
}
template<class Container, class Iter>
void assign_intersection(Container& cont, Iter first, Iter last)
{
auto new_end = std::set_intersection( // (2)
cont.begin(), cont.end(), first, last, cont.begin());
cont.erase(new_end, cont.end());
}
template<class Container, class Iter>
void assign_union(Container& cont, Iter first, Iter last)
{
auto insert_count = last - first;
cont.resize(cont.size() + insert_count); // T must be default-constructible
auto rfirst1 = cont.rbegin() + insert_count, rlast1 = cont.rend();
auto rfirst2 = std::make_reverse_iterator(last);
auto rlast2 = std::make_reverse_iterator(first);
rlast1 = std::set_union( // (3)
rfirst1, rlast1, rfirst2, rlast2, cont.rbegin(), std::greater<>());
cont.erase(std::copy(rlast1.base(), cont.end(), cont.begin()), cont.end());
}
The goal was:
No allocation is performed if the container has enaugh capacity to hold the result.
Otherwise exactly one allocation is performed to give the container the capacity to hold the result.
As you can see in the lines marked (1), (2) and (3), the same container is used as input and output for those STL algorithms. Assuming a usual implementation of those STL algorithms, this code works, since it only writes to parts of the container that have already been processed.
As pointed out in the comments, it's not guaranteed by the standard that this works. set_union, set_intersection and set_difference require that the resulting range doesn't overlap with one of the input ranges.
However, can there be a STL implementation that breaks the code?
If your answer is yes, please provide a conforming implementations of one of the three used STL algorithms that breaks the code.

A conforming implementation could check if argument 1 and 5 of set_intersection are equal, and if they are format your harddrive.
If you violate the requirements, the behaviour of your program is not constrained by the standard; your program is ill formed.
There are situations where UB may be worth the risk and cost (auditing all compiler changes and assembly output). I do not see the point here; write your own. Any fancy optimizations that the std library comes up with could cause problems when you violate requirements as you are doing, and as you have noted the naive implementation is simple.

As rule of thumb I use do not write on a container on which you are iterating. Everything can happen. In general it's odd.
As #Yakk said, it sounds ill. That's it. Something to be removed from your code base an sleep peacefully.
If you really need those functions, I would suggest to write by yourself the inner loop (eg: the inner of std::set_intersection) in order to handle the constraint you need for your algorithm to work.
I don't think that seeking for an STL implementation on which it doesn't work is the right approach. It doesn't sound like a long term solution. For the long term: the standard should be your reference, and as someone already pointed out, your solution doesn't seems to properly deal with it.
My 2 cents

Related

Is there a std::unique-style library algorithm that has user-defined collision handler?

I have a basic std::vector of key/value pairs. It is sorted by key. I would like to reduce all of the adjacent duplicate key entries using a user-defined binary operator while compacting the vector.
This is basically a std::unique application where the user can decide how to handle the collision rather than just keeping the first entry.
Is there a library algorithm that satisfies this requirement? I can write my own but I would prefer to rely on something that an expert has written.
The map-as-sorted-vector is core to other parts of the algorithm and can't be changed. I am limited to C++14.

I can't think of a standard algo for this. std::unique almost satisfies the requirement, but unfortunately the BinaryPredicate you supply to compare elements isn't allowed to modify them ("binary_pred shall not apply any non-constant function through the dereferenced iterators." - [algorithms.requirements] paragraph 7 in the C++17 Standard) - a requirement that lets the implementation optimise more freely (e.g. parallel processing of different parts of the vector).
An implementation's not too hard though...
template <typename Iterator, typename BinaryPredicate, typename Compaction>
Iterator compact(Iterator begin, Iterator end, BinaryPredicate equals, Compaction compaction)
{
if (begin == end) return begin;
Iterator compact_to = begin;
while (++begin != end)
if (equals(*begin, *compact_to))
compaction(*compact_to, *begin);
else
*++compact_to = *begin;
return ++compact_to;
}
The return value will be the new "end" for the compacted vector - you can erase therefrom like you would for remove_if.
You can see it running here.

Why is there no std::inplace_merge_unique?

I tried looking for an algorithm that would do what std::inplace_merge
followed by std::unique would do. Seems more efficient to do it in 1 pass than in 2.
Could not find it in standard library or by oogling.
So is there implementation somewhere in boost under different name maybe?
Is such algorithn possible (in a sense that it has same complexity guarantees as normal inplace_merge)?

It doesn't operate in-place, but assuming that neither range contains duplicates beforehand, std::set_union will find the same result as merge followed by unique.

There are many interesting algorithms missing from the algorithms section. The original submission of STL was incomplete from Stepanov's view and some algorithms were even removed. The proposal by Alexander Stepanov and Meng Lee doesn't seem to include an algorithm inplace_merge_unique() or any variation thereof.
One of the potential reasons why there is no such algorithm is that it isn't clear which of the element should be dropped: since the comparison is only a strict weak ordering, the choice of element matters. One approach to implement inplace_merge_unique() is to
Use std::remove_if() to remove any element which is a duplicate from the second range.
Use inplace_merge() to do the actual merge.
The predicate to std::remove_if() would track the current position in the first part of the sequence to be merged. The code below isn't tested but something like that should work:
template <typename BiDirIt, typename Comp>
BiDirIt inplace_merge_unique(BiDirIt begin, BiDirIt middle, BiDirIt end, Comp comp) {
using reference = typename std::iterator_traits<BiDirIt>::reference;
BiDirIt result = std::remove_if(middle, end, [=](reference other) mutable -> bool {
begin = std::find_if(begin, middle, [=](reference arg)->bool {
return !comp(arg, other);
});
return begin != middle && !comp(other, *begin);
});
std::inplace_merge(begin, middle, result, comp);
return result;
}

Obtaining `std::priority_queue` elements in reverse order?

I've written some K-nearest-neighbor query methods which build a list of points that are nearest to a given query point. To maintain that list of neighbors, I use the std::priority_queue such that the top element is the farthest neighbor to the query point. This way I know if I should push the new element that is currently being examined (if at a lesser distance than the current farthest neighbor) and can pop() the farthest element when my priority-queue has more than K elements.
So far, all is well. However, when I output the elements, I would like to order them from the closest to the farthest. Currently, I simply pop all the elements from the priority-queue and put them on the output-container (through an iterator), which results in a sequence of points ordered from farthest to closest, so then, I call std::reverse on the output iterator range.
As a simple example, here is a linear-search that uses the priority-queue (obviously, the actual nearest-neighbor query methods I use are far more complicated):
template <typename DistanceValue,
typename ForwardIterator,
typename OutputIterator,
typename GetDistanceFunction,
typename CompareFunction>
inline
OutputIterator min_dist_linear_search(ForwardIterator first,
ForwardIterator last,
OutputIterator output_first,
GetDistanceFunction distance,
CompareFunction compare,
std::size_t max_neighbors = 1,
DistanceValue radius = std::numeric_limits<DistanceValue>::infinity()) {
if(first == last)
return output_first;
typedef std::priority_queue< std::pair<DistanceValue, ForwardIterator>,
std::vector< std::pair<DistanceValue, ForwardIterator> >,
detail::compare_pair_first<DistanceValue, ForwardIterator, CompareFunction> > PriorityQueue;
PriorityQueue output_queue = PriorityQueue(detail::compare_pair_first<DistanceValue, ForwardIterator, CompareFunction>(compare));
for(; first != last; ++first) {
DistanceValue d = distance(*first);
if(!compare(d, radius))
continue;
output_queue.push(std::pair<DistanceValue, ForwardIterator>(d, first));
while(output_queue.size() > max_neighbors)
output_queue.pop();
if(output_queue.size() == max_neighbors)
radius = output_queue.top().first;
};
OutputIterator it = output_first;
while( !output_queue.empty() ) {
*it = *(output_queue.top().second);
output_queue.pop(); ++it;
};
std::reverse(output_first, it);
return it;
};
The above is all dandy except for one thing: it requires the output-iterator type to be bidirectional and essentially be pointing to a pre-allocated container. Now, this practice of storing the output in a range prescribed by some output iterator is great and pretty standard too (e.g. std::copy and other STL algorithms are good examples of that). However, in this case I would like to be able to only require a forward output-iterator type, which would make it possible to use back-inserter iterators like those provided for STL containers and iostreams.
So, this boils down to reversing the priority-queue before dumping its content in the output iterator. So, these are the better options I've been able to come up with:
Create a std::vector, dump the priority-queue content in it, and dump the elements into the output-iterator using a reverse-iterator on the vector.
Replace the std::priority_queue with a sorted container (e.g. std::multimap), and then dump the content into the output-iterator using the appropriate traversal order.
Are there any other reasonable option?
I used to employ a std::multimap in a previous implementation of this algorithm and others, as of my second option above. However, when I switched to std::priority_queue, the performance gain was significant. So, I'd rather not use the second option, as it really seems that using a priority-queue for maintaining the list of neighbors is much better than relying on a sorted array. Btw, I also tried a std::vector that I maintain sorted with std::inplace_merge, which was better than multimap, but didn't match up to the priority-queue.
As for the first option, which is my best option at this point, it just seems wasteful to me to have to do this double transfer of data (queue -> vector -> output). I'm just inclined to think that there must be a simpler way to do this... something that I'm missing..
The first option really isn't that bad in this application (considering the complexity of the algorithm that precedes it), but if there is a trick to avoid this double memory transfer, I'd like to know about it.

Problem solved!
I'm such an idiot... I knew I was missing something obvious. In this case, the std::sort_heap() function. The reference page even has an example that does exactly what I need (and since the std::priority_queue is just implemented in terms of a random-access container and the heap-functions (pop_heap, push_heap, make_heap) it makes no real difference to use these functions directly in-place of the std::priority_queue class). I don't know how I could have missed that.
Anyways, I hope this helps anyone who had the same problem.

One dirty idea, which would nevertheless be guaranteed to work, would be the following:
std::priority_queue<int, std::vector<int>, std::less<int> > queue;
queue.push(3);
queue.push(5);
queue.push(9);
queue.push(2);
// Prints in reverse order.
int* front = const_cast<int*>(&queue.top());
int* back = const_cast<int*>(front + queue.size());
std::sort(front, back);
while (front < back) {
printf("%i ", *front);
++front;
}
It may be noted that the in-place sorting will likely break the queue.

why don't you just specify the opposite comparison function in the declaration:
#include <iostream>
#include <queue>
#include <vector>
#include <functional>
int main() {
std::priority_queue<int, std::vector<int>, std::greater<int> > pq;
pq.push(1);
pq.push(10);
pq.push(15);
std::cout << pq.top() << std::endl;
}

what is the better way to write iterators for a loop in C++

For a very simple thing, like for example to print each element in a vector, what is the better way to use in C++?
I have been using this:
for (vector<int>::iterator i = values.begin(); i != values.end(); ++i)
before, but in one of the Boost::filesystem examples I have seen this way:
for (vec::const_iterator it(v.begin()), it_end(v.end()); it != it_end; ++it)
For me it looks more complicated and I don't understand why is it better then the one I have been using.
Can you tell me why is this version better? Or it doesn't matter for simple things like printing elements of a vector?
Does i != values.end() make the iterating slower?
Or is it const_iterator vs iterator? Is const_iterator faster in a loop like this?

Foo x = y; and Foo x(y); are equivalent, so use whichever you prefer.
Hoisting the end out of the loop may or may not be something the compiler would do anyway, in any event, it makes it explicit that the container end isn't changing.
Use const-iterators if you aren't going to modify the elements, because that's what they mean.
for (MyVec::const_iterator it = v.begin(), end = v.end(); it != end; ++it)
{
/* ... */
}
In C++0x, use auto+cbegin():
for (auto it = v.cbegin(), end = v.cend(); it != end; ++it)
(Perhaps you'd like to use a ready-made container pretty-printer?)

for (vector<int>::iterator i = values.begin(); i != values.end(); ++i)
...vs...
for (vec::const_iterator it(v.begin()), it_end(v.end()); it != it_end; ++it)
For me [the latter, seen in boost] looks more complicated and I don't understand why is it better then the one I have been using.
I'd say it would look more complicated to anybody who hasn't got some specific reason for liking the latter to the extent that it distorts perception. But let's move on to why it might be better....
Can you tell me why is this version better? Or it doesn't matter for simple things like printing elements of a vector?
Does i != values.end() make the iterating slower?
it_end
Performance: it_end gets the end() value just once as the start of the loop. For any container where calculating end() was vaguely expensive, calling it only once may save CPU time. For any halfway decent real-world C++ Standard library, all the end() functions perform no calculations and can be inlined for equivalent performance. In practice, unless there's some chance you may need to drop in a non-Standard container that's got a more expensive end() function, there's no benefit to explicitly "caching" end() in optimised code.This is interesting, as it means for vector that size() may require a small calculation - conceptually subtracting begin() from end() then dividing by sizeof(value_type) (compilers scale by size implicitly during pointer arithmetic), e.g. GCC 4.5.2:
size_type size() const
{ return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
Maintenance: if the code evolves to insert or erase elements inside the loop (obvious in such a way that the iterator itself isn't invalidated - plausible for maps / sets / lists etc.) it's one more point of maintenance (and hence error-proneness) if the cached end() value also needs to be explicitly recalculated.
A small detail, but here vec must be a typedef, and IMHO it's often best to use typedefs for containers as it loosens the coupling of container type with access to the iterator types.
type identifier(expr)
Style and documentary emphasis: type identifier(expr) is more directly indicative of a constructor call than type identifier = expr, which is the main reason some people prefer the form. I generally prefer the latter, as I like to emphasise the sense of assignment... it's visually unambiguous whereas function call notation is used for many things.
Near equivalence: For most classes, both invoke the same constructor anyway, but if type has an explicit constructor from the type of expr, it will be passed over if = is used. Worse still, some other conversion may allow a less ideal constructor be used instead. For example, X x = 3.14;, would pass over explicit X::X(double); to match X::X(int) - you could get a less precise (or just plain wrong) result - but I'm yet to be bitten by such an issue so it's pretty theoretical!
Or is it const_iterator vs iterator? Is const_iterator faster in a loop like this?
For Standard containers, const_iterator and iterator perform identically, but the latter implies you want the ability to modify the elements as you iterate. Using const_iterator documents that you don't intend to do that, and the compiler will catch any contradictory uses of the iterator that attempt modification. For example, you won't be able to accidentally increment the value the iterator addresses when you intend to increment the iterator itself.
Given C++0x has been mentioned in other answers - but only the incremental benefit of auto and cbegin/cend - there's also a new notation supported:
for (const Foo& foo: container)
// use foo...

To print the items in a vector, you shouldn't be using any of the above (at least IMO).
I'd recommend something like this:
std::copy(values.begin(), values.end(),
std::ostream_iterator<T>(std::cout, "\n"));

You could just access them by index
int main(int argc, char* argv[])
{
std::vector<int> test;
test.push_back(10);
test.push_back(11);
test.push_back(12);
for(int i = 0; i < test.size(); i++)
printf("%d\n", test[i]);
}
prints out:
10
11
12

I don't think it matters. Internally, they do the same thing, so you compiler should optimise it anyway. I would personally use the first version as I find it much clearer as it closely follows the for-loop strucutre.
for (vector<int>::iterator i = values.begin(); i != values.end(); ++i)

Determining if an unordered vector<T> has all unique elements

Profiling my cpu-bound code has suggested I that spend a long time checking to see if a container contains completely unique elements. Assuming that I have some large container of unsorted elements (with < and = defined), I have two ideas on how this might be done:
The first using a set:
template <class T>
bool is_unique(vector<T> X) {
set<T> Y(X.begin(), X.end());
return X.size() == Y.size();
}
The second looping over the elements:
template <class T>
bool is_unique2(vector<T> X) {
typename vector<T>::iterator i,j;
for(i=X.begin();i!=X.end();++i) {
for(j=i+1;j!=X.end();++j) {
if(*i == *j) return 0;
}
}
return 1;
}
I've tested them the best I can, and from what I can gather from reading the documentation about STL, the answer is (as usual), it depends. I think that in the first case, if all the elements are unique it is very quick, but if there is a large degeneracy the operation seems to take O(N^2) time. For the nested iterator approach the opposite seems to be true, it is lighting fast if X[0]==X[1] but takes (understandably) O(N^2) time if all the elements are unique.
Is there a better way to do this, perhaps a STL algorithm built for this very purpose? If not, are there any suggestions eek out a bit more efficiency?

Your first example should be O(N log N) as set takes log N time for each insertion. I don't think a faster O is possible.
The second example is obviously O(N^2). The coefficient and memory usage are low, so it might be faster (or even the fastest) in some cases.
It depends what T is, but for generic performance, I'd recommend sorting a vector of pointers to the objects.
template< class T >
bool dereference_less( T const *l, T const *r )
{ return *l < *r; }
template <class T>
bool is_unique(vector<T> const &x) {
vector< T const * > vp;
vp.reserve( x.size() );
for ( size_t i = 0; i < x.size(); ++ i ) vp.push_back( &x[i] );
sort( vp.begin(), vp.end(), ptr_fun( &dereference_less<T> ) ); // O(N log N)
return adjacent_find( vp.begin(), vp.end(),
not2( ptr_fun( &dereference_less<T> ) ) ) // "opposite functor"
== vp.end(); // if no adjacent pair (vp_n,vp_n+1) has *vp_n < *vp_n+1
}
or in STL style,
template <class I>
bool is_unique(I first, I last) {
typedef typename iterator_traits<I>::value_type T;
…
And if you can reorder the original vector, of course,
template <class T>
bool is_unique(vector<T> &x) {
sort( x.begin(), x.end() ); // O(N log N)
return adjacent_find( x.begin(), x.end() ) == x.end();
}

You must sort the vector if you want to quickly determine if it has only unique elements. Otherwise the best you can do is O(n^2) runtime or O(n log n) runtime with O(n) space. I think it's best to write a function that assumes the input is sorted.
template<class Fwd>
bool is_unique(In first, In last)
{
return adjacent_find(first, last) == last;
}
then have the client sort the vector, or a make a sorted copy of the vector. This will open a door for dynamic programming. That is, if the client sorted the vector in the past then they have the option to keep and refer to that sorted vector so they can repeat this operation for O(n) runtime.

The standard library has std::unique, but that would require you to make a copy of the entire container (note that in both of your examples you make a copy of the entire vector as well, since you unnecessarily pass the vector by value).
template <typename T>
bool is_unique(std::vector<T> vec)
{
std::sort(vec.begin(), vec.end());
return std::unique(vec.begin(), vec.end()) == vec.end();
}
Whether this would be faster than using a std::set would, as you know, depend :-).

Is it infeasible to just use a container that provides this "guarantee" from the get-go? Would it be useful to flag a duplicate at the time of insertion rather than at some point in the future? When I've wanted to do something like this, that's the direction I've gone; just using the set as the "primary" container, and maybe building a parallel vector if I needed to maintain the original order, but of course that makes some assumptions about memory and CPU availability...

For one thing you could combine the advantages of both: stop building the set, if you have already discovered a duplicate:
template <class T>
bool is_unique(const std::vector<T>& vec)
{
std::set<T> test;
for (typename std::vector<T>::const_iterator it = vec.begin(); it != vec.end(); ++it) {
if (!test.insert(*it).second) {
return false;
}
}
return true;
}
BTW, Potatoswatter makes a good point that in the generic case you might want to avoid copying T, in which case you might use a std::set<const T*, dereference_less> instead.
You could of course potentially do much better if it wasn't generic. E.g if you had a vector of integers of known range, you could just mark in an array (or even bitset) if an element exists.

You can use std::unique, but it requires the range to be sorted first:
template <class T>
bool is_unique(vector<T> X) {
std::sort(X.begin(), X.end());
return std::unique(X.begin(), X.end()) == X.end();
}
std::unique modifies the sequence and returns an iterator to the end of the unique set, so if that's still the end of the vector then it must be unique.
This runs in nlog(n); the same as your set example. I don't think you can theoretically guarantee to do it faster, although using a C++0x std::unordered_set instead of std::set would do it in expected linear time - but that requires that your elements be hashable as well as having operator == defined, which might not be so easy.
Also, if you're not modifying the vector in your examples, you'd improve performance by passing it by const reference, so you don't make an unnecessary copy of it.

If I may add my own 2 cents.
First of all, as #Potatoswatter remarked, unless your elements are cheap to copy (built-in/small PODs) you'll want to use pointers to the original elements rather than copying them.
Second, there are 2 strategies available.
Simply ensure there is no duplicate inserted in the first place. This means, of course, controlling the insertion, which is generally achieved by creating a dedicated class (with the vector as attribute).
Whenever the property is needed, check for duplicates
I must admit I would lean toward the first. Encapsulation, clear separation of responsibilities and all that.
Anyway, there are a number of ways depending on the requirements. The first question is:
do we have to let the elements in the vector in a particular order or can we "mess" with them ?
If we can mess with them, I would suggest keeping the vector sorted: Loki::AssocVector should get you started.
If not, then we need to keep an index on the structure to ensure this property... wait a minute: Boost.MultiIndex to the rescue ?
Thirdly: as you remarked yourself a simple linear search doubled yield a O(N2) complexity in average which is no good.
If < is already defined, then sorting is obvious, with its O(N log N) complexity.
It might also be worth it to make T Hashable, because a std::tr1::hash_set could yield a better time (I know, you need a RandomAccessIterator, but if T is Hashable then it's easy to have T* Hashable to ;) )
But in the end the real issue here is that our advises are necessary generic because we lack data.
What is T, do you intend the algorithm to be generic ?
What is the number of elements ? 10, 100, 10.000, 1.000.000 ? Because asymptotic complexity is kind of moot when dealing with a few hundreds....
And of course: can you ensure unicity at insertion time ? Can you modify the vector itself ?

Well, your first one should only take N log(N), so it's clearly the better worse case scenario for this application.
However, you should be able to get a better best case if you check as you add things to the set:
template <class T>
bool is_unique3(vector<T> X) {
set<T> Y;
typename vector<T>::const_iterator i;
for(i=X.begin(); i!=X.end(); ++i) {
if (Y.find(*i) != Y.end()) {
return false;
}
Y.insert(*i);
}
return true;
}
This should have O(1) best case, O(N log(N)) worst case, and average case depends on the distribution of the inputs.

If the type T You store in Your vector is large and copying it is costly, consider creating a vector of pointers or iterators to Your vector elements. Sort it based on the element pointed to and then check for uniqueness.
You can also use the std::set for that. The template looks like this
template <class Key,class Traits=less<Key>,class Allocator=allocator<Key> > class set
I think You can provide appropriate Traits parameter and insert raw pointers for speed or implement a simple wrapper class for pointers with < operator.
Don't use the constructor for inserting into the set. Use insert method. The method (one of overloads) has a signature
pair <iterator, bool> insert(const value_type& _Val);
By checking the result (second member) You can often detect the duplicate much quicker, than if You inserted all elements.

In the (very) special case of sorting discrete values with a known, not too big, maximum value N.
You should be able to start a bucket sort and simply check that the number of values in each bucket is below 2.
bool is_unique(const vector<int>& X, int N)
{
vector<int> buckets(N,0);
typename vector<int>::const_iterator i;
for(i = X.begin(); i != X.end(); ++i)
if(++buckets[*i] > 1)
return false;
return true;
}
The complexity of this would be O(n).

Using the current C++ standard containers, you have a good solution in your first example. But if you can use a hash container, you might be able to do better, as the hash set will be nO(1) instead of nO(log n) for a standard set. Of course everything will depend on the size of n and your particular library implementation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Inplace versions of set_difference, set_intersection and set_union - c++

Related

Is there a std::unique-style library algorithm that has user-defined collision handler?

Why is there no std::inplace_merge_unique?

Obtaining `std::priority_queue` elements in reverse order?

what is the better way to write iterators for a loop in C++

Determining if an unordered vector<T> has all unique elements

Categories

Resources