In place randomized selection algorithm - c++

We are currently studying algorithms hence I marked this question as “homework” even though this is not a homework related task. Just to be safe.
We just studied the randomized selection algorithm, and the logic seems simple. Choose an element from a list, and then put the element in its right place. Then repeat the process in one sub list until the element at the index is in its place. Where index is the position of the element you want in the sort list.
This should be a modified version of the quick sort algorithm. But we only sort one sub list, not both sub lists. Hence a performance boost (in big-oh).
I can successfully implement this algorithm using external storage (C++, and zero based array’s):
int r_select2(vector<int>& list, int i)
{
int p = list[0];
vector<int> left, right;
for (int k = 1; k < list.size(); ++k)
{
if (list[k] < p) left.push_back(list[k]);
else right.push_back(list[k]);
}
int j = left.size();
if (j > i) p = r_select2(left, i);
else if (j < i) p = r_select2(right, i - j - 1);
return p;
}
However, I want to implement the algorithm using in-situ (in-place), and not use extra sub arrays. I believe that this should be an easy/trivial task. But somewhere, my in-situ version goes wrong. Maybe it’s just late and I need to sleep, but I can’t see the root cause of why the following version fails:
int r_select(vector<int>& list, int begin, int end, int i)
{
i = i + begin;
int p = list[begin];
if (begin < end)
{
int j = begin;
for (int k = begin + 1; k < end; ++k)
{
if (list[k] < p)
{
++j;
swap(list[j], list[k]);
}
}
swap(list[begin], list[j]);
if (j > i) p = r_select(list, begin, j, i);
else if (j < i) p = r_select(list, j + 1, end, i - j);
}
return p;
}
In both examples, the first element is being used as the pivot to keep the design simple. In both example, i is the index of the element I want.
Any ideas where the 2nd example is failing? Is it a simple off-by-one error?
Thank you all!

This sounds fishy:
i = i + begin;
...
r_select(list, begin, j, i);

Related

Maintain an unordered_map but at the same time need the lowest of it's mapped values at every step

I have an unordered_map<int, int> which is updated at every step of a for loop. But at the end of the loop, I also need the lowest of the mapped values. Traversing it to find the minimum in O(n) is too slow. I know there exists MultiIndex container in boost but I can't use boost. What is the simplest way it can be done using only STL?
Question:
Given an array A of positive integers, call a (contiguous, not
necessarily distinct) subarray of A good if the number of different
integers in that subarray is exactly K.
(For example, [1,2,3,1,2] has 3 different integers: 1, 2, and 3.)
Return the number of good subarrays of A.
My code:
class Solution {
public:
int subarraysWithKDistinct(vector<int>& A, int K) {
int left, right;
unordered_map<int, int> M;
for (left = right = 0; right < A.size() && M.size() < K; ++right)
M[A[right]] = right;
if (right == A.size())
return 0;
int smallest, count;
smallest = numeric_limits<int>::max();
for (auto p : M)
smallest = min(smallest, p.second);
count = smallest - left + 1;
for (; right < A.size(); ++right)
{
M[A[right]] = right;
while (M.size() > K)
{
if (M[A[left]] == left)
M.erase(A[left]);
++left;
}
smallest = numeric_limits<int>::max();
for (auto p : M)
smallest = min(smallest, p.second);
count += smallest - left + 1;
}
return count;
}
};
Link to the question: https://leetcode.com/problems/subarrays-with-k-different-integers/
O(n) is not slow, in fact it is the theoretically fastest possible way to find the minimum, as it's obviously not possible to find the minimum of n items without actually considering each of them.
You could update the minimum during the loop, which is trivial if the loop only adds new items to the map but becomes much harder if the loop may change existing items (and may increase the value of the until-then minimum item!), but ultimately, this also adds O(n) amount of work, or more, so complexity-wise, it's not different from doing an extra loop at the end (obviously, the constant can be different - the extra loop may be slower than reusing the original loop, but the complexity is the same).
As you said, there are data structures that make it more efficient (O(log n) or even O(1)) to retrieve the minimum item, but at the cost of increased complexity to maintain this data structure during insertion. These data structures only make sense if you frequently need to check the minimum item while inserting or changing items - not if you only need to know the minimum only at the end of the loop, as you described.
I made a simple class to make it work although it's far from perfect, it's good enough for the above linked question.
class BiMap
{
public:
void insert(int key, int value)
{
auto itr = M.find(key);
if (itr == M.cend())
M.emplace(key, S.insert(value).first);
else
{
S.erase(itr->second);
M[key] = S.insert(value).first;
}
}
void erase(int key)
{
auto itr = M.find(key);
S.erase(itr->second);
M.erase(itr);
}
int operator[] (int key)
{
return *M.find(key)->second;
}
int size()
{
return M.size();
}
int minimum()
{
return *S.cbegin();
}
private:
unordered_map<int, set<int>::const_iterator> M;
set<int> S;
};
class Solution {
public:
int subarraysWithKDistinct(vector<int>& A, int K) {
int left, right;
BiMap M;
for (left = right = 0; right < A.size() && M.size() < K; ++right)
M.insert(A[right], right);
if (right == A.size())
return 0;
int count = M.minimum() - left + 1;
for (; right < A.size(); ++right)
{
M.insert(A[right], right);
while (M.size() > K)
{
if (M[A[left]] == left)
M.erase(A[left]);
++left;
}
count += M.minimum() - left + 1;
}
return count;
}
};

fastest way to traverse a tree structure in cpp with a stack

I've got a tree structure and a running example of how to iterate over it but as I'm a beginner when it comes to performance in coding, I wanted to ask if somebody knows a way to make it faster.
Eigen3 is used to build vectors.
The struct:
struct linkedBoxes{
std::vector<Vector3r> points;
std::vector<linkedBoxes*> boxes;
int sideLength;
};
And the algorithm:
std::vector<Vector3r> integPoints;
std::vector<linkedBoxes*> stack;
stack.push_back(firstBox);
vector<linkedBoxes*>::iterator iterator = stack.begin();
while (iterator != stack.end()){
if((*iterator)->boxes.size() == 0){
for (int j = 0; j < (*iterator)->points.size(); ++j) {
integPoints.push_back(point + (*iterator)->points[j]);
}
} else {
for (int k = 0; k < (*iterator)->points.size(); ++k) {
Vector3r tmpPoint = point + (*iterator)->points[k];
if(computeDistance(tmpPoint) < 0){
const size_t diff = iterator - stack.begin();
stack.push_back((*iterator)->boxes[k]);
iterator = stack.begin() + diff;
}
}
}
++iterator;
}
Point is a vector which is given to the function and computeDistance returns a value between -1 and 1.
Does somebody knows a way to make this faster?
Bruno

Quicksort c++ first element as pivot

I have something like this and I want to have first element as pivot.
Why this program is still does not working?
void algSzyb1(int tab[],int l,int p)
{
int x,w,i,j;
i=l; //l is left and p is pivot, //i, j = counter
j=p;
x=tab[l];
do
{
while(tab[i]<x) i++;
while(tab[j]>x) j--;
if(i<=j)
{
w=tab[i];
tab[i]=tab[j];
tab[j]=w;
i++;
j--;
}
}
while(!(i<j));
if(l<j) algSzyb1(tab,l,j);
if(i<p) algSzyb1(tab,i,p);
}
Looking at the code, not really checking what it does, just looking at the individual lines, this one line stands out:
while(!(i<j));
I look at that line, and I think: There is a bug somewhere round here. I haven't actually looked at the code so I don't know what the bug is, but I look at this single line and it looks wrong.
I think you need to decrement j before incrementing i.
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
Also I have added an extra condition to ensure that i doesn't sweep past j. (Uninitialized memory read).
The pivot is slightly mis-named, as the end result is a sorted element, but this and the wikipedia page : quicksort both move the pivot into the higher partition, and don't guarantee the item in the correct place.
The end condition is when you have swept through the list
while( i < j ); /* not !(i<j) */
At the end of the search, you need to test a smaller set. The code you had created a stack overflow, because it repeatedly tried the same test.
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
Full code
void algSzyb1(int tab[], int l, int p)
{
int x, w, i, j;
i = l;
j = p;
x = tab[l]; //wróć tu później :D
do
{
while (tab[j]>x ) j--;
while (tab[i]<x && i < j) i++;
if (i < j)
{
w = tab[i];
tab[i] = tab[j];
tab[j] = w;
i++;
j--;
}
} while ((i<j));
if (l<j) algSzyb1(tab, l, j);
if (j+1<p) algSzyb1(tab, j+1, p);
}

c++ iterate through all neighbor permutations

I have a vector of N objects, and I would like to iterate through all neighbor permutations of this vector. What I call a neighbor permutation is a permutation where only two elements of the original vector would be changed :
if I have a vector with 'a','b','c','d' then :
'b','a','c','d' //is good
'a','c','b','d' //is good
'b','a','d','c' //is not good (2 permutations)
If I use std::next_permutation(myVector.begin(), myVector.end() then I will get all the possible permutations, not only the "neighbor" ones...
Do you have any idea how that could be achieved ?
Initially, I thought I would filter the permutations that have a hamming distance greater than 2.
However, if you really only need to generate all the vectors resulting by swapping one pair, it would be more efficient if you do like this:
for(int i = 0; i < n; i++)
for(int j = i + 1; j < n; j++)
// swap i and j
Depending on whether you need to collect all the results or not, you should make a copy or the vector before the swap, or swap again i and j after you processed the current permutation.
Collect all the results:
std::vector< std::vector<T> > neighbor_permutations;
for(int i = 0; i < n; i++) {
for(int j = i + 1; j < n; j++) {
std::vector<T> perm(v);
std::swap(perm[i], perm[j]);
neighbor_permutations.push_back(perm);
}
}
Faster version - do not collect results:
for(int i = 0; i < n; i++) {
for(int j = i + 1; j < n; j++) {
std::swap(v[i], v[j]);
process_permutation(v);
std::swap(v[i], v[j]);
}
}
Perhaps it's a good idea to divide this into two parts:
How to generate the "neighbor permutations"
How to iterate over them
Regarding the first, it's easy to write a function:
std::vector<T> make_neighbor_permutation(
const std::vector<T> &orig, std::size_t i, std::size_t j);
which swaps i and j. I did not understand from your question if there's an additional constraint that j = i + 1, in which case you could drop a parameter.
Armed with this function, you now need an iterator that iterates over all legal combinations of i and j (again, I'm not sure of the interpretation of your question. It might be that there are n - 1 values).
This is very easy to do using boost::iterator_facade. You simply need to define an iterator that takes in the constructor your original iterator, and sets i (and possibly j) to initial values. As it is incremented, it needs to update the index (or indices). The dereference method needs to call the above function.
Another way to get it, just a try.
int main()
{
std::vector<char> vec={'b','a','c','d'};
std::vector<int> vec_in={1,1,0,0};
do{
auto it =std::find(vec_in.begin(),vec_in.end(),1);
if( *(it++) ==1)
{
for(auto &x : vec)
{
std::cout<<x<<" ";
}
std::cout<<"\n";
}
} while(std::next_permutation(vec_in.begin(),vec_in.end()),
std::next_permutation(vec.begin(),vec.end()) );
}

Implementing selection sort with vectors

I am attempting to implement a function that sorts a randomly generated vector using selection sort. I am trying a naive way just to see if I can get it working correctly. Here is my attempt:
void selection_sort(std::vector<int>& v)
{
int pos, min, i;
//std::vector<int>::iterator pos, min, i;
for( pos = v[0]; pos < v[30]; ++pos)
{
min = pos;
for( i = v[pos + 1]; i < v[30]; ++i)
{
if( i < min)
{
min = i;
}
}
if( min != pos)
{
std::swap(v.at(min), v.at(pos));
}
}
}
For some reason however when I display the vector again, all of the elements are in the exact same order as they were originally. I am not sure if I am not using std::swap correctly or if my selection sort is not written correctly. I am sure the answer is trivially easy, but I can not see it. Thanks for your help in advance.
Your problem is that you're trying to base your loops around the actual values in the vector, not the indexes in the vector.
So if your vector is randomly generated, and you say this:
for( pos = v[0]; pos < v[30]; ++pos)
There is a chance that the value at v[0] is greater than v[30]. Thus the loop would never run. I see the same problem in this loop:
for( i = v[pos + 1]; i < v[30]; ++i)
So I'd recommend using indexes for the actual looping. Try something like:
for( pos = 0; pos < 30; ++pos)
{
min = v[pos];
etc...
EDIT: As mentioned below, it would also be better to base your loop of the size of the of the vector. However, to save your self from calling the expensive size() method every time the loops runs, just grab the size before the loop starts. For example:
size_t size = v.size();
for(size_t pos = 0; pos < size; ++pos)
You should use 0, pos+1 and v.size() as end points in your for-loops.