Clustering example in C++ - c++

I have an increasing input vector like this {0, 1, 3, 5, 6, 7, 9} and want to cluster the inputs like this {{0, 1}, {3}, {5, 6, 7}, {9}} i.e cluster only the integers that are neighbors. The data structure std::vector<std::vector<int>> solution(const std::vector<int>& input)

I usually advocate for not giving away solutions, but it looks like you're getting bogged down with indices and temporary vectors. Instead, standard iterators and algorithms make this task a breeze:
std::vector<std::vector<int>> solution(std::vector<int> const &input) {
std::vector<std::vector<int>> clusters;
// Special-casing to avoid returning {{}} in case of an empty input
if(input.empty())
return clusters;
// Loop-and-a-half, no condition here
for(auto it = begin(input);;) {
// Find the last element of the current cluster
auto const last = std::adjacent_find(
it, end(input),
[](int a, int b) { return b - a > 1; }
);
if(last == end(input)) {
// We reached the end: register the last cluster and return
clusters.emplace_back(it, last);
return clusters;
}
// One past the end of the current cluster
auto const gap = next(last);
// Register the cluster
clusters.emplace_back(it, gap);
// One past the end of a cluster is the beginning of the next one
it = gap;
}
}
See it live on Coliru (lame output formatting free of charge)

Related

using std::remove_if to iterate and remove entries in deque

I am using std::remove_if to iterate through a deque to remove items found in a separate list. The logic seems straight forward but my test code seems to leave items remaining in the deque that should have been removed. I'm failing to see where the fault is in my logic.
int main()
{
const int32_t STATUS_OK = 1;
const int32_t STATUS_NOT_OK = 2;
const int32_t STATUS_TEST = 3;
std::deque<int32_t> error_codes;
error_codes.push_front(STATUS_OK);
error_codes.push_front(STATUS_NOT_OK);
error_codes.push_front(STATUS_TEST);
std::list<int32_t> ignored_codes;
ignored_codes.push_back(STATUS_OK);
ignored_codes.push_back(STATUS_NOT_OK);
error_codes.erase(
std::remove_if(
begin(error_codes),
end(error_codes),
[ignored_codes](int32_t error_code) {
for (auto ignore_code : ignored_codes)
{
if (ignore_code == error_code)
return true;
}
return false;
})
// edit - this is what was missing:
end(error_codes));
}
At the end of the execution, error_codes shows 2 items {3, 1} remaining in the deque. If I add an additional STATUS_NOT_OK into the error codes, then i am left with {3, 2, 1} in the error_codes deque, so... What am i doing wrong?

How to remove non contiguous elements from a vector in c++

I have a vector std::vector<inputInfo> inputList and another vector std::vector<int> selection.
inputInfo is a struct that has some information stored.
The vector selection corresponds to positions inside inputList vector.
I need to remove elements from inputList which correspond to entries in the selection vector.
Here's my attempt on this removal algorithm.
Assuming the selection vector is sorted and using some (unavoidable ?) pointer arithmetic, this can be done in one line:
template <class T>
inline void erase_selected(std::vector<T>& v, const std::vector<int>& selection)
{
v.resize(std::distance(
v.begin(),
std::stable_partition(v.begin(), v.end(),
[&selection, &v](const T& item) {
return !std::binary_search(
selection.begin(), selection.end(),
static_cast<int>(static_cast<const T*>(&item) - &v[0]));
})));
}
This is based on an idea of Sean Parent (see this C++ Seasoning video) to use std::stable_partition ("stable" keeps elements sorted in the output array) to move all selected items to the end of an array.
The line with pointer arithmetic
static_cast<int>(static_cast<const T*>(&item) - &v[0])
can, in principle, be replaced with STL algorithms and index-free expression
std::distance(std::find(v.begin(), v.end(), item), std::begin(v))
but this way we have to spend O(n) in std::find.
The shortest way to remove non-contiguous elements:
template <class T> void erase_selected(const std::vector<T>& v, const std::vector<int>& selection)
{
std::vector<int> sorted_sel = selection;
std::sort(sorted_sel.begin(), sorted_sel.end());
// 1) Define checker lambda
// 'filter' is called only once for every element,
// all the calls respect the original order of the array
// We manually keep track of the item which is filtered
// and this way we can look this index in 'sorted_sel' array
int itemIndex = 0;
auto filter = [&itemIndex, &sorted_sel](const T& item) {
return !std::binary_search(
sorted_sel.begin(),
sorted_sel.end(),
itemIndex++);
}
// 2) Move all 'not-selected' to the end
auto end_of_selected = std::stable_partition(
v.begin(),
v.end(),
filter);
// 3) Cut off the end of the std::vector
v.resize(std::distance(v.begin(), end_of_selected));
}
Original code & test
If for some reason the code above does not work due to strangely behaving std::stable_partition(), then below is a workaround (wrapping the input array values with selected flags.
I do not assume that inputInfo structure contains the selected flag, so I wrap all the items in the T_withFlag structure which keeps pointers to original items.
#include <algorithm>
#include <iostream>
#include <vector>
template <class T>
std::vector<T> erase_selected(const std::vector<T>& v, const std::vector<int>& selection)
{
std::vector<int> sorted_sel = selection;
std::sort(sorted_sel.begin(), sorted_sel.end());
// Packed (data+flag) array
struct T_withFlag
{
T_withFlag(const T* ref = nullptr, bool sel = false): src(ref), selected(sel) {}
const T* src;
bool selected;
};
std::vector<T_withFlag> v_with_flags;
// should be like
// { {0, true}, {0, true}, {3, false},
// {0, true}, {2, false}, {4, false},
// {5, false}, {0, true}, {7, false} };
// for the input data in main()
v_with_flags.reserve(v.size());
// No "beautiful" way to iterate a vector
// and keep track of element index
// We need the index to check if it is selected
// The check takes O(log(n)), so the loop is O(n * log(n))
int itemIndex = 0;
for (auto& ii: v)
v_with_flags.emplace_back(
T_withFlag(&ii,
std::binary_search(
sorted_sel.begin(),
sorted_sel.end(),
itemIndex++)
));
// I. (The bulk of ) Removal algorithm
// a) Define checker lambda
auto filter = [](const T_withFlag& ii) { return !ii.selected; };
// b) Move every item marked as 'not-selected'
// to the end of an array
auto end_of_selected = std::stable_partition(
v_with_flags.begin(),
v_with_flags.end(),
filter);
// c) Cut off the end of the std::vector
v_with_flags.resize(
std::distance(v_with_flags.begin(), end_of_selected));
// II. Output
std::vector<T> v_out(v_with_flags.size());
std::transform(
// for C++20 you can parallelize this
// with 'std::execution::par' as first parameter
v_with_flags.begin(),
v_with_flags.end(),
v_out.begin(),
[](const T_withFlag& ii) { return *(ii.src); });
return v_out;
}
The test function is
int main()
{
// Obviously, I do not know the structure
// used by the topic starter,
// so I just declare a small structure for a test
// The 'erase_selected' does not assume
// this structure to be 'light-weight'
struct inputInfo
{
int data;
inputInfo(int v = 0): data(v) {}
};
// Source selection indices
std::vector<int> selection { 0, 1, 3, 7 };
// Source data array
std::vector<inputInfo> v{ 0, 0, 3, 0, 2, 4, 5, 0, 7 };
// Output array
auto v_out = erase_selected(v, selection);
for (auto ii : v_out)
std::cout << ii.data << ' ';
std::cout << std::endl;
}

BFS paths C++ implementation returning return an extra value

I'm trying to implement a function using Breadth First Search to find the paths given a start and end nodes. I'm new to c++, I implemented the same in python already and it works.
With the following graph, it should give the paths {{1, 3, 6}, {1, 2, 5, 6}}:
map<int, vector<int> > aGraph = {
{1, {2, 3}},
{2, {1, 4, 5}},
{3, {1, 6}},
{4, {2}},
{5, {2, 6}},
{6, {3, 5}}
};
I created a function called BFSPaths to solve the problem, however I keep on getting an extra digit in the answer {{1, 2, 3, 6}, {1, 2, 4, 5, 6}}. I haven't been able to figure out why the 2 and the 4 are being added to the answer. This is how the functions looks like:
vector<vector<int>> BFSPaths(map<int, vector<int>> &graph, int head, int tail)
{
vector<vector<int>> out;
vector<int> init {head};
vector<tuple<int, vector<int>>> queue = { make_tuple(head, init) };
while (queue.size() > 0)
{
int vertex = get<0>(queue.at(0));
vector<int> path = get<1>(queue.at(0));
queue.erase(queue.begin());
vector<int> difference;
sort(graph.at(vertex).begin(), graph.at(vertex).end());
sort(path.begin(), path.end());
set_difference(
graph.at(vertex).begin(), graph.at(vertex).end(),
path.begin(), path.end(),
back_inserter( difference )
);
for (int v : difference)
{
if (v == tail)
{
path.push_back(v);
out.push_back(path);
}
else
{
path.push_back(v);
tuple<int, vector<int>> temp (v, path);
queue.push_back(temp);
}
}
}
return out;
}
This is how I'm calling my function (to print to the shell):
void testBFSPaths(map<int, vector<int>> &graph, int head, int tail)
{
vector<vector<int>> paths = BFSPaths(graph, head, tail);
for (int i=0; i<paths.size(); i++)
{
print(paths.at(i));
}
}
int main ()
{
// graph definition goes here ....
testBFSPaths(aGraph, 1, 6);
}
I would appreciate if someone can give me a push in the right direction.
As far as I understand your calculating the set difference between reachable vertices and the path to the current vertex here:
set_difference(
graph.at(vertex).begin(), graph.at(vertex).end(),
path.begin(), path.end(),
back_inserter( difference )
);
But it does not make any sense in terms of BFS. As you can see further, you are adding vertices from this difference to your answer no matter if they lies on path from head to tail or not.
You should look to another approach in this case and change your algorithm a little bit.
Steps that I would recommend:
Add the head vertex as you do, but without a path.
Extract queue's head and add all adjacent vertices to queue with a link to their predecessor.
Repeat until queue is not empty or tail is reached.
Get the path from head to tail by following links to predecessors.
Btw, I would recommend you not to use queue.erase(...) method when you want to delete a head of queue (use queue.pop() instead). And also, you can change map.at(key) method to simple map[key].
The last thing -- it looks for me not very clear why do you store adjacent vertices in vector<int> if you have to sort them often. Use smth like set<int> instead so you will not have to worry about it.
The problems was that the path was being updated path.insert(path.end(), v). I need to use a temporary path so that the original path was not needlessly changed by visited nodes during the iteration.
I also used sets (not much of a difference, it only removes the sorting step).
The function after fixed looks like this:
vector<set<int>> BFSPaths(map<int, set<int>> &graph, int head, int tail)
{
vector<set<int>> out;
set<int> init {head};
queue<tuple<int, set<int>>> aQueue;
aQueue.push( make_tuple(head, init) );
while (aQueue.size() > 0)
{
int vertex = get<0>(aQueue.front());
set<int> path = get<1>(aQueue.front());
aQueue.pop();
vector<int> difference;
set_difference(
graph[vertex].begin(), graph[vertex].end(),
path.begin(), path.end(),
back_inserter( difference )
);
for (int v : difference)
{
set<int> tempPath;
tempPath.insert(path.begin(), path.end());
tempPath.insert(tempPath.end(), v);
if (v == tail)
{
out.push_back(tempPath);
}
else
{
tuple<int, set<int>> temp (v, tempPath);
aQueue.push(temp);
}
}
}
return out;
}
The tempPath is now what's passed to the queue or added to the out vector.

Generate all permutation of an array without duplicate result

This is a leetcode question permutation2.
Given a array num (element is not unique, such as 1,1,2), return all permutations without duplicate result. For example, num = {1,1,2} should have permutations of {1,1,2},{1,2,1},{2,1,1}.
I came up with a solution as follow. Basically, I recursively generate permutations. Assuming [0, begin-1] is fixed, then recursively generate permutation of [begin, lastElement].
vector<vector<int> > permuteUnique(vector<int> &num) {
vector<vector<int> > res;
if(num.empty())
return res;
helper(num, 0, res);
return res;
}
//0...begin-1 is already permutated
void helper(vector<int> &num, int begin, vector<vector<int> > &res)
{
if(begin == num.size())
{
res.push_back(num);//This is a permutation
return;
}
for(int i = begin; i<num.size(); ++i)
{
if(i!=begin&&num[i]==num[begin])//if equal, then [begin+1,lastElement] would have same permutation, so skip
continue;
swap(num[i], num[begin]);
helper(num, begin+1, res);
swap(num[i], num[begin]);
}
}
I was wondering if this is the right solution since leetcode oj gave me Output Limit while my xCode IDE can return the right answer for several cases.
My main concern is does this if(i!=begin&&num[i]==num[begin])continue; can really skip the duplicate result? If not, what is the counter example?
Thanks for sharing your thoughts!
With STL, the code may be:
std::vector<std::vector<int> > permuteUnique(std::vector<int> num) {
std::sort(num.begin(), num.end());
std::vector<std::vector<int> > res;
if(num.empty()) {
return res;
}
do {
res.push_back(num);
} while (std::next_permutation(num.begin(), num.end()));
return res;
}
Live demo
Your test is not sufficient to skip duplicates. For entry {2, 1, 1}, you got:
{2, 1, 1}
{1, 2, 1}
{1, 1, 2}
{1, 1, 2}
{1, 2, 1}
So 2 duplicates.

Good C++ solutions to the "Bring all the zeros to the back of the array" interview challenge

I had an interview for a Jr. development job and he asked me to write a procedure that takes an array of ints and shoves the zeroes to the back. Here are the constraints (which he didn't tell me at the beginning .... As often happens in programming interviews, I learned the constraints of the problem while I solved it lol):
Have to do it in-place; no creating temporary arrays, new arrays, etc.
Don't have to preserve the order of the nonzero numbers (I wish he would've told me this at the beginning)
Setup:
int arr[] = {0, -2, 4, 0, 19, 69};
/* Transform arr to {-2, 4, 19, 69, 0, 0} or {69, 4, -2, 19, 0, 0}
or anything that pushes all the nonzeros to the back and keeps
all the nonzeros in front */
My answer:
bool f (int a, int b) {return a == 0;}
std::sort(arr, arr+sizeof(arr)/sizeof(int), f);
What are some other good answers?
Maybe the interviewer was looking for this answer:
#include <algorithm>
//...
std::partition(std::begin(arr), std::end(arr), [](int n) { return n != 0; });
If the order needs to be preserved, then std::stable_partition should be used:
#include <algorithm>
//...
std::stable_partition(std::begin(arr), std::end(arr), [](int n) { return n != 0; });
For pre C++11:
#include <functional>
#include <algorithm>
//...
std::partition(arr, arr + sizeof(arr)/sizeof(int),
std::bind1st(std::not_equal_to<int>(), 0));
Live Example
Basically, if the situation is that you need to move items that satisfy a condition to "one side" of a container, then the partition algorithm functions should be high up on the list of solutions to choose (if not the solution to use).
An approach that sorts is O(N*Log2N). There is a linear solution that goes like this:
Set up two pointers - readPtr and writePtr, initially pointing to the beginning of the array
Make a loop that walks readPtr up the array to the end. If *readPtr is not zero, copy to *writePtr, and advance both pointers; otherwise, advance only readPtr.
Once readPtr is at the end of the array, walk writePtr to the end of the array, while writing zeros to the remaining elements.
This is O(n) so it may be what he's looking for:
auto arrBegin = begin(arr);
const auto arrEnd = end(arr);
for(int i = 0; arrBegin < arrEnd - i; ++arrBegin){
if(*arrBegin == 0){
i++;
*arrBegin = *(arrEnd - i);
}
}
std::fill(arrBegin, arrEnd, 0);