Efficiently mark nodes in a graph that must no longer be considered - c++

I have a graph and iterate over every node multiple times until i mark it as finished eventually and ignore it in future iterations. This process is repeated until all nodes are marked.
So far, i have a std::vector that stores the status for all nodes: finished[v] = 1 when the node is finished and 0 otherwise. The code looks like this:
for every node v {
if finished[v] == 0 {
[...]
}
}
The problem is that near the end of the computation, only a few nodes are not marked but i still check every single one for finshed[v] == 0
Would it be better to save all node id's in a vector and then remove them until the vector is empty (I heard removing elements in a vector is not really efficient)?
Since I already store the number of finished nodes as a smple integer, I could just move all marked nodes at the end of the vector and cut it (at the position totalNumberOfNodes - numberOfFinishedNodes) in case moving elements is more efficient than deleting. Or is a vector just inferior to other data structures in this scenario?

Using std::list<T>:
#include <list>
std::list<int> unvisited_nodes;
// fill in unvisited_nodes with all nodes' ids
loop of you algorithm
{
// iterate only over unvisited nodes
for (auto it = unvisited_nodes.begin(); it != unvisited_nodes.end(); )
{
visit(*it);
if (shouldNotBeVisitedAgain(*it))
{
unvisited_nodes.erase(it++);
}
else
{
++it;
}
}
}
Using your std::vector<T>:
#include <vector>
std::vector<int> unvisited_nodes;
// fill in unvisited_nodes with all nodes' ids
loop of you algorithm
{
// iterate only over unvisited nodes
for (int i = 0; i < unvisited_nodes.size(); )
{
visit(unvisited_nodes[i]);
if (shouldNotBeVisitedAgain(unvisited_nodes[i]))
{
std::swap(unvisited_nodes[i], unvisited_nodes.back());
unvisited_nodes.pop_back();
}
else
{
++i;
}
}
}
Regarding elements' removal from std::vector<T>: this operations has O(N) complexity only in case you want to preserve the original order of elements. This operation can be optimized if the order of elements after removal does not need to be the same:
std::vector<int> v = { 1, 2, 3, 4, 5, 6, 7 } ;
// now, let's remove element under index 3, v[3] == 4:
std::swap(v[3], v.back());
v.pop_back();
// now v == { 1, 2, 3, >7<, 5, 6 }

If you need them to stay in a particular order: a linked list may be the only efficient solution (you can consider other data structures like "ropes" if you want, but I suspect you won't want to implement them).
If you only need them to stay in sorted order: std::multiset should also work; just remove the elements that you've visited.
If you don't care about order at all: just keep a vector of the indices of all the nodes to be processed, but instead of actually erasing an element from the middle, swap it with the last element and then pop the back of the vector.

Related

Most efficient algorithm for Two-sum problem (involving indices)

The problem statement is given an array and a given sum "T", find all the pairs of indices of the elements in the array which add up to T. Additional requirements/constraints:
Indexing starts from 0
The indices must be displayed with lower index first (Ex: 24, 30 instead of 30, 24)
The indices must be displayed in ascending order (Ex: if we find (1,3), (0,2) and (5,8) the output must be (0,2) (1,3) (5,8)
There can be duplicate elements in the array, which also have to be considered
Here's my code in C++, I used the hash-table approach using unordered_set:
void Twosum(vector <int> res, int T){
int temp; int ti = -1;
unordered_set<int> s;
vector <int> res2 = res; //Just a copy of the input vector
vector <tuple<int, int>> indices; //Result to be output
for (int i = 0; i < (int)res.size(); i++){
temp = T - res[i];
if (s.find(temp) != s.end()){
while(ti < (int)res.size()){ //While loop for finding all the instances of temp in the array,
//not part of the original hash-table algorithm, something I added
ti = find(res2.begin(), res2.end(), temp) - res2.begin();
//Here find() takes O(n) time which is an issue
res2[ti] = lim; //To remove that instance of temp so that new instances
//can be found in the while loop, here lim = 10^9
if(i <= ti) indices.push_back(make_tuple(i, ti));
else indices.push_back(make_tuple(ti, i));
}
}
s.insert(res[i]);
}
if(ti == -1)
{cout<<"-1 -1"; //if no indices were found
return;}
sort(indices.begin(), indices.end()); //sorting since unordered_set stores elements randomly
for(int i=0; i<(int)indices.size(); i++)
cout<<get<0>(indices[i])<<" "<<get<1>(indices[i])<<endl;
}
This has multiple issues:
firstly that while loop doesn't work as intended, instead it shows SIGABRT error (free(): invalid pointer). The ti index is also somehow going beyond the vector bounds, even though I have that check in the while loop.
Secondly the find() function works in O(n) time, which increases the overall complexity to O(n^2), which is causing my program to timeout during execution. However that function is required since we have to output indices.
Lastly this unordered-set implementation doesn't seem to work when there are many duplicate elements in the array (since sets only take unique elements), which is one of the main constraints of the problem. This makes me think we need some sort of hash function or hashmap to deal with the duplicates? I'm not sure...
All the different algorithms I've found for this on the internet have dealt with just printing the elements and not the indices, hence I've had no luck with this problem.
If any of you know an optimal algorithm for this while also satisfying the constraints and running under O(n) time, your help would be highly appreciated. Thank you in advance.
Here is a pseudo-code answering your question, using hash tables (or maps) and set. I let you translate this to cpp using adapted data structures (in this case, classic hashmaps and sets will do the job well).
Notations: we will denote A the array, n its length, and T the "sum".
// first we build a map element -> {set of indices corresponding to this element}
Let M be an empty map; // or hash map, or hash table, or dictionary
for i from 0 to n-1 do {
Let e = A[i];
if e is not a key of M then {
M[e] = new_set()
}
M[e].add(i)
}
// Now we iterate over the elements
for each key e of M do {
if T-e is a key of M then {
display_combinations(M[e], M[T-e]);
}
}
// The helper function display_combinations
function display_combinations(set1, set2) {
for each element e1 of set1 do {
for element e2 of set2 do {
if e1 < e2 then {
display "(e1, e2)";
} else if e1 > e2 then {
display "(e2, e1)";
}
}
}
}
As said in the comments, the complexity in the worst case of this algorithm is in O(n²). A way to see that we cannot go below this complexity is that the size of the output may be in O(n²), in the case where all elements of the array have the value T/2.
Edit: this pseudo code does not output the pairs in the order. Just store them in an array of pairs, and sort this array before displaying it. Same, I did not treat the case where a pair (i, i) may satisfy the requirement. You may have to consider it (just change e1 > e2 by e1 >= e2 in the last loop)

how to get pairs of elements efficiently in a linked list (Haxe)

I have a list of objects and I would like to return each possible unique pair of objects within this list. Is the following the most efficient way to do that in Haxe?
for (elem1 in my_list)
{
for (elem2 in my_list)
{
if (elem1 == elem2)
{
break;
}
trace(elem1, elem2);
}
}
I would rather avoid the equality check if possible. The reason that I am not using arrays or vectors is that these lists will be added to/removed from very frequently and I have no need for index level access.
If you want to efficient (the less amount of iterations), you could loop like this:
for (i in 0 ... my_list.length-1) // loop to total minus 1
for (j in i+1 ... my_list.length) // start 1 further than i, loop to end
if (my_list[i] != my_list[j]) // not match
[my_list[i], my_list[j]]]; // make pair
Btw, it depends on the content if linked list or array actually faster, since this uses indexes now. You should test/measure it your case (don't assume anything if it's performance critic piece of code).
try online:
http://try.haxe.org/#2Ab3F

Removing multiple elements from stl list while iterating

This is not similar to Can you remove elements from a std::list while iterating through it?. Mine is a different scenario.
Lets say I have a list like this.
1 2 3 1 2 2 1 3
I want to iterate this stl list in such a way that
When I first encounter an element X I do some activity and then I need to remove all the elements X in that list and continue iterating. Whats an efficient way of doing this in c++.
I am worried that when i do a remove or an erase I will be invalidating the iterators. If it was only one element then I could potentially increment the iterator and then erase. But in my scenario I would need to delete/erase all the occurances.
Was thinking something like this
while (!list.empty()) {
int num = list.front();
// Do some activity and if successfull
list.remove(num);
}
Dont know if this is the best.
Save a set of seen numbers and if you encounter a number in the set ignore it. You can do as follows:
list<int> old_list = {1, 2, 3, 1, 2, 2, 1, 3};
list<int> new_list;
set<int> seen_elements;
for(int el : old_list) {
if (seen_elements.find(el) == seen_elements.end()) {
seen_elements.insert(el);
new_list.push_back(el);
}
}
return new_list;
This will process each value only once and the new_list will only contain the first copy of each element in the old_list. This runs in O(n*log(n)) because each iteration performs a set lookup (you can make this O(n) by using a hashset). This is significantly better than the O(n^2) that your approach runs in.

Finding maximum values of rests of array

For example:
array[] = {3, 9, 10, **12**,1,4,**7**,2,**6**,***5***}
First, I need maximum value=12 then I need maximum value among the rest of array (1,4,7,2,6,5), so value=7, then maxiumum value of the rest of array 6, then 5, After that, i will need series of this values. This gives back (12,7,6,5).
How to get these numbers?
I have tried the following code, but it seems to infinite
I think I'll need ​​a recursive function but how can I do this?
max=0; max2=0;...
for(i=0; i<array_length; i++){
if (matrix[i] >= max)
max=matrix[i];
else {
for (j=i; j<array_length; j++){
if (matrix[j] >= max2)
max2=matrix[j];
else{
...
...for if else for if else
...??
}
}
}
}
This is how you would do that in C++11 by using the std::max_element() standard algorithm:
#include <vector>
#include <algorithm>
#include <iostream>
int main()
{
int arr[] = {3,5,4,12,1,4,7,2,6,5};
auto m = std::begin(arr);
while (m != std::end(arr))
{
m = std::max_element(m, std::end(arr));
std::cout << *(m++) << std::endl;
}
}
Here is a live example.
This is an excellent spot to use the Cartesian tree data structure. A Cartesian tree is a data structure built out of a sequence of elements with these properties:
The Cartesian tree is a binary tree.
The Cartesian tree obeys the heap property: every node in the Cartesian tree is greater than or equal to all its descendants.
An inorder traversal of a Cartesian tree gives back the original sequence.
For example, given the sequence
4 1 0 3 2
The Cartesian tree would be
4
\
3
/ \
1 2
\
0
Notice that this obeys the heap property, and an inorder walk gives back the sequence 4 1 0 3 2, which was the original sequence.
But here's the key observation: notice that if you start at the root of this Cartesian tree and start walking down to the right, you get back the number 4 (the biggest element in the sequence), then 3 (the biggest element in what comes after that 4), and the number 2 (the biggest element in what comes after the 3). More generally, if you create a Cartesian tree for the sequence, then start at the root and keep walking to the right, you'll get back the sequence of elements that you're looking for!
The beauty of this is that a Cartesian tree can be constructed in time Θ(n), which is very fast, and walking down the spine takes time only O(n). Therefore, the total amount of time required to find the sequence you're looking for is Θ(n). Note that the approach of "find the largest element, then find the largest element in the subarray that appears after that, etc." would run in time Θ(n2) in the worst case if the input was sorted in descending order, so this solution is much faster.
Hope this helps!
If you can modify the array, your code will become simpler. Whenever you find a max, output that and change its value inside the original array to some small number, for example -MAXINT. Once you have output the number of elements in the array, you can stop your iterations.
std::vector<int> output;
for (auto i : array)
{
auto pos = std::find_if(output.rbegin(), output.rend(), [i](int n) { return n > i; }).base();
output.erase(pos,output.end());
output.push_back(i);
}
Hopefully you can understand that code. I'm much better at writing algorithms in C++ than describing them in English, but here's an attempt.
Before we start scanning, output is empty. This is the correct state for an empty input.
We start by looking at the first unlooked at element I of the input array. We scan backwards through the output until we find an element G which is greater than I. Then we erase starting at the position after G. If we find none, that means that I is the greatest element so far of the elements we've searched, so we erase the entire output. Otherwise, we erase every element after G, because I is the greatest starting from G through what we've searched so far. Then we append I to output. Repeat until the input array is exhausted.

std::list - Sort one item

Is it possible to sort a list based off one item?
For instance, if I have
1,3,2,4,5,6,7 ... 1000000
And I know that 3 is the second element, is it possible to efficiently sort 3 into it's correct position between 2 and 4 without re-sorting the entire list?
EDIT: I should also note that, in this scenario, it is assumed that the rest of the list is already sorted; it is simply the 3 that is now out of place.
You could simply find that unordered object (O(n)), take the object out (O(1)), find the correct position (O(n)), then insert it again (O(1)).
Assuming C++11,
#include <list>
#include <algorithm>
#include <iostream>
int main() {
std::list<int> values {1, 2, 3, 9, 4, 5, 6, 7, 12, 14};
auto it = std::is_sorted_until(values.begin(), values.end());
auto val = *--it;
// ^ find that object.
auto next = values.erase(it);
// ^ remove it.
auto ins = std::lower_bound(next, values.end(), val);
// ^ find where to put it back.
values.insert(ins, val);
// ^ insert it.
for (auto value : values) {
std::cout << value << std::endl;
}
}
Before C++11 you need to implement std::is_sorted_until yourself.
For this very limited case, writing your own bubblesort would probably be faster than std::sort.
If you have that level of knowledge, why don't you just swap the items yourself rather than trying to coerce sort to do it for you?
Surely that's a better way.
Even if you don't know where it has to go, you can find that out quickly, remove it, then insert it at the correct location.
I suppose you could use the insert method to move the element, but I'd like to know more about the way you calculate its "correct" position: there could be a better suited algorithm.
If you think about the traversal possible for a list, it's clearly only end-to-end. So:
if you don't know where the mis-ordered element is you have to first scan through the elements one by one until you find it, then
you can remember the value and delete the out-of-order element from the list, then
there are two possibilities:
that element is greater in your sorting order than any other element you've yet encountered, in which case you need to keep going through the remaining elements until you find the correct place to insert it.
the element would belong somewhere amongst the elements you've already passed over, in which case:
you can move backwards, or forwards from the first element again, until you find the correct place to put it.
if you've created some records from your earlier traversal you can instead use it to find the insertion place faster, for example: if you've created a vector of list iterators, you can do a binary search in the vector. Vectors of every Nth element, hash tables etc. are all other possibilities.
This is If you dont use std::list.
With a Selection sort algorthm, you simply sort items 0 to 3 ( selectionSort(list,3)) if you know that thats the range.
Not the entire range till the end.
Sample code :
#include <iostream>
using namespace std;
void selectionSort(int *array,int length)//selection sort function
{
int i,j,min,minat;
for(i=0;i<(length-1);i++)
{
minat=i;
min=array[i];
for(j=i+1;j<(length);j++) //select the min of the rest of array
{
if(min>array[j]) //ascending order for descending reverse
{
minat=j; //the position of the min element
min=array[j];
}
}
int temp=array[i] ;
array[i]=array[minat]; //swap
array[minat]=temp;
}
}
void printElements(int *array,int length) //print array elements
{
int i=0;
for(i=0;i<length;i++) cout<<array[i]<<" ";
cout<<" \n ";
}
int main(void)
{
int list[]={1,3,2,4,5,6}; // array to sort
int number_of_elements=6;
cout<<" \nBefore selectionSort\n";
printElements(list,number_of_elements);
selectionSort(list,3);
cout<<" \nAfter selectionSort\n";
printElements(list,number_of_elements);
cout<<" \nPress any key to continue\n";
cin.ignore();
cin.get();
return 0;
}
Output:
Before selectionSort
1 3 2 4 5 6
After selectionSort
1 2 3 4 5 6
Press any key to continue