Copy elements from vector based on condition C++ - c++

I'm using C++ to create Hopcroft's algorithm for DFA Minimization.
Part of Hopcroft's algorithm is to - initially - divide two sets (P with accept and non-accept states and Q with non-accept states only). I already have group P, and from P I'm trying to extract Q. I'm using the following code to do it:
for(int i=0; i<groupP.size(); i++)
if(groupP[i]->final)
groupQ.push_back(groupP[i]);
in which groupP and groupQ are:
vector<node*> groupQ;
vector<node*> groupP;
and node is a structure that I've created to represent a node of my automata. It's guaranteed that the boolean attribute "final" is already correctly set (false for non-final states, true for final states).
Finally, my question is: is it correct to copy one element from a vector to another by doing what I've done? If I modify the content of a copied element from groupP, will this same element be modified in groupQ as well?

Right now, you have vectors of pointers. When you copy from one vector to another, you're copying the pointers, not the elements themselves.
Since you have two pointers referring to the same node, any modification made to a node will be visible in the other group--i.e., if yo make a change to groupP[i]->foo, then the same change will be visible in groupQ[j]->foo (provided that groupP[i] is one of the elements you copied from groupP to groupQ.
If you don't want that, you have a couple of choices. One would be to leave groupP and groupQ in the same vector, but partition the vector based on the state of an element's final member:
auto P_end = std::partition(groupP.begin(), groupQ.end(),
[](node *n) { return n->final;});
Then [groupP.begin(), P_begin) is groupP (i.e., final==true) and [P_begin, groupP.end()) is groupQ (i.e., final==false).
This moves the pointers around (and gives you an iterator so you know the dividing line between the two) so you have exactly one pointer to each element, but they're separated into the two relevant groups.
As a final possibility, you might want to actually copy elements from groupP to groupQ, and in the process create a new element, so after you copy items from groupP to groupQ, each item you copied now exists in two place--i.e., there's one element in groupP and one element in groupQ. Either one can be modified, but they're separate from each other, so either can be modified, but a modification to one has no effect on the other.
The most obvious way to achieve that would be be to just use vectors of nodes:
vector<node> groupQ;
vector<node> groupP;
This way, when you copy from one group to the other, you're copying the nodes themselves rather than pointers to nodes, so each copy creates a new, independent node with the same value as an existing node.

You could use std::copy_if which does the same thing:
std::copy_if(groupP.cbegin(), groupP.cend(),
std::back_inserter(groupQ),
[](node* n){ return n->final; });
Since you are manipulating pointers, the elements themselves are shared, so modifying a node in one of the container can be seen from the other.
Note that manipulating raw pointers like you are doing is very error prone, and you may want to use shared pointers for instance.
Edit: Adding missing std::back_inserter.

Related

Loop in std::set by index

I'm making a program that handles dynamic graphs. Nodes and arcs are two classes, they are memorized in an array in the graph object, and are all indexed by custom ids (which are that item's position in the array).
Each arc has the ids of the 2 nodes it connects, each node has a list of the ids of all arcs its connected to. (all stored in sets)
An arc's destructor removes its id from the arcs set of the nodes it connected.
Now i'm writing node's destructor. It should call each arc's destructor until its set is empty. I cannot iterate through the set with an iterator, since every step the arc destructor is removing its id from the set itself.
Hence i'd need to always access last element until the set is empty; but std::set does not allow indexing like arrays and vectors, and it doesn't have a "back" like lists and stacks. How can i do that?
relevant code:
graph::arc::~arc()
{
owner->list_node[n1]->remove_arc(id);
owner->list_node[n2]->remove_arc(id);
owner->list_arc[id] = nullptr;
}
graph::node::~node()
{
while (!list_arc.empty())
{
owner->remove_arc(list_arc[list_arc.size()-1]); //invalid, that's roughly what i want to achieve
}
owner->list_node[id] = nullptr;
}
Notes
owner is the graph object. owner->list_(node/arc) holds the actual pointers. Each item's id is equal to its position in graph's list.
This feels like an error-prone cleanup strategy, but an improvement would probably require rewriting significantly more than what is provided in the question. (I suspect that shared and weak pointers could simplify things.) Since I lack enough information to suggest a better data structure:
For a set, it's easier to access the first element, *list_arc.begin(), than the last. (Not a huge difference, but still easier.)

C++ How to update member value via map value pointers

I have an unusual graph structure that's made up of a few classes, and I'm trying to set a boolean member value in one of them for the sake of traversal. Let's say the classes are Graph, Node, and Edge. Graph holds an unordered map with string labels as keys and Nodes as values. The graph has bounded degree, so fixed sized arrays of pointers to Edges are kept at each node, and each Edge also has pointers to the Nodes at each end.
My aim is to visit every Edge once, so I maintain a boolean 'marked' flag inside every Edge initially set to false. Since the map in the Graph lets me iterate over Nodes, I wish to iterate over all Edges from every Node and mark each to avoid repeated visits from opposite ends. However, I find that the marks are failing to be recorded and can't seem to get it to work.
My iteration code looks like this:
for(auto it = nodeMap.begin(); it != nodeMap.end(); ++it){
Node* node = &it->second;
for (i=0; i< node->EdgeArray.size(); i++){
if (node->EdgeArray[i]){
Edge & edge = *(node->EdgeArray[i]);
if(edge.getMark()) continue;
[...do needed processing...]
edge.setMark(true);
}
}
}
I am more comfortable with pointers than I am with references, so my original version had 'edge' as a pointer into EdgeArray without the dereferencing. However, some digging led me to understand that passing by reference is used to effect changes on function caller's values. My suspicion was that some kind of similar tweak is needed here, but in this case all of the iteration occurs in a method in the Graph class where the nodeMap is stored. I've tried basically all variations of pointers (dereferenced or not) and references I could think of, but can't seem to get the marked values to persist outside the loop. I.e., if I add a print that depends on the second if conditional, I never see a result from it.
If your previous version worked, have you tried replacing only edge.setMark(true) with node->EdgeArray[i]->setMark(true)?

Copy Construction For hashMap in C++

In the recent assignment, we are asked to implement a hashmap in C++ without the techniques provided in STL.
I'm stack on one of the functions -- copy constructor. After searching the google, I found a valid solution in the question:
Writing a valid copy constructor for a hash map in C++
But I can't totally understand it. Could anyone please help explain
1. why we need to use a pointer-to-pointer Node** p = &hashTable[i]; ?
2. what is the logic in the while loop?
3. especially, what does this code p=&c->next; mean?
Firstly, there are many different types of hash table implementations, so any specific one you find online may or may not yield insights into what you'll need to do for your own implementation. That said...
p is initially pointed at the head element for the bucket, which is the Node*s at [this->]hashTable[i]. It's initially used to set it to NULL. As you're dealing with Node*s, a Node** is a natural way to keep track of their locations.
each iteration of the while loop duplicates the next Node that's in bucket [i] in hm; the duplicate is created in new memory at c, and *p (which tracks the linked list positions being created for the *this object under construction) is updated to point thereto.
p=&c->next; means p is set to the next member of the newly created Node (at address c): that next pointer must be initialised by the Node(const Node&) constructor to nullptr/NULL/0, or the linked lists created wouldn't terminate properly. Only if there are more elements in the linked list of colliding elements to be copied, the next iteration will overwrite *p and therefore the next member of the previously added Node with the next value of c.
Summarily, you're looking at a loop that copies amountOfBuckets linked lists. If you're not familiar with linked list operations, you'd be better off writing a linked list class first and getting that working, then use it to help implement the hash table.

Difference between ways of sorting linked lists c++

I was thinking about ways of sorting a linked list and I came up with two different ways (using BubbleSort, because I'm relatively new at programming and it is the simplest algorithm for me). Example struct:
struct node {
int value;
node *next;
};
The two different methods:
Rearranging the list elements
Doing something like swap(root->value, root->next->value)
I did some Google searches on the subject, and from the looks of it, the first method seems to be more popular. From my experience, such that it is, rearranging the list is more complicated than simply swapping the actual node values. Is there any benefit in rearranging the whole list, and if yes, what is it?
I can think of two advantages:
1) Other pointers might exist, pointing to nodes in this list. If you rearrange the list, these pointers will still point to the same values they pointed to before the sorting; if you swap values, they won't. (Which one of these two is better depends on the details of your design, but there are designs in which it is better if they remain pointing to the same values.)
2) It doesn't matter much for a list of mere ints, but eventually you might be sorting a list of more complex things, so that swapping values is very expensive or even impossible.
As answered by Beta, it's better to rearrange the nodes (via the next pointers) than it is to swap node data.
If actually using a bubble sort or any sort that "swaps" nodes via the pointers, swap the next (or head) pointers to the two nodes to be swapped first, then swap those two nodes next pointers. This handles both the adjacent node case where 3 pointers are rotated, and the normal case where 2 pairs of pointers are swapped.
Another simple option is to create an new empty list (node * pNew = NULL;) for the sorted list. Remove a node from the original list one at a time and insert that node into the sorted list in order, or scan the original list for the largest node, remove that node and prepend the sorted list with that node.
If the list is large and speed is important, than bottom up merge sorts are much faster.

Does appending a list to another list in F# incur copying of underlying objects or just the pointers?

I've always thought that appending a list to another one meant copying the objects from the first list and then pointing to the appended list as described for example here.
However, in this blog post and in its comment, it says that it is only the pointers that are copied and not the underlying objects.
So what is correct?
Drawing from Snowbear's answer, a more accurate image of combining two lists (than the one presented in the first referred article in the question) would be as shown below.
let FIRST = [1;2;3]
let SECOND = [4;5;6]
let COMBINED = FIRST # SECOND
In the functional world, lists are immutable. This means that node sharing is possible because the original lists will never change. Because the first list ends with the empty list, its nodes must be copied in order to point its last node to the second list.
If you mean this statement then the answer is seems to be pretty simple. Author of the first article is talking about list node elements when he says nodes. Node element is not the same as the list item itself. Take a look at the pictures in the first article. There are arrows going from every element to the next node. These arrows are pointers. But integer type (which is put into the list) has no such pointers. There is probably some list node type which wraps those integers and stores the pointers. When author says that nodes must be copies he is talking about these wrappers being copied. The underlying objects (if they were not value types as in this case) would not be cloned, new wrappers will point to the same object as before.
F# lists hold references (not to be confused with F#'s ref) to their elements; list operations copy those references (pointers), but not the elements themselves.
There are two ways you might append items to an existing list, which is why there seems to be a discrepancy between the articles (though they both look to be correct):
Cons operator (::): The cons operator prepends a single item to an F# list, producing a new list. It's very fast (O(1)), since it only needs to call a very simple constructor to produce the new list.
Append operator (#): The append operator appends two F# lists together, producing a new list. It's not as fast (O(n)) because in order for the elements of the combined list to be ordered correctly, it needs to traverse the entire list on the left-hand-side of the operator (so copying can start at the first element of that list). You'll still see this used in production if the list on the left-hand-side is known to be very small, but in general you'll get much better performance from using ::.