Single linked lists & time complexity - c++

I'm trying to write my own (as close to standard as possible) single linked list implementation. However I am wondering what time complexity people expect of such a list?
Especially for inserting I am wondering how I should implement it. I've read some locations around the internet, where some say inserting is O(1) while others say O(n) - all agree that a double linked list is O(1). However I think O(1) is the case for single linked lists too?
As long as you know the preceding node you just let the preceding node point to the new new, and the new node will point towards where the preceding node did originally point to.
That said it makes me wonder how people expect insert to behave? Normally it inserts elements BEFORE the given iterator. However with a single-linked-list it is hard to do so (one would have to go through O(n) time to get the preceding element & then use above method). Is it common in such lists to make insert place items behind the current iterator? Or -probably better- is there another common function for this?

The complexity of the insertion depends on what you need to do. If you know preceding node (actually, a suitable handle to change the preceding node's "next pointer" is all you need), the complexity is O(1). If you need to find the location where to insert, the complexity is O(n).
With respect to the expectation, I would expect the insert() to behave the same as for doubly linked lists but I also realize that you can't achieve this: You either need to have a different time complexity (to find the predecessor node) or different iterator invalidation semantics (i.e., iterators to other nodes get invalidated). I think the C++ 2011 std::forward_list class template went for a different interface but retaining the guarantees on iterator validity.
To briefly explain why the iterator validity can be effected: An iterator doesn't have to only know about the current node. Instead, it could, for example, point to the predecessor's next pointer. When dereferencing the iterator, it would dereference first its pointer to the next pointer and then this pointer to get hold of the actual node. In return, it is possible to insert in front of the iterator because the iterator knows which next pointer to update. Unfortunately, this means that iterators may get invalidated because the pointer they point to may have changed and they would reference a different node (when erasing nodes, the iterator may have been moved to be entirely invalid although the node referenced is still there).

Related

Check for end-of-list in boost::intrusive::list without container?

I'm getting started with Boost.Intrusive, specifically interested in the doubly-linked list (boost::intrusive::list).
This would be trivial to do in a "hand-rolled" linked list, but so far I can't find a Boost equivalent:
Given a node that belongs to a list, how do I check to see if it represents the end of the list, without needing the owning container.
In a hand-made list, this would be as simple as checking if the "next" pointer is NULL.
With boost::intrusive::list, there is the s_iterator_to function, which converts a plain node to an iterator. And you can check that against mylist.end(), which gives the desired result, but it requires a reference to the list container itself.
I also note that using operator++ on such an iterator simply produces a garbage value once it is moved past the end — no error or assert from Boost.
After some more research and thought, it seems that there is no way to do what I want with the standard boost::intrusive::list functionality.
The list provided is, in fact, a circular linked list, not a linear one. So, there is no "null pointer" at the end.
The implementation seems to follow a similar design to the Linux kernel's list.h. You always need a reference to the container object because that contains the "head" of the circular list, which is a special node containing no user data. This is also the node that represents end() during traversal.
As to why this design is chosen, I haven't found any hard evidence. Seemingly, the circular list design allows a simpler implementation, with fewer branches. See, for example, this old article, which says "The circular nature of the list makes inserting and removing nodes simple and branch free."
I am not fully convinced by that, since I think using "pointer-to-pointer" style handling can avoid the branches, too. But that's how it's done in boost::intrusive::list, regardless.

C++: can I reuse / move an std::list element from middle to end?

I'm optimising constant factors of my LRU-cache implementation, where I use std::unordered_map to store ::iterators to std::list, which are guaranteed to remain valid even as nearby elements are added or removed. This results in O(n) runtime, so, I'm going after the constant factors.
I understand that each iterator is basically a pointer to the structure that holds my stuff. Currently, to move a given element to the back of the linked list, I call l.erase(it) with the iterator, and then allocate a new pair w/ make_pair(key, value) to l.push_back() or l.emplace_back() (not too sure of the difference), and get the new iterator back for insertion into the map w/ prev(l.end()) or --l.end().
Is there a way to re-use an existing iterator and the underlying doubly-linked list structure that it points to, instead of having to destroy it each time as per above?
(My runtime is currently 56ms (beats 99.78%), but the best C++ submission on leetcode is 50ms.)
As pointed out by HolyBlackCat, the solution is to use std::list::splice.
l.splice(l.end(), l, it);
This avoid any need to l.erase, make_pair(), l.push_back / l.emplace_back(), as well getting the prev(l.end()) / --l.end() to update the underlying std::map.
Sadly, though, it doesn't result in a better runtime speed, but, oh well, possibly a measurement variation, then, or an implementation using more specialised data structures.
Update: actually, I fixed the final instance of reusing the "removed" elements from l.begin(), and got 52ms / 100%! :-)

why Run-time for add-before for doubly-linked lists is O(1)?

In data structures, we say pushing an element before a node in singly-linked lists are O(n) operation! since there is no backward pointers, we have to walk all the way through the elements to get to the key we are going to add before the new element. Therefore, it has a linear run time.
Then, when we introduce doubly-linked lists, we say the problem is resolved and now since we have pointers in both directions pushing before becomes a constant time operation O(1).
I understand the logic but still, something is confusing to me! Since we DO NOT have constant time access to the elements of the list, for finding the element we want to add before, we have to walk through the previous element to get there! that is true that in the doubly-linked list it is now faster to implement the add-before command, but still, the action of finding the interested key is O(n)! then why we say with the doubly-linked list the operation of add before becomes O(1)?
Thanks,
In C++, the std::list::insert() function takes an iterator to indicate where the insert should occur. That means the caller already has this iterator, and the insert operation is not doing a search and therefore runs in constant time.
The find() algorithm, however, is linear, and is the normal way to search for a list element. If you need to find+insert, the combination is O(n).
However, there is no requirement to do a search before an insert. For example, if you have a cached (valid) iterator, you can insert in front of (or delete) the element it corresponds with in constant time.

How can I point to a member of a std::set in such a way that I can tell if the element has been removed?

An iterator into a std::set becomes invalidated if the item it's pointing to is erased. (It does not get invalidated if the set is modified in any other way, which is nice.) However, there is no way to detect whether an iterator has been invalidated or not.
I'm implementing an algorithm that requires me to be able to keep track of members of a std::set in such a way that I can erase them in constant time, but without risking undefined behaviour if I try to delete the same one twice. If I have two iterators pointing to the same member of a set, Bad Things will happen if I try to erase both of them.
My question is, how can I avoid this? Is there some way to implement something that behaves like an iterator into a set, but which knows when it has been invalidated?
Incidentally, I'm using std::set because this is a performance critical situation and I need the complexity guarantees that set provides. I'm happy to accept answers that suggest a different data structure, but only if it allows me to (a) access and remove the smallest element in constant time, (b) remove the pointed-to elements in constant time, and (c) insert elements in O(log(N)) time or better. C++11 is OK.
You could keep a set of shared pointers. And every time you store an iterator, pair it with a weak pointer to the element. When you want to erase the element, first check the weak pointer to see if the object still exists.

Get pointer to node in std::list or std::forward_list

I am planning to use std::list in my code, I decided not to use std::forward_list, because for deletions (I figured) the whole list will have to traversed, O(N) complexity for std::forward_list (being a single link list). However, when I looked into the documentation I noticed both the stl containers have O(N) complexity to remove an item.
http://www.cplusplus.com/reference/forward_list/forward_list/remove/
http://www.cplusplus.com/reference/list/list/remove/
After some thinking I figured out why (I think). It's because in both cases, the whole list has to be scanned to find the node first, and then delete it. Is this right?
I then looked into the "erase" and "erase_after" methods, and their complexity is "Linear in the number of elements erased (destructions).". It's because, I am passing an iterator to the node (which is kind of like a "pointer"). However, I cannot (or prefer not to) pass this iterator around in my code to access the data in the node. I am not sure if this iterator will be valid if the list is modified? Thoughts?
My question is, is there a way I can get a pointer to the node in the list. That way, I know it will be valid throughout the lifetime of my program, pass it around. And I can just look into it to get access to my data.
However, I cannot (or prefer not to) pass this iterator around in my code to access the data in the node.
Why not? Iterators are easy to use and are quite lightweight. A pointer isn't better in any way.
I am not sure if this iterator will be valid if the list is modified?
For list, any iterator will remain valid, even if the list is modified. Except, of course, if you erase the particular element that is the iterator points to. But that's kind of obvious, you can' expect to have an iterator (or pointer) to something that doesn't exist any more.
(vector is more dangerous. One small change to a vector can invalidate all its iterators.)
You can take a pointer to any individual element in the list.
list<int> iterator it = find(l.begin(), l.end(), 7); // get an iterator
int * ptr = &*it; // get a pointer to the same element.
The pointer is similar to the iterator in many respects. But the iterator is a little more powerful. An iterator can be incremented or decremented, to access neighbouring elements in the list. And an iterator can be used to delete an element from the list. A pointer cannot do either of those things.
Both the iterator and pointer remain valid as long as that particular element isn't removed.
I am not sure if this iterator will be valid if the list is modified
Yeah, in the general case, storing iterators is risky unless you keep a close eye on the operations performed on your container.
Problem is, this is just the same for a pointer. In fact, for many containers, iterators are implemented as pointers.
So either store an iterator or a pointer if you like but, either way, keep an eye on the iterator invalidation rules:
Iterator invalidation rules
For lists, an iterator is valid even if other items in the list are erased. It becomes garbage when that item the iterator references in the list is removed.
So, as long as you know the iterator you're passing around isn't being removed by some other piece of code, they're safe to hold onto. This seems fragile though.
Even if there was a construct outside of iterators to reference a node in the list, it would suffer from the same fragility.
However, you can have each node contain an std::shared_ptr to the data it stores instead of the object itself and then pass around std::weak_ptr's to those objects and check for expired before accessing those weak_ptr's.
eg
instead of
std::list<MyClass> foo;
you would have
std::list<std::shared_ptr<MyClass>> foo;
have a look here for info on weak_ptr's
is there a way I can get a pointer to the node in the list
Yes, in your particular implementation.
No, in a standard-compliant way.
If you look at the std::list documentation, there is not a single word about a node. While it is hard to imagine a different way to implement the std::list other than using a doubly linked list, there is nothing that prevents it.
You should almost never come into any contact with undocumented internals of libraries.
Adding, removing and moving the elements within the list or across several lists does not invalidate the iterators or references. An iterator is invalidated only when the corresponding element is deleted.
Source: https://en.cppreference.com/w/cpp/container/list
So a std::list<>::iterator is only invalidated when the corresponding element is deleted. So yes, as long as you make sure that the corresponding element exists (which you will anyway have to do in your scenario of storing/passing around a pointer to anything) you can save and/or pass around the iterator throughout the lifetime of your program.
Now, an iterator is nothing but a pointer in disguise. So, if you prefer to save/pass around the corresponding pointer instead of iterator, you can always first convert the iterator to the pointer as #Aaron McDaid suggested.
int * ptr = &*it; // get a pointer to the same element.