Why does the reverse function for the std::list class in the C++ standard library have linear runtime? I would think that for doubly-linked lists the reverse function should have been O(1).
Reversing a doubly-linked list should just involve switching the head and the tail pointers.
Hypothetically, reverse could have been O(1). There (again hypothetically) could have been a boolean list member indicating whether the direction of the linked list is currently the same or opposite as the original one where the list was created.
Unfortunately, that would reduce the performance of basically any other operation (albeit without changing the asymptotic runtime). In each operation, a boolean would need to be consulted to consider whether to follow a "next" or "prev" pointer of a link.
Since this was presumably considered a relatively infrequent operation, the standard (which does not dictate implementations, only complexity), specified that the complexity could be linear. This allows "next" pointers to always mean the same direction unambiguously, speeding up common-case operations.
It could be O(1) if the list would store a flag that allows swapping the meaning of the “prev” and “next” pointers each node has. If reversing the list would be a frequent operation, such an addition might be in fact useful and I don't know of any reason why implementing it would be prohibited by the current standard. However, having such a flag would make ordinary traversal of the list more expensive (if only by a constant factor) because instead of
current = current->next;
in the operator++ of the list iterator, you would get
if (reversed)
current = current->prev;
else
current = current->next;
which is not something you'd decide to add easily. Given that lists are usually traversed much more often than they are reversed, it would be very unwise for the standard to mandate this technique. Therefore, the reverse operation is allowed to have linear complexity. Do note, however, that t ∈ O(1) ⇒ t ∈ O(n) so, as mentioned earlier, implementing your “optimization” technically would be permitted.
If you come from a Java or similar background, you might wonder why the iterator has to check the flag each time. Couldn't we instead have two distinct iterator types, both derived from a common base type, and have std::list::begin and std::list::rbegin polymorphically return the appropriate iterator? While possible, this would make the whole thing even worse because advancing the iterator would be an indirect (hard to inline) function call now. In Java, you're paying this price routinely anyway, but then again, this is one of the reasons many people reach for C++ when performance is critical.
As pointed out by Benjamin Lindley in the comments, since reverse is not allowed to invalidate iterators, the only approach permitted by the standard seems to be to store a pointer back to the list inside the iterator which causes a double-indirect memory access.
Surely since all containers that support bidirectional iterators have the concept of rbegin() and rend(), this question is moot?
It's trivial to build a proxy that reverses the iterators and access the container through that.
This non-operation is indeed O(1).
such as:
#include <iostream>
#include <list>
#include <string>
#include <iterator>
template<class Container>
struct reverse_proxy
{
reverse_proxy(Container& c)
: _c(c)
{}
auto begin() { return std::make_reverse_iterator(std::end(_c)); }
auto end() { return std::make_reverse_iterator(std::begin(_c)); }
auto begin() const { return std::make_reverse_iterator(std::end(_c)); }
auto end() const { return std::make_reverse_iterator(std::begin(_c)); }
Container& _c;
};
template<class Container>
auto reversed(Container& c)
{
return reverse_proxy<Container>(c);
}
int main()
{
using namespace std;
list<string> l { "the", "cat", "sat", "on", "the", "mat" };
auto r = reversed(l);
copy(begin(r), end(r), ostream_iterator<string>(cout, "\n"));
return 0;
}
expected output:
mat
the
on
sat
cat
the
Given this, it seems to me that the standards committee have not taken time to mandate O(1) reverse-ordering of the container because it's not necessary, and the standard library is largely built on the principle of mandating only what is strictly necessary while avoiding duplication.
Just my 2c.
Because it has to traverse every node (n total) and update their data (the update step is indeed O(1)). This makes the whole operation O(n*1) = O(n).
It also swaps previous and next pointer for every node. Thats why it takes Linear. Although it can be done in O(1) if the function using this LL also takes information about LL as input like whether it is accessing normally or reverse.
Only an algorithm explanation.
Imagine you have an array with elements, then you need to inverted it.
The basic idea is to iterate on each element changing the element on the
first position to the last position, the element on second position to penultimate position, and so on. When you reach at the middle of the array you'll have all elements changed, thus in (n/2) iterations, which is considered O(n).
It is O(n) simply because it needs to copy the list in reverse order. Each individual item operation is O(1) but there are n of them in the entire list.
Of course there are some constant-time operations involved in setting up the space for the new list, and changing pointers afterwards, etc. The O notation doesn't consider individual constants once you include a first-order n factor.
Related
Suppose I have a non primitive data type with duplicates as per comparator, and I attempt to sort it using std::sort...does it give the same sorted array everytime (if we compare the sorted array in every result, will it be same ?). I know it is not stable (it may change the order of equal elements), but is the result of the same input array guaranteed to be deterministic (reliable and reproducible) ?
struct Data {
std::string str;
int data;
};
struct {
bool operator()(Data a, Data b) const { return a.data > b.data; }
} customLess;
int main() {
std::vector v = {
{"Rahul", 100},
{"Sachin", 200},
{"Saurav", 200},
{"Rohit", 300},
// .....
};
for(uint k = 0; k < 1000; k++) {
auto v2 = v;
std::sort(v2.begin(), v2.end(), customLess);
}
}
If I read you correctly, you're asking whether, despite the lack of stability, std::sort guarantees repeatability; if the same input is provided in the same order, and there are elements that are equal on the compared components, but unequal on others, will said elements always get sorted the same relative to one another?
The answer is No, std::sort make no such guarantees. Doing so would impose restrictions on implementations that might cause them to perform worse (e.g. implementations based on quicksort couldn't use a random pivot to minimize the occurrence of quicksort's pathological case, where performance is O(n²) rather than the average case O(n log n)). While a plain quicksort of that design is banned in C++11 (where std::sort now requires O(n log n) comparisons period, not merely O(n log n) average case), it can still form the top-level sort for an introsort-based std::sort implementation (a reasonable strategy when the inputs are received from possibly malicious sources and you want to reduce their ability to force excessive recursion followed by slower heapsort), so requiring repeatability would prevent implementations from using a random pivot (or any other sorting strategy with a random component), for a benefit virtually no one cares about.
std::sort means you don't care about the order of unequal elements that compare equal according to the comparator; they're not going to limit potential optimizations to provide a useless guarantee. Many implementations might, in practice, have repeatable sort order in this scenario, but it's not something code should rely on; if you need repeatability, either:
Use std::stable_sort (and get an ordering for equal inputs that is repeatable across implementations, where std::sort, being implemented differently by different vendors, would almost certainly not be repeatable across implementations that chose different algorithms), or
Expand your custom comparator to perform fallback sorting that encompasses all fields in the input elements, so it's impossible to have any uncertainty unless the fields are 100% equal, not merely equivalent based on the main comparison, which gets you not only repeatability for equal inputs, but repeatability for inputs with the same elements in different order. The actual results might put two completely equal elements in a different order (e.g. you might be able to check .data() on a std::string, and discover that two string with the same characters end up sorting in different orders), but that's almost never important (and if it is, again, use std::stable_sort). In this case, you'd change your comparator to (adding #include <tuple> if you're not using it):
struct {
bool operator()(const Data& a, const Data& b) const {
return std::tie(a.data, a.str) > std::tie(b.data, b.str);
}
} customLess;
so all fields are compared. Note that I changed the arguments to be const references (so you're not copying two Data objects for each comparison) and I used std::tie to make the fallback comparison efficient and easy to code (std::tie lets you use std::tuple's lexicographic sort without having to reimplement lexicographic sorting from scratch, an error-prone process, while still sticking to reference semantics to avoid copies).
I am implementing a container that presents a map-like interface. The physicals implementation is an std::vector<std::pair<K*, T>>. A K object remembers its assigned position in the vector. It is possible for a K object to get destroyed. In that case its remembered index is used to zero out its corresponding key pointer within the vector, creating a tombstone.
I would like to expose the full traditional collection of iterators, though I think that they need only claim to be forward_iterators (see next).
I want to be able to use range-based for loop iteration to return the only non-tombstoned elements. Further, I would like the implementation of my iterators to be a single pointer (i.e. no back pointer to the container).
Since the range-based for loop is pretested I think that I can implement tombstone skipping within the inequality predicate.
bool operator != (MyInterator& cursor, MyIterator stop) {
while (cursor != stop) {
if (cursor->first)
return true;
++cursor;
}
return false;
}
Is this a reasonable approach? If yes, is there a simple way for me to override the inequality operator of std::vector's iterators instead of implementing my iterators from scratch?
If this is not a reasonable approach, what would be better?
Is this a reasonable approach?
No. (Keep in mind that operator!= can be used outside a range-based for loop.)
Your operator does not accept a const object as its first parameter (meaning a const vector::iterator).
You have undefined behavior if the first parameter comes after the second (e.g. if someone tests end != cur instead of cur != end).
You get this weird case where, given iterators a and b, it might be that *a is different than *b, but if you check if (a != b) then you find that the iterators are equal and then *a is the same as *b. This probably wrecks havoc with the multipass guarantee of forward iterators (but the situation is bizarre enough that I would want to check the standard's precise wording before passing judgement). Messing with people's expectations is inadvisable.
There is no simple way to override the inequality operator of std::vector's iterators.
If this is not a reasonable approach, what would be better?
You already know what would be better. You're just shying away from it.
Implement your own iterators from scratch. Wrapping your vector in your own class has the benefit that only the code for that class has to be aware that tombstones exist.
Caveat: Document that the conditions that create a tombstone also invalidate iterators to that element. (Invalid iterators are excluded from most iterator requirements, such as the multipass guarantee.)
OR
While your implementation makes a poor operator!=, it could be a fine update or check function. There's this little-known secret that C++ has more looping structures than just range-based for loops. You could make use of one of these, for example:
for ( cur = vec.begin(); skip_tombstones(cur, vec.end()); ++cur ) {
auto& element = *cur;
where skip_tombstones() is basically your operator!= renamed. If not much code needs to iterate over the vector, this might be a reasonable option, even in the long term.
A very simpel example is multiplication - suppose I have a vector:
std::vector<int> ints = {1,2,3,4};
With a naive approach I can just use std::accumulate (or std::reduce) and it looks like this:
int result = std::accumulate(ints.begin(), ints.end(), int{}, [](const int &a, const int &b){return a*b;});
but since the initial value is zero - the result becomes zero as well (For this specific case, one way I could fix it is by putting a '1' as initial).
I would rather use an algorithm that does the above but without an initial value 'side-effect' (ie. just multiply the numbers in vector).
A similar problem is often encountered within string handling where a delimiter must be inserted between elements.
What you're talking about can be reframed as a generalisation of accumulate over the last N-1 elements of your range, with the 1st element being the initial value.
So you can just write:
std::accumulate(std::next(std::begin(ints)), std::end(ints), *std::begin(ints), OP);
You have to assume that ints is non-empty, though, which raises my main point: what should a hypothetical standard function return when the range is empty? Should its results simply be undefined? Is that sensible?
(current draft) 237) accumulate is similar to the APL reduction operator and Common Lisp reduce function, but it avoids the difficulty of defining the result of reduction on an empty sequence by always requiring an initial value
Accumulate sidesteps this issue and provides a boatload of flexibility, by doing things the way it does. I think that's a good thing.
Combined with the ability to simply provide an appropriate initial value like 1 for your operation over the whole range, I'm not convinced there's much need for this hypothetical alternative in the standard.
It might also be difficult to come up with two names for it that mirror the already-asymmetrically-named "accumulate" and "reduce".
template <class InputIt, class T, class BinaryOperation>
T fold_if_you_really_want_to(InputIt first, InputIt last, BinaryOperation op)
{
// UB if range is empty. Whatevs.
T init = *first;
return std::accumulate(++first, last, std::move(init), std::move(op));
}
…or something like that anyway. Note that this necessarily copies the first element; you could avoid that if you weren't lazy by calling into std::accumulate like I did. 😊
In addition to #Lightness Races in Orbit's answer, consider the case in Haskell:
For cases like you described it (most prominently searching the maximum element in a list), Haskell delivers the functions foldl1 and foldr1, which perform the fold over the collection and implicitly taking the first value as initial value.
Yes, for the empty list this makes no sense, hence for this problem you have to provide a list with at least one element.
Is there a specific data structure that a deque in the C++ STL is supposed to implement, or is a deque just this vague notion of an array growable from both the front and the back, to be implemented however the implementation chooses?
I used to always assume a deque was a circular buffer, but I was recently reading a C++ reference here, and it sounds like a deque is some kind of array of arrays. It doesn't seem like it's a plain old circular buffer. Is it a gap buffer, then, or some other variant of growable array, or is it just implementation-dependent?
UPDATE AND SUMMARY OF ANSWERS:
It seems the general consensus is that a deque is a data structure such that:
the time to insert or remove an element should be constant at beginning or end of the list and at most linear elsewhere. If we interpret this to mean true constant time and not amortized constant time, as someone comments, this seems challenging. Some have argued that we should not interpret this to mean non-amortized constant time.
"A deque requires that any insertion shall keep any reference to a member element valid. It's OK for iterators to be invalidated, but the members themselves must stay in the same place in memory." As someone comments: This is easy enough by just copying the members to somewhere on the heap and storing T* in the data structure under the hood.
"Inserting a single element either at the beginning or end of a deque always takes constant time and causes a single call to a constructor of T." The single constructor of T will also be achieved if the data structure stores T* under the hood.
The data structure must have random access.
It seems no one knows how to get a combination of the 1st and 4th conditions if we take the first condition to be "non-amortized constant time". A linked list achieves 1) but not 4), whereas a typical circular buffer achieves 4) but not 1). I think I have an implementation that fulfills both below. Comments?
We start with an implementation someone else suggested: we allocate an array and start placing elements from the middle, leaving space in both the front and back. In this implementation, we keep track of how many elements there are from the center in both the front and back directions, call those values F and B. Then, let's augment this data structure with an auxiliary array that is twice the size of the original array (so now we're wasting a ton of space, but no change in asymptotic complexity). We will also fill this auxiliary array from its middle and give it similar values F' and B'. The strategy is this: every time we add one element to the primary array in a given direction, if F > F' or B > B' (depending on the direction), up to two values are copied from the primary array to the auxiliary array until F' catches up with F (or B' with B). So an insert operation involves putting 1 element into the primary array and copying up to 2 from the primary to the auxiliary, but it's still O(1). When the primary array becomes full, we free the primary array, make the auxiliary array the primary array, and make another auxiliary array that's yet 2 times bigger. This new auxiliary array starts out with F' = B' = 0 and having nothing copied to it (so the resize op is O(1) if a heap allocation is O(1) complexity). Since the auxiliary copies 2 elements for every element added to the primary and the primary starts out at most half-full, it is impossible for the auxiliary to not have caught up with the primary by the time the primary runs out of space again. Deletions likewise just need to remove 1 element from the primary and either 0 or 1 from the auxiliary. So, assuming heap allocations are O(1), this implementation fulfills condition 1). We make the array be of T* and use new whenever inserting to fulfill conditions 2) and 3). Finally, 4) is fulfilled because we are using an array structure and can easily implement O(1) access.
It's implementation specific. All a deque requires is constant time insertion/deletion at the start/end, and at most linear elsewhere. Elements are not required to be contiguous.
Most implementations use what can be described as an unrolled list. Fixed-sized arrays get allocated on the heap and pointers to these arrays are stored in a dynamically sized array belonging to the deque.
A deque is typically implemented as a dynamic array of arrays of T.
(a) (b) (c) (d)
+-+ +-+ +-+ +-+
| | | | | | | |
+-+ +-+ +-+ +-+
^ ^ ^ ^
| | | |
+---+---+---+---+
| 1 | 8 | 8 | 3 | (reference)
+---+---+---+---+
The arrays (a), (b), (c) and (d) are generally of fixed capacity, and the inner arrays (b) and (c) are necessarily full. (a) and (d) are not full, which gives O(1) insertion at both ends.
Imagining that we do a lot of push_front, (a) will fill up, when it's full and an insertion is performed we first need to allocate a new array, then grow the (reference) vector and push the pointer to the new array at the front.
This implementation trivially provides:
Random Access
Reference Preservation on push at both ends
Insertion in the middle that is proportional to min(distance(begin, it), distance(it, end)) (the Standard is slightly more stringent that what you required)
However it fails the requirement of amortized O(1) growth. Because the arrays have fixed capacity whenever the (reference) vector needs to grow, we have O(N/capacity) pointer copies. Because pointers are trivially copied, a single memcpy call is possible, so in practice this is mostly constant... but this is insufficient to pass with flying colors.
Still, push_front and push_back are more efficient than for a vector (unless you are using MSVC implementation which is notoriously slow because of very small capacity for the arrays...)
Honestly, I know of no data structure, or data structure combination, that could satisfy both:
Random Access
and
O(1) insertion at both ends
I do know a few "near" matches:
Amortized O(1) insertion can be done with a dynamic array in which you write in the middle, this is incompatible with the "reference preservation" semantics of the deque
A B+ Tree can be adapted to provide an access by index instead of by key, the times are close to constants, but the complexity is O(log N) for access and insertion (with a small constant), it requires using Fenwick Trees in the intermediate level nodes.
Finger Trees can be adapted similarly, once again it's really O(log N) though.
A deque<T> could be implemented correctly by using a vector<T*>. All the elements are copied onto the heap and the pointers stored in a vector. (More on the vector later).
Why T* instead of T? Because the standard requires that
"An insertion at either end of the deque invalidates all the iterators
to the deque, but has no effect on the validity of references to
elements of the deque."
(my emphasis). The T* helps to satisfy that. It also helps us to satisfy this:
"Inserting a single element either at the beginning or end of a deque always ..... causes a single call to a constructor of T."
Now for the (controversial) bit. Why use a vector to store the T*? It gives us random access, which is a good start. Let's forget about the complexity of vector for a moment and build up to this carefully:
The standard talks about "the number of operations on the contained objects.". For deque::push_front this is clearly 1 because exactly one T object is constructed and zero of the existing T objects are read or scanned in any way. This number, 1, is clearly a constant and is independent of the number of objects currently in the deque. This allows us to say that:
'For our deque::push_front, the number of operations on the contained objects (the Ts) is fixed and is independent of the number of objects already in the deque.'
Of course, the number of operations on the T* will not be so well-behaved. When the vector<T*> grows too big, it'll be realloced and many T*s will be copied around. So yes, the number of operations on the T* will vary wildly, but the number of operations on T will not be affected.
Why do we care about this distinction between counting operations on T and counting operations on T*? It's because the standard says:
All of the complexity requirements in this clause are stated solely in terms of the number of operations on the contained objects.
For the deque, the contained objects are the T, not the T*, meaning we can ignore any operation which copies (or reallocs) a T*.
I haven't said much about how a vector would behave in a deque. Perhaps we would interpret it as a circular buffer (with the vector always taking up its maximum capacity(), and then realloc everything into a bigger buffer when the vector is full. The details don't matter.
In the last few paragraphs, we have analyzed deque::push_front and the relationship between the number of objects in the deque already and the number of operations performed by push_front on contained T-objects. And we found they were independent of each other. As the standard mandates that complexity is in terms of operations-on-T, then we can say this has constant complexity.
Yes, the Operations-On-T*-Complexity is amortized (due to the vector), but we're only interested in the Operations-On-T-Complexity and this is constant (non-amortized).
Epilogue: the complexity of vector::push_back or vector::push_front is irrelevant in this implementation; those considerations involve operations on T* and hence is irrelevant.
(Making this answer a community-wiki. Please get stuck in.)
First things first: A deque requires that any insertion to the front or back shall keep any reference to a member element valid. It's OK for iterators to be invalidated, but the members themselves must stay in the same place in memory. This is easy enough by just copying the members to somewhere on the heap and storing T* in the data structure under the hood. See this other StackOverflow question " About deque<T>'s extra indirection "
(vector doesn't guarantee to preserve either iterators or references, whereas list preserves both).
So let's just take this 'indirection' for granted and look at the rest of the problem. The interesting bit is the time to insert or remove from the beginning or end of the list. At first, it looks like a deque could trivially be implemented with a vector, perhaps by interpreting it as a circular buffer.
BUT A deque must satisfy "Inserting a single element either at the beginning or end of a
deque always takes constant time and causes a single call to a constructor of T."
Thanks to the indirection we've already mentioned, it's easy to ensure there is just one constructor call, but the challenge is to guarantee constant time. It would be easy if we could just use constant amortized time, which would allow the simple vector implementation, but it must be constant (non-amortized) time.
My understanding of deque
It allocates 'n' empty contiguous objects from the heap as the first sub-array.
The objects in it are added exactly once by the head pointer on insertion.
When the head pointer comes to the end of an array, it
allocates/links a new non-contiguous sub-array and adds objects there.
They are removed exactly once by the tail pointer on extraction.
When the tail pointer finishes a sub-array of objects, it moves
on to the next linked sub-array, and deallocates the old.
The intermediate objects between the head and tail are never moved in memory by deque.
A random access first determines which sub-array has the
object, then access it from it's relative offset with in the subarray.
This is an answer to user gravity's challenge to comment on the 2-array-solution.
Some details are discussed here
A suggestion for improvement is given
Discussion of details:
The user "gravity" has already given a very neat summary. "gravity" also challenged us to comment on the suggestion of balancing the number of elements between two arrays in order to achieve O(1) worst case (instead of average case) runtime. Well, the solution works efficiently if both arrays are ringbuffers, and it appears to me that it is sufficient to split the deque into two segments, balanced as suggested.
I also think that for practical purposes the standard STL implementation is at least good enough, but under realtime requirements and with a properly tuned memory management one might consider using this balancing technique. There is also a different implementation given by Eric Demaine in an older Dr.Dobbs article, with similar worst case runtime.
Balancing the load of both buffers requires to move between 0 or 3 elements, depending on the situation. For instance, a pushFront(x) must, if we keep the front segment in the primary array, move the last 3 elements from the primary ring to the auxiliary ring in order to keep the required balance. A pushBack(x) at the rear must get hold of the load difference and then decide when it is time to move one element from the primary to the auxiliary array.
Suggestion for improvement:
There is less work and bookkeeping to do if front and rear are both stored in the auxiliary ring. This can be achieved by cutting the deque into three segments q1,q2,q3, arranged in the following manner: The front part q1 is in the auxiliary ring (the doubled-sized one) and may start at any offset from which the elements are arranged clockwise in subsequent order. The number of elements in q1 are exactly half of all elements stored in the auxiliary ring. The rear part q3 is also in the auxilary ring, located exactly opposite to part q1 in the auxilary ring, also clockwise in subsequent order. This invariant has to be kept between all deque operations. Only the middle part q2 is located (clockwise in subsequent order) in the primary ring.
Now, each operation will either move exactly one element, or allocate a new empty ringbuffer when either one gets empty. For instance, a pushFront(x) stores x before q1 in the auxilary ring. In order to keep the invariant, we move the last element from q2 to the front of the rear q3. So both, q1 and q3 get an additional element at their fronts and thus stay opposite to each other. PopFront() works the other way round, and the rear operations work the same way. The primary ring (same as the middle part q2) goes empty exactly when q1 and q3 touch each other and form a full circle of subsequent Elements within the auxiliary ring. Also, when the deque shrinks, q1,q3 will go empty exactly when q2 forms a proper circle in the primary ring.
The datas in deque are stored by chuncks of fixed size vector, which are
pointered by a map(which is also a chunk of vector, but its size may change)
The main part code of the deque iterator is as below:
/*
buff_size is the length of the chunk
*/
template <class T, size_t buff_size>
struct __deque_iterator{
typedef __deque_iterator<T, buff_size> iterator;
typedef T** map_pointer;
// pointer to the chunk
T* cur;
T* first; // the begin of the chunk
T* last; // the end of the chunk
//because the pointer may skip to other chunk
//so this pointer to the map
map_pointer node; // pointer to the map
}
The main part code of the deque is as below:
/*
buff_size is the length of the chunk
*/
template<typename T, size_t buff_size = 0>
class deque{
public:
typedef T value_type;
typedef T& reference;
typedef T* pointer;
typedef __deque_iterator<T, buff_size> iterator;
typedef size_t size_type;
typedef ptrdiff_t difference_type;
protected:
typedef pointer* map_pointer;
// allocate memory for the chunk
typedef allocator<value_type> dataAllocator;
// allocate memory for map
typedef allocator<pointer> mapAllocator;
private:
//data members
iterator start;
iterator finish;
map_pointer map;
size_type map_size;
}
Below i will give you the core code of deque, mainly about two parts:
iterator
Simple function about deque
1. iterator(__deque_iterator)
The main problem of iterator is, when ++, -- iterator, it may skip to other chunk(if it pointer to edge of chunk). For example, there are three data chunks: chunk 1,chunk 2,chunk 3.
The pointer1 pointers to the begin of chunk 2, when operator --pointer it will pointer to the end of chunk 1, so as to the pointer2.
Below I will give the main function of __deque_iterator:
Firstly, skip to any chunk:
void set_node(map_pointer new_node){
node = new_node;
first = *new_node;
last = first + chunk_size();
}
Note that, the chunk_size() function which compute the chunk size, you can think of it returns 8 for simplify here.
operator* get the data in the chunk
reference operator*()const{
return *cur;
}
operator++, --
// prefix forms of increment
self& operator++(){
++cur;
if (cur == last){ //if it reach the end of the chunk
set_node(node + 1);//skip to the next chunk
cur = first;
}
return *this;
}
// postfix forms of increment
self operator++(int){
self tmp = *this;
++*this;//invoke prefix ++
return tmp;
}
self& operator--(){
if(cur == first){ // if it pointer to the begin of the chunk
set_node(node - 1);//skip to the prev chunk
cur = last;
}
--cur;
return *this;
}
self operator--(int){
self tmp = *this;
--*this;
return tmp;
}
2. Simple function about deque
common function of deque
iterator begin(){return start;}
iterator end(){return finish;}
reference front(){
//invoke __deque_iterator operator*
// return start's member *cur
return *start;
}
reference back(){
// cna't use *finish
iterator tmp = finish;
--tmp;
return *tmp; //return finish's *cur
}
reference operator[](size_type n){
//random access, use __deque_iterator operator[]
return start[n];
}
If you want to understand deque more deeply you can also see this question https://stackoverflow.com/a/50959796/6329006
I'm still new to C++ so I daily run into new problems.
Today came the [] operator's turn:
I'm making myself a new generic List class because I don't really like the std's one. I'm trying to give it the warm and fuzzy look of the C#'s Collections.Generic List, so I do want to be able to access elements by index. To cut to the chase:
Extract from the template:
T& operator[](int offset)
{
int translateVal = offset - cursorPos;
MoveCursor(translateVal);
return cursor->value;
}
const T& operator[](int offset) const
{
int translateVal = offset - cursorPos;
MoveCursor(translateVal);
return cursor->value;
}
That's the code for the operators. The template uses "template", so as far as I saw on some tutorials, that's the correct way to do operator overloading.
Nevertheless, when I'm trying to access by index, e.g.:
Collections::List<int> *myList;
myList = new Collections::List<int>();
myList->SetCapacity(11);
myList->Add(4);
myList->Add(10);
int a = myList[0];
I get the
no suitable conversion function from "Collections::List<int>" to "int" exists
error, referring to the "int a = myList[0]" line. Basically "myList[0]" type's is still "Collections::List", although it should have been just int. How Come?
Since myList is a pointer myList[0] doesn't invoke operator[], it returns Collections::List<int>*. What you want is (*myList)[0]. Or better still, Collections::List<int>& myRef = *myList; and then use myRef[0] (Other option is not allocate the memory for myList on the heap, you can create it on stack using Collections::List<int> myList and then use . operator on it).
myList has type pointer to List, not List. In the case where an expression of pointer type is followed by an integral value enclosed in square brackets, (such as myList[0]), the result is identical to "add 0 to the pointer value and dereference it". The result of adding 0 to the address of the list and dereferencing it simply yields the list.
It is common for programmers used to C# and Java to overuse C++ new. In the posted example it is better to use Collections::List<int> myList; and the . operator instead of ->.
Good that you want to learn C++ and practise writing your own collections but your logic is probably flawed.
std::vector and std::deque already allow constant-time random access. std::deque is more like a list in that it allows constant-time insertion and removal at either end and does not invalidate references or iterators due to insertions.
You also seem to be mixing your collection with its iterator into one class, so that a collection contains a current position. I am pretty certain C# collections are not implemented that way.
Finally I would imagine your MoveCursor command is O(N) which means you do not really have random-access at all.
If you want fast random access and insertion time the best you can manage is O(log N) by using a tree structure, with each node on the tree indicating the number of elements in each branch below it. Thus you can find the nth element recursing down the right path. Insertion is also O(log N) as you have to recurse up the tree modifying the counts, and you will of course have to regularly balance the tree.