Fastest way to remove an element from an array? - c++

I've coded an algorithm designed to produce a list of the closest triplets from three sorted arrays (one element contributed by each array). The algorithm finds the closest triplet, removes those elements, then repeats the process. The algorithm itself runs quite fast, however the code I'm using to "remove" the elements from each array slows it down significantly. Is there a more efficient method, or is random removal necessarily always O(n)?
My current strategy (effectively identical to std::move):
for(int i = 6; i < n; ++i)
array[i] = array[i+1];
Where n is the size of the array, and 6 is the index of the element to be removed.

Fastest way to remove an element from an array?
Note that there is no way to erase elements from an array. The size of an array cannot change. So to be clear, we are considering algorithm where the resulting array contains the elements excluding the "removed" value at the beginning of the array, with some irrelevant value in the end.
The algorithm that you show1 is the optimal one if there is an additional constraint that the order of other elements must not change. It can be slightly improved by using move assignment if the element type is non-trivial, but that doesn't improve asymptotic complexity. There is no need to write the loop, since there is a standard algorithm: std::move (the two-argument overload from <algorithm>).
If there is no constraint of stable order, then there is a more efficient algorithm: Only write the last element over the "removed" one.
is random removal [from array] necessarily always O(n)?
Only when the remaining elements need to have a stable order.
1 However, there is a bug in your implementation:
for(int i = 6; i < n; ++i)
array[i] = array[i+1];
Where n is the size of the array
If n is the size of the array, then array[n-1+1] is outside the bounds of the array.

There are a few more options you can consider.
Validity masks
You can have an additional array of bool initially everything is set to false to day values are not deleted. To delete the value you set the corresponding bool to true (or the other way if it makes more sense in your code).
This requires a bit of tweaks to the rest of the code to skip values that are marked as deleted.
Tombstones
Similar to the solution above, but doesn't require additional memory. If there's a value that it's not used (say all the values are supposed to be positive, then we can use -1) you can set the entry to that value. This also requires tweaks in the rest of the code to skip it.
Delayed deletion
This one is a bit more complicated. I'd only use it if iterating over the deleted entries significantly affects performance or complexity.
The idea is to tombsone or mark the entries as deleted. Next time you iterate over the array you also do the swaps. This makes the code kind of complex. The easiest, I think, you can do it is using custom iterators.
This is still O(N), but it's amortized O(1) within the overall algorithm.
Also note that if you do a loop O(N) to find the element to delete and than do another loop O(N) to delete it, then the overall solution is still O(N).

Related

Insert a sorted range into std::set with hint

Assume I have a std::set (which is by definition sorted), and I have another range of sorted elements (for the sake of simplicity, in a different std::set object). Also, I have a guarantee that all values in the second set are larger than all the values in the first set.
I know I can efficiently insert one element into std::set - if I pass a correct hint, this will be O(1). I know I can insert any range into std::set, but as no hint is passed, this will be O(k logN) (where k is number of new elements, and N number of old elements).
Can I insert a range in a std::set and provide a hint? The only way I can think of is to do k single inserts with a hint, which does push the complexity of the insert operations in my case down to O(k):
std::set <int> bigSet{1,2,5,7,10,15,18};
std::set <int> biggerSet{50,60,70};
for(auto bigElem : biggerSet)
bigSet.insert(bigSet.end(), bigElem);
First of all, to do the merge you're talking about, you probably want to use set (or map's) merge member function, which will let you merge some existing map into this one. The advantage of doing this (and the reason you might not want to, depending your usage pattern) is that the items being merged in are actually moved from one set to the other, so you don't have to allocate new nodes (which can save a fair amount of time). The disadvantage is that the nodes then disappear from the source set, so if you need each local histogram to remain intact after being merged into the global histogram, you don't want to do this.
You can typically do better than O(log N) when searching a sorted vector. Assuming reasonably predictable distribution you can use an interpolating search to do a search in (typically) around O(log log N), often called "pseudo-constant" complexity.
Given that you only do insertions relatively infrequently, you might also consider a hybrid structure. This starts with a small chunk of data that you don't keep sorted. When you reach an upper bound on its size, you sort it and insert it into a sorted vector. Then you go back to adding items to your unsorted area. When it reaches the limit, again sort it and merge it with the existing sorted data.
Assuming you limit the unsorted chunk to no larger than log(N), search complexity is still O(log N)--one log(n) binary search or log log N interpolating search on the sorted chunk, and one log(n) linear search on the unsorted chunk. Once you've verified that an item doesn't exist yet, adding it has constant complexity (just tack it onto the end of the unsorted chunk). The big advantage is that this can still easily use a contiguous structure such as a vector, so it's much more cache friendly than a typical tree structure.
Since your global histogram is (apparently) only ever populated with data coming from the local histograms, it might be worth considering just keeping it in a vector, and when you need to merge in the data from one of the local chunks, just use std::merge to take the existing global histogram and the local histogram, and merge them together into a new global histogram. This has O(N + M) complexity (N = size of global histogram, M = size of local histogram). Depending on the typical size of a local histogram, this could pretty easily work out as a win.
Merging two sorted containers is much quicker than sorting. It's complexity is O(N), so in theory what you say makes sense. It's the reason why merge-sort is one of the quickest sorting algorithms. If you follow the link, you will also find pseudo-code, what you are doing is just one pass of the main loop.
You will also find the algorithm implemented in STL as std::merge. This takes ANY container as an input, I would suggest using std::vector as default container for new element. Sorting a vector is a very fast operation. You may even find it better to use a sorted-vector instead of a set for output. You can always use std::lower_bound to get O(Nlog(N)) performance from a sorted-vector.
Vectors have many advantages compared with set/map. Not least of which is they are very easy to visualise in a debugger :-)
(The code at the bottom of the std::merge shows an example of using vectors)
You can merge the sets more efficiently using special functions for that.
In case you insist, insert returns information about the inserted location.
iterator insert( const_iterator hint, const value_type& value );
Code:
std::set <int> bigSet{1,2,5,7,10,15,18};
std::set <int> biggerSet{50,60,70};
auto hint = bigSet.cend();
for(auto& bigElem : biggerSet)
hint = bigSet.insert(hint, bigElem);
This assumes, of course, that you are inserting new elements that will end up together or close in the final set. Otherwise there is not much to gain, only the fact that since the source is a set (it is ordered) then about half of the three will not be looked up.
There is also a member function
template< class InputIt > void insert( InputIt first, InputIt last );.
That might or might not do something like this internally.

Efficiently inserting values into a map. Better incrementing or decrementing keys?

I have a vector of pairs ordered by key in decrementing order.
I want to efficiently transform it to a map.
This is what I currently do:
int size = vect.size();
for (int i = 0; i < size; i++)
map[vect[i].key] = vect[i];
Is there a point in traversing the vector backwards and inserting values with lowest keys first? I'm not sure how insert works internally and whether it even matters...
How about using map constructor and just passing the vector into that instead of looping? This would be recreating the map, vs doing map.clear() that I currently do between runs.
I read a few other SO answers about [key]=val being about the same as insert() but none deal with insertion order.
std::map is usually implemented as Red-Black Tree. Therefore, it doesn't really matter whether you increment or decrement the keys. It will still perform a search with O(log n) complexity and rebalancing.
What you can do to speed up your insertion is use either insert or emplace_hint with "hint", which is an iterator used as a suggestion as to where to insert the new element.
Constructing map with a range won't make a difference.
It is hard to recommend the best data structure for you without knowing details about the program and data it handles. Generally, RB-tree is the best you can get for general case (and that's why it is an implementation of choice for std::map).
Hope it helps. Good Luck!
I decided this was interesting enough (an outright bug in the standard that lasted 13 years) to add as an answer.
Section 23.1.2 of the C++03 specification says, concerning the "hinted" version insert(p,t), that the complexity is:
logarithmic in general, but amortized constant if t is inserted right after p
What this means is that if you insert n elements in sorted order, providing the correct hint each time, then the total time will be O(n), not O(n log n). Even though some individual insertions will take logarithmic time, the average time per insertion will still be constant.
C++11 finally fixed the wording to read "right before p" instead of "right after p", which is almost certainly what was meant in the first place... And the corrected wording actually makes it possible to use the "hint" when inserting elements in either forward or reverse order (i.e. passing container.end() or container.begin() as the hint).

Are vectors in c++ so slow?

I am making a program that solves this problem here: http://opc.iarcs.org.in/index.php/problems/BOOKLIST
and the only places i am using vectors are:
for(int i =0;i < total_books; i++){
int temp;
cin >> temp;
books_order.push_back(temp);
}
and
for(int i = 0;i < total_entries; i++){
int index;
cin >> index;
index--;
cout << books_order[index] << endl;
books_order.erase(books_order.begin()+index);
}
(here is my full code: http://cpaste.org/1377/)
The only functions i am using from vectors are vector::erase, vector::push_back and vector::begin.
For large inputs my code take more time than 3 seconds(which is the time limit for that problem), but when i remove the vector function, it runs much faster(but gives a wrong answer ofcourse)
I understood that it is wrong using vectors in this problem and that my algorithm is very slow, but i did not understand why it is slow.
If you know why the functions i am using are slow please explain them to me.
Thank you.
The only functions i am using from vectors are vector::erase, vector::push_back and vector::begin
I would say this is your problem. vector::erase is O(n), the others are constant time and should not cause efficiency issues in online judge problems. Avoid the erase method.
However, in general, if you know the size of your array in advance, I would just use a simple array. Don't add any unnecessary overhead if you can help it.
To solve the problem you linked to though, consider using a set: its erase method is O(log n), as are its insertion methods. That is what you should use generally if you need random removals.
Edit: I forgot that C++ had priority queues too, so look into those as well. Since you only care about the max here, it might be faster than a set, although they both have the same theoretical complexities (a heap can retrieve the min/max in O(1), but removing it, which you must do for this problem, is O(log n)) in this case, and either one should work just fine.
IMO the culprit is vector::erase as it will shift all the elements after the one removed. So unless you're removing always the last element it can be quite slow.
Unless you have called reserve to size your vector to the number of elements you have, calls to push_back will reallocate memory every time the new size will exceed the vector capacity. Consider using reserve to allocate memory sufficient for your elements and you should see a performance speed up, particularly if the vector is large. Also take heed of the other answers regarding use of erase.
your problem is the call to "erase". If you check the documentation here, you'll note that:
Because vectors keep an array format, erasing on positions other than the vector end also moves all the elements after the segment erased to their new positions, which may not be a method as efficient as erasing in other kinds of sequence containers (deque, list).
either run the loop from the end or erase after the loop.
Removing elements from the middle of a vector (with vector::erase) is slow. This is because all the higher elements of the vector must be moved down to fill the gap left by the element you've removed.
So this is not a case of vectors being slow, but you algorithm being slow because you've chosen an inappropriate data structure.
If you know that the vector has a certain number of elements in it in advance you can reserve enough space to contain everything to save yourself reallocations of space within the vector:
books_order.reserve(total_books)
for(int i=0 ;i < total_books; ++i){
int temp;
cin >> temp;
books_order.push_back(temp);
}
However that's not the issue here, I'm sure if you profiled your code you would see a large overhead from the .erase() usage. Removing elements from the middle of a vector is slow because of how a vector is implemented it will need to move all the elements part the one that is removed. Consider using a different data structure instead such as std::array instead.
Do you really need a vector to achieve what you are doing?
You have to keep in mind how vectors work: They're essentially dynamic arrays that resize as you add more items in them. Resizing a vector is an O(n) operation (where n is the number of items in your vector) as it allocates new memory and copies the items over to the new vector.
You can read about what vectors are good at here.
I would recommend using a standard array instead -- which seems completely plausible since you already know how many total elements there are (which is total_books)

std::set<T>::insert, duplicate elements

What would be an efficient implementation for a std::set insert member function? Because the data structure sorts elements based on std::less (operator < needs to be defined for the element type), it is conceptually easy to detect a duplicate.
How does it actually work internally? Does it make use of the red-back tree data structure (a mentioned implementation detail in the book of Josuttis)?
Implementations of the standard data structures may vary...
I have a problem where I am forced to have a (generally speaking) sets of integers which should be unique. The length of the sets varies so I am in need of dynamical data structure (based on my narrow knowledge, this narrows things down to list, set). The elements do not necessarily need to be sorted, but there may be no duplicates. Since the candidate sets always have a lot of duplicates (sets are small, up to 64 elements), will trying to insert duplicates into std::set with the insert member function cause a lot of overhead compared to std::list and another algorithm that may not resort to having the elements sorted?
Additional: the output set has a fixed size of 27 elements. Sorry, I forgot this... this works for a special case of the problem. For other cases, the length is arbitrary (lower than the input set).
If you're creating the entire set all at once, you could try using std::vector to hold the elements, std::sort to sort them, and std::unique to prune out the duplicates.
The complexity of std::set::insert is O(log n), or amortized O(1) if you use the "positional" insert and get the position correct (see e.g. http://cplusplus.com/reference/stl/set/insert/).
The underlying mechanism is implementation-dependent. It's often a red-black tree, but this is not mandated. You should look at the source code for your favourite implementation to find out what it's doing.
For small sets, it's possible that e.g. a simple linear search on a vector will be cheaper, due to spatial locality. But the insert itself will require all the following elements to be copied. The only way to know for sure is to profile each option.
When you only have 64 possible values known ahead of time, just take a bit field and flip on the bits for the elements actually seen. That works in n+O(1) steps, and you can't get less than that.
Inserting into a std::set of size m takes O(log(m)) time and comparisons, meaning that using an std::set for this purpose will cost O(n*log(n)) and I wouldn't be surprised if the constant were larger than for simply sorting the input (which requires additional space) and then discarding duplicates.
Doing the same thing with an std::list would take O(n^2) average time, because finding the insertion place in a list needs O(n).
Inserting one element at a time into an std::vector would also take O(n^2) average time – finding the insertion place is doable in O(log(m)), but elements need to me moved to make room. If the number of elements in the final result is much smaller than the input, that drops down to O(n*log(n)), with close to no space overhead.
If you have a C++11 compiler or use boost, you could also use a hash table. I'm not sure about the insertion characteristics, but if the number of elements in the result is small compared to the input size, you'd only need O(n) time – and unlike the bit field, you don't need to know the potential elements or the size of the result a priori (although knowing the size helps, since you can avoid rehashing).

STL: Set of natural numbers from A to B

I want to add natural numbers from A to B in a set. Currently I am inserting each and every number from A to B, one by one in the set like this,
set<int> s;
for(int j=A; j<=B; j++)
s.insert(j);
But it takes O(n) time (here n = (B - A)+1). Is there any pre-defined way in STL to do it in O(1) time?
Thanks
Allocating memory to hold n number is always going to be in at least O(n) so I think you're out of luck.
Technically I believe this is O(n log n) because the set.insert function is log n. O(n) is the best you can do I think but for that you would need to use an unsorted container like a vector or list.
No. The shortest amount of time it takes to fill a container with sequential values is O(n) time.
With the STL set container you will never get O(1) time. You may be able to reduce the running time by using the set(InputIterator f, InputIterator l, const key_compare& comp) constructor and passing in a custom iterator that iterates over the given integer range. The reason this may run faster (depends on stl implementation, compiler, etc) is that you are reducing the call stack depth. In your snippet, you go all the way down from your .insert() call to the actual insertion and back for each integer. Using the alternate constructor, your increment operation is moved down into the frame in which the insertion is performed. The increment operation would now have the possible overhead of a function call if your compiler can't inline it. You should benchmark this before taking this approach though. It may be slower if your stl implementation has a shallow call stack for .insert().
In general though, if you need a set of a contiguous range of integers, you could see massive performance gains by implementing a specialized set class that can store and compare only the upper and lower bounds of each set.
O(1) is only true for default constructor.
O(n) for the copy constructors and sorted sequence insertion using iterators.
O(log n!) for unsorted sequence insertion using iterators.
Well, if you want to complately out of the box, you could design a "lazy-loaded" array, custom to this task. Basically, upon access, if the value had not been previously set, it would determine the correct value.
This would allow the setup to be O(1) (assuming inintialize the "not previously set" flags is itself O(1)), but wouldn't speed up the overall operation -- it would just scatter that time over the rest of the run (It would probably take longer overall).