why removing the first element in ArrayList is slow? - list

some where I've read that removing the first elementarrayList.remove(0); is slower than removing the last one arrayList.remove(arrayList.size()-1); please some one provide the detailed explanation. Thanks in advance

In ArrayList the elements reside in contiguous memory locations.
So when you remove the first element, all elements from 2 to n have to be shifted.
E.g. If you remove 1 from [1,2,3,4], 2, 3 and 4 have to be shifted to left to maintain contiguous memory allocation.
This makes it a little slower.
On the other hand, if you remove the last element, there is no shifting required since all the remaining elements are in the proper place.

Implementation of remove:
public E More ...remove(int index) {
rangeCheck(index);
modCount++;
E oldValue = elementData(index);
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
The values are stored in an array, so if the last one is removed it only set the value in the array to null (elementData[--size] = null). But if it is somewhere else it needs to use arraycopy to move all the elements after it. So in the code above you can see clearly: index = size - 1 implies the arraycopy call (the extra time used).

Related

Shrink QVector to last 10 elements & rearrange one item

I'm trying to find the most "efficient" way or, at least fast enough for 10k items vector to shrink it to last 10 items and move last selected item to the end of it.
I initially though of using this method for shrinking :
QVector<QModelIndex> newVec(listPrimary.end() - 10, listPrimary.end());
But that does not work, and I'm not sure how to use the Qt interators / std to get it to work...
And then once that's done do this test
if(newVec.contains(lastItem))
{
newVec.insert(newVec[vewVec.indexOf(newVec)],newVec.size());
}
else{
newVec.push_back(lastItem);
}
QVector Class has a method that does what you want:
QVector QVector::mid(int pos, int length = ...) const
Returns a sub-vector which contains elements from this vector, starting at position pos. If length is -1 (the default), all elements after pos are included; otherwise length elements (or all remaining elements if there are less than length elements) are included.
So as suggested in the comments, you can do something like this:
auto newVec = listPrimary.mid(listPrimary.size() - 10);
You do not have to pass length because its default value ensures that all elements after pos are included.

Is this code a bubble sorting program?

I made a simple bubble sorting program, the code works but I do not know if its correct.
What I understand about the bubble sorting algorithm is that it checks an element and the other element beside it.
#include <iostream>
#include <array>
using namespace std;
int main()
{
int a, b, c, d, e, smaller = 0,bigger = 0;
cin >> a >> b >> c >> d >> e;
int test1[5] = { a,b,c,d,e };
for (int test2 = 0; test2 != 5; ++test2)
{
for (int cntr1 = 0, cntr2 = 1; cntr2 != 5; ++cntr1,++cntr2)
{
if (test1[cntr1] > test1[cntr2]) /*if first is bigger than second*/{
bigger = test1[cntr1];
smaller = test1[cntr2];
test1[cntr1] = smaller;
test1[cntr2] = bigger;
}
}
}
for (auto test69 : test1)
{
cout << test69 << endl;
}
system("pause");
}
It is a bubblesort implementation. It just is a very basic one.
Two improvements:
the outerloop iteration may be one shorter each time since you're guaranteed that the last element of the previous iteration will be the largest.
when no swap is done during an iteration, you're finished. (which is part of the definition of bubblesort in wikipedia)
Some comments:
use better variable names (test2?)
use the size of the container or the range, don't hardcode 5.
using std::swap() to swap variables leads to simpler code.
Here is a more generic example using (random access) iterators with my suggested improvements and comments and here with the improvement proposed by Yves Daoust (iterate up to last swap) with debug-prints
The correctness of your algorithm can be explained as follows.
In the first pass (inner loop), the comparison T[i] > T[i+1] with a possible swap makes sure that the largest of T[i], T[i+1] is on the right. Repeating for all pairs from left to right makes sure that in the end T[N-1] holds the largest element. (The fact that the array is only modified by swaps ensures that no element is lost or duplicated.)
In the second pass, by the same reasoning, the largest of the N-1 first elements goes to T[N-2], and it stays there because T[N-1] is larger.
More generally, in the Kth pass, the largest of the N-K+1 first element goes to T[N-K], stays there, and the next elements are left unchanged (because they are already increasing).
Thus, after N passes, all elements are in place.
This hints a simple optimization: all elements following the last swap in a pass are in place (otherwise the swap wouldn't be the last). So you can record the position of the last swap and perform the next pass up to that location only.
Though this change doesn't seem to improve a lot, it can reduce the number of passes. Indeed by this procedure, the number of passes equals the largest displacement, i.e. the number of steps an element has to take to get to its proper place (elements too much on the right only move one position at a time).
In some configurations, this number can be small. For instance, sorting an already sorted array takes a single pass, and sorting an array with all elements swapped in pairs takes two. This is an improvement from O(N²) to O(N) !
Yes. Your code works just like Bubble Sort.
Input: 3 5 1 8 2
Output after each iteration:
3 1 5 2 8
1 3 2 5 8
1 2 3 5 8
1 2 3 5 8
1 2 3 5 8
1 2 3 5 8
Actually, in the inner loop, we don't need to go till the end of the array from the second iteration onwards because the heaviest element of the previous iteration is already at the last. But that doesn't better the time complexity much. So, you are good to go..
Small Informal Proof:
The idea behind your sorting algorithm is that you go though the array of values (left to right). Let's call it a pass. During the pass pairs of values are checked and swapped to be in correct order (higher right).
During first pass the maximum value will be reached. When reached, the max will be higher then value next to it, so they will be swapped. This means that max will become part of next pair in the pass. This repeats until pass is completed and max moves to the right end of the array.
During second pass the same is true for the second highest value in the array. Only difference is it will not be swapped with the max at the end. Now two most right values are correctly set.
In every next pass one value will be sorted out to the right.
There are N values and N passes. This means that after N passes all N values will be sorted like:
{kth largest, (k-1)th largest,...... 2nd largest, largest}
No it isn't. It is worse. There is no point whatsoever in the variable cntr1. You should be using test1 here, and you should be referring to one of the many canonical implementations of bubblesort rather than trying to make it up for yourself.

C++ vector.erase() function bug

I have this vector:
list.push_back("one");
list.push_back("two");
list.push_back("three");
I use list.erase(list.begin() + 1) to delete the "two" and it works. But when I try to output the list again:
cout<<list[0]<<endl;
cout<<list[1]<<endl;
cout<<list[2]<<endl;
produces:
one
three
three
I tried targeting the last element for erasing with list.erase(list.begin() + 2), but the duplicate three's remain. I imagined index 2 should have been shifted and list[2] should have outputted nothing. list[3] outputs nothing, as it should.
I'm trying to erase the "two" and output the list as only:
one
three
When using cout<<list[2]<<endl; you asume that you still have three elements. But in fact you are accessing remaining data in a part of the memory that is no more used.
You should use list.size () to obtain the number of elements. So, something like:
for ( size_t i = 0; i < list.size (); i++ )
{
cout<<list[i]<<endl;
}
But you erased the element, thus the size of your container was decreased by one, i.e. from 3 to 2.
So, after the erase, you shouldn't do this:
cout<<list[0]<<endl;
cout<<list[1]<<endl;
cout<<list[2]<<endl; // Undefined Behaviour!!
but this:
cout<<list[0]<<endl;
cout<<list[1]<<endl;
In your case, the "three" is just copied to the index 1, which is expected. you is vector.size() == 2 now.
it is because vector will do pre-allocation, which help to improve the performance.
To keep from having to resize with every change, vector grabs a block of memory bigger than it needs and keeps it until forced to get bigger or instructed to get smaller.
To brutally simplify, think of it as
string * array = new string[100];
int capacity = 100
int size = 0;
In this case you can write all through that 100 element array without the program crashing because it is good and valid memory, but only values beneath size have been initialized and are meaningful. What happens when you read above size is undefined. Because reading out of bounds is a bad idea and preventing it has a performance cost that should not be paid by correct usage, the C++ standard didn't waste any time defining what the penalty for doing so is. Some debug or security critical versions will test and throw exceptions or mark unused portions with a canary value to assist in detecting faults, but most implementations are aiming for maximum speed and do nothing.
Now you push_back "one", "two", and "three". The array is still 100 elements, capacity is still 100, but size is now 3.
You erase array[1] and every element after 1 up to size will be copied up one element (note potentially huge performance cost here. vector is not the right data structure choice if you are adding and removing items from it at random locations) and size will be reduced by one resulting in "one", "three", and "three". The array is still 100 elements, capacity is still 100, but size is now 2.
Say you add another 99 strings. This pushes size each time a string is added and when size will exceed capacity, a new array will be made, the old array will be copied to the new, and the old will be freed. Something along the lines of:
capacity *= 1.5;
string * temp = new string[capacity];
for (int index = 0; index < size; index ++)
{
temp[index] = array[index];
}
delete array;
array = temp;
The array is now 150 elements, capacity is now 150, and size is now 101.
Result:
There is usually a bit of fluff around the end of a vector that will allow reading out of bounds without the program crashing, but do not confuse this with the program working.

find a single element in an array of consecutive duplicate elements

Given an array of elements where every element is repeated except a single element. Moreover all the repeated elements are consecutive to each other.
We need to find out the index of that single element.
Note:
array may not be sorted
expected time O(logn)
range of elements can
be anything.
O(n) is trivial. but how can I figure out logn?
Gave a thought to bitwise operators also but nothing worked out.
Also, I am unable to make use of this statement in this question all the repeated elements are consecutive to each other.
Ex: 2 2 3 3 9 9 1 1 5 6 6
output 5
It can be done in O(logn) by checking if arr[2k] == arr[2k+1], k>=0 - if it is, then the distinct elementt is AFTER 2k+1, if it's not - than it is before before 2k+1.
This allows you to effectively trim half of the array at each step by checking the middle value, and recursing only on a problem half as big, getting it O(logn) overall.
Python code:
def findUnique(arr,l,r):
if r-l < 2:
return (arr[l],l)
mid = (r-l)/2 + l
if mid % 2 != 0:
flag = -1
else:
flag = 0
if (mid == 0 or arr[mid-1] != arr[mid] ) and (mid == len(arr)-1 or arr[mid] != arr[mid+1] ):
return (arr[mid],mid)
if arr[mid+flag] == arr[mid+1+flag]:
return findUnique(arr,mid,r)
return findUnique(arr,l,mid)
Assuming each element is repeated exactly twice, except one, then it is easy.
The first answer is correct, just feel like I could elaborate a bit on it.
So, lets take your example array.
a = [2 2 3 3 9 9 1 1 5 6 6];
If all elements were paired, then you can take an even index and know for sure that the next element will be the same.
a[0] = 2;
a[1] = 2; //as well
a[2] = 3;
a[3] = 3; //as well
General case:
a[k] = a[k+1] = x;
where k is even, and x is some value.
BUT, in your case, we know that there is one index that doesn't follow this rule.
in order to find it, we can use Binary Search (just for reference), with a bit of extra computation in the middle.
We go somewhere in the middle, and grab an element with an even index.
If that elements' value equals to the next elements' value, then your lonely value is in the second part of the array, because the pairing wasn't broken yet.
If those values are not equal, then either your lonely value is in the first half OR you are at it (it is in the middle).
You will need to check couple elements before and after to make sure.
By cutting your array in half with each iteration, you will achieve O(logn) time.

Find the element with the longest distance in a given array where each element appears twice?

Given an array of int, each int appears exactly TWICE in the
array. find and return the int such that this pair of int has the max
distance between each other in this array.
e.g. [2, 1, 1, 3, 2, 3]
2: d = 5-1 = 4;
1: d = 3-2 = 1;
3: d = 6-4 = 2;
return 2
My ideas:
Use hashmap, key is the a[i], and value is the index. Scan the a[], put each number into hash. If a number is hit twice, use its index minus the old numbers index and use the result to update the element value in hash.
After that, scan hash and return the key with largest element (distance).
it is O(n) in time and space.
How to do it in O(n) time and O(1) space ?
You would like to have the maximal distance, so I assume the number you search a more likely to be at the start and the end. This is why I would loop over the array from start and end at the same time.
[2, 1, 1, 3, 2, 3]
Check if 2 == 3?
Store a map of numbers and position: [2 => 1, 3 => 6]
Check if 1 or 2 is in [2 => 1, 3 => 6] ?
I know, that is not even pseudo code and not complete but just to give out the idea.
Set iLeft index to the first element, iRight index to the second element.
Increment iRight index until you find a copy of the left item or meet the end of the array. In the first case - remember distance.
Increment iLeft. Start searching from new iRight.
Start value of iRight will never be decreased.
Delphi code:
iLeft := 0;
iRight := 1;
while iRight < Len do begin //Len = array size
while (iRight < Len) and (A[iRight] <> A[iLeft]) do
Inc(iRight); //iRight++
if iRight < Len then begin
BestNumber := A[iLeft];
MaxDistance := iRight - iLeft;
end;
Inc(iLeft); //iLeft++
iRight := iLeft + MaxDistance;
end;
This algorithm is O(1) space (with some cheating), O(n) time (average), needs the source array to be non-const and destroys it at the end. Also it limits possible values in the array (three bits of each value should be reserved for the algorithm).
Half of the answer is already in the question. Use hashmap. If a number is hit twice, use index difference, update the best so far result and remove this number from the hashmap to free space . To make it O(1) space, just reuse the source array. Convert the array to hashmap in-place.
Before turning an array element to the hashmap cell, remember its value and position. After this it may be safely overwritten. Then use this value to calculate a new position in the hashmap and overwrite it. Elements are shuffled this way until an empty cell is found. To continue, select any element, that is not already reordered. When everything is reordered, every int pair is definitely hit twice, here we have an empty hashmap and an updated best result value.
One reserved bit is used while converting array elements to the hashmap cells. At the beginning it is cleared. When a value is reordered to the hashmap cell, this bit is set. If this bit is not set for overwritten element, this element is just taken to be processed next. If this bit is set for element to be overwritten, there is a conflict here, pick first unused element (with this bit not set) and overwrite it instead.
2 more reserved bits are used to chain conflicting values. They encode positions where the chain is started/ended/continued. (It may be possible to optimize this algorithm so that only 2 reserved bits are needed...)
A hashmap cell should contain these 3 reserved bits, original value index, and some information to uniquely identify this element. To make this possible, a hash function should be reversible so that part of the value may be restored given its position in the table. In simplest case, hash function is just ceil(log(n)) least significant bits. Value in the table consists of 3 fields:
3 reserved bits
32 - 3 - (ceil(log(n))) high-order bits from the original value
ceil(log(n)) bits for element's position in the original array
Time complexity is O(n) only on average; worst case complexity is O(n^2).
Other variant of this algorithm is to transform the array to hashmap sequentially: on each step m having 2^m first elements of the array converted to hashmap. Some constant-sized array may be interleaved with the hashmap to improve performance when m is low. When m is high, there should be enough int pairs, which are already processed, and do not need space anymore.
There is no way to do this in O(n) time and O(1) space.