Insertion into STL vector - c++

With a C++ STL vector we are building a vector of N elements and for some
reason we chose to insert them at the front of the vector. Every element insertion at the front of a vector forces the shift of all existing elements by 1. This results in (1+2+3+...+N) overall shifts of vector elements, which is (N/2)(N+1) shifts.
My question is how the author came with (1+2+3+...N), I thought it should be 1+1+1..N as we are moving one element at one position to get empty at beginning?
Thanks!

From [vector.modifiers]/2 (which describes vector::insert):
Complexity: The complexity is linear in the number of elements inserted plus the distance to the end of the vector.
Each time that you add an element the distance to the end of the vector is increased by one.
The first time that you add an element, there is 1 to be inserted and the distance to the end is 0, so the complexity is 1 + 0 = 1. The second time, there is 1 to be inserted, and the distance to the end is 1, so the complexity is 1 + 1 = 2. The third time, the distance to the end is 2, so the complexity is 1 + 2 = 3. This is what creates the 1 + 2 + 3 + ... + N pattern that the author is describing.

At insertion n, there are n elements currently in the vector that needs to be shifted.
vector<int> values;
for (size_t i = 0; i < N; ++i)
{
//At this point there are `i` elements in the vector that need to be moved
//to make room for the new element
values.insert(values.begin(), 0);
}

The first Value is shifted N-1 times, each time a new value is inserted it has to move. The second value is shifted N-2 times because only N-2 values are added after it. Next value is shifted N-3 and so on. The last value is not shifted.
I don't know why the author speaks about N and not N-1. But the reason for your confusion is, that the author counts the shifts of a single value and you count the amount of shift prozesses involving more than one single value shift.

Related

How to erase elements more efficiently from a vector or set?

Problem statement:
Input:
First two inputs are integers n and m. n is the number of knights fighting in the tournament (2 <= n <= 100000, 1 <= m <= n-1). m is the number of battles that will take place.
The next line contains n power levels.
The next m lines contain two integers l and r, indicating the range of knight positions to compete in the ith battle.
After each battle, all nights apart from the one with the highest power level will be eliminated.
The range for each battle is given in terms of the new positions of the knights, not the original positions.
Output:
Output m lines, the ith line containing the original positions (indices) of the knights from that battle. Each line is in ascending order.
Sample Input:
8 4
1 0 5 6 2 3 7 4
1 3
2 4
1 3
0 1
Sample Output:
1 2
4 5
3 7
0
Here is a visualisation of this process.
1 2
[(1,0),(0,1),(5,2),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
4 5
[(1,0),(6,3),(2,4),(3,5),(7,6),(4,7)]
-----------------
3 7
[(1,0),(6,3),(7,6),(4,7)]
-----------------
0
[(1,0),(7,6)]
-----------
[(7,6)]
I have solved this problem. My program produces the correct output, however, it is O(n*m) = O(n^2). I believe that if I erase knights more efficiently from the vector, efficiency can be increased. Would it be more efficient to erase elements using a set? I.e. erase contiguous segments rather that individual knights. Is there an alternative way to do this that is more efficient?
#define INPUT1(x) scanf("%d", &x)
#define INPUT2(x, y) scanf("%d%d", &x, &y)
#define OUTPUT1(x) printf("%d\n", x);
int main(int argc, char const *argv[]) {
int n, m;
INPUT2(n, m);
vector< pair<int,int> > knights(n);
for (int i = 0; i < n; i++) {
int power;
INPUT(power);
knights[i] = make_pair(power, i);
}
while(m--) {
int l, r;
INPUT2(l, r);
int max_in_range = knights[l].first;
for (int i = l+1; i <= r; i++) if (knights[i].first > max_in_range) {
max_in_range = knights[i].first;
}
int offset = l;
int range = r-l+1;
while (range--) {
if (knights[offset].first != max_in_range) {
OUTPUT1(knights[offset].second));
knights.erase(knights.begin()+offset);
}
else offset++;
}
printf("\n");
}
}
Well, removing from vector wouldn't be efficient for sure. Removing from set, or unordered set would be more effective (use iterators instead of indexes).
Yet the problem will still remain O(n^2), because you have two nested whiles running n*m times.
--EDIT--
I believe I understand the question now :)
First let's calculate the complexity of your code above. Your worst case would be the case that max range in all battles is 1 (two nights for each battle) and the battles are not ordered with respect to the position. Which means you have m battles (in this case m = n-1 ~= O(n))
The first while loop runs n times
For runs for once every time which makes it n*1 = n in total
The second while loop runs once every time which makes it n again.
Deleting from vector means n-1 shifts that makes it O(n).
Thus with the complexity of the vector total complexity is O(n^2)
First of all, you don't really need the inner for loop. Take the first knight as the max in range, compare the rest in the range one-by-one and remove the defeated ones.
Now, i believe it can be done in O(nlogn) with using std::map. The key to the map is the position and the value is the level of the knight.
Before proceeding, finding and removing an element in map is logarithmic, iterating is constant.
Finally, your code should look like:
while(m--) // n times
strongest = map.find(first_position); // find is log(n) --> n*log(n)
for (opponent = next of strongest; // this will run 1 times, since every range is 1
opponent in range;
opponent = next opponent) // iterating is constant
// removing from map is log(n) --> n * 1 * log(n)
if strongest < opponent
remove strongest, opponent is the new strongest
else
remove opponent, (be careful to remove it after iterating to next)
Ok, now the upper bound would be O(2*nlogn) = O(nlogn). If the ranges increases, that makes the run time of upper loop decrease but increases the number of remove operations. I'm sure the upper bound won't change, let's make it a homework for you to calculate :)
A solution with a treap is pretty straightforward.
For each query, you need to split the treap by implicit key to obtain the subtree that corresponds to the [l, r] range (it takes O(log n) time).
After that, you can iterate over the subtree and find the knight with the maximum strength. After that, you just need to merge the [0, l) and [r + 1, end) parts of the treap with the node that corresponds to this knight.
It's clear that all parts of the solution except for the subtree traversal and printing work in O(log n) time per query. However, each operation reinserts only one knight and erase the rest from the range, so the size of the output (and the sum of sizes of subtrees) is linear in n. So the total time complexity is O(n log n).
I don't think you can solve with standard stl containers because there'no standard container that supports getting an iterator by index quickly and removing arbitrary elements.

Is this code a bubble sorting program?

I made a simple bubble sorting program, the code works but I do not know if its correct.
What I understand about the bubble sorting algorithm is that it checks an element and the other element beside it.
#include <iostream>
#include <array>
using namespace std;
int main()
{
int a, b, c, d, e, smaller = 0,bigger = 0;
cin >> a >> b >> c >> d >> e;
int test1[5] = { a,b,c,d,e };
for (int test2 = 0; test2 != 5; ++test2)
{
for (int cntr1 = 0, cntr2 = 1; cntr2 != 5; ++cntr1,++cntr2)
{
if (test1[cntr1] > test1[cntr2]) /*if first is bigger than second*/{
bigger = test1[cntr1];
smaller = test1[cntr2];
test1[cntr1] = smaller;
test1[cntr2] = bigger;
}
}
}
for (auto test69 : test1)
{
cout << test69 << endl;
}
system("pause");
}
It is a bubblesort implementation. It just is a very basic one.
Two improvements:
the outerloop iteration may be one shorter each time since you're guaranteed that the last element of the previous iteration will be the largest.
when no swap is done during an iteration, you're finished. (which is part of the definition of bubblesort in wikipedia)
Some comments:
use better variable names (test2?)
use the size of the container or the range, don't hardcode 5.
using std::swap() to swap variables leads to simpler code.
Here is a more generic example using (random access) iterators with my suggested improvements and comments and here with the improvement proposed by Yves Daoust (iterate up to last swap) with debug-prints
The correctness of your algorithm can be explained as follows.
In the first pass (inner loop), the comparison T[i] > T[i+1] with a possible swap makes sure that the largest of T[i], T[i+1] is on the right. Repeating for all pairs from left to right makes sure that in the end T[N-1] holds the largest element. (The fact that the array is only modified by swaps ensures that no element is lost or duplicated.)
In the second pass, by the same reasoning, the largest of the N-1 first elements goes to T[N-2], and it stays there because T[N-1] is larger.
More generally, in the Kth pass, the largest of the N-K+1 first element goes to T[N-K], stays there, and the next elements are left unchanged (because they are already increasing).
Thus, after N passes, all elements are in place.
This hints a simple optimization: all elements following the last swap in a pass are in place (otherwise the swap wouldn't be the last). So you can record the position of the last swap and perform the next pass up to that location only.
Though this change doesn't seem to improve a lot, it can reduce the number of passes. Indeed by this procedure, the number of passes equals the largest displacement, i.e. the number of steps an element has to take to get to its proper place (elements too much on the right only move one position at a time).
In some configurations, this number can be small. For instance, sorting an already sorted array takes a single pass, and sorting an array with all elements swapped in pairs takes two. This is an improvement from O(N²) to O(N) !
Yes. Your code works just like Bubble Sort.
Input: 3 5 1 8 2
Output after each iteration:
3 1 5 2 8
1 3 2 5 8
1 2 3 5 8
1 2 3 5 8
1 2 3 5 8
1 2 3 5 8
Actually, in the inner loop, we don't need to go till the end of the array from the second iteration onwards because the heaviest element of the previous iteration is already at the last. But that doesn't better the time complexity much. So, you are good to go..
Small Informal Proof:
The idea behind your sorting algorithm is that you go though the array of values (left to right). Let's call it a pass. During the pass pairs of values are checked and swapped to be in correct order (higher right).
During first pass the maximum value will be reached. When reached, the max will be higher then value next to it, so they will be swapped. This means that max will become part of next pair in the pass. This repeats until pass is completed and max moves to the right end of the array.
During second pass the same is true for the second highest value in the array. Only difference is it will not be swapped with the max at the end. Now two most right values are correctly set.
In every next pass one value will be sorted out to the right.
There are N values and N passes. This means that after N passes all N values will be sorted like:
{kth largest, (k-1)th largest,...... 2nd largest, largest}
No it isn't. It is worse. There is no point whatsoever in the variable cntr1. You should be using test1 here, and you should be referring to one of the many canonical implementations of bubblesort rather than trying to make it up for yourself.

Big 0 notation for duplicate function, C++

What is the Big 0 notation for the function description in the screenshot.
It would take O(n) to go through all the numbers but once it finds the numbers and removes them what would that be? Would the removed parts be a constant A? and then would the function have to iterate through the numbers again?
This is what I am thinking for Big O
T(n) = n + a + (n-a) or something involving having to iterate through (n-a) number of steps after the first duplicate is found, then would big O be O(n)?
Big O notation is considering the worst case. Let's say we need to remove all duplicates from the array A=[1..n]. The algorithm will start with the first element and check every remaining element - there are n-1 of them. Since all values happen to be different it won't remove any from the array.
Next, the algorithm selects the second element and checks the remaining n-2 elements in the array. And so on.
When the algorithm arrives at the final element it is done. The total number of comparisions is the sum of (n-1) + (n-2) + ... + 2 + 1 + 0. Through the power of maths, this sum becomes (n-1)*n/2 and the dominating term is n^2 so the algorithm is O(n^2).
This algorithm is O(n^2). Because for each element in the array you are iterating over the array and counting the occurrences of that element.
foreach item in array
count = 0
foreach other in array
if item == other
count += 1
if count > 1
remove item
As you see there are two nested loops in this algorithm which results in O(n*n).
Removed items doesn't affect the worst case. Consider an array containing unique elements. No elements is being removed in this array.
Note: A naive implementation of this algorithm could result in O(n^3) complexity.
You started with first element you will go through all elements in the vector thats n-1 you will do that for n time its (n * n-1)/2 for worst case n time is the best case (all elements are 4)

spoj dp lsort approach

http://www.spoj.com/problems/LSORT/ It is a problem on spoj
It states that
You are given a permutation of n numbers that are between 1 to n and having no duplicates.
Task is to sort that permutation in ascending order.There is another array Q in which we are inserting elements from given permutation P.
You have to implement N steps to sort P. In the i-th step, P has N-i+1 remaining elements, Q has i-1 elements and you have to choose some x-th element (from the N-i+1 available elements) of P and put it to the left or to the right of Q. The cost of this step is equal to x * i. The total cost is the sum of costs of individual steps. After N steps, Q must be an ascending sequence. Your task is to minimize the total cost.
Input
The first line of the input file is T (T ≤ 10), the number of test cases. Then descriptions of T test cases follow. The description of each test case consists of two lines. The first line contains a single integer N (1 ≤ N ≤ 1000). The second line contains N distinct integers from the set {1, 2, .., N}, the N-element permutation P.
Output
For each test case your program should write one line, containing a single integer - the minimum total cost of sorting.
Now i have figured out the dp
My recurrence relation states that for getting most optimal values from elements having value i to j i will have to insert either $i$ at front or $j$ at back.
Cost of inserting i at front = dp[i+1][j]+cost of adding element i at front
Cost of inserting j at back = dp[i][j-1] +cost of adding element j at back
and i have to take minimum of these.answer would be dp[1][n]
for(l=1;l<=n;l++) //length of current permutation Q
{
for(i=1;i<=n-l+1;i++) //starting value of permutation Q
{
j=i+l-1; //ending value of permutation Q
dp[i][j]=min(dp[i+1][j]+l*xi,dp[i][j-1]+l*xj);//chosing wether to insert i at start or j at end
}
}
here xi=index of element i from start of permutation P.
and yi=index of element j from start of permutation P.
ans would be dp[1][n]
But am unable to figure out xi and xj
Please help
You can try re-thinking your DP state.
For me, I would use the dp[startQ][endQ] where dp[startQ][endQ] means the cost I have incurred to far to 'sort' values startQ to endQ in the array Q.
If you know what is in the array Q (integers startQ to endQ inclusive), one can easily re-construct the array of P by just removing/ignoring all the integers within startQ and endQ.
For each state, dp[startQ][endQ], since one can only add to the front or the back of Q,
dp[startQ][endQ] can only be:
dp[startQ][endQ-1] + cost of adding endQ
dp[startQ-1][endQ] + cost of adding startQ
with the base cases being
dp[i][i] = 0;
These states can be computed and the answer can be found at dp[1]][n]; (assuming it is one indexed).
However I haven't thought of a efficient way to compute x if it were to be coded in a top down manner, where as the whole computation can be performed in O(N^2 log N) using bottom-up DP with a data structure to compute x at every state.
I will leave the final details for you to code out :) but I can help more if required.

Find the element with the longest distance in a given array where each element appears twice?

Given an array of int, each int appears exactly TWICE in the
array. find and return the int such that this pair of int has the max
distance between each other in this array.
e.g. [2, 1, 1, 3, 2, 3]
2: d = 5-1 = 4;
1: d = 3-2 = 1;
3: d = 6-4 = 2;
return 2
My ideas:
Use hashmap, key is the a[i], and value is the index. Scan the a[], put each number into hash. If a number is hit twice, use its index minus the old numbers index and use the result to update the element value in hash.
After that, scan hash and return the key with largest element (distance).
it is O(n) in time and space.
How to do it in O(n) time and O(1) space ?
You would like to have the maximal distance, so I assume the number you search a more likely to be at the start and the end. This is why I would loop over the array from start and end at the same time.
[2, 1, 1, 3, 2, 3]
Check if 2 == 3?
Store a map of numbers and position: [2 => 1, 3 => 6]
Check if 1 or 2 is in [2 => 1, 3 => 6] ?
I know, that is not even pseudo code and not complete but just to give out the idea.
Set iLeft index to the first element, iRight index to the second element.
Increment iRight index until you find a copy of the left item or meet the end of the array. In the first case - remember distance.
Increment iLeft. Start searching from new iRight.
Start value of iRight will never be decreased.
Delphi code:
iLeft := 0;
iRight := 1;
while iRight < Len do begin //Len = array size
while (iRight < Len) and (A[iRight] <> A[iLeft]) do
Inc(iRight); //iRight++
if iRight < Len then begin
BestNumber := A[iLeft];
MaxDistance := iRight - iLeft;
end;
Inc(iLeft); //iLeft++
iRight := iLeft + MaxDistance;
end;
This algorithm is O(1) space (with some cheating), O(n) time (average), needs the source array to be non-const and destroys it at the end. Also it limits possible values in the array (three bits of each value should be reserved for the algorithm).
Half of the answer is already in the question. Use hashmap. If a number is hit twice, use index difference, update the best so far result and remove this number from the hashmap to free space . To make it O(1) space, just reuse the source array. Convert the array to hashmap in-place.
Before turning an array element to the hashmap cell, remember its value and position. After this it may be safely overwritten. Then use this value to calculate a new position in the hashmap and overwrite it. Elements are shuffled this way until an empty cell is found. To continue, select any element, that is not already reordered. When everything is reordered, every int pair is definitely hit twice, here we have an empty hashmap and an updated best result value.
One reserved bit is used while converting array elements to the hashmap cells. At the beginning it is cleared. When a value is reordered to the hashmap cell, this bit is set. If this bit is not set for overwritten element, this element is just taken to be processed next. If this bit is set for element to be overwritten, there is a conflict here, pick first unused element (with this bit not set) and overwrite it instead.
2 more reserved bits are used to chain conflicting values. They encode positions where the chain is started/ended/continued. (It may be possible to optimize this algorithm so that only 2 reserved bits are needed...)
A hashmap cell should contain these 3 reserved bits, original value index, and some information to uniquely identify this element. To make this possible, a hash function should be reversible so that part of the value may be restored given its position in the table. In simplest case, hash function is just ceil(log(n)) least significant bits. Value in the table consists of 3 fields:
3 reserved bits
32 - 3 - (ceil(log(n))) high-order bits from the original value
ceil(log(n)) bits for element's position in the original array
Time complexity is O(n) only on average; worst case complexity is O(n^2).
Other variant of this algorithm is to transform the array to hashmap sequentially: on each step m having 2^m first elements of the array converted to hashmap. Some constant-sized array may be interleaved with the hashmap to improve performance when m is low. When m is high, there should be enough int pairs, which are already processed, and do not need space anymore.
There is no way to do this in O(n) time and O(1) space.