Problems counting number of exchanges and comparisons in bubble sort - c++

I know that a reverse ordered list should yield theta(n^2) number of comparisons and theta(n^2) number of exchanges for bubble sort. In my sample code I am using a list of size n = 10. I implemented counters for the numComparisons and numExchanges, and although this doesn't seem very complicated, I can't figure out why my results don't yield 100 comparisons and 100 exchanges. Am I really far off target?
void testList::bubbleSort()
{
int k = 10;
bool flag = true;
while(flag)
{
k = k - 1;
flag = false;
for(int j = 0; j < k; j++)
{
if( vecPtr[j] > vecPtr[j+1])
{
int temp = vecPtr[j];
vecPtr[j] = vecPtr[j+1];
vecPtr[j+1] = temp;
numExchanges += 1;
flag = true;
}
numComparisons++;
}
}
}
The resulting output:
Original List: 10 9 8 7 6 5 4 3 2 1
Sorted List: 1 2 3 4 5 6 7 8 9 10
Comparisons: 45
Exchanges: 45
I also tried this implementation, but my results were the same:
void testList::bubbleSort()
{
int temp;
for(long i = 0; i < 10; i++)
{
for(long j = 0; j < 10-i-1; j++)
{
if (vecPtr[j] > vecPtr[j+1])
{
temp = vecPtr[j];
vecPtr[j] = vecPtr[j+1];
vecPtr[j+1] = temp;
numExchanges++;
}
numComparisons++;
}
}
}

Approximately N2/2 comparisons and exchanges are expected.
In particular, the inner loop starts the current value of the outer loop. So, on the first iteration, it traverses the entire array. On each subsequent iteration, it traverses one fewer item in the array.
So, the number of iterations of the inner loop is N + N-1 + N-2 ... 1. On average, that's approximately N/2.
If you want to get more precise, there's one more detail to consider: the inner loop iterates from i+1...N, so its largest value is N-1 iterations, not N iterations.
Therefore, instead of being precisely N2/2, it's really N * (N-1)/2. In your case, that 10*9/2 = 45.
That's the count for the number of comparisons. For swaps, you get some percentage of that, depending on the number of items that are out of order. In your specific case, all items are always out of order (because you're starting with reverse order) so you do a swap for every comparison. With any other ordering, you'd expect the number of swaps to be reduced.

45 = 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1, so for the exchanges this is correct, but for the comparisons I think there must be a mistake somewhere. Edit: You implemented a slightly more intelligent version than the standard bubble sort, that's why you have only 45 comparisons instead of 90 (it's not 100, one iteration takes 9 comparisons).

Related

Tell me the Input in which this code will give incorrect Output

There's a problem, which I've to solve in c++. I've written the whole code and it's working in the given test cases but when I'm submitting it, It's saying wrong answer. I can't understand that why is it showing wrong answer.
I request you to tell me an input for the given code, which will give incorrect output so I can modify my code further.
Shrink The Array
You are given an array of positive integers A[] of length L. If A[i] and A[i+1] both are equal replace them by one element with value A[i]+1. Find out the minimum possible length of the array after performing such operation any number of times.
Note:
After each such operation, the length of the array will decrease by one and elements are renumerated accordingly.
Input format:
The first line contains a single integer L, denoting the initial length of the array A.
The second line contains L space integers A[i] − elements of array A[].
Output format:
Print an integer - the minimum possible length you can get after performing the operation described above any number of times.
Example:
Input
7
3 3 4 4 4 3 3
Output
2
Sample test case explanation
3 3 4 4 4 3 3 -> 4 4 4 4 3 3 -> 4 4 4 4 4 -> 5 4 4 4 -> 5 5 4 -> 6 4.
Thus the length of the array is 2.
My code:
#include <bits/stdc++.h>
using namespace std;
int main()
{
bool end = false;
int l;
cin >> l;
int arr[l];
for(int i = 0; i < l; i++){
cin >> arr[i];
}
int len = l, i = 0;
while(i < len - 1){
if(arr[i] == arr[i + 1]){
arr[i] = arr[i] + 1;
if((i + 1) <= (len - 1)){
for(int j = i + 1; j < len - 1; j++){
arr[j] = arr[j + 1];
}
}
len--;
i = 0;
}
else{
i++;
}
}
cout << len;
return 0;
}
THANK YOU
As noted in the comments: Just picking the first two neighbours that have the same value and combining those will lead to suboptimal results.
You will need to investigate which two neighbours you should combine somehow. When you have combined two neighbours you then need to investigate which neighbours to combine on the next level. The number of combinations may become plentiful.
One way to solve this is through recursion.
If you've followed the advice in the comments, you now have all your input data in std::vector<unsigned> A(L).
You can now do std::cout << solve(A) << '\n'; where solve has the signature size_t solve(const std::vector<unsigned>& A) and is described below:
Find the indices of all neighbour pairs in A that has the same values and put the indices in a std::vector<size_t> neighbours. Example: If A contains 2 2 2 3, put 0 and 1 in neighbours.
If no neighbours are found (neighbours.empty() == true), return A.size().
Define a minimum variable and initialize it with A.size() - 1 which is the worst result you know you can get at this point. So, size_t minimum = A.size() - 1;
Loop over all indices stored in neighbours (for(size_t idx : neighbours))
Copy A into a new std::vector<unsigned>. Let's call it cpy.
Increase cpy[idx] by one and remove cpy[idx+1].
Call size_t result = solve(cpy). This is where recursion comes in.
Is result less than minimum? If so assign result to minimum.
Return minimum.
I don't think I ruined the programming exercise by providing one algorithm for solving this. It should still have plenty of things to deal with. Recursion won't be possible with big data etc.

To make array identical by swapping elements

There are 2 i/p array's. They are identical when they have exactly same numbers in it. To make them identical, we can swap their elements. Swapping will have cost. If we are swapping a and b elements then cost = min(a, b).
While making array's identical, cost should be minimum.
If it is not possible to make array identical then print -1.
i/p:
3 6 6 2
2 7 7 3
o/p :
4
Here I have swapped (2,7) and (2,6). So min Cost = 2 + 2 = 4.
Logic :
Make 2 maps which will store frequency of i/p array's elements.
if element "a" in aMap is also present in bMap, then we have to consider number of swapping for a = abs(freq(a) in aMap - freq(a) in bMap)
if frequency of elements is "odd", then not possible to make them identical.
else , add total swaps from both maps and find cost using
cost = smallest element * total swaps
Here is the code
#include<iostream>
#include<algorithm>
#include<map>
using namespace std;
int main()
{
int t;
cin >> t;
while(t--)
{
int size;
long long int cost = 0;
cin >> size;
bool flag = false;
map<long long int, int> aMap;
map<long long int, int> bMap;
// storing frequency of elements of 1st input array in map
for( int i = 0 ; i < size; i++)
{
long long int no;
cin >> no;
aMap[no]++;
}
// storing frequency of elements of 2nd input array in map
for(int i = 0 ; i < size; i++)
{
long long int no;
cin >> no;
bMap[no]++;
}
// fetching smallest element (i.e. 1st element) from both map
long long int firstNo = aMap.begin()->first;
long long int secondNo = bMap.begin()->first;
long long int smallestNo;
// finding smallest element from both maps
if(firstNo < secondNo)
smallestNo = firstNo;
else
smallestNo = secondNo;
map<long long int, int> :: iterator itr;
// trying to find out total number of swaps we have to perform
int totalSwapsFromA = 0;
int totalSwapsFromB = 0;
// trversing a map
for(itr = aMap.begin(); itr != aMap.end(); itr++)
{
// if element "a" in aMap is also present in bMap, then we have to consider
// number of swapping = abs(freq(a) in aMap - freq(a) in bMap)
auto newItr = bMap.find(itr->first);
if(newItr != bMap.end())
{
if(itr->second >= newItr->second)
{
itr->second -= newItr->second;
newItr->second = 0;
}
else
{
newItr->second -= itr->second;
itr->second = 0;
}
}
// if freq is "odd" then, this input is invalid as it can not be swapped
if(itr->second & 1 )
{
flag = true;
break;
}
else
{
// if freq is even, then we need to swap only for freq(a)/ 2 times
itr->second /= 2;
// if swapping element is smallest element then we required 1 less swap
if(itr->first == smallestNo && itr->second != 0)
totalSwapsFromA += itr->second -1;
else
totalSwapsFromA += itr->second;
}
}
// traversing bMap to check whether there any number is present which is
// not in aMap.
if(!flag)
{
for(itr = bMap.begin(); itr != bMap.end(); itr++)
{
auto newItr = aMap.find(itr->first);
if( newItr == aMap.end())
{
// if frew is odd , then i/p is invalid
if(itr->second & 1)
{
flag = true;
break;
}
else
{
itr->second /= 2;
// if swapping element is smallest element then we required 1 less swap
if(itr->first == smallestNo && itr->second != 0)
totalSwapsFromB += itr->second -1;
else
totalSwapsFromB += itr->second;
}
}
}
}
if( !flag )
{
cost = smallestNo * (totalSwapsFromB + totalSwapsFromA);
cout<<"cost "<<cost <<endl;
}
else
cout<<"-1"<<endl;
}
return 0;
}
No error in the above code but giving wrong answer and not getting accepted.
Can anyone improve this code / logic ?
Suppose you have 2 arrays:
A: 1 5 5
B: 1 4 4
We know that we want to move a 5 down and a 4 up, so we have to options: swapping 4 by 5 (with cost min(4, 5) = 4) or using the minimum element to do achive the same result, making 2 swaps:
A: 1 5 5 swap 1 by 4 (cost 1)
B: 1 4 4
________
A: 4 5 5 swap 1 by 5 (cost 1)
B: 1 1 4
________
A: 4 1 5 total cost: 2
B: 5 1 4
So the question we do at every swap is this. Is it better to swap directly or swapping twice using the minimum element as pivot?
In a nutshell, let m be the minimum element in both arrays and you want to swap i for j. The cost of the swap will be
min( min(i,j), 2 * m )
So just find out all the swaps you need to do, apply this formula and sum the results to get your answer.
#user1745866 You can simplify your task of determining the answer -1 by using only variable:
let we have int x=0 and we will just do XOR of all the i/p integers like this:
int x = 0;
for(int i=0;i<n;i++){
cin>>a[i];
x = x^a[i];
}
for(int i=0;i<n;i++){
cin>>b[i];
x = x^b[i];
}
if(x!=0)
cout<<-1;
else{
...do code for remain 2 condition...
}
Now the point is how it will work because , as all the numbers of both array should occurs only even number of times and when we do XOR operation of any number which occured even number of times we will get 0.... otherwise they can't be identical arrays.
Now for 2nd condition(which gives answer 0) you should use multimap so you would be able to directly compare both arrays in O(n) time complexity as if all elements of both arrays are same you can output:0
(Notice: i am suggesting multimap because 1:You would have both array sorted and all elements would be there means also duplicates.
2: because they are sorted, if they consist of same element at same position we can output:0 otherwise you have to proceed further for your 3rd condition or have to swap the elements.)
For reducing the swap cost see Daniel's answer. For finding if the swap is actually possible, please do the following, the swaps are actually only possible if you have an even number of elements in total, so that you can split them out evenly, so if you have 2, 4 or 6 5's you are good, but if you have 1, 3, or 5 5's return -1. It is impossible if your number of duplicates of a number is odd. For actually solving the problem, there is a very simple solution I can think of, through it is a little bit expensive, you just need to make sure that there are the same number of elements on each side so the simple way to do that would be to declare a new array:
int temp[size of original arrays];
//Go through both arrays and store them in temp
Take half of each element, so something like:
int count[max element in array - min element in array];
for(int i = 0; i < temp.size(); i++){
count[temp[i]]++;
}
Take half of each element from temp. When you see an element that matches a element on your count array so whenever you see a 1 decrement the index on the count array by 1, so something like count[1]--; Assuming count starts at 0. If the index is at zero and the element is that one, that means a swap needs to be done, in this case find the next min in the other array and swap them. Albeit a little bit expensive, but it is the simplest way I can think of. So for example in your case:
i/p:
3 6 6 2
2 7 7 3
o/p :
4
We would need to store the min index as 2. Cause that is the smallest one. So we would have an array that looks like the following:
1 1 0 0 1 1
//one two one three zero four zero five 1 six and 1 seven
You would go through the first array, when you see the second six, your array index at 6 would be zero, so you know you need to swap it, you would find the min in the other array, which is 2 and then swap 6 with 2, after wards you can go through the array smoothly. Finally you go through the second array, afterwards when you see the last 7 it will look for the min on the other side swap them...., which is two, note that if you had 3 twos on one side and one two on the other, chances are the three twos will go to the other side, and 2 of them will come back, because we are always swapping the min, so there will always be an even number of ways we can rearrange the elements.
Problem link https://www.codechef.com/JULY20B/problems/CHFNSWPS
here for calculating minimum number of swap.we will having 2 cases
let say an example
l1=[1,2,2]
l2=[1,5,5]
case 1. swap each pair wrt to min(l1,l2)=1
step 1 swapping single 2 of a pair of 2 from l1-> [1,1,2]
[2,5,5] cost is 1
step 2 swapping single 5 of a pair of 5 from l1-> [1,5,2]
[2,1,5] cost is 1
total cost is 2
case 2. swap min of l1 with max of l2(repeat until both list end)
try to think if we sort 1st list in increasing order and other as decreasing order then we can minimize cost.
l1=[1,2,2]
l2=[5,5,1]
Trick is that we only need to store min(l1,l2) in variable say mn. Then remove all common element from both list.
now list became l1=[2,2]
l2=[5,5]
then swap each element from index 0 to len(l1)-1 with jump of 2 like 0,2,4,6..... because each odd neighbour wiil be same as previous number.
after perform swapping cost will be 2 and
l1=[5,2]
l2=[2,5] cost is 2
total cost is 2
Let say an other example
l1=[2,2,5,5]
l2=[3,3,4,4]
after solving wrt to min(l1,l2) total cost will be 2+2+2=6
but cost after sorting list will be swap of ((2,4) and (5,3)) is 2+3=5
so minimum swap to make list identical is min(5,6)=5
//code
l1.sort()
l2.sort(reverse=True)
sums=0
for i in range(len(l1)):
sums+=min(min(l1[i],l2[i]),2*minimum))
print(sums)
#print -1 if u get odd count of a key in total (means sums of count of key in both list)

Efficiency of Sieve of Eratosthenes algorithm

I am trying to understand the "Sieve of Eratosthenes". Here is my algorithm (code below), and a list of features that I cannot understand (in order).
Why is i * i more efficient than i * 2? Yes, I can understand it would be less iterations, therefore more efficient, but then doesn't it skip some numbers (for example i = 9 => j = 81 skips 18 27 36 ...)?
On Wikipedia I found that space complexity is equal to O(n) and that's understandable; whatever number we enter it creates an array of the size entered, but time complexity here is where things get confusing. I found this notation O(n(logn)(loglogn)) -- what is that? According to my understanding we have 2 full iterations and 1 partial iteration, therefore O(n^2 * logn).
#include <iostream>
using namespace std;
int main() {
cout << "Enter number:" << endl;
int arrSize;
cin >> arrSize;
bool primesArr[arrSize];
primesArr[0] = false;
for (int i = 1; i < arrSize; i++) primesArr[i] = true;
for (int i = 2; i < arrSize; i++)
if (primesArr[i - 1]) {
cout << i << endl;
/* for (int j = i * 2; j < arrSize; j += i) less efficient */
for (int j = i * i; j < arrSize; j += i)
primesArr[j - 1] = false;
}
return 0;
}
Why i * i more efficient than i * 2? Yes, I can understand it would be less iteration, therefore more efficiency, but then doesn't it skip some numbers (for example i = 9 => j = 81 skip 18 27 36 ...)?
You are referring to
for (int j = i * i; j < arrSize; j += i)
Note that i * i is the initial value for the loop counter j. So the values of j greater than i * i will all be marked off. The values which we skip from i * 2 to i * i have already been marked off during previous iterations. Let's think about the first few:
When i == 2, we mark off all multiples of 2 (2, 4, 6, 8, etc.). When i == 3, if we start j = 3 * 2 = 6 then we will mark off 6 again before reaching 9, 12, 15, etc. Since 6 is a multiple of 2 and was already marked off, we can skip straight to 3 * 3 == 9.
When we reach i == 5 and if we start at j == 5 * 2 == 10, then we will mark off 10, which was already taken care of since it is a multiple of 2, 15 which is a multiple of 3, and 20 which is also a multiple of 2 before we finally reach 25 which is not a multiple of any primer less than 5.
time complexity here is where things get confusing. I found this notation O(n(logn)(loglogn)) -- what is that? According to my understanding we have 2 full iterations and 1 partial iteration, therefore O(n^2 * logn).
Your analysis reaches correct result that this algorithm is O(n^2 * logn). A more detailed analysis can prove a tighter upper bound as O(n(logn)(loglogn)). Note that O(n(logn)(loglogn)) is a subset of O(n^2 * logn).
Why i * i more efficient than i * 2? Doesn't it skip some numbers?
No it doesn't because smaller multiple of i (For example 18, 27 etc in your case are covered while running loop for i = 2, i = 3 etc)
Every number can be represented as unique prime factorization. If i is a prime number, any multiple of i greater than i and smaller than i * i would be multiple of one or more primes smaller than i.
nasty notation O(n(logn)(loglogn))
From this answer
Number of operations are 1/2 + 1/3 + 1/5 + 1/7 ... = n log log n
If you count bit operations, since you're dealing with numbers up to n, they have about log n bits, which is where the factor of log n comes in, giving O(n log n log log n) bit operations.

Why do we make n-1 iterations in bubble sort algorithm

Most common way of bubble sort algorithm is to have two for loops. Inner one being done from j=0 until j n-i-1. I assume we substract minus i, because when we reach last element we don't compare it because we don't have an element after him. But why do we use n-1. Why we don't run outer loop from i=0 until i < n and inner from j=0 until n-i? Could someone explain it to me, tutorials on internet does not emphasize this.
for (int i = 0; i < n - 1; i++) // Why do we have n-1 here?
{
swapped = false;
for (int j = 0; j < n - i - 1; j++)
{
countComparisons++;
if (arr[j] > arr[j + 1])
{
countSwaps++;
swap(&arr[j], &arr[j + 1]);
swapped = true;
}
}
}
For example, if I have an array with 6 elements, why do I only need to make 5 iterations?
Because a swap requires at least two elements.
So if you have 6 elements, you only need to consider 5 consecutive pairs.
For comparison purposes in an array, two adjacent cells are needed; in an array of 6 elements, you do 5 comparisons only; in an array of 10 elements, 9 comparisons, and so on:
array and comparisons between adjacent cells
So for 7 elements, just 6 comparisons are done, hence the general rule of n-1 in the outer for loop
About the n-1-i expression, remember that the highest (or lowest, depending on the ordering criterion) value in the bubble sort goes to the last position in the array after the first cycle, so there is no need to compare that value with anything else, therefore the array has to be "shortened" 1 cell at a time, and the value of i in the outer loop is the counter responsible for that in the inner loop:
5 | 3 | 9 | 20 | elements (n) = 4
after first cycle (i = 0), 20 has reached its correct position within the array (using an ascending order), leaving us with an array of 3 elements to do comparisons to; in next cycle, i will be equal to 1, and as n-1 remains the same, we need to substract 1 in that expression to "shorten" the array:
n-1-i = 4-1-1 = 2, which is the index of the last element in that new array as well as the quantity of comparisons needed.
Hope it helps!

Optimizing algorithm to find number of six digit numbers satisfying certain property

Problem: "An algorithm to find the number of six digit numbers where the sum of the first three digits is equal to the sum of the last three digits."
I came across this problem in an interview and want to know the best solution. This is what I have till now.
Approach 1: The Brute force solution is, of course, to check for each number (between 100,000 and 999,999) whether the sum of its first three and last three digits are equal. If yes, then increment certain counter which keeps count of all such numbers.
But this checks for all 900,000 numbers and so is inefficient.
Approach 2: Since we are asked "how many" such numbers and not "which numbers", we could do better. Divide the number into two parts: First three digits (these go from 100 to 999) and Last three digits (these go from 000 to 999). Thus, the sum of three digits in either part of a candidate number can range from 1 to 27.
* Maintain a std::map<int, int> for each part where key is the sum and value is number of numbers (3 digit) having that sum in the corresponding part.
* Now, for each number in the first part find out its sum and update the corresponding map.
* Similarly, we can get updated map for the second part.
* Now by multiplying the corresponding pairs (e.g. value in map 1 of key 4 and value in map 2 of key 4) and adding them up we get the answer.
In this approach, we end up checking 1K numbers.
My question is how could we further optimize? Is there a better solution?
For 0 <= s <= 18, there are exactly 10 - |s - 9| ways to obtain s as the sum of two digits.
So, for the first part
int first[28] = {0};
for(int s = 0; s <= 18; ++s) {
int c = 10 - (s < 9 ? (9 - s) : (s - 9));
for(int d = 1; d <= 9; ++d) {
first[s+d] += c;
}
}
That's 19*9 = 171 iterations, for the second half, do it similarly, with the inner loop starting at 0 instead of 1, that's 19*10 = 190 iterations. Then sum first[i]*second[i] for 1 <= i <= 27.
Generate all three-digit numbers; partition them into sets based on their sum of digits. (Actually, all you need to do is keep a vector that counts the size of the sets). For each set, the number of six-digit numbers that can be generated is the size of the set squared. Sum up the squares of the set sizes to get your answer.
int sumCounts[28]; // sums can go from 0 through 27
for (int i = 0; i < 1000; ++i) {
sumCounts[sumOfDigits(i)]++;
}
int total = 0;
for (int i = 0; i < 28; ++i) {
count = sumCounts[i];
total += count * count;
}
EDIT Variation to eliminate counting leading zeroes:
int sumCounts[28];
int sumCounts2[28];
for (int i = 0; i < 100; ++i) {
int s = sumOfDigits(i);
sumCounts[s]++;
sumCounts2[s]++;
}
for (int i = 100; i < 1000; ++i) {
sumCounts[sumOfDigits(i)]++;
}
int total = 0;
for (int i = 0; i < 28; ++i) {
count = sumCounts[i];
total += (count - sumCounts2[i]) * count;
}
Python Implementation
def equal_digit_sums():
dists = {}
for i in range(1000):
digits = [int(d) for d in str(i)]
dsum = sum(digits)
if dsum not in dists:
dists[dsum] = [0,0]
dists[dsum][0 if len(digits) == 3 else 1] += 1
def prod(dsum):
t = dists[dsum]
return (t[0]+t[1])*t[0]
return sum(prod(dsum) for dsum in dists)
print(equal_digit_sums())
Result: 50412
One idea: For each number from 0 to 27, count the number of three-digit numbers that have that digit sum. This should be doable efficiently with a DP-style approach.
Now you just sum the squares of the results, since for each answer, you can make a six-digit number with one of those on each side.
Assuming leading 0's aren't allowed, you want to calculate how many different ways are there to sum to n with 3 digits. To calculate that you can have a for loop inside a for loop. So:
firstHalf = 0
for i in xrange(max(1,n/3),min(9,n+1)): #first digit
for j in xrange((n-i)/2,min(9,n-i+1)): #second digit
firstHalf +=1 #Will only be one possible third digit
secondHalf = firstHalf + max(0,10-|n-9|)
If you are trying to sum to a number, then the last number is always uniquely determined. Thus in the case where the first number is 0 we are just calculating how many different values are possible for the second number. This will be n+1 if n is less than 10. If n is greater, up until 18 it will be 19-n. Over 18 there are no ways to form the sum.
If you loop over all n, 1 through 27, you will have your total sum.