Have no idea how to imlpement Quicksort [closed] - c++

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I am trying to implement quicksort in this for just research. But i have no idea how quicksort works im looking at this algorithm but have no idea what or how to implement it right now im using bubble sort but how do i go ahead and implement quicksort?
# choose pivot
swap a[1,rand(1,n)]
# 2-way partition
k = 1
for i = 2:n, if a[i] < a[1], swap a[++k,i]
swap a[1,k]
→ invariant: a[1..k-1] < a[k] <= a[k+1..n]
# recursive sorts
sort a[1..k-1]
sort a[k+1,n]
This is my code below
int main()
{
srand(time(NULL));
int length = 250000;
double arr[length];
for(int i = 0; i<length; ++i) arr[i] = rand();
// mergeSort2(arr, arr+length-1);
for(int i = 0; i < (length-1); i++)
{
for(int j = i+1; j < length; j++)
{
if( arr[i] > arr[j])
{
swap(arr[i], arr[j]);
}
}
}
ofstream ofs("input.txt");
for(int i = 0; i<length; ++i) ofs << i << " " << arr[i] << endl;
}

This animation is really helpful.
The heart of quick sort (Dive and Conquer) is simply the following:
Pick an element as pivot
In practice most of the time we don't care which one. As you will see by proof, picking a pivot randomly (instead of front / end / mid) will minimize the possibility of running into the worst-case.
For test purpose, pick the middle guy is a good choice.
Partitions (l = left, r = right)
The goal is to partition your original array into two sets (virtually!!)
S_left = { x in S - {pivot} : x less than or equal to pivot }
S_right = { x in S - {pivot} : x greater than or equal to pivot }
To ease the notation:
a[l]....a[i-1] are less than or equal to a[i]
a[i+1]...a[r] are greater than or equal to a[i]
and hence, at the end, a[i] (the pivot element) should be in its proper place.
Strategy
There are two common ways to program QS, but in general QS takes three parameters (Array, low, high), where low is the left-end index of the array, and high is the right-end index of the array (could be 2,3, 5,10, not necessarily 0, length-1)
Initially
l = low-1
r = high
advances l when A[l] less than or equal to pivot
decrements r when A[r] is greater than or equal to pivot
swap A[l] and A[r]
repeat this process until l is greater than or equal to r, and swap (pivot, A[l])
Call QuickSort(A, left, l-1)
Call QuickSort(A, l+1, right)
I am giving you most of the implementation.
Just work out with a small example (size 9 seems reasonable to me) on paper. Don't use rand until your implementation is correct.
Try this array:
myArray = 9,6,2,5,11,4,20,1,3
I can write more tomorrow.

Related

Count number of sub-sequences of given array such that their sum is less or equal to given number?

I have an array of size n of integer values and a given number S.
1<=n<=30
I want to find the total number of sub-sequences such that for each sub-sequences elements sum is less than S.
For example: let n=3 , S=5and array's elements be as {1,2,3}then its total sub-sequences be 7 as-
{1},{2},{3},{1,2},{1,3},{2,3},{1,2,3}
but, required sub sequences is:
{1},{2},{3},{1,2},{1,3},{2,3}
that is {1,2,3}is not taken because its element sum is (1+2+3)=6which is greater than S that is 6>S. Others is taken because, for others sub-sequences elements sum is less than S.
So, total of possible sub-sequences be 6.
So my answer is count, which is6.
I have tried recursive method but its time complexity is 2^n.
Please help us to do it in polynomial time.
You can solve this in reasonable time (probably) using the pseudo-polynomial algorithm for the knapsack problem, if the numbers are restricted to be positive (or, technically, zero, but I'm going to assume positive). It is called pseudo polynomial because it runs in nS time. This looks polynomial. But it is not, because the problem has two complexity parameters: the first is n, and the second is the "size" of S, i.e. the number of digits in S, call it M. So this algorithm is actually n 2^M.
To solve this problem, let's define a two dimensional matrix A. It has n rows and S columns. We will say that A[i][j] is the number of sub-sequences that can be formed using the first i elements and with a maximum sum of at most j. Immediately observe that the bottom-right element of A is the solution, i.e. A[n][S] (yes we are using 1 based indexing).
Now, we want a formula for A[i][j]. Observe that all subsequences using the first i elements either include the ith element, or do not. The number of subsequences that don't is just A[i-1][j]. The number of subsequences that do is just A[i-1][j-v[i]], where v[i] is just the value of the ith element. That's because by including the ith element, we need to keep the remainder of the sum below j-v[i]. So by adding those two numbers, we can combine the subsequences that do and don't include the jth element to get the total number. So this leads us to the following algorithm (note: I use zero based indexing for elements and i, but 1 based for j):
std::vector<int> elements{1,2,3};
int S = 5;
auto N = elements.size();
std::vector<std::vector<int>> A;
A.resize(N);
for (auto& v : A) {
v.resize(S+1); // 1 based indexing for j/S, otherwise too annoying
}
// Number of subsequences using only first element is either 0 or 1
for (int j = 1; j != S+1; ++j) {
A[0][j] = (elements[0] <= j);
}
for (int i = 1; i != N; ++i) {
for (int j = 1; j != S+1; ++j) {
A[i][j] = A[i-1][j]; // sequences that don't use ith element
auto leftover = j - elements[i];
if (leftover >= 0) ++A[i][j]; // sequence with only ith element, if i fits
if (leftover >= 1) { // sequences with i and other elements
A[i][j] += A[i-1][leftover];
}
}
}
Running this program and then outputting A[N-1][S] yields 6 as required. If this program does not run fast enough you can significantly improve performance by using a single vector instead of a vector of vectors (and you can save a bit of space/perf by not wasting a column in order to 1-index, as I did).
Yes. This problem can be solved in pseudo-polynomial time.
Let me redefine the problem statement as "Count the number of subsets that have SUM <= K".
Given below is a solution that works in O(N * K),
where N is the number of elements and K is the target value.
int countSubsets (int set[], int K) {
int dp[N][K];
//1. Iterate through all the elements in the set.
for (int i = 0; i < N; i++) {
dp[i][set[i]] = 1;
if (i == 0) continue;
//2. Include the count of subsets that doesn't include the element set[i]
for (int k = 1; k < K; k++) {
dp[i][k] += dp[i-1][k];
}
//3. Now count subsets that includes element set[i]
for (int k = 0; k < K; k++) {
if (k + set[i] >= K) {
break;
}
dp[i][k+set[i]] += dp[i-1][k];
}
}
//4. Return the sum of the last row of the dp table.
int count = 0;
for (int k = 0; k < K; k++) {
count += dp[N-1][k];
}
// here -1 is to remove the empty subset
return count - 1;
}

C++ algorithm optimization: find K combination from N elements

I am pretty noobie with C++ and am trying to do some HackerRank challenges as a way to work on that.
Right now I am trying to solve Angry Children problem: https://www.hackerrank.com/challenges/angry-children
Basically, it asks to create a program that given a set of N integer, finds the smallest possible "unfairness" for a K-length subset of that set. Unfairness is defined as the difference between the max and min of a K-length subset.
The way I'm going about it now is to find all K-length subsets and calculate their unfairness, keeping track of the smallest unfairness.
I wrote the following C++ program that seems to the problem correctly:
#include <cmath>
#include <cstdio>
#include <iostream>
using namespace std;
int unfairness = -1;
int N, K, minc, maxc, ufair;
int *candies, *subset;
void check() {
ufair = 0;
minc = subset[0];
maxc = subset[0];
for (int i = 0; i < K; i++) {
minc = min(minc,subset[i]);
maxc = max(maxc, subset[i]);
}
ufair = maxc - minc;
if (ufair < unfairness || unfairness == -1) {
unfairness = ufair;
}
}
void process(int subsetSize, int nextIndex) {
if (subsetSize == K) {
check();
} else {
for (int j = nextIndex; j < N; j++) {
subset[subsetSize] = candies[j];
process(subsetSize + 1, j + 1);
}
}
}
int main() {
cin >> N >> K;
candies = new int[N];
subset = new int[K];
for (int i = 0; i < N; i++)
cin >> candies[i];
process(0, 0);
cout << unfairness << endl;
return 0;
}
The problem is that HackerRank requires the program to come up with a solution within 3 seconds and that my program takes longer than that to find the solution for 12/16 of the test cases. For example, one of the test cases has N = 50 and K = 8; the program takes 8 seconds to find the solution on my machine. What can I do to optimize my algorithm? I am not very experienced with C++.
All you have to do is to sort all the numbers in ascending order and then get minimal a[i + K - 1] - a[i] for all i from 0 to N - K inclusively.
That is true, because in optimal subset all numbers are located successively in sorted array.
One suggestion I'd give is to sort the integer list before selecting subsets. This will dramatically reduce the number of subsets you need to examine. In fact, you don't even need to create subsets, simply look at the elements at index i (starting at 0) and i+k, and the lowest difference for all elements at i and i+k [in valid bounds] is your answer. So now instead of n choose k subsets (factorial runtime I believe) you just have to look at ~n subsets (linear runtime) and sorting (nlogn) becomes your bottleneck in performance.

N choose k for large n and k

I have n elements stored in an array and a number k of possible subset over n(n chose k).
I have to find all the possible combinations of k elements in the array of length n and, for each set(of length k), make some calculations on the elements choosen.
I have written a recursive algorithm(in C++) that works fine, but for large number it crashes going out of heap space.
How can I fix the problem? How can I calculate all the sets of n chose k for large n and k?
Is there any library for C++ that can help me?
I know it is a np problem but I would write the best code in order to calculate the biggest numbers possible.
Which is approximately the biggest numbers (n and k)beyond which it becames unfeasible?
I am only asking for the best algorithm, not for unfeasible space/work.
Here my code
vector<int> people;
vector<int> combination;
void pretty_print(const vector<int>& v)
{
static int count = 0;
cout << "combination no " << (++count) << ": [ ";
for (int i = 0; i < v.size(); ++i) { cout << v[i] << " "; }
cout << "] " << endl;
}
void go(int offset, int k)
{
if (k == 0) {
pretty_print(combination);
return;
}
for (int i = offset; i <= people.size() - k; ++i) {
combination.push_back(people[i]);
go(i+1, k-1);
combination.pop_back();
}
}
int main() {
int n = #, k = #;
for (int i = 0; i < n; ++i) { people.push_back(i+1); }
go(0, k);
return 0;
}
Here is non recursive algorithm:
const int n = ###;
const int k = ###;
int currentCombination[k];
for (int i=0; i<k; i++)
currentCombination[i]=i;
currentCombination[k-1] = k-1-1; // fill initial combination is real first combination -1 for last number, as we will increase it in loop
do
{
if (currentCombination[k-1] == (n-1) ) // if last number is just before overwhelm
{
int i = k-1-1;
while (currentCombination[i] == (n-k+i))
i--;
currentCombination[i]++;
for (int j=(i+1); j<k; j++)
currentCombination[j] = currentCombination[i]+j-i;
}
else
currentCombination[k-1]++;
for (int i=0; i<k; i++)
_tprintf(_T("%d "), currentCombination[i]);
_tprintf(_T("\n"));
} while (! ((currentCombination[0] == (n-1-k+1)) && (currentCombination[k-1] == (n-1))) );
Your recursive algorithm might be blowing the stack. If you make it non-recursive, then that would help, but it probably won't solve the problem if your case is really 100 choose 10. You have two problems. Few, if any, computers in the world have 17+ terabytes of memory. Going through 17 trillion+ iterations to generate all the combinations will take way too long. You need to rethink the problem and either come up with an N choose K case that is more reasonable, or process only a certain subset of the combinations.
You probably do not want to be processing more than a billion or two combinations at the most - and even that will take some time. That translates to around 41 choose 10 to about 44 choose 10. Reducing either N or K will help. Try editing your question and posting the problem you are trying to solve and why you think you need to go through all of the combinations. There may be a way to solve it without going through all of the combinations.
If it turns out you do need to go through all those combinations, then maybe you should look into using a search technique like a genetic algorithm or simulated annealing. Both of these hill climbing search techniques provide the ability to search a large space in a relatively small time for a close to optimal solution, but neither guarantee to find the optimal solution.
You can use next_permutation() in algorithm.h to generate all possible combinations.
Here is some example code:
bool is_chosen(n, false);
fill(is_chosen.begin() + n - k, is_chosen.end(), true);
do
{
for(int i = 0; i < n; i++)
{
if(is_chosen[i])
cout << some_array[i] << " ";
}
cout << endl;
} while( next_permutation(is_chosen.begin(), is_chosen.end()) );
Don't forget to include the algorithm.
As I said in a comment, it's not clear what you really want.
If you want to compute (n choose k) for relatively small values, say n,k < 100 or so, you may want to use a recursive method, using Pascals triangle.
If n,k are large (say n=1000000, k=500000), you may be happy with an approxiate result using Sterlings formula for the factorial: (n choose k) = exp(loggamma(n)-loggamma(k)-loggamma(n-k)), computing loggamma(x) via Sterling's formula.
If you want (n choose k) for all or many k but the same n, you can simply iterate over k and use (n choose k+1) = ((n choose k)*(n-k))/(k+1).

Quick sort code explanation [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
This is code that I came across implementing the quick sort algorithm. Can you please explain how the recursion works here?
void quickSort(int arr[], int left, int right)
{
int i = left, j = right;
int tmp;
int pivot = arr[(left + right) / 2];
/* partition */
while (i <= j) {
while (arr[i] < pivot)
i++;
while (arr[j] > pivot)
j--;
if (i <= j) {
tmp = arr[i];
arr[i] = arr[j];
arr[j] = tmp;
i++;
j--;
}
}
/* recursion */
if (left < j)
quickSort(arr, left, j);
if (i < right)
quickSort(arr, i, right);
}
And please note, this is not homework.
Not sure what you mean with "explain how the recursion is working". But here you go:
The function you posted takes an array of ints and two indexes. It will not sort the whole array, but only the part of it between the two indexes, ignoring anything that is outside them. This means the same function can sort the whole array if you pass the first and last indexes, or just a sub array if you pass a left value that is not the index of the first element of the array and/or a right value that is not the index of the last element.
The sorting algorithm is the well known quicksort. The as pivot it uses the central element (it could as well have used any other element). It partitions the array into the less than (or equal to) pivot subarray and the greater than (or equal to) pivot subarray, leaving an element equal to the pivot between the two partitions.
Then it recursively calls itself to sort the two partitions, but only does it if it is necessary (hence the ifs before the recursive calls).
The implementation works, but is sub-optimal in many ways, and could be improved.
Here are some possible improvements:
switch to another sorting algorithm if the array is sufficiently short
chose the pivot value as median of three values (generally first, last and mid)
initially move one pivot value out of the array (put it in first or last position and reduce the focus to the rest of the array) then change the tests to pass over values that are equal to the pivot to reduce the number of swaps involving them. You'll put the pivot value back in with a final exchange at the end. This is especially useful if you do not follow suggestion 2 and chose the firs/last element instead of the mid one as in this implementation.
late reply but I just added some prints and it might help whoever comes across this understand the code.
#include<iostream>
using namespace std;
void quickSort(int arr[], int left, int right)
{
int i = left, j = right;
int tmp;
int pivot = arr[abs((left + right) / 2)];
cout<<"pivot is"<<pivot<<endl;
/* partition */
while (i <= j) {
while (arr[i] < pivot)
i++;
while (arr[j] > pivot)
j--;
if (i <= j) {
cout<<"i and j are"<<i<<" "<<j<<"and corresponding array value is"<<arr[i]<<" " <<arr[j]<<endl;
tmp = arr[i];
arr[i] = arr[j];
arr[j] = tmp;
i++;
j--;
cout<<"entering first big while loop"<<endl;
for(int i=0;i<7;i++)
cout<<arr[i]<<" "<<endl ;
}
}
cout<<"recursion"<<endl;
/* recursion */
if (left < j)
quickSort(arr, left, j);
if (i< right)
quickSort(arr, i, right);
}
int main(){
int arr[7]= {2,3,8,7,4,9,1};
for(int i=0;i<7;i++)
cout<<arr[i]<<" " ;
quickSort(arr,0,6);
cout<<endl;
for(int i=0;i<7;i++)
cout<<arr[i]<<" " ;
int wait;
cin>>wait;
return 0;
}
Here is your answer -- in the common case both the recursive calls will be executed because the conditions above them will be true. However, in the corner case you could have the pivot element be the largest (or the smallest) element. In which case you have to make only one recursive call which will basically attempt the process one more time by choosing a different pivot after removing the pivot element from the array.

Interesting Problem (Currency arbitrage)

Arbitrage is the process of using discrepancies in currency exchange values to earn profit.
Consider a person who starts with some amount of currency X, goes through a series of exchanges and finally ends up with more amount of X(than he initially had).
Given n currencies and a table (nxn) of exchange rates, devise an algorithm that a person should use to avail maximum profit assuming that he doesn't perform one exchange more than once.
I have thought of a solution like this:
Use modified Dijkstra's algorithm to find single source longest product path.
This gives longest product path from source currency to each other currency.
Now, iterate over each other currency and multiply to the maximum product so far, w(curr,source)(weight of edge to source).
Select the maximum of all such paths.
While this appears good, i still doubt of correctness of this algorithm and the completeness of the problem.(i.e Is the problem NP-Complete?) as it somewhat resembles the traveling salesman problem.
Looking for your comments and better solutions(if any) for this problem.
Thanks.
EDIT:
Google search for this topic took me to this here, where arbitrage detection has been addressed but the exchanges for maximum arbitrage is not.This may serve a reference.
Dijkstra's cannot be used here because there is no way to modify Dijkstra's to return the longest path, rather than the shortest. In general, the longest path problem is in fact NP-complete as you suspected, and is related to the Travelling Salesman Problem as you suggested.
What you are looking for (as you know) is a cycle whose product of edge weights is greater than 1, i.e. w1 * w2 * w3 * ... > 1. We can reimagine this problem to change it to a sum instead of a product if we take the logs of both sides:
log (w1 * w2 * w3 ... ) > log(1)
=> log(w1) + log(w2) + log(w3) ... > 0
And if we take the negative log...
=> -log(w1) - log(w2) - log(w3) ... < 0 (note the inequality flipped)
So we are now just looking for a negative cycle in the graph, which can be solved using the Bellman-Ford algorithm (or, if you don't need the know the path, the Floyd-Warshall algorihtm)
First, we transform the graph:
for (int i = 0; i < N; ++i)
for (int j = 0; j < N; ++j)
w[i][j] = -log(w[i][j]);
Then we perform a standard Bellman-Ford
double dis[N], pre[N];
for (int i = 0; i < N; ++i)
dis[i] = INF, pre[i] = -1;
dis[source] = 0;
for (int k = 0; k < N; ++k)
for (int i = 0; i < N; ++i)
for (int j = 0; j < N; ++j)
if (dis[i] + w[i][j] < dis[j])
dis[j] = dis[i] + w[i][j], pre[j] = i;
Now we check for negative cycles:
for (int i = 0; i < N; ++i)
for (int j = 0; j < N; ++j)
if (dis[i] + w[i][j] < dis[j])
// Node j is part of a negative cycle
You can then use the pre array to find the negative cycles. Start with pre[source] and work your way back.
The fact that it is an NP-hard problem doesn't really matter when there are only about 150 currencies currently in existence, and I suspect your FX broker will only let you trade at most 20 pairs anyway. My algorithm for n currencies is therefore:
Make a tree of depth n and branching factor n. The nodes of the tree are currencies and the root of the tree is your starting currency X. Each link between two nodes (currencies) has weight w, where w is the FX rate between the two currencies.
At each node you should also store the cumulative fx rate (calculated by multiplying all the FX rates above it in the tree together). This is the FX rate between the root (currency X) and the currency of this node.
Iterate through all the nodes in the tree that represent currency X (maybe you should keep a list of pointers to these nodes to speed up this stage of the algorithm). There will only be n^n of these (very inefficient in terms of big-O notation, but remember your n is about 20). The one with the highest cumulative FX rate is your best FX rate and (if it is positive) the path through the tree between these nodes represents an arbitrage cycle starting and ending at currency X.
Note that you can prune the tree (and so reduce the complexity from O(n^n) to O(n) by following these rules when generating the tree in step 1:
If you get to a node for currency X, don't generate any child nodes.
To reduce the branching factor from n to 1, at each node generate all n child nodes and only add the child node with the greatest cumulative FX rate (when converted back to currency X).
Imho, there is a simple mathematical structure to this problem that lends itself to a very simple O(N^3) Algorithm. Given a NxN table of currency pairs, the reduced row echelon form of the table should yield just 1 linearly independent row (i.e. all the other rows are multiples/linear combinations of the first row) if no arbitrage is possible.
We can just perform gaussian elimination and check if we get just 1 linearly independent row. If not, the extra linearly independent rows will give information about the number of pairs of currency available for arbitrage.
Take the log of the conversion rates. Then you are trying to find the cycle starting at X with the largest sum in a graph with positive, negative or zero-weighted edges. This is an NP-hard problem, as the simpler problem of finding the largest cycle in an unweighted graph is NP-hard.
Unless I totally messed this up, I believe my implementation works using Bellman-Ford algorithm:
#include <algorithm>
#include <cmath>
#include <iostream>
#include <vector>
std::vector<std::vector<double>> transform_matrix(std::vector<std::vector<double>>& matrix)
{
int n = matrix.size();
int m = matrix[0].size();
for (int i = 0; i < n; ++i)
{
for (int j = 0; j < m; ++j)
{
matrix[i][j] = log(matrix[i][j]);
}
}
return matrix;
}
bool is_arbitrage(std::vector<std::vector<double>>& currencies)
{
std::vector<std::vector<double>> tm = transform_matrix(currencies);
// Bellman-ford algorithm
int src = 0;
int n = tm.size();
std::vector<double> min_dist(n, INFINITY);
min_dist[src] = 0.0;
for (int i = 0; i < n - 1; ++i)
{
for (int j = 0; j < n; ++j)
{
for (int k = 0; k < n; ++k)
{
if (min_dist[k] > min_dist[j] + tm[j][k])
min_dist[k] = min_dist[j] + tm[j][k];
}
}
}
for (int j = 0; j < n; ++j)
{
for (int k = 0; k < n; ++k)
{
if (min_dist[k] > min_dist[j] + tm[j][k])
return true;
}
}
return false;
}
int main()
{
std::vector<std::vector<double>> currencies = { {1, 1.30, 1.6}, {.68, 1, 1.1}, {.6, .9, 1} };
if (is_arbitrage(currencies))
std::cout << "There exists an arbitrage!" << "\n";
else
std::cout << "There does not exist an arbitrage!" << "\n";
std::cin.get();
}