So I have the following code and I need to derive the execution time growth rate, however I have no idea where to start. My question is, how do I go about doing this? Any help would be appreciated.
Thank you.
// function to merge two sorted arrays
int merge (int smax, char sArray[], int tmax, char tArray[], char target[])
{
int m, s, t;
for (m = s = t = 0; s < smax && t < tmax; m++)
{
if (sArray[s] <= tArray[t])
{
target[m] = sArray[s];
s++;
}
else
{
target[m] = tArray[t];
t++;
}
}
int compCount = m;
for (; s < smax; m++)
{
target[m] = sArray[s++];
}
for (; t < tmax; m++)
{
target[m] = tArray[t++];
}
return compCount;
}
It's actually very simple.
Look, the first for loop increases either s or t at each iteration, so it's O(smax + tmax). The second loop is obviously O(smax), the third is O(tmax). Altogether we get O(smax + tmax).
(There exist some cleverer ways to prove, but I've intentionally left them out.)
All loops are bounded in number of iterations by (smax + tmax). So you could say the algorithm is O( max(smax,tmax) ) or O( smax +tmax).
Related
I was working on a codingame challenge : the horse racing dual.
The goal is to find the minimum difference between two elements of a list.
I started with this first algorithm, which is i think in O(nlog(n)) but the execution was timing out for large arrays.
int array[N];
int min = numeric_limits<int>::max();
for (int i = 0; i < N; i++) {
int value;
cin >> value;
cin.ignore();
array[i] = value;
for (int j = i - 1; j >= 0; --j) {
int diff = abs(array[j] - value);
if (diff < min) {
min = diff;
}
}
}
I then tried this other algorithm which is also O(nlog(n))and this time the execution finishes in time.
int array[N];
int min = numeric_limits<int>::max();
for (int i = 0; i < N; i++) {
int value;
cin >> value;
cin.ignore();
array[i] = value;
}
sort(array, array + N);
for (int i = 1; i < N; ++i) {
int diff = abs(array[i - 1] - array[i]);
if (diff < min) {
min = diff;
}
}
Am I wrong with the first code complexity ? Is there any difference that I did not notice ?
Thanks for your help.
Am I wrong with the first code complexity?
Yes, you are wrong, this complexity is not O(n log n), but O(n^2) instead.
The outer loop runs n (N) times while the inner loop runs n/2 times in average. Thus, the complexity is O(n * n/2) which is O(n^2), since multiplicative constants doesn't matter in complexity calculations.
Is there any difference that I did not notice?
Yes, there is. Even if you have two algorithms with the very same complexity, such as O(n log n), they both can run in very different times due to hidden constants, which are ignored in asymptotic complexity behavior.
reader,
Well, I think I just got brainfucked a bit.
I'm implementing knapsack, and I thought about I implemented brute-force algorithm like 1 or 2 times ever. So I decided to make another one.
And here's what I chocked in.
Let us decide W is maximum weight, and w(min) is minimal-weighted element we can put in knapsack like k=W/w(min) times. I'm explaining this because you, reader, are better know why I need to ask my question.
Now. If we imagine that we have like 3 types of things we can put in knapsack, and our knapsack can store like 15 units of mass, let's count each unit weight as its number respectively. so we can put like 15 things of 1st type, or 7 things of 2nd type and 1 thing of 1st type. but, combinations like 22222221[7ed] and 12222222[7ed] will mean the same for us. and counting them is a waste of any type of resources we pay for decision. (it's a joke, 'cause bf is a waste if we have a cheaper algorithm, but I'm very interested)
As I guess the type of selections we need to go through all possible combinations is called "Combinations with repetitions". The number of C'(n,k) counts as (n+k-1)!/(n-1)!k!.
(while I typing my message I just spotted a hole in my theory. we will probably need to add an empty, zero-weighted-zero-priced item to hold free space it's probably just increases n by 1)
so, what's the matter.
https://rosettacode.org/wiki/Combinations_with_repetitions
as this problem is well-described up here^ I don't really want to use stack this way, I want to generate variations in single cycle, which is going from i=0 to i<C'(n,k).
so, If I can make it, how it works?
we have
int prices[n]; //appear mystically
int weights[n]; // same as previous and I guess we place (0,0) in both of them.
int W, k; // W initialized by our lord and savior
k = W/min(weights);
int road[k], finalroad[k]; //all 0
int curP = curW = maxP = maxW = 0;
for (int i = 0; i < rCombNumber(n, k); i ++) {
/*guys please help me to know how to generate this mask which is consists of indices from 0 to n (meaning of each element) and k is size of mask.*/
curW = 0;
for (int j = 0; j < k; j ++)
curW += weights[road[j]];
if (curW < W) {
curP = 0;
for (int l = 0; l < k; l ++)
curP += prices[road[l]];
if (curP > maxP) {
maxP = curP;
maxW = curW;
finalroad = road;
}
}
}
mask, road -- is an array of indices, each can be equal from 0 to n; and have to be generated as C'(n,k) (link about it above) from { 0, 1, 2, ... , n } by k elements in each selection (combination with repetitions where order is unimportant)
that's it. prove me wrong or help me. Much thanks in advance _
and yes, of course algorithm will take the hell much time, but it looks like it should work. and I'm very interesting in it.
UPDATE:
what do I miss?
http://pastexen.com/code.php?file=EMcn3F9ceC.txt
The answer was provided by Minoru here https://gist.github.com/Minoru/745a7c19c7fa77702332cf4bd3f80f9e ,
it's enough to increment only the first element, then we count all of the carries, set where we did a carry and count reset value as the maximum of elements to reset and reset with it.
here's my code:
#include <iostream>
using namespace std;
static long FactNaive(int n)
{
long r = 1;
for (int i = 2; i <= n; ++i)
r *= i;
return r;
}
static long long CrNK (long n, long k)
{
long long u, l;
u = FactNaive(n+k-1);
l = FactNaive(k)*FactNaive(n-1);
return u/l;
}
int main()
{
int numberOFchoices=7,kountOfElementsInCombination=4;
int arrayOfSingleCombination[kountOfElementsInCombination] = {0,0,0,0};
int leftmostResetPos = kountOfElementsInCombination;
int resetValue=1;
for (long long iterationCounter = 0; iterationCounter<CrNK(numberOFchoices,kountOfElementsInCombination); iterationCounter++)
{
leftmostResetPos = kountOfElementsInCombination;
if (iterationCounter!=0)
{
arrayOfSingleCombination[kountOfElementsInCombination-1]++;
for (int anotherIterationCounter=kountOfElementsInCombination-1; anotherIterationCounter>0; anotherIterationCounter--)
{
if(arrayOfSingleCombination[anotherIterationCounter]==numberOFchoices)
{
leftmostResetPos = anotherIterationCounter;
arrayOfSingleCombination[anotherIterationCounter-1]++;
}
}
}
if (leftmostResetPos != kountOfElementsInCombination)
{
resetValue = 1;
for (int j = 0; j < leftmostResetPos; j++)
{
if (arrayOfSingleCombination[j] > resetValue)
{
resetValue = arrayOfSingleCombination[j];
}
}
for (int j = leftmostResetPos; j != kountOfElementsInCombination; j++)
{
arrayOfSingleCombination[j] = resetValue;
}
}
for (int j = 0; j<kountOfElementsInCombination; j++)
{
cout<<arrayOfSingleCombination[j]<<" ";
}
cout<<"\n";
}
return 0;
}
thanks a lot, Minoru
I am trying to solve this problem:
Given a string array words, find the maximum value of length(word[i]) * length(word[j]) where the two words do not share common letters. You may assume that each word will contain only lower case letters. If no such two words exist, return 0.
https://leetcode.com/problems/maximum-product-of-word-lengths/
You can create a bitmap of char for each word to check if they share chars in common and then calc the max product.
I have two method almost equal but the first pass checks, while the second is too slow, can you understand why?
class Solution {
public:
int maxProduct2(vector<string>& words) {
int len = words.size();
int *num = new int[len];
// compute the bit O(n)
for (int i = 0; i < len; i ++) {
int k = 0;
for (int j = 0; j < words[i].length(); j ++) {
k = k | (1 <<(char)(words[i].at(j)));
}
num[i] = k;
}
int c = 0;
// O(n^2)
for (int i = 0; i < len - 1; i ++) {
for (int j = i + 1; j < len; j ++) {
if ((num[i] & num[j]) == 0) { // if no common letters
int x = words[i].length() * words[j].length();
if (x > c) {
c = x;
}
}
}
}
delete []num;
return c;
}
int maxProduct(vector<string>& words) {
vector<int> bitmap(words.size());
for(int i=0;i<words.size();++i) {
int k = 0;
for(int j=0;j<words[i].length();++j) {
k |= 1 << (char)(words[i][j]);
}
bitmap[i] = k;
}
int maxProd = 0;
for(int i=0;i<words.size()-1;++i) {
for(int j=i+1;j<words.size();++j) {
if ( !(bitmap[i] & bitmap[j])) {
int x = words[i].length() * words[j].length();
if ( x > maxProd )
maxProd = x;
}
}
}
return maxProd;
}
};
Why the second function (maxProduct) is too slow for leetcode?
Solution
The second method does repetitive call to words.size(). If you save that in a var than it working fine
Since my comment turned out to be correct I'll turn my comment into an answer and try to explain what I think is happening.
I wrote some simple code to benchmark on my own machine with two solutions of two loops each. The only difference is the call to words.size() is inside the loop versus outside the loop. The first solution is approximately 13.87 seconds versus 16.65 seconds for the second solution. This isn't huge, but it's about 20% slower.
Even though vector.size() is a constant time operation that doesn't mean it's as fast as just checking against a variable that's already in a register. Constant time can still have large variances. When inside nested loops that adds up.
The other thing that could be happening (someone much smarter than me will probably chime in and let us know) is that you're hurting your CPU optimizations like branching and pipelining. Every time it gets to the end of the the loop it has to stop, wait for the call to size() to return, and then check the loop variable against that return value. If the cpu can look ahead and guess that j is still going to be less than len because it hasn't seen len change (len isn't even inside the loop!) it can make a good branch prediction each time and not have to wait.
Fairly simple, I want to loop over every index of an array of arraysize using only one var for the loop. I have a way to do it with two vars i and j, where i tracks the actual index and loops around and j counts up to arraysize and terminates the loop:
for (unsigned int i = start, j = 0; //start is the starting index
j < arraysize;
++i == arraysize ? i = 0 : 0, ++j)
{
//do stuff
}
Is there some nifty way to do this with only i? Order doesn't matter, if backward iteration makes sense for some reason.
Clarification: I want to loop from start to arraysize - 1, then from 0 to start - 1.
At least as I understand it, you want to loop through the entire array, but you want to start from somewhere other than the beginning, then when you reach the end, you want to start back at the beginning and keep going until you reach the original starting point.
Assuming that's correct, it's pretty easy:
for (size_t i=0; i<arraysize; i++)
process(array[(i+start)%arraysize]);
I would prefer to abstract that algorithm into generic function (which would work even on things like std::forward_list), without doing superfluous modulo and addition operations (though, they may be acceptable in many cases):
#include <algorithm>
#include <iostream>
#include <iterator>
template<typename FwdIter, typename F>
F for_each_shifted(FwdIter first, FwdIter start, FwdIter last, F f)
{
using std::for_each;
return for_each(first, start, for_each(start, last, f));
}
int main()
{
using namespace std;
int v[] = { 1, 1, 2, 6, 24 };
for_each_shifted(begin(v), begin(v) + 3, end(v), [](int x)
{
cout << x << endl;
});
}
Output is:
6
24
1
1
2
Live Demo
for ( i=start; i<start+arraysize; i++ ) {
// do stuff with (i % arraysize) in place of i
}
for (size_t i = start; (i + 1) % arraysize != start: i = (i + 1) % arraysize) {
// stuff
}
This would get you there:
for (unsigned int i = start; i < start + arraySize; i++)
{
DoSomething(array[i % arraySize]);
}
Alternatively:
for (unsigned int i = 0; i < arraySize; i++)
{
DoSomething(array[(i + start) % arraySize]);
}
For example you can use the folloing loop statement
for ( int i = start; i < arraysize + start; i++ )
and inside the loop body instead of i use expression i % arraysize
I'm using OpenMP to make parallel version of Dijkstra algorithm. My code consists of two parts. First part is execute only by one thread (master). This thread chooses new nodes from list. Second part is execute by other threads. These threads change distances from source to other nodes. Unfortunatelly in my code is error because one of many threads which execute second part suddenly "disappear". Probably there is problem with data synchronization, but I don't know where. I would be grateful if someone could tell me where is my mistake. Here is the code:
map<int, int> C;
map<int, int> S;
map<int, int> D;
int init;
int nu;
int u;
int p = 3;//omp_get_num_threads();
int d;
int n = graph->getNodesNum();
#pragma omp parallel shared(n, C, d, S, init, nu, u, D, graph, p) num_threads(p)
{
int myId = omp_get_thread_num();
if (myId == 0)
{
init = 0;
nu = 0;
u = to;
while (init < p - 1)
{
}
while (u != 0)
{
S[u] = 1;
while (nu < p - 1)
{
}
u = 0;
d = INFINITY;
for (int i = 1; i <= p - 1; ++i)
{
int j = C[i];
if ((j != 0) && (D[j] < d))
{
d = D[j];
u = j;
}
}
nu = 0;
}
}
else
{
for (int i=myId; i<=n; i += p-1)
{
D[i] = INFINITY;
S[i] = 0;
}
D[u] = 0;
++init;
while (init < p-1)
{
}
while (u != 0)
{
C[myId] = 0;
int d = INFINITY;
for (int i = myId; i<=n; i+=p-1)
{
if (S[i] == 0)
{
if (i != u)
{
int cost = graph->getCostBetween(u, i);
if (cost != INFINITY)
{
D[i] = min(D[i], D[u] + cost);
}
}
if ((d > D[i]))
{
d = D[i];
C[myId] = i;
}
}
}
++nu;
while (nu != 0)
{
}
}
}
}
}
I don't know what information you have, but parallelising an irregular, highly synchronized algorithm with small tasks is amongst the toughest parallel problems one can have. Research teams can dedicate themselves to such tasks and get limited speedups, or nowhere with it. Often such algorithms only work on specific architectures that are tailored for the parallelisation, and quirky overheads such as false sharing have been eliminated by designing the data structures appropriately.
An algorithm such as this needs a lot of time and effort to profile, measure, and consideration. See for example this paper.
ww2.cs.fsu.edu/~flin/ppq_report.pdf
Now, onto your direct question, since your algorithm is highly synchronized and tasks are small you are experiencing the side effect of data races. To remove these from your parallel algorithm is going to be very tricky, and no-one here can do it for you.
So your first point of call is to look at tools which can help you detect data races such as Valgrind and the Intel thread checker.