What is the practical use of the Kadane's algorithm? - c++

This algorithm is interesting, I know that it's being used in image processing ( and probably other scenarios ), but what I find strange is that this algorithm also operates on negative values, and negative values are virtually non existent in the imaging world where there are a lot of unsigned int to represent the values .
Could you offer a practical example that takes advantage of the Kadane algorithm ?

One of the obvious application is business analysis where you need to find out the duration of time where the company experienced the maximum growth or also duration of time with minimum growth which helps company to find what they did good or bad during those periods to repeat or prevent them in future for benefit of company.

It can handle problems like:
Station Travel in Order
Given a list of n gas station of form P(D,X) where D is the distance from this station to next station and X is the amount of petrol available at this station, identify the starting station from where you can complete journey to each station in order from 1.....N. You can only go in one direction i.e from P(i) to P(i+1). Suppose you need to find the max sum.
Kadane's algorithm (solution)
Initialize:
max_so_far = 0
max_ending_here = 0
Loop for each element of the array
(a) max_ending_here = max_ending_here + a[i]
(b) if(max_ending_here < 0)
max_ending_here = 0
(c) if(max_so_far < max_ending_here)
max_so_far = max_ending_here
return max_so_far
Hotels Along the Croatian Coast.

I am quoting directly from wikipedia,
The problem was first posed by Ulf Grenander of Brown University in
1977, as a simplified model for maximum likelihood estimation of
patterns in digitized images. A linear time algorithm was found soon
afterwards by Jay Kadane of Carnegie-Mellon University (Bentley 1984).
if you don't already know what is maximum likelihood estimation, then here it goes for you.

Related

Possible way to find the actual billboard locations in the Billboard Highway Problem [Dynamic Programming]

I've been learning about dynamic programming the past few days and came across the Highway Billboard Problem. From what I can understand we are able to find the maximum revenue that can be generated from the possible sites, revenue, size of the highway and the minimum distance between two certain billboards. Is there a possible way we can also find out the actual billboard locations alongside the maximum revenue.
For the code I've been looking at this https://www.geeksforgeeks.org/highway-billboard-problem/
Yes, it is possible to write down the sequence of the chosen sites.
There are two max function calls. Replace them by own maximum choice with if, and inside branch where current site is used, add current position to list (to the emptied list in the first max clause, as far as I understand)
For example,
maxRev[i] = max(maxRev[i-1], revenue[nxtbb]);
change to this (pseudocode, did not check validity)
if (revenue[nxtbb] > maxRev[i-1]) {
maxRev[i] = revenue[nxtbb];
sitelist.clear();
sitelist.push(i);
}
else
maxRev[i] = maxRev[i-1];
and
maxRev[i] = max(maxRev[i-t-1]+revenue[nxtbb], maxRev[i-1]);
change to
if (maxRev[i-t-1]+revenue[nxtbb] > maxRev[i-1]) {
maxRev[i] = maxRev[i-t-1]+revenue[nxtbb];
sitelist.push(i);
}
else
maxRev[i] = maxRev[i-1];

linear programming problem for minimum cost

A construction company has 6 projects, for each they need $d_i$ workers. The company has no workers at the beginning of project 1.
Each new worker must take a safety course that costs 300, and 50 more for each worker.
If there is no new worker there is no course.
Firing a worker does not cost any money, and a workers can't be rehired.
Given that the salary of a worker is 100 per project, formulate a linear programming problem that minimizes the workers costs.
What I tried:
Let $x_i$ be the number of new workers for project $i$.
Let $y_i$ be the number of old workers remaining from previous projects until project $i$ (all the workers hired - all the workers that were fired)
Let $z_i$ be an indicator such that $z_i =0 \iff x_i>0$
The function I'm trying to solve is:
$\min(\sum_{i=1}^6 150x_i + 300(1-z_i) + 100y_i)$
s.t:
\begin{align}
x_i,y_i,z_i &\ge 0 \\
z_i &\ge 1-x_i \\
y_i + x_i &\ge d_i \\
y_i &\ge y_{i-1} + x_i
\end{align}
Something feels not right to me. The main reason is that I tried to use matlab to solve this and it failed.
What did I do wrong? How can I solve this question?
When I see this correctly you have two small mistakes in your constraints.
The first appears when you use z_i >= 1-x_i. This allows z_i to take the value 1 all the time, which will never give you the extra cost of 300. You need to upper bound z_i such that z_i will not be 1 when you have x_i>0. For this constraint you need something called big M. For sufficiently large M you would then use z_i <= 1-x_i/M. This way when x_i=0 you can have z_i=1, otherwise the right hand side is smaller than 1 and due to integrality z_i has to be zero. Note that you usually want to choose M as tight as possible. So in your case d_i might be a good choice.
The second small mistake lays in y_i >= y_{i-1} + x_i. This way you can increase y_i over y_{i-1} without having to set any x_i. To force x_i to increase you need to flip the inequality. Additionally by the way you defined y_i this inequality should refer to x_{i-1}. Thus you should end up with y_i <= y_{i-1} + x_{i-1}. Additionally you need to take care of corner cases (i.e. y_1 = 0)
I think with these two changes it should work. Let me know whether it helped you. And if it still doesn't work I might have missed something.

Search similar object

Assume I have the following array of objects:
Object 0:
[0]=1.1344
[1]=2.18
...
[N]=1.86
-----------
Object 1 :
[0]=1.1231
[1]=2.16781
...
[N]=1.8765
-------------
Object 2 :
[0]=1.2311
[1]=2.14781
...
[N]=1.5465
--------
Object 17:
[0]=1.31
[1]=2.55
...
[N]=0.75
How can I compare those objects?
You can see that object 0 and object 1 are very similar but object 17 not like any of them.
I would like to have algorithm tha twill give me all the similar object in my array
You tag this question with Algorithm (and I am not expert in C++) so lets give a pseudo code.
First, you should set a threshold which define 2 var with different under that threshold as similar. Second step will be to loop over all pair of elements and check for similarity.
Consider A to be array with n objects and m to be number of fields in each object.
threshold = 0.1
for i in (0, n):
for j in (i+1,n):
flag = true;
for k in (1,m):
if (abs(A[i][k] - A[j][k]) > threshold)
flag = false // if the absolute value of the diff is above the threshold object are not similar
break // no need to continue checks
if (flag)
print: element i and j similar // and do what ever
Time complexity is O(m * n^2).
Notice that you can use the same algorithm to sort the objects array - declare compare function as the max diff between field and then sort accordingly.
Hope that helps!
Your problem essentially boils down to nearest neighbor search which is a well researched problem in data mining.
There are diffent approaches to this problem.
I would suggest to decide first what number of similar elements you want OR to set a given threshold for the similarity. Than you have to iterate through all the vectors and compute a distance function between the query vector and each vector in the database.
I would suggest you to use Euclidean distance in your case since you have real nominal data.
You can read more about the topic of nearest neighbor search and Euclidean distancehere and here. Good luck!
What you need is a classifier, for your problem there are 2 algorithms depends on what you wanted.
If you need to find which object is most similar to the choosen object-m, you can use nearest neighbor algorithm or else if you need to find similar sets of objects you can use k-means algorithm to find k sets.

Better alternative to divide and conquer algorithm

First let me explain the problem I'm trying to solve. I'm integrating my code with 3rd party library which does quite complicated financial predictions. For the purposes of this question let's just say I have a blackbox which returns y when I pass in x.
Now, what I need to do is find input (x) for a given output (y). Since I know lowest and highest possible input values I wrote the following algorithm:
define starting input range (minimum input value to maximum input value)
divide the range into two equal parts and find output for a middle value
find which half output falls into
repeat steps 2 and 3 until range is too small to divide any further
This algorithm does the job nicely, I don't see any problems with it. However, is there a faster way to solve this problem?
It sounds like x and y are strongly correlated (i.e. as x increases, so does y), as otherwise your divide and conquer algorithm wouldn't work.
Assumuing this is the case, and you could work out a correlation factor, then you might be able to multiply the midpoint by the correlation factor to potentially hone in the expected value quicker.
Please note that I've not tested this idea at all, but it's something to think about. Possible improvements would be to make the correlationFactor a moving average, or precompute it based on, say, the deciles between xLow and xHigh.
Also, this assumes that calling f(x) is relatively inexpensive. If it is expensive, then the increased number of calls to f(x) would dwarf any savings. In fact - I'm starting to think this is a stupid idea...
Hopefully the following pseudo-code illustrates what I mean:
DivideAndConquer(xLow, xHigh, correlationFactor, expectedValue)
xMid = (xHigh - xLow) * correlationFactor
// Add some range checks to make sure that xMid is within xLow and xHigh!!
y = f(xMid)
if (y == expectedValue)
return expectedValue
elseif (y < expectedValue)
correlationFactor = (xMid - xLow) / (f(xMid) - f(xLow))
return DivideAndConquer(xLow, xMid, correlationFactor, expectedValue)
else
correlationFactor = (xHigh - xMid) / (f(xHigh) - f(xMid))
return DivideAndConquer(xMid, xHigh, correlationFactor, expectedValue)

Given a collection of stacks of different heights, how can I select every combination possible?

Input: total cost.
Output: all the combinations of levels that give the desired cost.
Every level of each stack costs a different amount (level 1 in stack 1 doesn't cost the same as level 1 in stack 2). I have a function that converts the level to the actual cost based on the base cost (level 1), which I entered manually (hard coded).
I need to find the combination of levels that give me the inputed cost. I realize there are more than one possible solutions, but I only need a way to iterate trough every possibility.
Here is what I need:
input = 224, this is one of the solutions:
I'm making a simple program that needs to select levels of different stacks and then calculate the cost, and I need to know every possible cost that exists... Each level of each stack costs a different amount of money, but that is not the problem, the problem is how to select one level for each stack.
I probably explained that very vaguely, so here's a picture (you'll have to excuse my poor drawing skills):
So, all stacks have the level 0, and level 0 always costs 0 money.
Additional info:
I have an array called "maxLevels", length of that array is the number of stacks, and each element is the number of the highest level in that stack (for example, maxLevels[0] == 2).
You can iterate from the 1st level because the level 0 doesn't matter at all.
The selected levels should be saved in an array (name: "currentLevels) that is similar to maxLevels (same length) but, instead of containing the maximum level of a stack, it contains the selected level of a stack (for example: currentLevels[3] == 2).
I'm programming in C++, but pseudocode is fine as well.
This isn't homework, I'm doing it for fun (it's basically for a game).
I'm not sure I understand the question, but here's how to churn through all the possible combinations of selecting one item from each stack (in this case 3*1*2*3*1 = 18 possibilities):
void visit_each_combination(size_t *maxLevels, size_t num_of_stacks, Visitor &visitor, std::vector<size_t> &choices_so_far) {
if (num_of_stacks == 0) {
visitor.visit(choices_so_far);
} else {
for (size_t pos = 0; pos <= maxLevels[0]; ++pos) {
choices_so_far.push_back(pos);
visit_each_combination(maxLevels+1, num_of_stacks-1, visitor, choices_so_far);
choices_so_far.pop_back();
}
}
}
You can replace visitor.visit with whatever you want to do with each combination, to make the code more specific. I've used a vector choices_so_far instead of your array currentLevels, but it could just as well work with an array.
This is very simple, if I've understood it correctly. The minimum cost is 0, and the maximum cost is just the sum of the heights of the stacks. To achieve any specific cost between these limits, you can start from the left, selecting the maximum level for each stack until your target is achieved, and then select level 0 for the remaining stacks. (You may have to adjust the last non-zero stack if you overshoot the target.)
I solved it, I think. #Steve Jessop gave me the idea to use recursion.
Algorithm:
circ(currentStack)
{
for (i = 0; i <= allStacks[currentStack]; i ++)
if (currentStack == lastStack && i == allStacks[currentStack])
return 0;
else if (currentStack != lastStack)
circ(++ currentStack);
}