Finding a submatrix with largest sum [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Finding a submatrix with the maximum possible sum in O(n^2)
I have an NxN matrix. I want to find out an MxM submatrix, with largest sum of its elements, of the above matrix.
What is the efficient algorithm for this.

I have an algorithm that runs in constant time when enlarging M and quadratic time when enlarging N.
First submatrix count as usual. Save the sum. Then move one row field right - the two MxM matrices overlap, so you can just just sum the two non overlaping columns. Save all the sums. Now you can choose the largest sum for the line.
Move to next line. Remember the saved sums? The MxM matrices of first and second lines overlap again, so you can just sum the first and the last lines of MxM matrices and compute first sum in second line.
Now move to the second sum of the second line. Do the same thing as above, but you find out that the last lines of first sum and the second sum in the second row overlap again.
I know it is kind of confusing, if you don't get it, let me know I will draw some picture of it. The algorithm is based on paper in this answer.
EDIT: I know I promised picture, but this should be enough:
A AB AB AB AB B
AC ABCD ABCD ABCD ABCD BD
AC ABCD ABCD ABCD ABCD BD
AC ABCD ABCD ABCD ABCD BD
AC ABCD ABCD ABCD ABCD BD
C CD CD CD CD D
These are four submatices, A, B, C, D, positioned like this:
AB
CD
First you count the sum of the A submatrix: sum(A). No optimizations here. Now you want to count the sum of B: sum(B). You can do the same as in A, but notice that A and B overlap. So you assign sum(A) to sum(B), count the sum of vertical vector A AC AC AC AC and substract if from sum(B), then count the sum of vertical vector B BD BD BD BD and add it to sum(B).
sum(B) = sum(A) - sum(A AC AC AC AC) + sum(B BD BD BD BD)
You have sum(B). Now you can continue and compute the whole first line of submatices.
Move to the second line: matices C and D. You don't have to sum the whole matice C, because in previous line, you saved sum(A). Notice they overlap again. You just need to add the difference between A and C:
//code (1)
subC = sum([A AB AB AB AB]) //as substract C
addC = sum([C CD CD CD CD]) //as add C
sum(C) = sum(A) - subC + addC
You have sum(C). Now you can get sum(D) like this:
//code (2)
subD = sum([AB AB AB AB B]) //as substract D
addD = sum([CD CD CD CD D]) //as add D
sum(D) = sum(B) - subD + addD
but compare subD vs. subC and addD vs addC. They overlap! So you can do it this way:
//code (3)
subD = subC - A + B //from subC substract value on A and add B to it
addD = addC - C + D //same as above
sum(D) = sum(B) - subD + addD
You see that instead of 25 aditions for comuting sum of one submatrix, we do 6. For every possible size of MxM, we have MxM aditions for first submatrix, M*2+2 aditions for first row and for first column and 6 aditions for the rest.

Related

How to compare and calculate each row to all other rows in the same column and table

My data looks like this, a place with their coordinates details
Place Latitude Longitude
A 2.314 97.6110288
B 3.425 98.6925504
C 4.1231 99.774072
D 5.096466667 100.8555936
E 6.001016667 101.9371152
F 6.905566667 103.0186368
G 7.810116667 104.1001584
H 8.714666667 105.18168
I 9.619216667 106.2632016
J 10.52376667 107.3447232
K 11.42831667 108.4262448
L 12.33286667 109.5077664
M 13.23741667 110.589288
N 14.14196667 111.6708096
O 15.04651667 112.7523312
P 15.95106667 113.8338528
So the table looks like this
what i want to do is compare the place to all the other place by counting the distance in between places. and if it fulfills the criteria, we add one to output
so for example
We compare the distance of Place A to , B,C,D,E,F,G
so
for example A-B , distance = 100
A-C, distance = 70
A-D, distance = 50
A-E,distance = 120
A-F,distance = 140
A-G,distance = 175
A-H, DIstance=80
A-I,Distance =40
A-J,Distance=190
A-K,distance=209
A-L,distance=109
A-M,A-N,A-O,A-P=150
and we go a conditional so if i want to only take the one that is larger than 151 , it will return 3 for the row
and this will calculate for all rows in the table
the output example is like this
output expected
Place Latitude Longitude Bigger Than 151
A 2.314 97.6110288 3
B 3.425 98.6925504 5
C 4.1231 99.774072 1
D 5.096466667 100.8555936 3
E 6.001016667 101.9371152 2
F 6.905566667 103.0186368 1
G 7.810116667 104.1001584 5
H 8.714666667 105.18168 2
I 9.619216667 106.2632016 4
J 10.52376667 107.3447232 1
K 11.42831667 108.4262448 0
L 12.33286667 109.5077664 0
M 13.23741667 110.589288 0
N 14.14196667 111.6708096 0
O 15.04651667 112.7523312 0
P 15.95106667 113.8338528 0
i also can use python for power bi, if power query/dax power Bi may not be able to solve this .
Thank you
Start by cross-joining your Places table with itself: Cross join
Next calculate the (haversine) distances between all places: Use Power Query to Calculate Distance
Finally filter the Distance column > 151 and GroupBy Place, counting the rows.
Of cause everything can be done in DAX as well, but all calculations will run "live" in the report, which will impact the performance with 100k x 100k rows.

Regressing state-level coefficient on state-level law

I have the following.
county state employment county_shock state_law
1 NY 70 3 10
2 NY 80 4 10
4 IL 100 2 5
7 IL 60 9 5
3 TX 90 8 2
I ran the regression for all counties:
regress employment county_shock
But now I am curious about the state-level law on the degree to which county_shock affects employment.
Not sure adding interaction term achieves this.
But what I am trying to do is the following:
Run
regress employment county_shock
for "each state"
Then I will have coefficient for county_shock for each state. I could get those coefficients by "_b"
Then, regress those coefficients on state-level law.
How should I do this?
This is what I thought at first:
The initial model is y = b0 + b1*x + u, where x is county_shock for short. You mean b1 is a function of z, which is state_law. For this you can regress y on x, z and x*z. (Here you want to let the intercept different across state as well. The model without z looks strange because it means that the intercept is the same for all states.)
Now, let the model with interaction be y = c0 + c1*x + c2*z + c3*x*z + u. This is written as y = (c0 + c2*z) + (c1+c3*z)*x + u. Thus, c3 is the coefficient you want.
In Stata,
reg employment c.county_shock##c.state_law
-
After reading the second part of your question, I am not sure if this is what you want.

Rectangle in a triangle

I am trying to build a program in C++ that will procedurally generate cities.
For the moment, the city is represented by an array of blocks either Quad blocks or Triangle blocks.
I can't find an effective way to subdivide a triangle (TBlock) into a rectangle (QBlock) and three triangles.
So picture a triangle ABC. We have two point T and T' which are the first and second tier of line segment BC.
Now I need to find P and P' which are respectively on AB and AC line segments.
P is the intersection of AB and the normal of BC passing through T.
P' is the intersection of AB and the normal of BC passing through T'.
I know how to find T and T' and the inward normal of vector BC but I can't find a way to compute the normal passing through T or T'.
Thanks !
Given your (BC) vector is (x, y), a normal vector of BC is (-y, x). Now offset the normal vector by coordinates of T', and you will get the normal of BC passing through T'.
the normal to a line y = m*x + c is the line y = (-1/m)*x + d, where c and d are constants.
You have two lines with a common point (T or T'), you can solve simultaneously to find m and d for both T and T'.
You know that the angle formed by CBA is the same angle formed by TBP. Let's call that O. You also know the distance between B and T. Call it D. Using this we can find P using trigonometry.
Tan(O) = X / D
Where X is the y_axis distance between T and P. Just solve for X since you know O and D.
Once you know X you can just add X to the y value of T to find P.

Euclidean algorithm on Sage for more than 2 elements

I'm trying to make an exercise which gets a list of numers, an shows a list of elements like this: if A=[a0,a1,a2] then there is U=[u0,u1,u2], knowing that a0*u0 + a1*u1 + a2*u2 = d and d is the gcd of A.
For 2 elements is a pretty simple thing, as Sage has a function to retrieve u0 and u1 out of a0 and a1:
A=[15,21]
(d,u0,u1)=xgcd(a[0],a[1])
I just don't understand how could I do this with a list of n elements.
Note that gcd(a, b, c) = gcd((gcd(a, b), c). This means that you can use the built-in function repeatedly to calculate the coefficients that you want.
You helped me a lot, came to this:
x1=[1256,5468,5552,1465]
n=-1
for i in x1:
n=n+1
(d,w,x)=xgcd(x1[n-1],x1[n])
u1=[w,x]
n=n-2
while n>=0:
div=d
(d,u,v)=xgcd(x1[n],div)
position=0
for j in u1:
a=j*v
u1[position]=a
position=position+1
u1=[u]+u1
n=n-1
u1
And it works ;)

How to solve Linear Diophantine equations in programming?

I have read about Linear Diophantine equations such as ax+by=c are called diophantine equations and give an integer solution only if gcd(a,b) divides c.
These equations are of great importance in programming contests. I was just searching the Internet, when I came across this problem. I think its a variation of diophantine equations.
Problem :
I have two persons,Person X and Person Y both are standing in the middle of a rope. Person X can jump either A or B units to the left or right in one move. Person Y can jump either C or D units to the left or right in one move. Now, I'm given a number K and I have to find the no. of possible positions on the rope in the range [-K,K] such that both the persons can reach that position using their respective movies any number of times. (A,B,C,D and K are given in question).
My solution:
I think the problem can be solved mathematically using diophantine equations.
I can form an equation for Person X like A x_1 + B y_1 = C_1 where C_1 belongs to [-K,K] and similarly for Person Y like C x_2 + D y_2 = C_2 where C_2 belongs to [-K,K].
Now my search space reduces to just finding the number of possible values for which C_1 and C_2 are same. This will be my answer for this problem.
To find those values I'm just finding gcd(A,B) and gcd(C,D) and then taking the lcm of these two gcd's to get LCM(gcd(A,B),gcd(C,D)) and then simply calculating the number of points in the range [1,K] which are multiples of this lcm.
My final answer will be 2*no_of_multiples in [1,K] + 1.
I tried using the same technique in my C++ code, but it's not working(Wrong Answer).
This is my code :
http://pastebin.com/XURQzymA
My question is: can anyone please tell me if I'm using diophantine equations correctly ?
If yes, can anyone tell me possible cases where my logic fails.
These are some of the test cases which were given on the site with problem statement.
A B C D K are given as input in same sequence and the corresponding output is given on next line :
2 4 3 6 7
3
1 2 4 5 1
3
10 12 3 9 16
5
This is the link to original problem. I have written the original question in simple language. You might find it difficult, but if you want you can check it:
http://www.codechef.com/APRIL12/problems/DUMPLING/
Please give me some test cases so that I can figure out where am I doing wrong ?
Thanks in advance.
Solving Linear Diophantine equations
ax + by = c and gcd(a, b) divides c.
Divide a, b and c by gcd(a,b).
Now gcd(a,b) == 1
Find solution to aU + bV = 1 using Extended Euclidean algorithm
Multiply equation by c. Now you have a(Uc) + b (Vc) = c
You found solution x = U*c and y = V * c
The problem is that the input values are 64-bit (up to 10^18) so the LCM can be up to 128 bits large, therefore l can overflow. Since k is 64-bit, an overflowing l indicates k = 0 (so answer is 1). You need to check this case.
For instance:
unsigned long long l=g1/g; // cannot overflow
unsigned long long res;
if ((l * g2) / g2 != l)
{
// overflow case - l*g2 is very large, so k/(l*g2) is 0
res = 0;
}
else
{
l *= g2;
res = k / l;
}