SymPy unable to simplify solution - sympy

I am trying to solve the following system of equation using sympy.
from sympy import *
n = 4
K = 2
a = symbols(f"a_:{int(n)}", real=True)
b = symbols(f"b_:{int(n)}", real=True)
X = symbols(f"X_:{int(K)}", real=True)
Y = symbols(f"Y_:{int(K)}", real=True)
lambda_ = symbols("lambda",real=True)
mu = symbols(f"mu_:{int(K)}", real=True)
list_eq = [
# (1)
Eq(a[0] + a[1] + a[2] + a[3], 0),
Eq(a[0] + a[1], X[0]),
Eq(a[2] + a[3], X[1]),
# (2)
Eq(b[0] + b[1] + b[2] + b[3], 0),
Eq(b[0] + b[1], Y[0]),
Eq(b[2] + b[3], Y[1]),
# (3)
Eq(b[0], a[0] - lambda_ - mu[0]),
Eq(b[1], a[1] - lambda_ - mu[0]),
Eq(b[2], a[2] - lambda_ - mu[1]),
Eq(b[3], a[3] - lambda_ - mu[1]),
]
solve(list_eq, dict=True)
[{X_0: -b_2 - b_3 + mu_0 - mu_1,
X_1: b_2 + b_3 - mu_0 + mu_1,
Y_0: -b_2 - b_3,
Y_1: b_2 + b_3,
a_0: -b_1 - b_2 - b_3 + mu_0/2 - mu_1/2,
a_1: b_1 + mu_0/2 - mu_1/2,
a_2: b_2 - mu_0/2 + mu_1/2,
a_3: b_3 - mu_0/2 + mu_1/2,
b_0: -b_1 - b_2 - b_3,
lambda: -mu_0/2 - mu_1/2}]
The analytical solution for b is
b_0 = a_0 + (1/2)*(Y_0 - X_0)
b_1 = a_1 + (1/2)*(Y_0 - X_0)
b_2 = a_2 + (1/2)*(Y_1 - X_1)
b_3 = a_3 + (1/2)*(Y_1 - X_1)
However sympy does not manage to simplify the results and is still using mu_0 and mu_1 in the solution.
Is it possible to simplify those variables in the solution ?
For more details, the system i'm trying to solve is an optimization problem under constraints:
min_b || a - b ||^2 such that b_0 + b_1 + b_2 + b_3 = 0 and b_0 + b_1 = Y_0 and b_2 + b_3 = Y_1.
We assume that a_0 + a_1 + a_2 + a_3 = 0 and a_0 + a_1 = X_0 and a_2 + a_3 = X_1.
Therefore, the equations (1) are the assumptions on a and the equations (2) and (3) are the KKT equations.

You can eliminate variables from a system of linear or polynomial equations using a Groebner basis:
In [61]: G = groebner(list_eq, [*mu, lambda_, *b, *a, *X, *Y])
In [62]: for eq in G: pprint(eq)
X₁ - Y₁ + 2⋅λ + 2⋅μ₀
-X₁ + Y₁ + 2⋅λ + 2⋅μ₁
X₁ + Y₁ + 2⋅a₁ + 2⋅b₀
-X₁ + Y₁ - 2⋅a₁ + 2⋅b₁
-X₁ - Y₁ + 2⋅a₃ + 2⋅b₂
X₁ - Y₁ - 2⋅a₃ + 2⋅b₃
X₁ + a₀ + a₁
-X₁ + a₂ + a₃
X₀ + X₁
Y₀ + Y₁
Here the first two equations have mu and lambda but the others have these symbols eliminated. You can use G[2:] to get the equations that do not involve mu and lambda. The order of the symbols in a lex Groebner basis determines which symbols are eliminated first from the equations. You can solve specifically for b in terms of a, X and Y by picking out the equations involving b:
In [63]: solve(G[2:6], b)
Out[63]:
⎧ X₁ Y₁ X₁ Y₁ X₁ Y₁ X₁ Y₁ ⎫
⎨b₀: - ── - ── - a₁, b₁: ── - ── + a₁, b₂: ── + ── - a₃, b₃: - ── + ── + a₃⎬
⎩ 2 2 2 2 2 2 2 2 ⎭
This is not exactly the form you suggested but the form of solution for the problem is not unique because of the constraints among the variables it is expressed in. There are many equivalent ways to express b in terms of a, X and Y even after eliminating mu and lambda because a, X and Y are not independent (they are 8 symbols connected by 4 constraints).

Sometimes adding auxiliary equations with the pattern you desire and indicating what you don't want as a solution variable can help you get closer to what you desired:
[38] eqs = list_eq + [Y[0]-X[0]-var('z0'), Y[1]-X[1]-var('z1')]
[39] sol = Dict(solve(eqs, exclude=a, dict=True)[0]); sol

Related

How to represent the elements of the Galois filed GF(2^8) and perform arithmetic in NTL library

I am new to NTL library for its GF2X, GF2E, GF2EX, etc. Now, I want to perform multiplication on the Galois field GF(2^8). The problem is as following:
Rijndael (standardised as AES) uses the characteristic 2 finite field with 256 elements,
which can also be called the Galois field GF(2^8).
It employs the following reducing polynomial for multiplication:
x^8 + x^4 + x^3 + x^1 + 1.
For example, {53} • {CA} = {01} in Rijndael's field because
(x^6 + x^4 + x + 1)(x^7 + x^6 + x^3 + x)
= (x^13 + x^12 + x^9 + x^7) + (x^11 + x^10 + x^7 + x^5) + (x^8 + x^7 + x^4 + x^2) + (x^7 + x^6 + x^3 + x)
= x^13 + x^12 + x^9 + x^11 + x^10 + x^5 + x^8 + x^4 + x^2 + x^6 + x^3 + x
= x^13 + x^12 + x^11 + x^10 + x^9 + x^8 + x^6 + x^5 + x^4 + x^3 + x^2 + x
and
x^13 + x^12 + x^11 + x^10 + x^9 + x^8 + x^6 + x^5 + x^4 + x^3 + x^2 + x modulo x^8 + x^4 + x^3 + x^1 + 1
= (11111101111110 mod 100011011)
= {3F7E mod 11B} = {01}
= 1 (decimal)
My question is how to represent the reducing polynomial x^8 + x^4 + x^3 + x^1 + 1 and the polynomials x^6 + x^4 + x + 1 and x^7 + x^6 + x^3 + x in NTL. Then perform multiplication on these polynomials, and get the result {01}.
This is a good example for me to use this library.
Again, I don't know NTL, and I'm running Visual Studio 2015 on Windows 7. I've downloaded what I need, but have to build a library with all the supplied source files which will take a while to figure out. However, based on another answer, this should get you started. First, initialize the reducing polynomial for GF(256):
GF2X P; // apparently the length doesn't need to be set
SetCoeff(P, 0, 1);
SetCoeff(P, 1, 1);
SetCoeff(P, 3, 1);
SetCoeff(P, 4, 1);
SetCoeff(P, 8, 1);
GF2E::init(P);
Next, assign variables as polynomials:
GF2X A;
SetCoeff(A, 0, 1);
SetCoeff(A, 1, 1);
SetCoeff(A, 4, 1);
SetCoeff(A, 6, 1);
GF2X B;
SetCoeff(B, 1, 1);
SetCoeff(B, 3, 1);
SetCoeff(B, 6, 1);
SetCoeff(B, 7, 1);
GF2X C;
Looks like there is an override for multiply so this would work assuming that the multiply override is based on the GF(2^8) extension field GF2E::init(P).
C = A * B:
As commented after the question, NTL is more oriented to large fields. For GF(256) it would be faster to use bytes and lookup tables. For up to GF(2^64), xmm register intrinsics with carryless multiply (PCLMULQDQ) can be used to implement finite field math quickly without tables (some constants will be needed, the polynomial and it's multiplicative inverse). For fields greater than GF(2^64), extended precision math methods would be needed. For fields GF(p^n), where p != 2 and n > 1, unsigned integers can be used with lookup tables. Building the tables would involve some mapping between integers and GF(p) polynomial coefficients.

How to increase enough stack size in Pari/Gp for command to work

I am working with GP and minimum polynomials as follows running on ASUS x75:
(19:25) gp > elt=Mod(a*x^3+b*x^2+c*x+d,('x^5-1)/('x-1))
%122 = Mod(a*x^3 + b*x^2 + c*x + d, x^4 + x^3 + x^2 + x + 1)
(19:25) gp > (poly=minpoly(elt,x='x))
%123 = x^4 + (a + (b + (c - 4*d)))*x^3 + (a^2 + (-3*b + (2*c - 3*d))*a + (b^2 + (2*c - 3*d)*b + (c^2 - 3*d*c + 6*d^2)))*x^2 + (a^3 + (-2*b + (3*c - 2*d))*a^2 + (-2*b^2 + (c + 6*d)*b + (-2*c^2 - 4*d*c + 3*d^2))*a + (b^3 + (-2*c - 2*d)*b^2 + (3*c^2 - 4*d*c + 3*d^2)*b + (c^3 - 2*d*c^2 + 3*d^2*c - 4*d^3)))*x + (a^4 + (-b + (-c - d))*a^3 + (b^2 + (2*c + 2*d)*b + (c^2 - 3*d*c + d^2))*a^2 + (-b^3 + (-3*c + 2*d)*b^2 + (2*c^2 - d*c - 3*d^2)*b + (-c^3 + 2*d*c^2 + 2*d^2*c - d^3))*a + (b^4 + (-c - d)*b^3 + (c^2 + 2*d*c + d^2)*b^2 + (-c^3 - 3*d*c^2 + 2*d^2*c - d^3)*b + (c^4 - d*c^3 + d^2*c^2 - d^3*c + d^4)))
The first command came out successfully, while the second one below did finish successfully and gave an allocatemem() error. How is it possible to get the second command to work, without overheating the computer or program exhaustion? And the WHOLE output to command below this is needed. Thanks for the help.
(19:23) gp > elt=Mod(a*x^5+b*x^4+c*x^3+d*x^2+e*x+f,('x^7-1)/('x-1))
%120 = Mod(a*x^5 + b*x^4 + c*x^3 + d*x^2 + e*x + f, x^6 + x^5 + x^4 + x^3 + x^2 + x + 1)
(19:23) gp > (poly=minpoly(elt,x='x))
*** at top-level: poly=minpoly(elt,x='x)
*** ^-----------------
*** minpoly: the PARI stack overflows !
current stack size: 9000000 (8.583 Mbytes)
[hint] you can increase GP stack with allocatemem()
You can increase the PARI/GP's heap up to any limit you want at run-time following the example below (demonstrates how to set heap size to 120000000 bytes):
default(parisize, 120000000)
default(parisize, 10000000000) is more than 8 GB and in my case was
enough to make advanced calculations with matrices.

Catalan Numbers, Recursive function time complexity

The following function produces the nth number in catalan numbers. What is the exact time complexity function of this function or how can I find it myself?
int catalan(int n)
{
if (n==0 || n==1)
return 1;
int sum = 0;
for(int i=1;i<n;i++)
sum += catalan(i)*catalan(n-i);
return sum;
}
Note: I know this is the worst possible way to compute a catalan number.
To evaluate the complexity, let us focus on the number of recursive calls performed, let C(n).
A call for n implies exactly 2(n-1) recursive calls, each of them adding their own costs, 2(C(1)+C(2)+...C(n-1)).
A call for n+1 implies exactly 2n recursive calls, each of them adding their own costs, 2(C(1)+C(2)+...C(n-1)+C(n)).
By difference, C(n+1)-C(n) = 2+2C(n), which can be written C(n) = 2+3C(n-1).
C(1) = 0
C(2) = 2+2C(1) = 2+3C(0) = 2
C(3) = 4+2(C(1)+C(2)) = 2+3C(2) = 8
C(3) = 6+2(C(1)+C(2)+C(3)) = 2+3C(3) = 26
C(4) = 8+2(C(1)+C(2)+C(3)+C(4)) = 2+3C(4) = 80
...
C(n) = 2n-2+2(C(1)+C(2)+...C(n-1)) = 2+3C(n-1)
To solve this recurrence easily, notice that
C(n)+1 = 3(C(n-1)+1) = 9(C(n-2)+1) = ...3^(n-2)(C(2)+1) = 3^(n-1)
Hence, for n>1 the exact formula is
C(n) = 3^(n-1)-1
The number of calls to Catalan(1) (constant time), is also C(n), and the numbers of adds or multiplies are C(n)/2 each.
It is easy to reduce the complexity from O(3^n) to O(2^n) by noting that all terms in the loop (except the middle one) are computed twice - but that still doesn't make it an acceptable implementation :)
Assume
any step other than for-loop is k;
summation and multiple in for-loop is c and
catalan(r) is T(r)
In the for-loop of catalan(n), catalan(i) performs n-1 times where value of i from 1 to n-1 and catalan(n-i) performs n-1 times where value of n-i from n-1 to 1. In short, catalan(i) and catalan(n-i) equals to two times all catalan(x) where value of x from 1 to n-1.
T(n) = 2(T(1) + T(2) + T(3) + ... + T(n-2) + T(n-1)) + k + (n-1)c
Similarly,
T(n-1) = 2(T(1) + T(2) + T(3) + ... + T(n-2)) + k + (n-2)c
Reorder T(n) as 2(T(1) + T(2) + T(3) + ... + T(n-2)) + 2T(n-1) + k + (n-2)c + c
T(n) = 2(T(1) + T(2) + T(3) + ... + T(n-2)) + k + (n-2)c + 2T(n-1) + c
T(n) = T(n-1) + 2T(n-1) + c
T(n) = 3T(n-1) + c
T(n) = (3^2)T(n-2) + 3c + c
T(n) = (3^3)T(n-3) + (3^2)c + 3c + c
and so on...
T(n) = (3^(n-1))T(n-(n-1)) + c(3^0 + 3^1 + 3^2 + ... + 3^(n-2))
T(n) = (3^(n-1))T(1) + ((3^(n-1)-1)/2)c
So, the time complexity is O(3 ^ N)
My process is quite similar to #hk6279's one but I believe easier to understand because I start from the code itself. I start defining the recurrence relation as it is in code and then substitute it.
I also remove all the + k + c variables from #hk6279's approach because it adds noise to the equation and at the end all those variables will be ruled out.
Recurrence relation
T(n) => 1 -> n = 1
T(i) * T(n-i) -> n > 1; for i in 1..n-1
Visualize when n > 1
T(n) = [T(1) + T(2) + T(3) + .... + T(n-2) + T(n-1)] + [T(n-1) + T(n-2) + .... + T(3) + T(2) + T(1)]
T(n) = [T(1) + T(2) + T(3) + .... + T(n-2) + T(n-1)] + [T(1) + T(2) + T(3) + .... + T(n-2) + T(n-1)]
T(n) = 2 * [T(1) + T(2) + T(3) + .... + T(n-2) + T(n-1)]
What is T(n-1) ?
T(n-1) = 2 * [T(1) + T(2) + T(3) + .... + T(n-2)]
Replace with T(n-1)
T(n) = 2 * [T(1) + T(2) + T(3) + .... + T(n-2) + T(n-1)]
T(n) = 2 * [T(1) + T(2) + T(3) + .... + T(n-2)] + 2 * [T(n-1)]
T(n) = T(n-1) + 2 * [T(n-1)]
T(n) = 3 * T(n-1)
What is T(n-2) ?
T(n-2) = 2 * [T(1) + T(2) + T(3) + .... + T(n-3)]
Replace with T(n-2)
T(n) = 3 * [2 * [T(1) + T(2) + T(3) + .... + T(n-3) + T(n-2)]]
T(n) = 3 * [2 * [T(1) + T(2) + T(3) + .... + T(n-3)] + 2 * T(n-2)]]
T(n) = 3 * [T(n-2) + 2*T(n-2)]
T(n) = 3 * [3 * T(n-2)]
T(n) = 3^2 * T(n-2)
Replace with T(n-k)
T(n) = 3^k * T(n-k)
if n - k = 1 => k = n + 1
T(n) = 3^(n+1) * T(n-n+1)
T(n) = 3^(n+1) * T(1)
T(n) = 3^(n+1) * 1
Time complexiTy
O(3^n)

Complexity of an algorithm with two recursive calls

I have a strange algorithm than is being called recursively 2 times. It's
int alg(int n)
loop body = Θ(3n+1)
alg(n-1);
alg(n-2)
Somehow i need to find this algorithm's complexity. I've tried to find it with using characteristic polynomial of the above equation but the result system is too hard to solve so i was wondering if there was any other straight way..
Complexity: alg(n) = Θ(φ^n) where φ = Golden ratio = (1 + sqrt(5)) / 2
I can't formally prove it at first, but with a night's work, I find my missing part - The substitution method with subtracting a lower-order term. Sorry for my bad expression of provement (∵ poor English).
Let loop body = Θ(3n+1) ≦ tn
Assume (guess) that cφ^n ≦ alg(n) ≦ dφ^n - 2tn for an n (n ≧ 4)
Consider alg(n+1):
Θ(n) + alg(n) + alg(n-1) ≦ alg(n+1) ≦ Θ(n) + alg(n) + alg(n-1)
c * φ^n + c * φ^(n-1) ≦ alg(n+1) ≦ tn + dφ^n - 2tn + dφ^(n-1) - 2t(n-1)
c * φ^(n+1) ≦ alg(n+1) ≦ tn + d * φ^(n+1) - 4tn + 2
c * φ^(n+1) ≦ alg(n+1) ≦ d * φ^(n+1) - 3tn + 2
c * φ^(n+1) ≦ alg(n+1) ≦ d * φ^(n+1) - 2t(n+1) (∵ n ≧ 4)
So it is correct for n + 1. By mathematical induction, we can know that it's correct for all n.
So cφ^n ≦ alg(n) ≦ dφ^n - 2tn and then alg(n) = Θ(φ^n).
johnchen902 is correct:
alg(n)=Θ(φ^n) where φ = Golden ratio = (1 + sqrt(5)) / 2
but his argument is a bit too hand-waving, so let's make it strict. His original argument was incomplete, therefore I added mine, but now he has completed the argument.
loop body = Θ(3n+1)
Let us denote the cost of the loop body for the argument n with g(n). Then g(n) ∈ Θ(n) since Θ(n) = Θ(3n+1).
Further, let T(n) be the total cost of alg(n) for n >= 0. Then, for n >= 2 we have the recurrence
T(n) = T(n-1) + T(n-2) + g(n)
For n >= 3, we can insert the recurrence applied to T(n-1) into that,
T(n) = 2*T(n-2) + T(n-3) + g(n) + g(n-1)
and for n > 3, we can continue, applying the recurrence to T(n-2). For sufficiently large n, we therefore have
T(n) = 3*T(n-3) + 2*T(n-4) + g(n) + g(n-1) + 2*g(n-2)
= 5*T(n-4) + 3*T(n-5) + g(n) + g(n-1) + 2*g(n-2) + 3*g(n-3)
...
k-1
= F(k)*T(n+1-k) + F(k-1)*T(n-k) + ∑ F(j)*g(n+1-j)
j=1
n-1
= F(n)*T(1) + F(n-1)*T(0) + ∑ F(j)*g(n+1-j)
j=1
with the Fibonacci numbers F(n) [F(0) = 0, F(1) = F(2) = 1].
T(0) and T(1) are some constants, so the first part is obviously Θ(F(n)). It remains to investigate the sum.
Since g(n) ∈ Θ(n), we only need to investigate
n-1
A(n) = ∑ F(j)*(n+1-j)
j=1
Now,
n-1
A(n+1) - A(n) = ∑ F(j) + (((n+1)+1) - ((n+1)-1))*F((n+1)-1)
j=1
n-1
= ∑ F(j) + 2*F(n)
j=1
= F(n+1) - 1 + 2*F(n)
= F(n+2) + F(n) - 1
Summing that, starting with A(2) = 2 = F(5) + F(3) - 5, we obtain
A(n) = F(n+3) + F(n+1) - (n+3)
and therefore, with
c*n <= g(n) <= d*n
the estimate
F(n)*T(1) + F(n-1)*T(0) + c*A(n) <= T(n) <= F(n)*T(1) + F(n-1)*T(0) + d*A(n)
for n >= 2. Since F(n+1) <= A(n) < F(n+4), all terms depending on n in the left and right parts of the inequality are Θ(φ^n), q.e.d.
Assumptions:
1: n >= 0
2: Θ(3n+1) = 3n + 1
Complexity:
O(2 ^ n * (3n - 2));
Reasoning:
int alg(int n)
loop body = Θ(3n+1)// for every n you have O(3n+1)
alg(n-1);
alg(n-2)
Assuming the alg does not execute for n < 1, you have the following repetitions:
Step n:
3 * n + 1
alg(n - 1) => 3 * (n - 1) + 1
alg(n - 2) => 3 * (n - 2) + 1
Now you basically have a division. You have to imagine a number tree with N as parent and n-1 and n-2 as children.
n
n-1 n-2
n - 2 n - 3 n - 3 n - 4
n - 3 n - 4 n - 4 n - 5 n - 4 n - 5 n - 5 n - 6
n-4 n-5 | n-5 n-6 |n-5 n-6 |n-6 n-7 n-5 n-6 n-6 n-7 n-6 n-6| n-6 n-8
It's obvious that there is a repetition pattern here. For every pair (n - k, n - k - 1) in A = {k, with k from 0 to n) except the first two and the last two, (n - 1, n - 2) and (n-2, n-3) there is a 3k + 1 * (2 ^ (k - 1)) complexity.
I am looking at the number of repetitions of the pair (n - k, n - k - 1). So now for each k from 0 to n I have:
(3k + 1) * (2 ^ (k - 1)) iterations.
If you sum this up from 1 to n you should get the desired result. I will expand the expression:
(3k + 1) * (2 ^ (k - 1)) = 3k * 2 ^ (k - 1) + 2 ^ (k - 1)
Update
1 + 2 + 2^2 + 2^3 + ... + 2^n = 2 ^ (n + 1) - 1
In your case, this winds up being:
2^n - 1
Based on the summation formula and k = 0, n . Now the first one:
3k * 2 ^ (k - 1)
This is equal to 3 sum from k = 0, n of k * 2 ^ (k - 1).
That sum can be determined by switching to polinomial functions, integrating, contracting using the 1 + a ^ 2 + a ^ 3 + ... + a ^ n formula, and then differentiated again to obtain the result, which is (n - 1) * 2 ^ n + 1.
So you have:
2 ^ n - 1 + 3 * (n - 1) * 2 ^ n + 1
Which contracted is:
2 ^ n * (3n - 2);
The body of the function takes Θ(n) time.
The function is called twice recursively.
For the given function the complexity is,
T(n) = T(n-1) + T(n-2) + cn ----- 1
T(n-1) = T(n-2) + T(n-3) + c(n-1) ----- 2
1-2 -> T(n) = 2T(n-1) - T(n-3) + c ----- 3
3 --> T(n-1) = 2T(n-2) + T(n-4) + c ----- 4
3-4 -> T(n) = 3T(n-1) - 2T(n-2) - T(n-3) - T(n-4) ----- 5
Let g(n) = 3g(n-1)
There for, we can approximate T(n) = O(g(n))
g(n) is Θ(3n)
There for T(n) = O(3n)

How is make_heap in C++ implemented to have complexity of 3N?

I wonder what's the algorithm of make_heap in in C++ such that the complexity is 3*N? Only way I can think of to make a heap by inserting elements have complexity of O(N Log N). Thanks a lot!
You represent the heap as an array. The two elements below the i'th element are at positions 2i+1 and 2i+2. If the array has n elements then, starting from the end, take each element, and let it "fall" to the right place in the heap. This is O(n) to run.
Why? Well for n/2 of the elements there are no children. For n/4 there is a subtree of height 1. For n/8 there is a subtree of height 2. For n/16 a subtree of height 3. And so on. So we get the series n/22 + 2n/23 + 3n/24 + ... = (n/2)(1 * (1/2 + 1/4 + 1/8 + . ...) + (1/2) * (1/2 + 1/4 + 1/8 + . ...) + (1/4) * (1/2 + 1/4 + 1/8 + . ...) + ...) = (n/2) * (1 * 1 + (1/2) * 1 + (1/4) * 1 + ...) = (n/2) * 2 = n. Or, formatted maybe more readably to see the geometric series that are being summed:
n/2^2 + 2n/2^3 + 3n/2^4 + ...
= (n/2^2 + n/2^3 + n/2^4 + ...)
+ (n/2^3 + n/2^4 + ...)
+ (n/2^4 + ...)
+ ...
= n/2^2 (1 + 1/2 + 1/2^4 + ...)
+ n/2^3 (1 + 1/2 + 1/2^3 + ...)
+ n/2^4 (1 + 1/2 + 1/2^3 + ...)
+ ...
= n/2^2 * 2
+ n/2^3 * 2
+ n/2^4 * 2
+ ...
= n/2 + n/2^2 + n/2^3 + ...
= n(1/2 + 1/4 + 1/8 + ...)
= n
And the trick we used repeatedly is that we can sum the geometric series with
1 + 1/2 + 1/4 + 1/8 + ...
= (1 + 1/2 + 1/4 + 1/8 + ...) (1 - 1/2)/(1 - 1/2)
= (1 * (1 - 1/2)
+ 1/2 * (1 - 1/2)
+ 1/4 * (1 - 1/2)
+ 1/8 * (1 - 1/2)
+ ...) / (1 - 1/2)
= (1 - 1/2
+ 1/2 - 1/4
+ 1/4 - 1/8
+ 1/8 - 1/16
+ ...) / (1 - 1/2)
= 1 / (1 - 1/2)
= 1 / (1/2)
= 2
So the total number of "see if I need to fall one more, and if so which way do I fall? comparisons comes to n. But you get round-off from discretization, so you always come out to less than n sets of swaps to figure out. Each of which requires at most 3 comparisons. (Compare root to each child to see if it needs to fall, then the children to each other if the root was larger than both children.)