How to invoke lemma in dafny in bubblesort example? - bubble-sort

How to invoke lemma as the reasoning for equality to be true. Consider the following example in dafny where I've created a lemma to say that the sum from 0 to n is n choose 2. Dafny seems to have no problem justifying this lemma. I want to use that lemma to say that the number of swaps in bubblesort has an upper bound of the sum from 0 to n which is equivalent to n choose 2. That being the case, I currently have an error when I try to say it's true by the lemma. Why is this happening and how can I say that equality is true given a lemma?
method BubbleSort(a: array?<int>) returns (n: nat)
modifies a
requires a != null
ensures n <= (a.Length * (a.Length - 1))/2
{
var i := a.Length - 1;
n := 0;
while (i > 0)
invariant 0 <= i < a.Length
{
var j := 0;
while (j < i)
invariant j <= i
invariant n <= SumRange(i, a.Length)
{
if(a[j] > a[j+1])
{
a[j], a[j+1] := a[j+1], a[j];
n := n + 1;
}
j := j + 1;
}
i := i -1;
}
assert n <= SumRange(i, a.Length) == (a.Length * (a.Length - 1))/2 by {SumRangeNChoose2(a.Length)};
assert n <= (a.Length * (a.Length - 1))/2
}
function SumRange(lo: int, hi: int): int
decreases hi - lo
{
if lo == hi then hi
else if lo >= hi then 0
else SumRange(lo, hi - 1) + hi
}
lemma SumRangeNChoose2(n: nat)
ensures SumRange(0, n) == (n * (n - 1))/2
{}

You just have a syntax error. There should be no semicolon after the } on the second to last line of BubbleSort. Also, there should be a semicolon after the assertion on the following line.
After you fix the syntax errors, there are several deeper errors in the code about missing invariants, etc. But they can all be fixed using the annotations from my answer to your other question.

Related

program written in dafny, implementing the Merge Sorted Arrays in-Place algorithm

Here is the program that given to me in dafny:
method Main() {
var a, b := new int[3] [3,5,8], new int[2] [4,7];
print "Before merging the following two sorted arrays:\n";
print a[..];
print "\n";
print b[..];
ghost var AB := multiset(a[..]+b[..]);
assert Sorted(a[..]) && Sorted(b[..]);
MergeSortedArraysInPlace(a, b, AB);
assert multiset(a[..]+b[..]) == AB;
assert Sorted(a[..]+b[..]);
print "\nAfter merging:\n";
print a[..]; // [3,4,5]
print "\n";
print b[..]; // [7,8]
}
predicate Sorted(q: seq<int>)
{
forall i,j :: 0 <= i <= j < |q| ==> q[i] <= q[j]
}
method MergeSortedArraysInPlace(a: array<int>, b: array<int>, ghost AB: multiset<int>)
requires Sorted(a[..]) && Sorted(b[..])
requires multiset(a[..]+b[..]) == AB
requires a != b
ensures Sorted(a[..]+b[..])
ensures multiset(a[..]+b[..]) == AB
modifies a, b
now i need to Implement iteratively, correctly, efficiently and clearly the MergeSortedArraysInPlace method.
Restriction: Merge the arrays in place, using constant additional space only.
so the implementation that i wrote is the following:
method MergeSortedArraysInPlace(a: array<int>, b: array<int>, ghost AB: multiset<int>)
requires Sorted(a[..]) && Sorted(b[..])
requires multiset(a[..]+b[..]) == AB
requires a != b
ensures Sorted(a[..]+b[..])
ensures multiset(a[..]+b[..]) == AB
modifies a, b
{
var i := 0;
var j := 0;
while (i < a.Length && j < b.Length)
decreases a.Length - i, if i < a.Length then b.Length - j else 0 - 1
{
// if a[i] <= b[j] then both array is
// already sorted
if (a[i] <= b[j]) {
i := i + 1;
}
// if a[i]>b[j] then first we swap
// both element so that a[i] become
// smaller means a[] become sorted then
// we check that b[j] is smaller than all
// other element in right side of b[j] if
// b[] is not sorted then we linearly do
// sorting means while adjacent element are
// less than new b[j] we do sorting like
// by changing position of element by
// shifting one position toward left
else if (a[i] > b[j]) {
var t := a[i];
a[i] := b[j];
b[j] := t;
i := i +1;
if (j < b.Length - 1 && b[j + 1] < b[j]) {
var temp := b[j];
var tempj := j + 1;
while (tempj < b.Length && b[tempj] < temp)
decreases b.Length - tempj, if tempj < b.Length then temp - b[tempj] else 0 - 1
invariant 0 <= tempj < b.Length
{
b[tempj - 1] := b[tempj];
tempj := tempj+1;
if(tempj == b.Length){
break;
}
}
b[tempj - 1] := temp;
}
}
}
}
But for some reason I still get "A postcondition might not hold on this return path."
on the following post conditions:
ensures Sorted(a[..]+b\[..])
ensures multiset(a[..]+b\[..]) == AB
I don't know what the problem could be, I would appreciate your help :)
If you need to sort each array in place, you can use an algorithm like Insertion Sort, which can sort an array in place with constant additional space.
Here is an implementation of Insertion Sort that sorts the given arrays a and b in place:
method SortInPlace(a: array<int>)
modifies a
{
var i := 1;
while (i < a.Length)
invariant 1 <= i <= a.Length
decreases a.Length - i
{
var j := i;
while (j > 0 && a[j - 1] > a[j])
invariant 0 <= j < i
decreases j
{
var temp := a[j];
a[j] := a[j - 1];
a[j - 1] := temp;
j := j - 1;
}
i := i + 1;
}
}
method Main() {
var a := new int[3] [3,5,8];
var b := new int[2] [4,7];
print "Before sorting the following two arrays:\n";
print a[..];
print "\n";
print b[..];
SortInPlace(a);
SortInPlace(b);
print "\nAfter sorting in place:\n";
print a[..];
print "\n";
print b[..];
}
In this implementation, the SortInPlace method takes an array a as input and sorts it in place using Insertion Sort. The Main method creates two arrays a and b, prints their contents, sorts them in place using SortInPlace, and then prints their contents again to verify that they are sorted.
Note that if you need to sort multiple arrays, you can simply call the SortInPlace method on each array.

how to fix hamming weight invariants

I am learning Dafny, attempting to write a specification for the hamming weight problem, aka the number of 1 bits in a number. I believe I have gotten the specification correct, but it still doesn't verify. For speed of verification I limited it to 8 bit numbers;
problem definition: https://leetcode.com/problems/number-of-1-bits/
function method twoPow(x: bv16): bv16
requires 0 <= x <= 16
{
1 << x
}
function method oneMask(n: bv16): bv16
requires 0 <= n <= 16
ensures oneMask(n) == twoPow(n)-1
{
twoPow(n)-1
}
function countOneBits(n:bv8): bv8 {
if n == 0 then 0 else (n & 1) + countOneBits(n >> 1)
}
method hammingWeight(n: bv8) returns (count: bv8 )
ensures count == countOneBits(n)
{
count := 0;
var i := 0;
var n' := n;
assert oneMask(8) as bv8 == 255; //passes
while i < 8
invariant 0 <= i <= 8
invariant n' == n >> i
invariant count == countOneBits(n & oneMask(i) as bv8);
{
count := count + n' & 1;
n' := n' >> 1;
i := i + 1;
}
}
I have written the same code in javascript to test the behavior and example the invariant values before and after the loop. I don't seen any problems.
function twoPow(x) {
return 1 << x;
}
function oneMask(n) {
return twoPow(n)-1;
}
function countOneBits(n) {
return n === 0 ? 0 : (n & 1) + countOneBits(n >> 1)
}
function hammingWeight(n) {
if(n < 0 || n > 256) throw new Error("out of range")
console.log(`n: ${n} also ${n.toString(2)}`)
let count = 0;
let i = 0;
let nprime = n;
console.log("beforeloop",`i: ${i}`, `n' = ${nprime}`, `count: ${count}`, `oneMask: ${oneMask(i)}`, `cb: ${countOneBits(n & oneMask(i))}`)
console.log("invariants", i >= 0 && i <= 8, nprime == n >> i, count == countOneBits(n & oneMask(i)));
while (i < 8) {
console.log("");
console.log('before',`i: ${i}`, `n' = ${nprime}`, `count: ${count}`, `oneMask: ${oneMask(i)}`, `cb: ${countOneBits(n & oneMask(i))}`)
console.log("invariants", i >= 0 && i <= 8, nprime == n >> i, count == countOneBits(n & oneMask(i)));
count += nprime & 1;
nprime = nprime >> 1;
i++;
console.log('Afterloop',`i: ${i}`, `n' = ${nprime}`, `count: ${count}`, `oneMask: ${oneMask(i)}`, `cb: ${countOneBits(n & oneMask(i))}`)
console.log("invariants", i >= 0 && i <= 8, nprime == n >> i, count == countOneBits(n & oneMask(i)));
}
return count;
};
hammingWeight(128);
All invariants evaluate as true. I must be missing something. it says invariant count == countOneBits(n & oneMask(i) as bv8); might not be maintained by the loop. Running the javascript shows that they are all true. Is it due to the cast of oneMask to bv8?
edit:
I replaced the mask function with one that didn't require casting and that still not resolve the problem.
function method oneMaskOr(n: bv8): bv8
requires 0 <= n <= 8
ensures oneMaskOr(n) as bv16 == oneMask(n as bv16)
{
if n == 0 then 0 else (1 << (n-1)) | oneMaskOr(n-1)
}
One interesting thing I found is that it shows me a counter example where it has reached the end of the loop and the final bit of the input variable n is set, so values 128 or greater. But when I add an assertion above the loop that value equals the count at the end of the loop it then shows me the another value of n.
assert 1 == countOneBits(128 & OneMaskOr(8)); //counterexample -> 192
assert 2 == countOneBits(192 & OneMaskOr(8)); //counterexample -> 160
So it seems like it isn't evaluating the loop invariant after the end of loop? I thought the whole point of the invariants was to evaluate after the end of loop.
Edit 2:
I figured it out, apparently adding the explicit decreases clause to the while loop fixed it. I don't get it though. I thought Dafny could figure this out.
while i < 8
invariant 0 <= i <= 8
invariant n' == n >> i
invariant count == countOneBits(n & oneMask(i) as bv8);
decreases 8 - i
{
I see one line in the docs for loop termination saying
If the decreases clause of a loop specifies *, then no termination check will be performed. Use of this feature is sound only with respect to partial correctness.
So is if the decreases clause is missing does it default to *?
After playing around, I did find a version which passes though it required reworking countOneBits() so that its recursion followed the order of iteration:
function countOneBits(n:bv8, i: int, j:int): bv8
requires i ≥ 0 ∧ i ≤ j ∧ j ≤ 8
decreases 8-i {
if i == j then 0
else (n&1) + countOneBits(n >> 1, i+1, j)
}
method hammingWeight(n: bv8) returns (count: bv8 )
ensures count == countOneBits(n,0,8)
{
count ≔ 0;
var i ≔ 0;
var n' ≔ n;
//
assert count == countOneBits(n,0,i);
//
while i < 8
invariant 0 ≤ i ≤ 8;
invariant n' == n >> i;
invariant count == countOneBits(n,0,i);
{
count ≔ (n' & 1) + count;
n' ≔ n' >> 1;
i ≔ i + 1;
}
}
The intuition here is that countOneBits(n,i,j) returns the number of 1 bits between i (inclusive) and j (exclusive). This then reflects what the loop is doing as we increase i.

How do i reduce the repeadly use of % operator for faster execution in C

This is code -
for (i = 1; i<=1000000 ; i++ ) {
for ( j = 1; j<= 1000000; j++ ) {
for ( k = 1; k<= 1000000; k++ ) {
if (i % j == k && j%k == 0)
count++;
}
}
}
or is it better to reduce any % operation that goes upto million times in any programme ??
edit- i am sorry ,
initialized by 0, let say i = 1 ok!
now, if i reduce the third loop as #darshan's answer then both the first
&& second loop can run upto N times
and also it calculating % , n*n times. ex- 2021 mod 2022 , then 2021 mod 2023..........and so on
so my question is- % modulus is twice (and maybe more) as heavy as compared to +, - so there's any other logic can be implemented here ?? which is alternate for this question. and gives the same answer as this logic will give..
Thank you so much for knowledgeable comments & help-
Question is:
3 integers (A,B,C) is considered to be special if it satisfies the
following properties for a given integer N :
A mod B=C
B mod C=0
1≤A,B,C≤N
I'm so curious if there is any other smartest solution which can greatly reduces time complexity.
A much Efficient code will be the below one , but I think it can be optimized much more.
First of all modulo (%) operator is quite expensive so try to avoid it on a large scale
for(i = 0; i<=1000000 ; i++ )
for( j = 0; j<= 1000000; j++ )
{
a = i%j;
for( k = 0; k <= j; k++ )
if (a == k && j%k == 0)
count++;
}
We placed a = i%j in second loop because there is no need for it to be calculated every time k changes as it is independent of k and for the condition j%k == 0 to be true , k should be <= j hence change looping restrictions
First of all, your code has undefined behavior due to division by zero: when k is zero then j%k is undefined, so I assume that all your loops should start with 1 and not 0.
Usually the % and the / operators are much slower to execute than any other operation. It is possible to get rid of most invocations of the % operators in your code by several simple steps.
First, look at the if line:
if (i % j == k && j%k == 0)
The i % j == k has a very strict constrain over k which plays into your hands. It means that it is pointless to iterate k at all, since there is only one value of k that passes this condition.
for (i = 1; i<=1000000 ; i++ ) {
for ( j = 1; j<= 1000000; j++ ) {
k = i % j;
// Constrain k to the range of the original loop.
if (k <= 1000000 && k > 0 && j%k == 0)
count++;
}
}
To get rid of "i % j" switch the loop. This change is possible since this code is affected only by which combinations of i,j are tested, not in the order in which they are introduced.
for ( j = 1; j<= 1000000; j++ ) {
for (i = 1; i<=1000000 ; i++ ) {
k = i % j;
// Constrain k to the range of the original loop.
if (k <= 1000000 && k > 0 && j%k == 0)
count++;
}
}
Here it is easy to observe how k behaves, and use that in order to iterate on k directly without iterating on i and so getting rid of i%j. k iterates from 1 to j-1 and then does it again and again. So all we have to do is to iterate over k directly in the loop of i. Note that i%j for j == 1 is always 0, and since k==0 does not pass the condition of the if we can safely start with j=2, skipping 1:
for ( j = 2; j<= 1000000; j++ ) {
for (i = 1, k=1; i<=1000000 ; i++, k++ ) {
if (k == j)
k = 0;
// Constrain k to the range of the original loop.
if (k <= 1000000 && k > 0 && j%k == 0)
count++;
}
}
This is still a waste to run j%k repeatedly for the same values of j,k (remember that k repeats several times in the inner loop). For example, for j=3 the values of i and k go {1,1}, {2,2}, {3,0}, {4,1}, {5,2},{6,0},..., {n*3, 0}, {n*3+1, 1}, {n*3+2, 2},... (for any value of n in the range 0 < n <= (1000000-2)/3).
The values beyond n= floor((1000000-2)/3)== 333332 are tricky - let's have a look. For this value of n, i=333332*3=999996 and k=0, so the last iteration of {i,k}: {n*3,0},{n*3+1,1},{n*3+2, 2} becomes {999996, 0}, {999997, 1}, {999998, 2}. You don't really need to iterate over all these values of n since each of them does exactly the same thing. All you have to do is to run it only once and multiply by the number of valid n values (which is 999996+1 in this case - adding 1 to include n=0).
Since that did not cover all elements, you need to continue the remainder of the values: {999999, 0}, {1000000, 1}. Notice that unlike other iterations, there is no third value, since it would set i out-of-range.
for (int j = 2; j<= 1000000; j++ ) {
if (j % 1000 == 0) std::cout << std::setprecision(2) << (double)j*100/1000000 << "% \r" << std::flush;
int innerCount = 0;
for (int k=1; k<j ; k++ ) {
if (j%k == 0)
innerCount++;
}
int innerLoopRepeats = 1000000/j;
count += innerCount * innerLoopRepeats;
// complete the remainder:
for (int k=1, i= j * innerLoopRepeats+1; i <= 1000000 ; k++, i++ ) {
if (j%k == 0)
count++;
}
}
This is still extremely slow, but at least it completes in less than a day.
It is possible to have a further speed up by using an important property of divisibility.
Consider the first inner loop (it's almost the same for the second inner loop),
and notice that it does a lot of redundant work, and does it expensively.
Namely, if j%k==0, it means that k divides j and that there is pairK such that pairK*k==j.
It is trivial to calculate the pair of k: pairK=j/k.
Obviously, for k > sqrt(j) there is pairK < sqrt(j). This implies that any k > sqrt(j) can be extracted simply
by scanning all k < sqrt(j). This feature lets you loop over only a square root of all interesting values of k.
By searching only for sqrt(j) values gives a huge performance boost, and the whole program can finish in seconds.
Here is a view of the second inner loop:
// complete the remainder:
for (int k=1, i= j * innerLoopRepeats+1; i <= 1000000 && k*k <= j; k++, i++ ) {
if (j%k == 0)
{
count++;
int pairI = j * innerLoopRepeats + j / k;
if (pairI != i && pairI <= 1000000) {
count++;
}
}
}
The first inner loop has to go over a similar transformation.
Just reorder indexation and calculate A based on constraints:
void findAllSpecial(int N, void (*f)(int A, int B, int C))
{
// 1 ≤ A,B,C ≤ N
for (int C = 1; C < N; ++C) {
// B mod C = 0
for (int B = C; B < N; B += C) {
// A mod B = C
for (int A = C; A < N; A += B) {
f(A, B, C);
}
}
}
}
No divisions not useless if just for loops and adding operations.
Below is the obvious optimization:
The 3rd loop with 'k' is really not needed as there is already a many to One mapping from (I,j) -> k
What I understand from the code is that you want to calculate the number of (i,j) pairs such that the (i%j) is a factor of j. Is this correct or am I missing something?

Is a recursive function in a ternary operator evaluated twice?

I came across an article that has the following statement:
maxSubArray(A, i) = maxSubArray(A, i - 1) > 0 ? maxSubArray(A, i - 1) : 0 + A[i];
My question is, would maxSubArray(A, i - 1) be evaluated (called) twice (if its value is greater than 0)? Does it increase the time complexity of the code? I think so, since we would end up calling the recursive function twice (if its value is greater than 0).
Edit: Here's the code:
public int maxSubArray(int[] A) {
int n = A.length;
int[] dp = new int[n];//dp[i] means the maximum subarray ending with A[i];
dp[0] = A[0];
int max = dp[0];
for(int i = 1; i < n; i++){
dp[i] = A[i] + (dp[i - 1] > 0 ? dp[i - 1] : 0);
max = Math.max(max, dp[i]);
}
return max;
}
and here is the related link. The above code is not directly related, since the one in my original question is from top-down DP approach, while the one added later on is from a bottom-up DP approach.
This:
maxSubArray(A, i) = maxSubArray(A, i - 1) > 0 ? maxSubArray(A, i - 1) : 0 + A[i];
is just a pseudo code notation of the relation between maxSubArray(A, i - 1) and maxSubArray(A, i). It just tells us how to compute the result for i when we do know the result for i-1. Read it like maths notation. Similar as
y = 5 * x
describes
int foo(int x) { return x*5; }
In the actual implementation the above recurrence relation is realized via:
dp[i] = A[i] + (dp[i - 1] > 0 ? dp[i - 1] : 0);
Here dp[i - 1] is merely accessing an element of an array. Accessing the same array element twice has no impact on complexity. Given that dp[i-1] is not modified in that line, a compiler might optimize it to access dp[i-1] only once.
In the presence of recursion, unnecessarily calling a function twice can have an impact on complexity. Consider this example (stolen from Jarod42):
int f(int n) {
if (n == 0) return 1;
return 2 * f(n - 1);
}
int f(int n) {
if (n == 0) return 1;
return f(n - 1) + f(n - 1);
}
Both yield the same result, but the first has linear complexity while the second is exponential.

Find the running time in Big O notation

1) for (i = 1; i < n; i++) { > n
2) SmallPos = i; > n-1
3) Smallest = Array[SmallPos]; > n-1
4) for (j = i+1; j <= n; j++) > n*(n+1 -i-1)??
5) if (Array[j] < Smallest) { > n*(n+1 -i-1 +1) ??
6) SmallPos = j; > n*(n+1 -i-1 +1) ??
7) Smallest = Array[SmallPos] > n*(n+1 -i-1 +1) ??
}
8) Array[SmallPos] = Array[i]; > n-1
9) Array[i] = Smallest; > n-1
}
i know the big O notation is n^2 ( my bad its not n^3)
i am not sure between line 4-7 anyone care to help out?
im not sure how to get the out put for the second loop since j = i +1 as i changes so does j
also for line 4 the ans suppose to be n(n+1)/2 -1 i want to know why as i can never get that
i am not really solving for the big O i am trying to do the steps that gets to big O as constant and variables are excuded in big O notations.
I would say this is O(n^2) (although as Fred points out above, O(n^2) is a subset of O(n^3), so it's not wrong to say that it's O(n^3)).
Note that it's generally not necessary to compute the number of executions of every single line; as Big-O notation discards low-order terms, it's sufficient to focus only on the most-executed section (which will typically be inside the innermost loop).
So in your case, none of the loops are affected by the values in Array, so we can safely ignore all that. The innermost loop runs (n-1) + (n-2) + (n-3) + ... times; this is an arithmetic series, and so has a term in n^2.
Is this an algorithm given to you, or one you wrote?
I think your loop indexes are wrong.
for (i = 1; i < n; i++) {
should be either
for (i = 0; i < n; i++) {
or
for (i = 1; i <= n; i++) {
depending on whether your array indexes start at 0 or 1 (it's 0 in C and Java).
Assuming we correct it to:
for (i = 0; i < n; i++) {
SmallPos = i;
Smallest = Array[SmallPos];
for (j = i+1; j < n; j++)
if (Array[j] < Smallest) {
SmallPos = j;
Smallest = Array[SmallPos];
}
Array[SmallPos] = Array[i];
Array[i] = Smallest;
}
Then I think the complexity is n2-3/2n = O(n2).
Here's how...
The most costly operation in the innermost loop (my lecturer called this the "basic operation") is key comparison at line 5. It is done once per loop.
So now, you create a summation:
Sum(i=0 to n-1) of Sum(j=i+1 to n-1) of 1.
Now expand the innermost (rightmost) Sum to get:
Sum(i=0 to n-1) of (n-1)-(i+1)+1
and then:
Sum(i=0 to n-1) of n-i-1
and then:
[Sum(i=0 to n-1) of n] - [Sum(i=0 to n-1) of i] - [Sum (i=0 to n-1) of 1]
and then:
n[Sum(i=0 to n-1) of 1] - [(n-1)(n)/2] - [(n-1)-0+1]
and then:
n[(n-1)-0+1] - [(n^2-n)/2] - [n]
and then:
n^2 - [(n^2/2) - n/2] - n
equals:
1/2n^2 - 1/2n
is in:
O(n^2)
If you're asking why it's not O(n3)...
Consider the worst case. if (Array[j] < Smallest) will be true the most times if Array is reverse sorted.
Then you have an inner loop that looks like this:
Array[j] < Smallest;
SmallPos = j;
Smallest = Array[SmallPos];
Now we've got a constant three operations for every inner for (j...) loop.
And O(3) = O(1).
So really, it's i and j that determine how much work we do. Nothing in the inner if loop changes anything.
You can think of it as you should only count while and for loops.
As to why for (j = i+1; j <= n; j++) is n(n+1)/2. It's called an arithmetic series.
You're doing n-1 passes of the for (j...) loop when i==0, n-2 passes when i==1, n-3, etc, until 0.
So the summation is
n-1 + n-2 + n-3 + ... 3 + 2 + 1
now, you sum pairs from outside in, re-writing it as:
n-1+1 + n-2+2 + n-3+3 + ...
equals:
n + n + n + ...
and there are n/2 of these pairs, so you have:
n*(n/2)
Two for() loops, the outer loop from 1 to n, the inner loop runs between 1..n, to n. This makes it O(n^2).
If you 'draw this out', it'll be triangular, rather than rectangular, so O(n^2), while true, is hiding the fact that the constant factor term is smaller than if the inner loop also iterated from 1 to n.
It is O(n^2).
For each of the n iterations of the outer loop you have n iterations in the inner loop.