How is 'decreases' in JML defined? - jml

The statement after decreases has to get strictly smaller in each loop and always be non-zero. But does it have to reach 0? Does it have to get smaller by one?

As stated in the JML documentation, decreases (you can also write decreasing) means that an int or long expression with that specifier "must be no less than 0 when the loop is executing, and must decrease by at least one (1) each time around the loop."
So it may or may not reach 0, but can't get smaller than that. Also, it has to get smaller by at least, but not necessarily exactly one. Note the example in the documentation for a more precise explanation.

Related

When to use machine epsilon and when not to?

I'm reading a book about rendering 3d graphics and the author sometimes uses epsilon and sometimes doesn't.
Notice the if at the beginning using epsilon and the other ifs that don't.
What's the logic behind this? I can see he avoids any chance for division by zero but when not using epsilon in the function there's still a chance it will return a value that will make the outer code to divide by zero.
Book is Real-Time Rendering 3rd Edition, by the way.
The first statement, if(|f| > ϵ) is just checking to make sure f is significantly different from 0. It's important to do that in that specific spot in the code because the next two statements divide by f.
The other statements don't need to do that, so they don't need to use ϵ.
For example,
if(t1 > t2) swap(t1, t2);
is a self-contained statement that compares two numbers to each other and swaps them if the wrong one is greater. Since it's not comparing to see if a value is close to 0, there's no need to use ϵ.
If the value that is returned from this block of code can make the calling code divide by zero, that should be handled in the calling code.

Since I'm not supposed to have magic numbers in my code, should I #define ZERO 0

I want to optimize readability of my code. I don't want people reading my code to be confused when they see 0. What 0 means could be extremely ambiguous. For instance, if I had a statement like if (myVector.size() > 0), that might be confusing because, after all, what is zero supposed to mean in this context? I'm wondering if I should put
#define ZERO 0
at the top of my code.
You miss the point here. Magic is not about using a number - it's about being able to understand why this particular number is used. When you compare size() with 0, your intent is pretty obvious. Have you compared it with any other number - like 42, for example - you'd have to explain why this particular number has been chosen.
As for this particular change (0 => ZERO), it'll just introduce tautology in your code.
0 used in comparisons or initialization of counters, indexes, accumulators, arithmetic expressions... is perfectly understandable as such.
It makes sense to create an identifier in situations where the exact value is irrelevant and could be changed without doing any harm, and having a interpretation specific to the context, like FREE_CLUSTER or NO_ERROR.
No, don't do that. It's a bad choice and utterly unnecessary. In this specific case, your code is pretty obvious:
if (myVector.size() > 0)
No one can get confused with that. But if it was something like:
if (myVector.size() > 23*66)
Then that would be confusing, because someone reading your code would be left wondering what the hell 23*66 means. This is a magic number. In a context like this, a #define could help, accompanied by a comment, for example:
/* We have at most 23 files with 66 records each */
#define MAX_RECORDS_NO 23*66
And then you'd have:
if (myVector.size() > MAX_RECORDS_NO)
That is the kind of situation where you can and surely want to avoid magic numbers. Note that the #define is a description of what that number means, if I were to follow your logic, I would have chosen this instead:
#define TWENTY_THREE_TIMES_SIXTY_SIX 23*66
Now, that's not very helpful, is it? #define ZERO 0 totally misses the point: it adds nothing to someone reading your code, it is excessively verbose, and it's useless because it is not very likely that the value of 0 will change. What if all of a sudden you wanted to compare myVector.size() to another number? You'd have to replace the constant name because defining ZERO to something else other than 0 is asking for trouble. So, after all, you gained nothing.
I'm wondering if I should put
#define ZERO 0
at the top of my code.
As long you're asking just for the 0 magic number: NO, Don't do this!
You should simply use
if (myVector.size()) // ...
which reproduces behavior of your statement equivalently.
It is guaranteed in c and c++ language, that 0 evaluates to false and any other value evaluates to true for evaluation of conditional expressions.
You don't need to do so.
The original code is already straightforward, as you can see.
Also, just think about it. If you define zero as 0, are you going to change zero to some other value other than 0? The answer is no, of course.

Initialize a variable

Is it better to declare and initialize the variable or just declare it?
What's the best and the most efficient way?
For example, I have this code:
#include <stdio.h>
int main()
{
int number = 0;
printf("Enter with a number: ");
scanf("%d", &number);
if(number < 0)
number= -number;
printf("The modulo is: %d\n", number);
return 0;
}
If I don't initialize number, the code works fine, but I want to know, is it faster, better, more efficient? Is it good to initialize the variable?
scanf can fail, in which case nothing is written to number. So if you want your code to be correct you need to initialize it (or check the return value of scanf).
The speed of incorrect code is usually irrelevant, but for you example code if there is a difference in speed at all then I doubt you would ever be able to measure it. Setting an int to 0 is much faster than I/O.
Don't attribute speed to language; That attribute belongs to implementations of language. There are fast implementations and slow implementations. There are optimisations assosciated with fast implementations; A compiler that produces well-optimised machine code would optimise the initialisation away if it can deduce that it doesn't need the initialisation.
In this case, it actually does need the initialisation. Consider if scanf were to fail. When scanf fails, it's return value reflects this failure. It'll either return:
A value less than zero if there was a read error or EOF (which can be triggered in an implementation-defined way, typically CTRL+Z on Windows and CTRL+d on Linux),
A number less than the number of objects provided to scanf (since you've provided only one object, this failure return value would be 0) when a conversion failure occurs (for example, entering 'a' on stdin when you've told scanf to convert sequences of '0'..'9' into an integer),
The number of objects scanf managed to assign to. This is 1, in your case.
Since you aren't checking for any of these return values (particular #3), your compiler can't deduce that the initialisation is necessary and hence, can't optimise it away. When the variable is uninitialised, failure to check these return values results in undefined behaviour. A chicken might appear to be living, even when it is missing its head. It would be best to check the return value of scanf. That way, when your variable is uninitialised you can avoid using an uninitialised value, and when it isn't your compiler can optimise away the initialisations, presuming you handle erroneous return values by producing error messages rather than using the variable.
edit: On that topic of undefined behaviour, consider what happens in this code:
if(number < 0)
number= -number;
If number is -32768, and INT_MAX is 32767, then section 6.5, paragraph 5 of the C standard applies because -(-32768) isn't representable as an int.
Section 6.5, paragraph 5 says:
If an exceptional condition occurs during the evaluation of an
expression (that is, if the result is not mathematically defined or
not in the range of representable values for its type), the behavior
is undefined.
Suppose if you don't initialize a variable and your code is buggy.(e.g. you forgot to read number). Then uninitialized value of number is garbage and different run will output(or behave) different results.
But If you initialize all of your variables then it will produce constant result. An easy to trace error.
Yes, initialize steps will add extra steps in your code at low level. for example mov $0, 28(%esp) in your code at low level. But its one time task. doesn't kill your code efficiency.
So, always using initialization is a good practice!
With modern compilers, there isn't going to be any difference in efficiency. Coding style is the main consideration. In general, your code is more self-explanatory and less likely to have mistakes if you initialize all variables upon declaring them. In the case you gave, though, since the variable is effectively initialized by the scanf, I'd consider it better not to have a redundant initialization.
Before, you need to answer to this questions:
1) how many time is called this function? if you call 10.000.000 times, so, it's a good idea to have the best.
2) If I don't inizialize my variable, I'm sure that my code is safe and not throw any exception?
After, an int inizialization doesn't change so much in your code, but a string inizialization yes.
Be sure that you do all the controls, because if you have a not-inizialized variable your program is potentially buggy.
I can't tell you how many times I've seen simple errors because a programmer doesn't initialize a variable. Just two days ago there was another question on SO where the end result of the issue being faced was simply that the OP didn't initialize a variable and thus there were problems.
When you talk about "speed" and "efficiency" don't simply consider how much faster the code might compile or run (and in this case it's pretty much irrelevant anyway) but consider your debugging time when there's a simple mistake in the code do to the fact you didn't initialize a variable that very easily could have been.
Note also, my experience is when coding for larger corporations they will run your code through tools like coverity or klocwork which will ding you for uninitialized variables because they present a security risk.

Concurrent/Asynchronous access to shared data

I searched around a bit, but could not find anything useful. Could someone help me with this concurrency/synchronization problem?
Given five instances of the program below running asynchronously, with s being a shared data with an initial value of 0 and i a local variable, which values are obtainable by s?
for (i = 0; i < 5; i ++) {
s = s + 1;
}
2
1
6
I would like to know which values, and why exactly.
The non-answering answer is: Uaaaagh, don't do such a thing.
An answer more in the sense of your question is: In principle, any value is possible, because it is totally undefined. You have no strict guarantee that concurrent writes are atomic in any way and don't result in complete garbage.
In practice, less-than-machine-word sized writes are atomic everywhere (for what I know, at least), but they do not have a defined order. Also, you usually don't know in which order threads/processes are scheduled. So you will never see a "random garbage" value, but also you cannot know what it will be. It will be anything 5 or higher (up to 25).
Since no atomic increment is used, there is a race between reading the value, incrementing it, and writing it back. If the value is being written by another instance before the result is written back, the write (and thus increment) that finished earlier has no effect. If that does not happen, both increments are effective.
Nevertheless, each instance increments the value at least 5 times, so apart from the theoretical "total garbage" possibility, there is no way a value less than 5 could result in the end. Therefore (1) and (2) are not possible, but (3) is.

How does this C++ function use memoization?

#include <vector>
std::vector<long int> as;
long int a(size_t n){
if(n==1) return 1;
if(n==2) return -2;
if(as.size()<n+1)
as.resize(n+1);
if(as[n]<=0)
{
as[n]=-4*a(n-1)-4*a(n-2);
}
return mod(as[n], 65535);
}
The above code sample using memoization to calculate a recursive formula based on some input n. I know that this uses memoization, because I have written a purely recursive function that uses the same formula, but this one much, much faster for much larger values of n. I've never used vectors before, but I've done some research and I understand the concept of them. I understand that memoization is supposed to store each calculated value, so that instead of performing the same calculations over again, it can simply retrieve ones that have already been calculated.
My question is: how is this memoization, and how does it work? I can't seem to see in the code at which point it checks to see if a value for n already exists. Also, I don't understand the purpose of the if(as[n]<=0). This formula can yield positive and negative values, so I'm not sure what this check is looking for.
Thank you, I think I'm close to understanding how this works, it's actually a bit more simple than I was thinking it was.
I do not think the values in the sequence can ever be 0, so this should work for me, as I think n has to start at 1.
However, if zero was a viable number in my sequence, what is another way I could solve it? For example, what if five could never appear? Would I just need to fill my vector with fives?
Edit: Wow, I got a lot of other responses while checking code and typing this one. Thanks for the help everyone, I think I understand it now.
if (as[n] <= 0) is the check. If valid values can be negative like you say, then you need a different sentinel to check against. Can valid values ever be zero? If not, then just make the test if (as[n] == 0). This makes your code easier to write, because by default vectors of ints are filled with zeroes.
The code appears to be incorrectly checking is (as[n] <= 0), and recalculates the negative values of the function(which appear to be approximately every other value). This makes the work scale linearly with n instead of 2^n with the recursive solution, so it runs a lot faster.
Still, a better check would be to test if (as[n] == 0), which appears to run 3x faster on my system. Even if the function can return 0, a 0 value just means it will take slightly longer to compute (although if 0 is a frequent return value, you might want to consider a separate vector that flags whether the value has been computed or not instead of using a single vector to store the function's value and whether it has been computed)
If the formula can yield both positive and negative values then this function has a serious bug. The check if(as[n]<=0) is supposed to be checking if it had already cached this value of computation. But if the formula can be negative this function recalculates this cached value alot...
What it really probably wanted was a vector<pair<bool, unsigned> >, where the bool says if the value has been calculated or not.
The code, as posted, only memoizes about 40% of the time (precisely when the remembered value is positive). As Chris Jester-Young pointed out, a correct implementation would instead check if(as[n]==0). Alternatively, one can change the memoization code itself to read as[n]=mod(-4*a(n-1)-4*a(n-2),65535);
(Even the ==0 check would spend effort when the memoized value was 0. Luckily, in your case, this never happens!)
There's a bug in this code. It will continue to recalculate the values of as[n] for as[n] <= 0. It will memoize the values of a that turn out to be positive. It works a lot faster than code without the memoization because there are enough positive values of as[] so that the recursion is terminated quickly. You could improve this by using a value of greater than 65535 as a sentinal. The new values of the vector are initialized to zero when the vector expands.