I've been reading about the XorShift PRNG especially the paper here
A guy here states that
The number lies in the range [1, 2**64). Note that it will NEVER be 0.
Looking at the code that makes sense:
uint64_t x;
uint64_t next(void) {
x ^= x >> 12; // a
x ^= x << 25; // b
x ^= x >> 27; // c
return x * UINT64_C(2685821657736338717);
}
If x would be zero than every next number would be zero too. But wouldn't that make it less useful? The usual use-pattern would be something like min + rand() % (max - min) or converting the 64 bits to 32 bits if you only need an int. But if 0 is never returned than that might be a serious problem. Also the bits are not 0 or 1 with the same probability as obviously 0 is missing so zeroes or slightly less likely. I even can't find any mention of that on Wikipedia so am I missing something?
So what is a good/appropriate way to generate random, equally distributed numbers from XorShift64* in a given range?
Short answer: No it cannot return zero.
According the Numeric Recipes "it produces a full period of 2^64-1 [...] the missing value is zero".
The essence is that those shift values have been chosen carefully to make very long sequences (full possible one w/o zero) and hence one can be sure that every number is produced. Zero is indeed the fixpoint of this generator, hence it produces 2 sequences: Zero and the other containing every other number.
So IMO for a sufficiently small range max-min it is enough to make a function (next() - 1) % (max - min) + min or even omitting the subtraction altogether as zero will be returned by the modulo.
If one wants better quality equal distribution one should use the 'usual' method by using next() as a base generator with a range of [1, 2^64)
I am nearly sure that there is an x, for which the xorshift operation returns 0.
Proof:
First, we have these equations:
a = x ^ (x >> 12);
b = a ^ (a << 25);
c = b ^ (b >> 27);
Substituting them:
b = (x ^ x >> 12) ^ ((x ^ x >> 12) << 25);
c = b ^ (b >> 27) = ((x ^ x >> 12) ^ ((x ^ x >> 12) << 25)) ^ (((x ^ x >> 12) ^ ((x ^ x >> 12) << 25)) >> 27);
As you can see, although c is a complex equation, it is perfectly abelian.
It means, you can express the bits of c as fully boolean expressions of the bits of x.
Thus, you can simply construct an equation system for the bits b0, b1, b2, ... so:
(Note: the coefficients are only examples, I didn't calculate them, but so would it look):
c0 = x1 ^ !x32 ^ x47 ...
c1 = x23 ^ x45 ^ !x61 ...
...
c63 = !x13 ^ ...
From that point, you have 64 equations and 64 unknowns. You can simply solve it with Gauss-elimination, you will always have a single unique solution.
Except some rare cases, i.e. if the determinant of the coefficients of the equation system is zero, but it is very unlikely in the size of such a big matrix.
Even if it happens, it would mean that you have an information loss in every iteration, i.e. you can't get all of the 2^64 possible values of x, only some of them.
Now consider the much more probable possibility, that the coefficient matrix is non-zero. In this case, for all the possible 2^64 values of x, you have all possible 2^64 values of c, and these are all different.
Thus, you can get zero.
Extension: actually you get zero for zero... sorry, the proof is more useful to show that it is not so simple as it seems for the first spot. The important part is that you can express the bits of c as a boolean function of the bits of x.
There is another problem with this random number generator. And this is that even if you somehow modify the function to not have such problem (for example, by adding 1 in every iteration):
You still can't guarantee that it won't get into a short loop *for any possible values of x. What if there is a 5 length loop for value 345234523452345? Can you prove for all possible initial values? I can't.
Actually, having a really pseudorandom iteration function, your system will likely loop after 2^32 iterations. It has a nearly trivial combinatoric reason, but "unfortunately this margin is small to contain it" ;-)
So:
If a 2^32 loop length is for your PRNG okay, then use a proven iteration function collected from somewhere on the net.
If it isn't, upgrade the bit length to at least 2^128. It will result a roughly 2^64 loop length which is not so bad.
If you still want a 64-bit output, then use 128-bit numeric internally, but return (x>>64) ^ (x&(2^64-1)) (i.e. xor-ing the upper and lower half of the internal state x).
Is there a quick bit operation to implement msb_equal: a function to check if two numbers have the same most significant bit?
For example, 0b000100 and 0b000111 both have 4 as their most significant bit value, so they are most msb_equal. In contrast 0b001111 has 8 as the MSB value, and 0b010000 has 16 as it's MSB value, so the pair are not msb_equal.
Similarly, are there fast ways to compute <, and <=?
Examples:
msb_equal(0, 0) => true
msb_equal(2, 3) => true
msb_equal(0, 1) => false
msb_equal(1, 2) => false
msb_equal(3, 4) => false
msb_equal(128, 255) => true
A comment asks why 0 and 1 are not msb_equal. My view on this is that if I write out two numbers in binary, they are msb_equal when the most significant 1 bit in each is the same bit.
Writing out 2 & 3:
2 == b0010
3 == b0011
In this case, the top most 1 is the same in each number
Writing out 1 & 0:
1 == b0001
0 == b0000
Here, the top most 1s are not the same.
It could be said that as 0 has no top most set bit, msb_equal(0,0) is ambiguous. I'm defining it as true: I feel this is helpful and consistent.
Yes, there are fast bit based operations to compute MSB equality and inequalities.
Note on syntax
I'll provide implementations using c language syntax for bitwise and logical operators:
| – bitwise OR. || – logical OR.
& – bitwise AND. && – logical AND.
^ – bitwise XOR.
==
msb_equal(l, r) -> bool
{
return (l^r) <= (l&r)
}
<
This is taken from the Wikipedia page on the Z Order Curve (which is awesome):
msb_less_than(l, r) -> bool
{
(l < r) && (l < l^r)
}
<=
msb_less_than_equal(l, r) -> bool
{
(l < r) || (l^r <= l&r)
}
If you know which number is the smallest/biggest one, there is a very fast way to check whether the MSBs are equal. The following code is written in C:
bool msb_equal(unsigned small, unsigned big) {
assert(small <= big);
return (small ^ big) <= small;
}
This can be useful in cases like when you add numbers to a variable and you want to know when you reached a new power of 2.
Explanation
The trick here is that if the two numbers have the same most significant bit, it will disappear since 1 xor 1 is 0; that makes the xor result smaller than both numbers. If they have different most significant bits, the biggest number's MSB will remain because the smallest number has a 0 in that place and therefore the xor result will be bigger than the smallest number.
When both input numbers are 0, the xor result will be 0 and the function will return true. If you want 0 and 0 to count as having different MSBs then you can replace <= with <.
The problem is to tell if two 8-bit chars are gray codes(differ only in 1 bit) in C++?
I found an elegant C++ solution:
bool isGray(char a, char b) {
int m = a ^ b;
return m != 0 && (m & (m - 1) & 0xff) == 0;
}
I was confused that what does the "& 0xff" do?
& 0xff extracts the 8 lowest bits from the resulting value, ignoring any higher ones.
It's wrong. The mistaken idea is that char is 8 bit.
It's also pointless. The presumed problem is that m can have more bits than char (true) so that "unnecessary" bits are masked off.
But m is sign-extended. That means the sign bit is copied to the higher bits. Now, when we're comparing x==0 we're checking whether all bits are zero, and with x & 0xff we're comparing if the lower 8 bits are zero. If the 8th bit of x is copied to all higher positions (by sign extension), then the two conditions are the same regardless of whether the copied bit was 0 or 1.
i am trying to do some exercises, but i'm stuck at this point, where i can't understand what's happening and can't find anything related to this particular matter (Found other things about logical operators, but still not enough)
EDIT: Why the downvote, i was pretty explicit. There is no information regarding the type of X, but i assume is INT, the size is not described either, i thought i would discover that by doing the exercise.
a) At least one bit of x is '1';
b) At least one bit of x is '0';
c) At least one bit at the Least Significant Byte of x , is '1';
d) At least one bit at the Least Significant Byte of x , is '0';
I have the solutions, but would be great to understand them
a) !!x // What happens here? The '!' Usually is NOT in c
b) !!~x // Again, the '!' appears... The bitwise operand NOT is '~' and it makes the int one's complement, no further realization made unfortunately
c) !!(x & 0xFF) // I've read that this is a bit mask, i think they take in consideration 4 bytes in X, and this applies a mask at the least significant byte?
d) !!(~x & 0xFF) // Well, at this point i'm lost ...
I would love not having to skip classes at college, but i work full time in order to pay the fees :( .
You can add brackets around the separate operations and apply them in order. e.g.
!(!(~x))
i.e. !! is 2 NOT's
What happens to some value if you perform one NOT is:
If x == 0 then !x == 1, otherwise !x == 0
So, if you would perform another NOT, you invert the truth-value again. i.e.
If x == 0 then !!x == 0, otherwise !!x == 1
You could see it as getting your value between 0 and 1 in which 0 means: "no bit of x is '1'", and 1 means: "at least one bit of x is '1'".
Also, x & 0xFF takes the least significant byte of your variable. More thoroughly explained here:
What does least significant byte mean?
Assuming x is some unsigned int/short/long/... and you want conditions (if, while...):
a) You´ll have to know that just a value/variable as condition (without a==b or something)
is false if it is 0 and true if it is not 0. So, if x is not 0 (true), one ! will switch it to 0 and the other ! to something not-0-like again (not necessarily the old value, only not 0). If x was 0, the ! will finally result in 0 again (first not 0, then again 0).
The whole value of x is not 0 if at least 1 bit is 1...
What you´re doing is to transform either 0 to 0 or a value with 1-bits to some value with 1-bits. Not wrong, but... You can just write if(x) instead of if(!!x)
b) ~ switches every 0-bit to 1 and every 1 to 0. Now you can search again a 1 because you want a 0 in the original value. The same !!-thing again...
c and d:
&0xFF sets all bits except for the lowest 8 ones (lowest byte) to 0.
The result of A&B is a value where each bit is only 1 if the bits of A an B at the same position are both 1. 0xff (decimal 255) is the number which has exactly the lowest 8 bits set to 1...
I am trying to learn C programming, and I was studying some source codes and there are some things I didn't understand, especially regarding Bitwise Operators. I read some sites on this, and I kinda got an idea on what they do, but when I went back to look at this codes, I could not understand why and how where they used.
My first question is not related to bitwise operators but rather some ascii magic:
Can somebody explain to me how the following code works?
char a = 3;
int x = a - '0';
I understand this is done to convert a char into an int, however I don't understand the logic behind it. Why/How does it work?
Now, Regarding Bitwise operators, I feel really lost here.
What does this code do?
if (~pointer->intX & (1 << i)) { c++; n = i; }
I read somewhere that ~ inverts bits, but I fail to see what this statement is doing and why is it doing that.
Same with this line:
row.data = ~(1 << i);
Other question:
if (x != a)
{
ret |= ROW;
}
What exactly is the |= operator doing? From what I read, |= is OR but i don't quite understand what is this statement doing.
Is there any way of rewriting this code to make it easier to understands so that it doesn't use this bitwise operators? I find them very confusing to understand, so hopefully somebody will point me in the right direction on understanding how they work better!
I have a much better understanding of bitwise operators now and the whole code makes much more sense now.
One last thing: appartenly nobody responded if there would be a "cleaner" way for rewriting this code in a way that its easier to understand and maybe not at "bitlevel". Any ideas?
This will produce junk:
char a = 3;
int x = a - '0';
This is different - note the quotes:
char a = '3';
int x = a - '0';
The char datatype stores a number that identifiers a character. The characters for the digits 0 through 9 are all next to each other in the character code list, so if you subtract the code for '0' from the code for '9', you get the answer 9. So this will turn a digit character code into the integer value of the digit.
(~pointer->intX & (1 << i))
That will be interpreted by the if statement as true if it's non-zero. There are three different bitwise operators being used.
The ~ operator flips all the bits in the number, so if pointer->intX was 01101010, then ~pointer->intX will be 10010101. (Note that throughout, I'm illustrating the contents of a byte. If it was a 32-bit integer, I'd have to write 32 digits of 1s and 0s).
The & operator combines two numbers into one number, by dealing with each bit separately. The resulting bit is only 1 if both the input bits are 1. So if the left side is 00101001 and the right side is 00001011, the result will be 00001001.
Finally, << means left shift. If you start with 00000001 and left shift it by three places, you'll have 00001000. So the expression (1 << i) produces a value where bit i is switched on, and the others are all switch off.
Putting it all together, it tests if bit i is switched off (zero) in pointer->intX.
So you may be able to figure out what ~(1 << i) does. If i is 4, the thing in brackets will be 00010000, and so the whole thing will be 11101111.
ret |= ROW;
That one is equivalent to:
ret = ret | ROW;
The | operator is like & except that the resulting bit is 1 if either of the input bits is 1. So if ret is 00100000 and ROW is 00000010, the result will be 00100010.
ret |= ROW;
is equivalent to
ret = ret | ROW;
For char a = 3; int x = a - '0'; I think you meant char a = '3'; int x = a - '0';. It's easy enough to understand if you realize that in ASCII the numbers come in order, like '0', '1', '2', ... So if '0' is 48 and '1' is 49, then '1' - '0' is 1.
For bitwise operations, they are hard to grasp until you start looking at bits. When you view these operations on binary numbers then you can see exactly how they work...
010 & 111 = 010
010 | 111 = 111
010 ^ 111 = 101
~010 = 101
I think you probably have a typo, and meant:
char a = '3';
The reason this works is that all the numbers come in order, and '0' is the first. Obviously, '0' - '0' = 0. '1' - '0' = 1, since the character value for '1' is one greater than the character value for '0'. Etc.
1) A char is really just a 8-bit integer. '0' == 48, and all that that implies.
2) (~(pointer->intX) & (1 << i)) evalutates whether the 'i'th bit (from the right) in the intX member of whatever pointer points to is not set. The ~ inverts the bits, so all the 0s become 1s and vice versa, then the 1 << i puts a single 1 in the desired location, & combines the two values so that only the desired bit is kept, and the whole thing evalutes to true if that bit was 0 to begin with.
3) | is bitwise or. It takes each bit in both operands and performs a logical OR, producing a result where each bit is set if either operand had that bit set. 0b11000000 | 0b00000011 == 0b11000011. |= is an assignment operator, in the same way that a+=b means a=a+b, a|=b means a=a|b.
Not using bitwise operators CAN make things easier to read in some cases, but it will usually also make your code significantly slower without strong compiler optimization.
The subtraction trick you reference works because ASCII numbers are arranged in ascending order, starting with zero. So if ASCII '0' is a value of 48 (and it is), then '1' is a value of 49, '2' is 50, etc. Therefore ASCII('1') - ASCII('0') = 49 - 48 = 1.
As far as bitwise operators go, they allow you to perform bit-level operations on variables.
Let's break down your example:
(1 << i) -- this is left-shifting the constant 1 by i bits. So if i=0, the result is decimal 1. If i = 1, it shifts the bit one to the left, backfilling with zeros, yielding binary 0010, or decimal 2. If i = 2, you shift the bit two to the left, backfilling with zeros, yielding binary 0100 or decimal 4, etc.
~pointer->intX -- this is taking the value of the intX member of pointer and inverting its bits, setting all zeros to ones and vice versa.
& -- the ampersand operator does a bitwise AND comparison. The results of this will be 1 wherever both the left and right side of the expression are 1, and 0 otherwise.
So the test will succeed if pointer->intX has a 0 bit at the ith position from the right.
Also, |= means to do a bitwise OR comparison and assign the result to the left side of the expression. The result of a bitwise OR is 1 for every bit where the corresponding left or right side bit is 1,
Single quotes are used to indicate that a single char is used. '0' therefore is the char '0', which has the ASCII-Code 48.
3-'0'=3-48
'1<<i' shifts 1 i places to the left, therefore only the ith bit from the right is 1.
~pointer->intX negates the field intX, so the logical AND returns a true value (not 0) when intX has every bit except for the ith bit from the right isn't set.
char a = '3';
int x = a - '0';
you had a typo here (notice the 's around the 3), this assigns the ascii value of the character 3, to the char variable, then the next line takes '3' - '0' and assigns it to x, because of the way ascii values work, x will then be equal to 3 (integer value)
In the first comparison, I've never seen ~ being used on a pointer that way before, another typo maybe? If I were to read out the following code:
(~pointer->intX & (1 << i))
I would say "(the value intX dereferenced from pointer) AND (1 left shifted i times)"
1 << i is a quick way of multiplying 1 by a power of 2, ie if i is 3, then 1 << 3 == 8
In this case, I have no clue why you would invert the bits of the pointer..
In the 2nd comparison, x |= y is the same as x = x | y
I'm assuming you mean char a='3'; for the first line of code (otherwise you get a rather strange answer). The basic principal is that ASCII codes for digits are sequential, i.e. the code for '0'=48, the code for '1'=49, and so on. Subtracting '0' simply converts from the ASCII code to the actual digit, so e.g. '3' - '0' = 3, and so on. Note that this will only work if the character you're subtracting '0' from is an actual digit - otherwise the result will have little meaning.
a. Without context the "why" of this code is impossible to say. As for what it's doing, it appears that the if statement evaluates as true when bit i of pointer->intX is not set, i.e. that particular bit is a 0. I believe the & operator gets executed before the ~ operator, as the ~ operator has very low precedence. The code could make better use of parentheses to make the intended order of operations clearer. In this case, the order of operations might not matter though - I believe the result is the same either way.
b. This is simply creating a number with all bits EXCEPT bit i set to 1. A convenient way of creating a mask for bit i is to use the expression (1<<i).
The bitwise OR operation in this case is used to set the bits specified by the ROW constant to 1. If these bits are not set, it sets them; if they're already set it has no effect.
1) Can somebody explain to me how the following code works? char a = 3; int x = a - '0';
I undertand this is done to convert a char into an int, however I don't understand the logic behind it. Why/How does it work?
Sure. variable a is of type char, and by putting single quotes around 0 that is causing C to view it as a char as well. Finally, the whole statement is automagically typecast to its integer equivalent, because x is defined as an integer.
2) Now, Regarding Bitwise operators, I feel really lost here.
--- What does this code do? if (~pointer->intX & (1 << i)) { c++; n = i; } I read somewhere that ~ inverts bits, but I fail to see what this statement is doing and why is it doing that.
(~pointer->intX & (1 << i)) is saying:
negate intX, and AND it with a 1 shifted left by i bits
so, what you're getting, if intX = 1011, and i = 2, equates to
(0100 & 0100)
-negate 1011 = 0100
-(1 << 2) = 0100
0100 & 0100 = 1 :)
then, if the AND operation returns a 1 (which, in my example, it does)
{ c++; n = i; }
so, increment c by 1, and set variable n to be i
Same with this line: row.data = ~(1 << i);
Same principle here.
Shift a 1 to the left by i places, and negate.
So, if i = 2 again
(1 << 2) = 0100
~(0100) = 1011
**--- Other question:
if (x != a) { ret |= ROW; }
What exacly is the |= operator doing? From what I read, |= is OR but i don't quite understand what is this statement doing.**
if (x != a) (hopefully this is apparent to you....if variable x does not equal variable a)
ret |= ROW;
equates to
ret = ret | ROW;
which means, binary OR ret with ROW
For examples of exactly what AND and OR operations accomplish, you should have a decent understanding of binary logic.
Check wikipedia for truth tables...ie
Bitwise operations