Could somebody explain how this actually works for example the char input = 'a'.
I understand that << shift the bits over by four places (for more than one character). But why in the second part add 9? I know 0xf = 15.....Am I missing something obvious.
result = result << 4 | *str + 9 & 0xf;
Here is my understand so far:
char input = 'a' ascii value is 97. Add 9 is 106, 106 in binary is 01101010. 0xf = 15 (00001111), therefore 01101010 & 00001111 = 00001010, this gives the value of 10 and the result is then appended on to result.
Thanks in advance.
First, let's rewrite this with parenthesis to make the order of operations more clear:
result = (result << 4) | ((*str + 9) & 0xf);
If result is 0 on input, then we have:
result = (0 << 4) | ((*str + 9) & 0xf);
Which simplifies to:
result = (0) | ((*str + 9) & 0xf);
And again to:
result = (*str + 9) & 0xf;
Now let's look at the hex and binary representations of a - f:
a = 0x61 = 01100001
b = 0x62 = 01100010
c = 0x63 = 01100011
d = 0x64 = 01100100
e = 0x65 = 01100101
f = 0x66 = 01100110
After adding 9, the & 0xf operation clears out the top 4 bits, so we don't need to worry about those. So we're effectively just adding 9 to the lower 4 bits. In the case of a, the lower 4 bits are 1, so adding 9 gives you 10, and similarly for the others.
As chux mentioned in his comment, a more straightforward way of achieving this is as follows:
result = *str - 'a' + 10;
Related
I am calculating CRC on a large chunk of data every cycle in hardware (64B per cycle). In order to parallelize the CRC calculation, I want to calculate the CRC for small data chunks and then XOR them in parallel.
Approach:
We divide the data into small chunks (64B data divided into 8 chunks
of 8B each).
Then we calculate CRC's for all the chunks
individually (8 CRC's in parallel for 8B chunks).
Finally calculate
the CRC for padded data. This answer points out that the CRC
for padded data is obtained by multiplying the old CRC with x^n.
Hence, I am calculating the CRC for a small chunk of data, then multiply it with CRC of 0x1 shifted by 'i' times as shown below.
In short, I am trying to accomplish below:
For example: CRC-8 on this site:
Input Data=(0x05 0x07) CRC=0x54
Step-1: Data=0x5 CRC=0x1B
Step-2: Data=0x7 CRC=0x15
Step-3: Data=(0x1 0x0) CRC=0x15
Step-4: Multiply step-1 CRC and step-3 CRC with primitive polynomial 0x7. So, I calculate (0x1B).(0x15) = (0x1 0xC7) mod 0x7.
Step-5: Calculate CRC Data=(0x1 0xC7) CRC=0x4E (I assume this is same as (0x1 0xC7) mod 0x7)
Step-6: XOR the result to get the final CRC. 0x4E^0x15=0x5B
As we can see, the result in step-6 is not the correct result.
Can someone help me how to calculate the CRC for padded data? Or where am I going wrong in the above example?
Rather than calculate and then adjust multiple CRC's, bytes of data can be carryless multiplied to form a set of 16 bit "folded" products, which are then xor'ed and a single modulo operation performed on the xor'ed "folded" products. An optimized modulo operation uses two carryless multiples, so it's avoided until all folded products have been generated and xor'ed together. A carryless multiply uses XOR instead of ADD and a borrowless divide uses XOR instead of SUB. Intel has a pdf file about this using the XMM instruction PCLMULQDQ (carryless multiply), where 16 bytes are read at a time, split into two 8 byte groups, with each group folded into a 16 byte product, and the two 16 byte products are xor'ed to form a single 16 byte product. Using 8 XMM registers to hold folding products, 128 bytes at time are processed. (256 bytes at at time in the case of AVX512 and ZMM registers).
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
Assume your hardware can implement a carryless multiply that takes two 8 bit operands and produces a 16 bit (technically 15 bit) product.
Let message = M = 31 32 33 34 35 36 37 38. In this case CRC(M) = C7
pre-calculated constants (all values shown in hex):
2^38%107 = DF cycles forwards 0x38 bits
2^30%107 = 29 cycles forwards 0x30 bits
2^28%107 = 62 cycles forwards 0x28 bits
2^20%107 = 16 cycles forwards 0x20 bits
2^18%107 = 6B cycles forwards 0x18 bits
2^10%107 = 15 cycles forwards 0x10 bits
2^08%107 = 07 cycles forwards 0x08 bits
2^00%107 = 01 cycles forwards 0x00 bits
16 bit folded (cycled forward) products (can be calculated in parallel):
31·DF = 16CF
32·29 = 07E2
33·62 = 0AC6
34·16 = 03F8
35·6B = 0A17
36·15 = 038E
37·07 = 0085
38·01 = 0038
----
V = 1137 the xor of the 8 folded products
CRC(V) = 113700 % 107 = C7
To avoid having to use borrowless divide for the modulo operation, CRC(V) can be computed using carryless multiply. For example
V = FFFE
CRC(V) = FFFE00 % 107 = 23.
Implementation, again all values in hex (hex 10 = decimal 16), ⊕ is XOR.
input:
V = FFFE
constants:
P = 107 polynomial
I = 2^10 / 107 = 107 "inverse" of polynomial
by coincidence, it's the same value
2^10 % 107 = 15 for folding right 16 bits
fold the upper 8 bits of FFFE00 16 bits to the right:
U = FF·15 ⊕ FE00 = 0CF3 ⊕ FE00 = F2F3 (check: F2F3%107 = 23 = CRC)
Q = ((U>>8)·I)>>8 = (F2·107)>>8 = ...
to avoid a 9 bit operand, split up 107 = 100 ⊕ 7
Q = ((F2·100) ⊕ (F2·07))>>8 = ((F2<<8) ⊕ (F2·07))>>8 = (F200 ⊕ 02DE)>>8 = F0DE>>8 = F0
X = Q·P = F0·107 = F0·100 ⊕ F0·07 = F0<<8 ⊕ F0·07 = F000 ⊕ 02D0 = F2D0
CRC = U ⊕ X = F2F3 ⊕ F2D0 = 23
Since the CRC is 8 bits, there's no need for the upper 8 bits in the last two steps, but it doesn't help that much for the overall calculation.
X = (Q·(P&FF))&FF = (F0·07)&FF = D0
CRC = (U&FF) ⊕ X = F3 ⊕ D0 = 23
Example program to generate 2^0x10 / 0x107 and powers of 2 % 0x107:
#include <stdio.h>
typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
#define poly 0x107
uint16_t geninv(void) /* generate 2^16 / 9 bit poly */
{
uint16_t q = 0x0000u; /* quotient */
uint16_t d = 0x0001u; /* initial dividend = 2^0 */
for(int i = 0; i < 16; i++){
d <<= 1;
q <<= 1;
if(d&0x0100){ /* if bit 8 set */
q |= 1; /* q |= 1 */
d ^= poly; /* d ^= poly */
}
}
return q; /* return inverse */
}
uint8_t powmodpoly(int n) /* generate 2^n % 9 bit poly */
{
uint16_t d = 0x0001u; /* initial dividend = 2^0 */
for(int i = 0; i < n; i++){
d <<= 1; /* shift dvnd left */
if(d&0x0100){ /* if bit 8 set */
d ^= poly; /* d ^= poly */
}
}
return (uint8_t)d; /* return remainder */
}
int main()
{
printf("%04x\n", geninv());
printf("%02x %02x %02x %02x %02x %02x %02x %02x %02x %02x\n",
powmodpoly(0x00), powmodpoly(0x08), powmodpoly(0x10), powmodpoly(0x18),
powmodpoly(0x20), powmodpoly(0x28), powmodpoly(0x30), powmodpoly(0x38),
powmodpoly(0x40), powmodpoly(0x48));
printf("%02x\n", powmodpoly(0x77)); /* 0xd9, cycles crc backwards 8 bits */
return 0;
}
Long hand example for 2^0x10 / 0x107.
100000111 quotient
-------------------
divisor 100000111 | 10000000000000000 dividend
100000111
---------
111000000
100000111
---------
110001110
100000111
---------
100010010
100000111
---------
10101 remainder
I don't know how many registers you can have in your hardware design, but assume there are five 16 bit registers used to hold folded values, and either two or eight 8 bit registers (depending on how parallel the folding is done). Then following the Intel paper, you fold values for all 64 bytes, 8 bytes at a time, and only need one modulo operation. Register size, fold# = 16 bits, reg# = 8 bits. Note that powers of 2 modulo poly are pre-calculated constants.
foldv = prior buffer's folding value, equivalent to folded msg[-2 -1]
reg0 = foldv>>8
reg1 = foldv&0xFF
foldv = reg0·((2^0x18)%poly) advance by 3 bytes
foldv ^= reg1·((2^0x10)%poly) advance by 2 bytes
fold0 = msg[0 1] ^ foldv handling 2 bytes at a time
fold1 = msg[2 3]
fold2 = msg[4 5]
fold3 = msg[6 7]
for(i = 8; i < 56; i += 8){
reg0 = fold0>>8
reg1 = fold0&ff
fold0 = reg0·((2^0x48)%poly) advance by 9 bytes
fold0 ^= reg1·((2^0x40)%poly) advance by 8 bytes
fold0 ^= msg[i+0 i+1]
reg2 = fold1>>8 if not parallel, reg0
reg3 = fold1&ff and reg1
fold1 = reg2·((2^0x48)%poly) advance by 9 bytes
fold1 ^= reg3·((2^0x40)%poly) advance by 8 bytes
fold1 ^= msg[i+2 i+3]
...
fold3 ^= msg[i+6 i+7]
}
reg0 = fold0>>8
reg1 = fold0&ff
fold0 = reg0·((2^0x38)%poly) advance by 7 bytes
fold0 ^= reg1·((2^0x30)%poly) advance by 6 bytes
reg2 = fold1>>8 if not parallel, reg0
reg3 = fold1&ff and reg1
fold1 = reg2·((2^0x28)%poly) advance by 5 bytes
fold1 ^= reg3·((2^0x20)%poly) advance by 4 bytes
fold2 ... advance by 3 2 bytes
fold3 ... advance by 1 0 bytes
foldv = fold0^fold1^fold2^fold3
Say the final buffer has 5 bytes:
foldv = prior folding value, equivalent to folded msg[-2 -1]
reg0 = foldv>>8
reg1 = foldv&0xFF
foldv = reg0·((2^0x30)%poly) advance by 6 bytes
foldv ^= reg1·((2^0x28)%poly) advance by 5 bytes
fold0 = msg[0 1] ^ foldv
reg0 = fold0>>8
reg1 = fold0&ff
fold0 = reg0·((2^0x20)%poly) advance by 4 bytes
fold0 ^= reg1·((2^0x18)%poly) advance by 3 bytes
fold1 = msg[2 3]
reg2 = fold1>>8
reg3 = fold1&ff
fold1 = reg0·((2^0x10)%poly) advance by 2 bytes
fold1 ^= reg1·((2^0x08)%poly) advance by 1 bytes
fold2 = msg[4] just one byte loaded
fold3 = 0
foldv = fold0^fold1^fold2^fold3
now use the method above to calculate CRC(foldv)
As shown in your diagram, you need to calculate the CRC of 0x05 0x00, (A,0), and the CRC of 0x00 0x07, (0,B), and then exclusive-or those together. Calculating on the site you linked, you get 0x41 and 0x15 respectively. Exclusive-or those together, and, voila, you get 0x54, the CRC of 0x05 0x07.
There is a shortcut for (0,B), since for this CRC, the CRC of a string of zeros is zero. You can calculate the CRC of just 0x07 and get the same result as for 0x00 0x07, which is 0x15.
See crcany for how to combine CRCs in general. crcany will generate C code to compute any specified CRC, including code to combine CRCs. It employs a technique that applies n zeros to a CRC in O(log(n)) time instead of O(n) time.
I've two integer of one byte, so each integer can be in the range 0-255.
Suppose
a = 247
b = 8
Now the binary representation of each is:
a = 11110111
b = 00001000
What I need are the following operations:
1) Concat those two binary sequence, so following the example, "a" concat "b" would result in:
concat = 1111011100001000 -> that is 63240
2) Consider only the first n most significative bits, suppose the first 10 bits, this would resoult in:
msb = 1111011100000000
3) Revert the bits order, this would result in:
reverse = 0001000011101111
For the question number 2 I know it's just a shift operation, for the 3 I've guessed to convert in hex, then swap and then convert to integer, but I guess also that there are more elegant way.
For the number 1, I've wrote this solution, but I'm looking for something more elegant and performant:
a = 247
b = 8
first = hex(a)[2:].zfill(2)
second = hex(b)[2:].zfill(2)
concat = int("0x"+first+second,0)
Thank you
Those operations are basic bit manipulations, with the exception of the last step:
247, left shift by 8 bits (= move to higher byte)
keep 8 as it is (= lower byte)
add those results
bit-and with a mask that zeroes out the last 6 bits
reverse the bit order (not a bit operation; more methods here)
In Python 2.7 code:
a = 247 << 8
print "a = {0: >5} - {1:0>16}".format(a, bin(a)[2:])
b = 8
print "b = {0: >5} - {1:0>16}".format(b, bin(b)[2:])
c = a + b
print "c = {0: >5} - {1:0>16}".format(c, bin(c)[2:])
d = c & ~63 # 63 = 111111, ~63 is the inverse
print "d = {0: >5} - {1:0>16}".format(d, bin(d)[2:])
e = int('{:08b}'.format(d)[::-1], 2)
print "e = {0: >5} - {1:0>16}".format(e, bin(e)[2:])
Output
a = 63232 - 1111011100000000
b = 8 - 0000000000001000
c = 63240 - 1111011100001000
d = 63232 - 1111011100000000
e = 239 - 0000000011101111
Given this implementation of atoi in C++
// A simple atoi() function
int myAtoi(char *str)
{
int res = 0; // Initialize result
// Iterate through all characters of input string and update result
for (int i = 0; str[i] != '\0'; ++i)
res = res*10 + str[i] - '0';
// return result.
return res;
}
// Driver program to test above function
int main()
{
char str[] = "89789";
int val = myAtoi(str);
printf ("%d ", val);
return 0;
}
How exactly does the line
res = res*10 + str[i] - '0';
Change a string of digits into int values? (I'm fairly rusty with C++ to be honest. )
The standard requires that the digits are consecutive in the character set. That means you can use:
str[i] - '0'
To translate the character's value into its equivalent numerical value.
The res * 10 part is to shuffle left the digits in the running total to make room for the new digit you're inserting.
For example, if you were to pass "123" to this function, res would be 1 after the first loop iteration, then 12, and finally 123.
Each step that line does two things:
Shifts all digit left by a place in decimal
Places the current digit at the ones place
The part str[i] - '0' takes the ASCII character of the corresponding digit which are sequentially "0123456789" and subtracts the code for '0' from the current character. This leaves a number in the range 0..9 as to which digit is in that place in the string.
So when looking at your example case the following would happen:
i = 0 → str[i] = '8': res = 0 * 10 + 8 = 8
i = 1 → str[i] = '9': res = 8 * 10 + 9 = 89
i = 2 → str[i] = '7': res = 89 * 10 + 7 = 897
i = 3 → str[i] = '8': res = 897 * 10 + 8 = 8978
i = 4 → str[i] = '9': res = 8978 * 10 + 9 = 89789
And there's your result.
The digits 0123456789are sequential in ASCII.
The char datatype (and literal chars like '0') are integral numbers. In this case, '0' is equivalent to 48. Subtracting this offset will give you the digit in numerical form.
Lets take an example:
str = "234";
to convert it into int, basic idea is to process each character of string like this:
res = 2*100 + 3*10 + 4
or
res = 0
step1: res = 0*10 + 2 = 0 + 2 = 2
step2: res = res*10 + 3 = 20 + 3 = 23
step3: res = res*10 + 4 = 230 + 4 = 234
now since each letter in "234" is actually a character, not int
and has ASCII value associated with it
ASCII of '2' = 50
ASCII of '3' = 51
ASCII of '4' = 52
ASCII of '0' = 48
refer: http://www.asciitable.com/
if i had done this:
res = 0;
res = res*10 + str[0] = 0 + 50 = 50
res = res*10 + str[1] = 500 + 51 = 551
res = res*10 + str[2] = 5510 + 52 = 5562
then i would have obtained 5562, which we don't want.
remember: on using characters in arithmetic expressions, their ASCII value is used up (automatic typecasting of char -> int). Hence we need to convert character '2'(50) to int 2, which we can accomplish like this:
'2' - '0' = 50 - 48 = 2
Lets solve it again with this correction:
res = 0
res = res*10 + (str[0] - '0') = 0 + (50 - 48) = 0 + 2 = 2
res = res*10 + (str[1] - '0') = 20 + (51 - 48) = 20 + 3 = 23
res = res*10 + (str[2] - '0') = 230 + (52 - 48) = 230 + 4 = 234
234 is the required answer
What do "Non-Power-Of-Two Textures" mean? I read this tutorial and I meet some binaries operations("<<", ">>", "^", "~"), but I don't understand what they are doing.
For example following code:
GLuint LTexture::powerOfTwo(GLuint num)
{
if (num != 0)
{
num--;
num |= (num >> 1); //Or first 2 bits
num |= (num >> 2); //Or next 2 bits
num |= (num >> 4); //Or next 4 bits
num |= (num >> 8); //Or next 8 bits
num |= (num >> 16); //Or next 16 bits
num++;
}
return num;
}
I very want to understand this operations. As well, I read this. Very short article. I want to see examples of using, but I not found. I did the test:
int a = 5;
a <<= 1; //a = 10
a = 5;
a <<= 2; //a = 20
a = 5;
a <<= 3; //a = 40
Okay, this like multiply on two, but
int a = 5;
a >>= 1; // a = 2 Whaat??
In C++, the <<= is the "left binary shift" assignment operator; the operand on the left is treated as a binary number, the bits are moved to the left, and zero bits are inserted on the right.
The >>= is the right binary shift; bits are moved to the right and "fall off" the right end, so it's like a division by 2 (for each bit) but with truncation. For negative signed integers, by the way, additional 1 bits are shifted in at the left end ("arithmetic right shift"), which may be surprising; for positive signed integers, or unsigned integers, 0 bits are shifted in at the left ("logical right shift").
"Powers of two" are the numbers created by successive doublings of 1: 2, 4, 8, 16, 32… Most graphics hardware prefers to work with texture maps which are powers of two in size.
As said in http://lazyfoo.net/tutorials/OpenGL/08_non_power_of_2_textures/index.php
powerOfTwo will take the argument and find nearest number that is power of two.
GLuint powerOfTwo( GLuint num );
/*
Pre Condition:
-None
Post Condition:
-Returns nearest power of two integer that is greater
Side Effects:
-None
*/
Let's test:
num=60 (decimal) and its binary is 111100
num--; .. 59 111011
num |= (num >> 1); //Or first 2 bits 011101 | 111011 = 111111
num |= (num >> 2); //Or next 2 bits 001111 | 111111 = 111111
num |= (num >> 4); //Or next 4 bits 000011 | 111111 = 111111
num |= (num >> 8); //Or next 8 bits 000000 | 111111 = 111111
num |= (num >> 16); //Or next 16 bits 000000 | 111111 = 111111
num++; ..63+1 = 64
output 64.
For num=5: num-1 =4 (binary 0100), after all num |= (num >> N) it will be 0111 or 7 decimal). Then num+1 is equal to 8.
As you should know the data in our computers is represented in the binary system, in which digits are either a 1 or a 0.
So for example number 10 decimal = 1010 binary. (1*2^3 + 0*2^2 + 1*2^1 + 0*2^0).
Let's go to the operations now.
Binary | OR means that wherever you have at least one 1 the output will be 1.
1010
| 0100
------
1110
~ NOT means negation i.e. all 0s become 1s and all 1s become 0s.
~ 1010
------
0101
^ XOR means you turn a pair of 1 and 0 into a 1. All other combinations leave a 0 as output.
1010
^ 0110
------
1100
Bit shift.
N >> x means we "slide" our number N, x bits to the right.
1010 >> 1 = 0101(0) // zero in the brackets is dropped,
since it goes out of the representation = 0101
1001 >> 1 = 0100(1) // (1) is dropped = 0100
<< behaves the same way, just the opposite direction.
1000 << 1 = 0001
Since in binary system numbers are represented as powers of 2, shifting a bit one or the other direction will result in multiplying or dividing by 2.
Let num = 36. First subtract 1, giving 35. In binary, this is 100011.
Right shift by 1 position gives 10001 (the rightmost digit disappears). Bitwise Or'ed with num gives:
100011
10001
-------
110011
Note that this ensures two 1's on the left.
Now right shift by 2 positions, giving 1100. Bitwise Or:
110011
1100
-------
111111
This ensures four 1's on the left.
And so on, until the value is completely filled with 1's from the leftmost.
Add 1 and you get 1000000, a power of 2.
This procedure always generates a power of two, and you can check that it is just above the initial value of num.
can someone explain to me why the following results in b = 13?
int a, b, c;
a = 1|2|4;
b = 8;
c = 2;
b |= a;
b&= ~c;
It is using binary manipultaors. (Assuming ints are 1 byte, and use Two's complement for storage, etc.)
a = 1|2|4 means a = 00000001 or 00000010 or 00000100, which is 00000111, or 7.
b = 8 means b = 00001000.
c = 2 means c = 00000010.
b |= a means b = b | a which means b = 00001000 or 00000111, which is 00001111, or 15.
~c means not c, which is 11111101.
b &= ~c means b = b & ~c, which means b = 00001111 and 11111101, which is 00001101, or 13.
http://www.cs.cf.ac.uk/Dave/C/node13.html
a = 1|2|4
= 0b001
| 0b010
| 0b100
= 0b111
= 7
b = 8 = 0b1000
c = 2 = 0b10
b|a = 0b1000
| 0b0111
= 0b1111 = 15
~c = 0b111...1101
(b|a) & ~c = 0b00..001111
& 0b11..111101
= 0b00..001101
= 13
lets go into binary mode:
a = 0111 (7 in decimal)
b = 1000 (8)
c = 0010 (2)
then we OR b with a to get b = 1111 (15)
c = 0010 and ~c = 1101
finally b is anded with negated c which gives us c = 1101 (13)
hint: Convert decimal to binary and give it a shot.. maybe... just maybe you'll figure it all out by yourself
a = 1 | 2 | 4;
Assigns the value 7 to a. This is because you are performing a bitwise OR operation on the constants 1, 2 and 4. Since the binary representation of each of these is 1, 10 and 100 respectively, you get 111 which is 7.
b |= a;
This ORs b and a and assigns the result to b. Since b's binary representation is now 111 and a's binary representation is 1000 (8), you end up with 1111 or 15.
b &= ~c;
The ~c in this expression means the bitwise negation of c. This essentially flips all 0's to 1's and vice versa in the binary representation of c. This means c switches from 10 to 111...11101.
After negating c, there is a bitwise AND operation between b and c. This means only bits that are 1 in both b and c remain 1, all others equal 0. Since b is now 1111 and c is all 1's except in the second lowest order bit, all of b's bits remain 1 except the 2 bit.
The result of flipping b's 2 bit is the same as if you simply subtracted 2 from its value. Since its current value is 15, and 15-2 = 13, the assignment results in b == 13.