CRC32 C++ implementation using bool array and manually XORing bit by bit - c++

I have problem with understanding how CRC32 should work in normal way.
I've implemented mechanism from wiki and other sites: https://en.wikipedia.org/wiki/Cyclic_redundancy_check#Computation
where you xor elements bit by bit. For CRC32 I've used Polynomial from wiki, which is also everywhere:
x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1
with binary representation: 1 0000 0100 1100 0001 0001 1101 1011 0111
I was calculating CRC32 for input string "1234" only for testing.
This is the output of program:
https://i.stack.imgur.com/tG4wk.png
as you can see the xor is calculated properly and CRC32 is "619119D1". When I calculate it using online calculator or even c++ boost lib, the answer is "9BE3E0A3".
What is wrong with normal XORing input string bit by bit? Should I add something at the end or what?
I don't want to use libs and any other magic code to compute this, because I have to implement it in that way for my study project.
I've tried also polynomial without x^32, negate bits at the end, starting from 1s instead of 0s (where you have to add 32 zeros), and the answer is also different. I have no idea what should I do now to fix this.
This is the part of the code (a bit changed), I have buffor 3parts * 32bits, I'm loading 4 Chars from file to the middle part and xor from beggining to the middle, at the end I xor the middle part and the end -> the end is CRC32.
My pseudo schema:
1) Load 8 chars
2) | First part | Middle Part | CRC32 = 0 |
3) XOR
4) | 0 0 0 0 | XXXXXXX | 0 0 0 0 |
5) memcpy - middle part to first part
6) | XXXXXXX | XXXXXXX | 0 0 0 0 |
7) Load 4 chars
8) | XXXXXXX | loaded 4chars | 0 0 0 0 |
9) repeat from point 4 to the end of file
10) now we have: | 0 0 0 0 | XXXXXX | 0 0 0 0 |
11) last xor from middle part to end
12) Result: | 0 0 0 0 | 0 0 0 0 | CRC32 |
Probably screen with output will be more helpful.
I will use smart pointers etc. later ;)
bool xorBuffer(unsigned char *buffer) {
bool * binaryTab = nullptr;
try {
// CRC-32
// 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
// 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 0 1 1 1 0 1 1 0 1 1 0 1 1 1
const int dividerSizeBits = 33;
const bool binaryDivider[dividerSizeBits] = { 1,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,1,0,0,0,1,1,1,0,1,1,0,1,1,0,1,1,1 };
const int dividerLength = countLength(binaryDivider, dividerSizeBits);
const int dividerOffset = dividerSizeBits - dividerLength; // when divider < 33 bits
bool * binaryTab = charTabToBits(buffer);
// check tab if first part = 0
while (!checkTabIfEmpty(binaryTab)) {
// set the beginnning
int start = 0;
for (start = 0; start < 32; start++)
if (binaryTab[start] == true)
break;
for (int i = 0; i < dividerLength; i++)
binaryTab[i + start] = binaryTab[i + start] ^ binaryDivider[i + dividerOffset];
}
// binaryTab -> charTab
convertBinaryTabToCharTab(binaryTab, buffer);
}
catch (exception e) {
delete[] binaryTab;
return false;
}
delete[] binaryTab;
return true;
}
std::string CRC::countCRC(std::string fileName){
// create variables
int bufferOnePartSize = 4;
int bufferSize = bufferOnePartSize * 3;
bool EOFFlag = false;
unsigned char *buffer = new unsigned char[bufferSize];
for (int i = 0; i < 3 * bufferOnePartSize; i++)
buffer[i] = 0;
// open file
ifstream fin;
fin.open(fileName.c_str(), ios_base::in | ios_base::binary);
int position = 0;
int count = 0;
// while -> EOF
if (fin.is_open()) {
// TODO check if file <= 4 -> another solution
char ch;
int multiply = 2;
bool skipNormalXor = false;
while (true) {
count = 0;
if (multiply == 2)
position = 0;
else
position = bufferOnePartSize;
// copy part form file to tab
while (count < bufferOnePartSize * multiply && fin.get(ch)) {
buffer[position] = (unsigned char)ch;
++count;
++position;
}
cout << endl;
// if EOF write zeros to end of tab
if (count == 0) {
cout << "TODO: end of file" << endl;
EOFFlag = true;
skipNormalXor = true;
}
else if (count != bufferOnePartSize * multiply) {
for (int i = count; i < bufferOnePartSize * multiply; i++) {
buffer[position] = 0;
position++;
}
EOFFlag = true;
}
if (!skipNormalXor) {
// -- first part
multiply = 1;
// xor the buffer
xorBuffer(buffer);
}
if (EOFFlag) { // xor to the end
xorBuffer(buffer + bufferOnePartSize);
break;
}
else {
// copy memory
for (int i = 0; i < bufferOnePartSize; i++)
buffer[i] = buffer[i + bufferOnePartSize];
}
}
cout << "\n End\n";
fin.close();
}
stringstream crcSum;
for (int i = 2 * bufferOnePartSize; i < bufferSize; i++) {
//buffer[i] = ~buffer[i];
crcSum << std::hex << (unsigned int)buffer[i];
}
cout << endl << "CRC: " << crcSum.str() << endl;
delete[] buffer;
return crcSum.str();
}

A CRC is not defined by just the polynomial. You need to define the bit ordering, the initial value of the CRC register, and the final exclusive-or of the CRC. For the standard CRC-32, which gives 0x9be3e0a3 for "1234", the bits are processed starting with the least significant bit, the initial value of the register is 0xffffffff, and you exclusive-or the final results with 0xffffffff.

Related

How to create a number with (f)16 repeating n times?

I need to create a number where (f)16 repeats n times. 0 < n <= 16.
I tried the following for example for n = 16
std::cout << "hi:" << std::hex << std::showbase << (1ULL << 64) - 1 << std::endl;
warning: shift count >= width of type [-Wshift-count-overflow]
std::cout << "hi:" << std::hex << std::showbase << (1ULL << 64) - 1 << std::endl;
^ ~~ 1 warning generated.
hi:0x200
How can I get all digits f without overflowing ULL ?
For n = 1 to 16, you could start with all Fs and then shift accordingly:
0xFFFFFFFFFFFFFFFFULL >> (4*(16-n));
(handle n=0 separately)
where (f)16 repeats n times.
If I understood that correctly, I believe that's trivial. Add one f. Shift the number to the left by 4 bits. Add another f. Shift to the left 4 bits. Add another f. Repeat n times.
#include <stdio.h>
unsigned long long gen(unsigned n) {
unsigned long long r = 0;
while (n--) {
r <<= 4;
r |= 0xf;
}
return r;
}
int main() {
for (int i = 0; i < 16; ++i) {
printf("%d -> %llx\n", i, gen(i));
}
}
outputs:
0 -> 0
1 -> f
2 -> ff
3 -> fff
4 -> ffff
5 -> fffff
6 -> ffffff
7 -> fffffff
8 -> ffffffff
9 -> fffffffff
10 -> ffffffffff
11 -> fffffffffff
12 -> ffffffffffff
13 -> fffffffffffff
14 -> ffffffffffffff
15 -> fffffffffffffff
Since shifting by 4*n bits is problematic if n is 16 and unsigned long long is 64 bits, you can solve the problem by shifting by a smaller amount. If n is known to be positive, we can partition it into two shifts:
(1ull << 4 << 4*(n-1)) - 1u
And, since 1ull << 4 is a constant, we can replace it:
(0x10ull << 4*(n-1)) - 1u
If n can be zero, then, to support any value from 0 to 16, we cannot use a single expression. A solution is:
n ? 0 : (0x10ull << 4*(n-1)) - 1u
If you're only interrested in in hex format and the digit f, use the other answers.
The function below can generate the number for both hex and decimal formats and for any digit.
#include <iostream>
uint64_t getNum(uint64_t digit, uint64_t times, uint64_t base)
{
if (base != 10 && base != 16) return 0;
if (digit >= base) return 0;
uint64_t res = 0;
uint64_t multiply = 1;
for(uint64_t i = 0; i < times; ++i)
{
res += digit * multiply;
multiply *= base;
}
return res;
}
int main() {
std::cout << getNum(3, 7, 10) << std::endl;
std::cout << std::hex << getNum(0xa, 14, 16) << std::dec << std::endl;
return 0;
}
Output:
3333333
aaaaaaaaaaaaaa
notice: The current code has no overflow detection.
You can write a separate function looking for example the following way.
#include <stdio.h>
unsigned long long create_hex( size_t n )
{
unsigned long long x = 0;
n %= 2 * sizeof( unsigned long long );
while ( n-- )
{
x = x << 4 | 0xf;
}
return x;
}
int main( void )
{
for ( size_t i = 0; i <= 16; i++ )
{
printf( "%zu -> %llx\n", i, create_hex( i ) );
}
}
The program output is
0 -> 0
1 -> f
2 -> ff
3 -> fff
4 -> ffff
5 -> fffff
6 -> ffffff
7 -> fffffff
8 -> ffffffff
9 -> fffffffff
10 -> ffffffffff
11 -> fffffffffff
12 -> ffffffffffff
13 -> fffffffffffff
14 -> ffffffffffffff
15 -> fffffffffffffff
16 -> 0
As initially you was using two language tag, C and C++, then to run this program as a C++ program substitute the header <stdio.h> for <iostream> and use the operator << instead of the call of printf.

C++ Extracting a character from an image using bit-wise operations

this is my first time here asking a questions, so bear with me! I have a steganography lab that I am nearly complete with. I have completed a program that hides a message in the lower bits of an image, but the program to extract the image is where I am stuck. The image is in a file represented as a 2D matrix, column major order. So here is the code where I am stuck.
void image::reveal_message()
{
int bitcount = 0;
char c;
char *msg;
while(c != '\0' || bitcount < 1128)
{
for(int z = 0; z < cols; z++)
{
for(int k = 0; k < 8; k++)
{
int i = bitcount % rows ;
int j = bitcount / rows ;
int b = c & 1;
if(img[i][j] % 2 != 0 && b == 0)
{
c = c & (~1);
}
else if(img[i][j] % 2 == 0 && b == 1)
{
c = c | 1;
}
bitcount++;
c = c << 1;
}
reverse_bits(c);
cout << c << endl;
//strncat(msg, &c, 1);
}
}
int i = 0;
for(int i = 0; i < cols; i++)
{
if(!isprint(msg[i]))
{
cout << "There is no hidden message" << endl;
}
}
cout << "This is the hidden message" << endl;
cout << msg;
}
The code is able to loop through and grab all the right number for the bits. The bits are based on if the number in the matrix is odd or even. Where I am having trouble is actually setting the bits of the char to the bits the I extracted from the matrix. I am not the best at bit-wise operations, and we are also not supposed to use any library for this. The reverse_bits function works as well, so it seems to be just my shifting and bit-wise operations are messed up.I also commented out the strcat() line because it was producing a lot of errors due to the fact that char c is incorrect. Also the main error I keep receiving is Segmentation Dump.
My understanding from your code is that you embedded your message as 1 bit per pixel, row by row. For example, if you have a 3x10 image, with pixels
01 02 03 04 05 06 07 08 09 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
the first character of your message resides in the pixels 01-08, the second from 09 to 16, etc. After your message, you embedded an extra null character, which you can use during extraction to know when to stop. With all that in mind, you're looking for something like this.
int bitcount = 0;
int i = 0;
int j = 0;
while(bitcount < 1128)
{
// this will serve as the ordinal value for the extracted char
int b = 0;
for(int k = 0; k < 8; k++)
{
b = (b << 1) | (img[i][j] & 1);
j++;
if(j == cols)
{
i++;
j = 0;
}
}
bitcount += 8;
// do whatever you want with this, print it, store it somewhere, etc
c = (char)b;
if(c == '\0')
{
break;
}
}
Understanding how the bitshifting work. b starts with the value 0, or 00000000 if you would like to visualise it in binary. Every time, you shift it to the left by one to make room for the new extracted bit, which you OR. No need to check whether it's 1 or 0, it'll just work.
So, imagine you've extracted 5 bits so far, b is 00010011 and the least significant bit of the current image pixel is 1. What will happen is this
b = (b << 1) | 1 // b = 00100110 | 1 = 00100111
And thus you have extracted the 6th bit.
Now, let's say you embedded the character "a" (01100001) in the first 8 pixels.
01 02 03 04 05 06 07 08 \\ pixels
0 1 1 0 0 0 0 1 \\ least significant bit of each pixel
When you extract the bits with the above, b will equal to 97 and c will give you "a". However, if you embedded your bits in the reverse order, i.e.,
01 02 03 04 05 06 07 08 \\ pixels
1 0 0 0 0 1 1 0 \\ least significant bit of each pixel
you should change the extracting algorithm to the following so you won't have to reverse the bits later on
int b = 0;
for(int k = 7; k <= 0; k--)
{
b = b | ((img[i][j] & 1) << k);
// etc
}
You start with undefined data in your char c.
You read from it here int b = c & 1;.
That is clearly nonsense.
c = c <<1; // shift before, not after
// if odd clear:
if(img[i][j] % 2)
{
c = c & (~1);
}
else // if even set:
{
c = c | 1;
}
the above may not read the data, but at least is not nonesense.
The bitwise operations look otherwise fine.
char *msg; should be std::string, and use += instead of strncat.

Most efficient way of checking for shared row, column, diagonal?

In C++ if I have a square array int board[8][8] that's filled like this:
0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0
0 0 0 1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0
0 1 0 0 0 0 0 0
What's the shortest way to check if any of the 1's share a row, column, or diagonal with another 1?
edit: I said most efficient when I really meant shortest
8 x 8 board? This must be related to chess or something.
Here's a clever way to test if any piece is on hit by the queen (i.e. almost identical to whether a 1 shares a row, column or diagonal with another 1).
bool CG_queen::move(File f_to, Rank r_to, File f_from, Rank r_from)
{
bool canMakeMove = false;
//Check to see if Queen is moving only by File or only by Rank.
//aka, only vertically or horizontally.
if ( f_from == f_to || r_from == r_to )
{
canMakeMove = true;
}
//Check to see if Queen only moves diagonally.
if ( abs(f_from - f_to) == abs(r_to - r_from) )
{
canMakeMove = true;
}
return canMakeMove;
}
You can use a bitmask for the rows, columns and diagonals to indicate if there is a 1 on any of them:
int rowMask = 0;
int ColumnMask = 0;
int diagonalMask0 = 0;
int diagonalMask1 = 0;
for(int i = 0; i < 8; i++)
{
for(int j = 0; j < 8; j++)
{
if(board[i][j])
{
// test row:
if(rowMask & (1 << i))
return true;
rowMask |= 1 << i; // mark row set
// test column:
if(columnMask & (1 << j))
return true;
columnMask |= 1 << j; // mark column set
// test first diagonal:
if(diagonalMask0 & (1 << (i + j)))
return true;
diagonalMask0 |= 1 << (i + j); // mark diagonal set
// test first diagonal:
if(diagonalMask1 & (1 << (8 + i - j)))
return true;
diagonalMask1 |= 1 << (8 + i - j); // mark diagonal set
}
}
}
return false;
If there is an element set in a particular row, the bit for that row is tested in rowMask. If it is already set then return true, otherwise set it using a bitwise OR so other elements can be tested against it. Do likewise for columns and the diagonals.

Calculate Nth multiset combination (with repetition) based only on index

How can i calculate the Nth combo based only on it's index.
There should be (n+k-1)!/(k!(n-1)!) combinations with repetitions.
with n=2, k=5 you get:
0|{0,0,0,0,0}
1|{0,0,0,0,1}
2|{0,0,0,1,1}
3|{0,0,1,1,1}
4|{0,1,1,1,1}
5|{1,1,1,1,1}
So black_magic_function(3) should produce {0,0,1,1,1}.
This will be going into a GPU shader, so i want each work-group/thread to be able to figure out their subset of permutations without having to store the sequence globally.
with n=3, k=5 you get:
i=0, {0,0,0,0,0}
i=1, {0,0,0,0,1}
i=2, {0,0,0,0,2}
i=3, {0,0,0,1,1}
i=4, {0,0,0,1,2}
i=5, {0,0,0,2,2}
i=6, {0,0,1,1,1}
i=7, {0,0,1,1,2}
i=8, {0,0,1,2,2}
i=9, {0,0,2,2,2}
i=10, {0,1,1,1,1}
i=11, {0,1,1,1,2}
i=12, {0,1,1,2,2}
i=13, {0,1,2,2,2}
i=14, {0,2,2,2,2}
i=15, {1,1,1,1,1}
i=16, {1,1,1,1,2}
i=17, {1,1,1,2,2}
i=18, {1,1,2,2,2}
i=19, {1,2,2,2,2}
i=20, {2,2,2,2,2}
The algorithm for generating it can be seen as MBnext_multicombination at http://www.martinbroadhurst.com/combinatorial-algorithms.html
Update:
So i thought i'd replace the binomial coefficient in pascals triangle with (n+k-1)!/(k!(n-1)!) to see how it looks.
(* Mathematica code to display pascal and other triangle *)
t1 = Table[Binomial[n, k], {n, 0, 8}, {k, 0, n}];
t2 = Table[(n + k - 1)!/(k! (n - 1)!), {n, 0, 8}, {k, 0, n}];
(*display*)
{Row[#, "\t"]} & /# t1 // Grid
{Row[#, "\t"]} & /# t2 // Grid
T1:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
T2:
Indeterminate
1 1
1 2 3
1 3 6 10
1 4 10 20 35
1 5 15 35 70 126
1 6 21 56 126 252 462
1 7 28 84 210 462 924 1716
1 8 36 120 330 792 1716 3432 6435
Comparing with the n=3,k=5 console output at the start of this post: the third diagonal {3,6,10,15,21,28,36} gives the index of each roll-over point {0,0,0,1,1} -> {0,0,1,1,1} -> {0,1,1,1,1}, etc. And the diagonal to the left of it seems to show how many values are contained in the previous block (diagonal[2][i] == diagonal[3][i] - diagonal[3][i-1])). And if you read the 5th row of the pyramid horizontally you get the max amount of combinations for increasing values of N in (n+k-1)!/(k!(n-1)!) where K=5.
There is probably a way to use this information to determine the exact combo for an arbitrary index, without enumerating the whole set, but i'm not sure if i need to go that far. The original problem was just to decompose the full combo space into equal subsets, that can be generated locally, and worked on in parallel by the GPU. So the triangle above gives us the starting index of every block, of which the combo can be trivially derived, and all its successive elements incrementally enumerated. It also gives us the block size, and how many total combinations we have. So now it becomes a packing problem of how to fit unevenly sized blocks into groups of equal work load across X amount of threads.
See the example at:
https://en.wikipedia.org/wiki/Combinatorial_number_system#Finding_the_k-combination_for_a_given_number
Just replace the binomial Coefficient with (n+k-1)!/(k!(n-1)!).
Assuming n=3,k=5, let's say we want to calculate the 19th combination (id=19).
id=0, {0,0,0,0,0}
id=1, {0,0,0,0,1}
id=2, {0,0,0,0,2}
...
id=16, {1,1,1,1,2}
id=17, {1,1,1,2,2}
id=18, {1,1,2,2,2}
id=19, {1,2,2,2,2}
id=20, {2,2,2,2,2}
The result we're looking for is {1,2,2,2,2}.
Examining our 'T2' triangle: n=3,k=5 points to 21, being the 5th number (top to bottom) of the third diagonal (left to right).
Indeterminate
1 1
1 2 3
1 3 6 10
1 4 10 20 35
1 5 15 35 70 126
1 6 21 56 126 252 462
1 7 28 84 210 462 924 1716
1 8 36 120 330 792 1716 3432 6435
We need to find the largest number in this row (horizontally, not diagonally) that does not exceed our id=19 value. So moving left from 21 we arrive at 6 (this operation is performed by the largest function below). Since 6 is the 2nd number in this row it corresponds to n==2 (or g[2,5] == 6 from the code below).
Now that we've found the 5th number in the combination, we move up a floor in the pyramid, so k-1=4. We also subtract the 6 we encountered below from id, so id=19-6=13. Repeating the entire process we find 5 (n==2 again) to be the largest number less than 13 in this row.
Next: 13-5=8, Largest is 4 in this row (n==2 yet again).
Next: 8-4=4, Largest is 3 in this row (n==2 one more time).
Next: 4-3=1, Largest is 1 in this row (n==1)
So collecting the indices at each stage we get {1,2,2,2,2}
The following Mathematica code does the job:
g[n_, k_] := (n + k - 1)!/(k! (n - 1)!)
largest[i_, nn_, kk_] := With[
{x = g[nn, kk]},
If[x > i, largest[i, nn-1, kk], {nn,x}]
]
id2combo[id_, n_, 0] := {}
id2combo[id_, n_, k_] := Module[
{val, offset},
{val, offset} = largest[id, n, k];
Append[id2combo[id-offset, n, k-1], val]
]
Update:
The order that the combinations were being generated by MBnext_multicombination wasn't matching id2combo, so i don't think they were lexicographic. The function below generates them in the same order as id2combo and matches the order of mathematica's Sort[]function on a list of lists.
void next_combo(unsigned int *ar, unsigned int n, unsigned int k)
{
unsigned int i, lowest_i;
for (i=lowest_i=0; i < k; ++i)
lowest_i = (ar[i] < ar[lowest_i]) ? i : lowest_i;
++ar[lowest_i];
i = (ar[lowest_i] >= n)
? 0 // 0 -> all combinations have been exhausted, reset to first combination.
: lowest_i+1; // _ -> base incremented. digits to the right of it are now zero.
for (; i<k; ++i)
ar[i] = 0;
}
Here is a combinatorial number system implementation which handles combinations with and without repetition (i.e multiset), and can optionally produce lexicographic ordering.
/// Combinatorial number system encoder/decoder
/// https://en.wikipedia.org/wiki/Combinatorial_number_system
struct CNS(
/// Type for packed representation
P,
/// Type for one position in unpacked representation
U,
/// Number of positions in unpacked representation
size_t N,
/// Cardinality (maximum value plus one) of one position in unpacked representation
U unpackedCard,
/// Produce lexicographic ordering?
bool lexicographic,
/// Are repetitions representable? (multiset support)
bool multiset,
)
{
static:
/// Cardinality (maximum value plus one) of the packed representation
static if (multiset)
enum P packedCard = multisetCoefficient(unpackedCard, N);
else
enum P packedCard = binomialCoefficient(unpackedCard, N);
alias Index = P;
private P summand(U value, Index i)
{
static if (lexicographic)
{
value = cast(U)(unpackedCard-1 - value);
i = cast(Index)(N-1 - i);
}
static if (multiset)
value += i;
return binomialCoefficient(value, i + 1);
}
P pack(U[N] values)
{
P packed = 0;
foreach (Index i, value; values)
{
static if (!multiset)
assert(i == 0 || value > values[i-1]);
else
assert(i == 0 || value >= values[i-1]);
packed += summand(value, i);
}
static if (lexicographic)
packed = packedCard-1 - packed;
return packed;
}
U[N] unpack(P packed)
{
static if (lexicographic)
packed = packedCard-1 - packed;
void unpackOne(Index i, ref U r)
{
bool checkValue(U value, U nextValue)
{
if (summand(nextValue, i) > packed)
{
r = value;
packed -= summand(value, i);
return true;
}
return false;
}
// TODO optimize: (rolling product / binary search / precomputed tables)
// TODO optimize: don't check below N-i
static if (lexicographic)
{
foreach_reverse (U value; 0 .. unpackedCard)
if (checkValue(value, cast(U)(value - 1)))
break;
}
else
{
foreach (U value; 0 .. unpackedCard)
if (checkValue(value, cast(U)(value + 1)))
break;
}
}
U[N] values;
static if (lexicographic)
foreach (Index i, ref r; values)
unpackOne(i, r);
else
foreach_reverse (Index i, ref r; values)
unpackOne(i, r);
return values;
}
}
Full code: https://gist.github.com/CyberShadow/67da819b78c5fd16d266a1a3b4154203
I have done some preliminary analysis on the problem. Before I talk about the inefficient solution I found, let me give you a link to a paper I wrote on how to translate the k-indexes (or combination) to the rank or lexigraphic index to the combinations associated with the binomial coefficient:
http://tablizingthebinomialcoeff.wordpress.com/
I started out the same way in trying to solve this problem. I came up with the following code that uses one loop for each value of k in the formula (n+k-1)!/k!(n-1)! when k = 5. As written, this code will generate all combinations for the case of n choose 5:
private static void GetCombos(int nElements)
{
// This code shows how to generate all the k-indexes or combinations for any number of elements when k = 5.
int k1, k2, k3, k4, k5;
int n = nElements;
int i = 0;
for (k5 = 0; k5 < n; k5++)
{
for (k4 = k5; k4 < n; k4++)
{
for (k3 = k4; k3 < n; k3++)
{
for (k2 = k3; k2 < n; k2++)
{
for (k1 = k2; k1 < n; k1++)
{
Console.WriteLine("i = " + i.ToString() + ", " + k5.ToString() + " " + k4.ToString() +
" " + k3.ToString() + " " + k2.ToString() + " " + k1.ToString() + " ");
i++;
}
}
}
}
}
}
The output from this method is:
i = 0, 0 0 0 0 0
i = 1, 0 0 0 0 1
i = 2, 0 0 0 0 2
i = 3, 0 0 0 1 1
i = 4, 0 0 0 1 2
i = 5, 0 0 0 2 2
i = 6, 0 0 1 1 1
i = 7, 0 0 1 1 2
i = 8, 0 0 1 2 2
i = 9, 0 0 2 2 2
i = 10, 0 1 1 1 1
i = 11, 0 1 1 1 2
i = 12, 0 1 1 2 2
i = 13, 0 1 2 2 2
i = 14, 0 2 2 2 2
i = 15, 1 1 1 1 1
i = 16, 1 1 1 1 2
i = 17, 1 1 1 2 2
i = 18, 1 1 2 2 2
i = 19, 1 2 2 2 2
i = 20, 2 2 2 2 2
This is the same values as you gave in your edited answer. I also have tried it with 4 choose 5 as well, and it looks like it generates the correct combinations as well.
I wrote this in C#, but you should be able to use it with other languages like C/C++, Java, or Python without too many edits.
One idea for a somewhat inefficient solution is to modify GetCombos to accept k as an input as well. Since k is limited to 6, it would then be possible to put in a test for k. So the code to generate all possible combinations for an n choose k case would then look like this:
private static void GetCombos(int k, int nElements)
{
// This code shows how to generate all the k-indexes or combinations for any n choose k, where k <= 6.
//
int k1, k2, k3, k4, k5, k6;
int n = nElements;
int i = 0;
if (k == 6)
{
for (k6 = 0; k6 < n; k6++)
{
for (k5 = 0; k5 < n; k5++)
{
for (k4 = k5; k4 < n; k4++)
{
for (k3 = k4; k3 < n; k3++)
{
for (k2 = k3; k2 < n; k2++)
{
for (k1 = k2; k1 < n; k1++)
{
Console.WriteLine("i = " + i.ToString() + ", " + k5.ToString() + " " + k4.ToString() +
" " + k3.ToString() + " " + k2.ToString() + " " + k1.ToString() + " ");
i++;
}
}
}
}
}
}
}
else if (k == 5)
{
for (k5 = 0; k5 < n; k5++)
{
for (k4 = k5; k4 < n; k4++)
{
for (k3 = k4; k3 < n; k3++)
{
for (k2 = k3; k2 < n; k2++)
{
for (k1 = k2; k1 < n; k1++)
{
Console.WriteLine("i = " + i.ToString() + ", " + k5.ToString() + " " + k4.ToString() +
" " + k3.ToString() + " " + k2.ToString() + " " + k1.ToString() + " ");
i++;
}
}
}
}
}
}
else if (k == 4)
{
// One less loop than k = 5.
}
else if (k == 3)
{
// One less loop than k = 4.
}
else if (k == 2)
{
// One less loop than k = 3.
}
else
{
// k = 1 - error?
}
}
So, we now have a method that will generate all the combinations of interest. But, the problem is to obtain a specific combination from the lexigraphic order or rank of where that combination lies within the set. So, this can accomplished by a simple count and then returning the proper combination when it hits the specified value. So, to accommodate this an extra parameter that represents the rank needs to be added to the method. So, a new function to do this looks like this:
private static int[] GetComboOfRank(int k, int nElements, int Rank)
{
// Gets the combination for the rank using the formula (n+k-1)!/k!(n-1)! where k <= 6.
int k1, k2, k3, k4, k5, k6;
int n = nElements;
int i = 0;
int[] ReturnArray = new int[k];
if (k == 6)
{
for (k6 = 0; k6 < n; k6++)
{
for (k5 = 0; k5 < n; k5++)
{
for (k4 = k5; k4 < n; k4++)
{
for (k3 = k4; k3 < n; k3++)
{
for (k2 = k3; k2 < n; k2++)
{
for (k1 = k2; k1 < n; k1++)
{
if (i == Rank)
{
ReturnArray[0] = k1;
ReturnArray[1] = k2;
ReturnArray[2] = k3;
ReturnArray[3] = k4;
ReturnArray[4] = k5;
ReturnArray[5] = k6;
return ReturnArray;
}
i++;
}
}
}
}
}
}
}
else if (k == 5)
{
for (k5 = 0; k5 < n; k5++)
{
for (k4 = k5; k4 < n; k4++)
{
for (k3 = k4; k3 < n; k3++)
{
for (k2 = k3; k2 < n; k2++)
{
for (k1 = k2; k1 < n; k1++)
{
if (i == Rank)
{
ReturnArray[0] = k1;
ReturnArray[1] = k2;
ReturnArray[2] = k3;
ReturnArray[3] = k4;
ReturnArray[4] = k5;
return ReturnArray;
}
i++;
}
}
}
}
}
}
else if (k == 4)
{
// Same code as in the other cases, but with one less loop than k = 5.
}
else if (k == 3)
{
// Same code as in the other cases, but with one less loop than k = 4.
}
else if (k == 2)
{
// Same code as in the other cases, but with one less loop than k = 3.
}
else
{
// k = 1 - error?
}
// Should not ever get here. If we do - it is some sort of error.
throw ("GetComboOfRank - did not find rank");
}
ReturnArray returns the combination associated with the rank. So, this code should work for you. However, it will be much slower than what could be achieved if a table lookup was done. The problem with 300 choose 6 is that:
300 choose 6 = 305! / (6!(299!) = 305*304*303*302*301*300 / 6! = 1,064,089,721,800
That is probably way too much data to store in memory. So, if you could get n down to 20, through preprocessing then you would be looking at a total of:
20 choose 6 = 25! / (6!(19!)) = 25*24*23*22*21*20 / 6! = 177,100
20 choose 5 = 24! / (5!(19!)) = 24*23*22*21,20 / 5! = 42,504
20 choose 4 = 23! / (4!(19!)) = 23*22*21*20 / 4! = 8,855
20 choose 3 = 22! / (3!(19!)) = 22*21*20 / 3! = 1,540
20 choose 2 = 21! / (2!(19!)) = 22*21 / 2! = 231
=======
230,230
If one byte is used for each value of the combination, then the total number of bytes used to store a table (via a jagged array or perhaps 5 separate tables) in memory could be calculated as:
177,100 * 6 = 1,062,600
42,504 * 5 = 212,520
8,855 * 4 = 35,420
1,540 * 3 = 4,620
231 * 2 = 462
=========
1,315,622
It depends on the target machine and how much memory is available, but 1,315,622 bytes is not that much memory when many machines today have gigabytes of memory available.

What is unoptimized about this code? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I wrote a solution for a question on interviewstreet, here is the problem description:
https://www.interviewstreet.com/challenges/dashboard/#problem/4e91289c38bfd
Here is the solution they have given:
https://gist.github.com/1285119
Here is the solution that I coded:
#include<iostream>
#include <string.h>
using namespace std;
#define LOOKUPTABLESIZE 10000000
int popCount[2*LOOKUPTABLESIZE];
int main()
{
int numberOfTests = 0;
cin >> numberOfTests;
for(int test = 0;test<numberOfTests;test++)
{
int startingNumber = 0;
int endingNumber = 0;
cin >> startingNumber >> endingNumber;
int numberOf1s = 0;
for(int number=startingNumber;number<=endingNumber;number++)
{
if(number >-LOOKUPTABLESIZE && number < LOOKUPTABLESIZE)
{
if(popCount[number+LOOKUPTABLESIZE] != 0)
{
numberOf1s += popCount[number+LOOKUPTABLESIZE];
}
else
{
popCount[number+LOOKUPTABLESIZE] =__builtin_popcount (number);
numberOf1s += popCount[number+LOOKUPTABLESIZE];
}
}
else
{
numberOf1s += __builtin_popcount (number);
}
}
cout << numberOf1s << endl;
}
}
Can you please point me what is wrong with my code? It can only pass 3/10 of tests. The time limit is 3 seconds.
What is unoptimized about this code?
The algorithm. You are looping
for(int number=startingNumber;number<=endingNumber;number++)
computing or looking up the number of 1-bits in each. That can take a while.
A good algorithm counts the number of 1-bits in all numbers 0 <= k < n in O(log n) time using a bit of math.
Here is an implementation counting 0s in decimal expansions, the modification to make it count 1-bits shouldn't be hard.
When looking at such a question, you need to break it down in simple pieces.
For example, suppose that you know how many 1s there are in all numbers [0, N] (let's call this ones(N)), then we have:
size_t ones(size_t N) { /* magic ! */ }
size_t count(size_t A, size_t B) {
return ones(B) - (A ? ones(A - 1) : 0);
}
This approach has the advantage that one is probably simpler to program that count, for example using recursion. As such, a first naive attempt would be:
// Naive
size_t naive_ones(size_t N) {
if (N == 0) { return 0; }
return __builtin_popcount(N) + naive_ones(N-1);
}
But this is likely to be too slow. Even when simply computing the value of count(B, A) we will be computing naive_ones(A-1) twice!
Fortunately, there is always memoization to assist here, and the transformation is quite trivial:
size_t memo_ones(size_t N) {
static std::deque<size_t> Memo(1, 0);
for (size_t i = Memo.size(); i <= N; ++i) {
Memo.push_back(Memo[i-1] + __builtin_popcnt(i));
}
return Memo[N];
}
It's likely that this helps, however the cost in terms of memory might be... crippling. Ugh. Imagine that for computing ones(1,000,000) we will occupy 8MB of memory on a 64bits computer! A sparser memoization could help (for example, only memoizing every 8th or 16th count):
// count number of ones in (A, B]
static unoptimized_count(size_t A, size_t B) {
size_t result = 0;
for (size_t i = A + 1; i <= B; ++i) {
result += __builtin_popcount(i);
}
return result;
}
// something like this... be wary it's not tested.
size_t memo16_ones(size_t N) {
static std::vector<size_t> Memo(1, 0);
size_t const n16 = N - (N % 16);
for (size_t i = Memo.size(); i*16 <= n16; ++i) {
Memo.push_back(Memo[i-1] + unoptimized_count(16*(i-1), 16*i);
}
return Memo[n16/16] + unoptimized_count(n16, N);
}
However, while it does reduce the memory cost, it does not solve the main speed issue: we must at least use __builtin_popcount B times! And for large values of B this is a killer.
The above solutions are mechanical, they did not require one ounce of thought. It turns out that interviews are not so much about writing code than they are about thinking.
Can we solve this problem more efficiently than dumbly enumerating all integers until B ?
Let's see what our brains (quite the amazing pattern machine) picks up when considering the first few entries:
N bin 1s ones(N)
0 0000 0 0
1 0001 1 1
2 0010 1 2
3 0011 2 4
4 0100 1 5
5 0101 2 7
6 0110 2 9
7 0111 3 12
8 1000 1 13
9 1001 2 15
10 1010 2 17
11 1011 3 20
12 1100 2 22
13 1101 3 25
14 1110 3 28
15 1111 3 32
Notice a pattern ? I do ;) The range 8-15 is built exactly like 0-7 but with one more 1 per line => it's like a transposition. And it's quite logical too, isn't it ?
Therefore, ones(15) - ones(7) = 8 + ones(7), ones(7) - ones(3) = 4 + ones(3) and ones(1) - ones(0) = 1 + ones(0).
Well, let's make this a formula:
Reminder: ones(N) = popcount(N) + ones(N-1) (almost) by definition
We now know that ones(2**n - 1) - ones(2**(n-1) - 1) = 2**(n-1) + ones(2**(n-1) - 1)
Let's make isolate ones(2**n), it's easier to deal with, note that popcount(2**n) = 1:
regroup: ones(2**n - 1) = 2**(n-1) + 2*ones(2**(n-1) - 1)
use the definition: ones(2**n) - 1 = 2**(n-1) + 2*ones(2**(n-1)) - 2
simplify: ones(2**n) = 2**(n-1) - 1 + 2*ones(2**(n-1)), with ones(1) = 1.
Quick sanity check:
1 = 2**0 => 1 (bottom)
2 = 2**1 => 2 = 2**0 - 1 + 2 * ones(1)
4 = 2**2 => 5 = 2**1 - 1 + 2 * ones(2)
8 = 2**3 => 13 = 2**2 - 1 + 2 * ones(4)
16 = 2**4 => 33 = 2**3 - 1 + 2 * ones(8)
Looks like it works!
We are not quite done though. A and B might not necessarily be powers of 2, and if we have to count all the way from 2**n to 2**n + 2**(n-1) that's still O(N)!
On the other hand, if we manage to express a number in base 2, then we should be able to leverage our newly acquired formula. The main advantage being than there are only log2(N) bits in the representation.
Let's pick an example and understand how it works: 13 = 8 + 4 + 1
1 -> 0001
4 -> 0100
8 -> 1000
13 -> 1101
... however, the count is not just merely the sum:
ones(13) != ones(8) + ones(4) + ones(1)
Let's express it in terms of the "transposition" strategy instead:
ones(13) - ones(8) = ones(5) + (13 - 8)
ones(5) - ones(4) = ones(1) + (5 - 4)
Okay, easy to do with a bit of recursion.
#include <cmath>
#include <iostream>
static double const Log2 = log(2);
// store ones(2**n) at P2Count[n]
static size_t P2Count[64] = {};
// Unfortunately, the conversion to double might lose some precision
// static size_t log2(size_t n) { return log(double(n - 1))/Log2 + 1; }
// __builtin_clz* returns the number of leading 0s
static size_t log2(size_t n) {
if (n == 0) { return 0; }
return sizeof(n) - __builtin_clzl(n) - 1;
}
static size_t ones(size_t n) {
if (n == 0) { return 0; }
if (n == 1) { return 1; }
size_t const lg2 = log2(n);
size_t const np2 = 1ul << lg2; // "next" power of 2
if (np2 == n) { return P2Count[lg2]; }
size_t const pp2 = np2 / 2; // "previous" power of 2
return ones(pp2) + ones(n - pp2) + (n - pp2);
} // ones
// reminder: ones(2**n) = 2**(n-1) - 1 + 2*ones(2**(n-1))
void initP2Count() {
P2Count[0] = 1;
for (size_t i = 1; i != 64; ++i) {
P2Count[i] = (1ul << (i-1)) - 1 + 2 * P2Count[i-1];
}
} // initP2Count
size_t count(size_t const A, size_t const B) {
if (A == 0) { return ones(B); }
return ones(B) - ones(A - 1);
} // count
And a demonstration:
int main() {
// Init table
initP2Count();
std::cout << "0: " << P2Count[0] << ", 1: " << P2Count[1] << ", 2: " << P2Count[2] << ", 3: " << P2Count[3] << "\n";
for (size_t i = 0; i != 16; ++i) {
std::cout << i << ": " << ones(i) << "\n";
}
std::cout << "count(7, 14): " << count(7, 14) << "\n";
}
Victory!
Note: as Daniel Fisher noted, this fails to account for negative number (but assuming two-complement it can be inferred from their positive count).