How to read a binary file to calculate frequency of Huffman tree? - c++

I have to calculate frequency of Huffman tree from a "binary file" as sole argument. I have a doubt that binary files are the files which contains "0" and "1" only.
Whereas frequency is the repetition of the number of alphabets (eg, abbacdd here freq of a=2, b=2 ,c=1, d=2).
And my structure must be like this:
struct Node
{
unsigned char symbol; /* the symbol or alphabets */
int freq; /* related frequency */
struct Node *left,*right; /* Left and right leafs */
};
But i not at all understand how can i get the symbol and from ".bin" file (which consists of only "0" and "1") ?
When i try to see the contents of a file i get:
hp#ubuntu:~/Desktop/Internship_Xav/Huf_pointer$ xxd -b out.bin
0000000: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000006: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000000c: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000012: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000018: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000001e: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000024: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000002a: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000030: 00000000 00000000 00000000 00000000 00000000 00000000 ......
.........//Here also there is similar kind of data ................
00008ca: 00010011 00010011 00010011 00010011 00010011 00010011 ......
00008d0: 00010011 00010011 00010011 00010011 00010011 00010011 ......
00008d6: 00010011 00010011 00010011 00010011 00010011 00010011 .....
So , I not at all understand where are the frequencies and where are the symbols. How to store the symbols and how to calculate frequencies. Actually after having frequencies and symbols i will create HUffman tree using it.

First, you need to create some sort of frequency table.
You could use a std::map.
You would do something like this:
#include <algorithm>
#include <fstream>
#include <map>
#include <string>
std::map <unsigned char, int> CreateFrequencyTable (const std::string &strFile)
{
std::map <unsigned char, int> char_freqs ; // character frequencies
std::ifstream file (strFile) ;
int next = 0 ;
while ((next = file.get ()) != EOF) {
unsigned char uc = static_cast <unsigned char> (next) ;
std::map <unsigned char, int>::iterator iter ;
iter = char_freqs.find (uc) ;
// This character is in our map.
if (iter != char_freqs.end ()) {
iter->second += 1 ;
}
// This character is not in our map yet.
else {
char_freqs [uc] = 1 ;
}
}
return char_freqs ;
}
Then you could use this function like this:
std::map <unsigned char, int> char_freqs = CreateFrequencyTable ("file") ;
You can obtain the element with the highest frequency like this:
std::map <unsigned char, int>::iterator iter = std::max_element (
char_freqs.begin (),
char_freqs.end (),
std::map <unsigned char, int>::value_comp
) ;
Then you would need to build your Huffman tree.
Remember that the characters are all leaf nodes, so you need a way to differentiate the leaf nodes from the non-leaf nodes.
Update
If reading a single character from the file is too slow, you could always load all of the contents into a vector like this:
// Make sure to #include <iterator>
std::ifstream file ("test.txt") ;
std::istream_iterator <unsigned char> begin = file ;
std::vector<unsigned char> vecBuffer (begin, std::istream_iterator <unsigned char> ()) ;
You would still need to create a frequency table.

A symbol in a huffman tree could be anything,
but as you have to use an unsigned char per symbol
you should probably take a byte?
So no, not only 0 or 1, but eight time 0 or 1 together.
Like 00010011 somewhere in your output of xxd
xxd -b will just give you eight 0/1 per byte.
You could write a number between 0 and 255 as well,
or two times one character of 0123456789abcdef
There are lots of possibilies how to show a byte on the screen,
but that does not matter at all.
If you know how to read the content of a file in C/C++,
just read unsigned char until the file ends
and count which value is how often in there. That´s all.
As you´re probably writing decimal numbers in your program code,
there are 256 different values (0,1,2...255).
So you will need 256 integers (in an array, or your Node struct...)
to count how often each value appears.

Related

Cast from double to size_t yields wrong result?

The following code works.
My question is, should 2) not lead to a result very close to 1) ?
Why is 2) casted to such a small amount ?
Whereby, maybe worth to note 2) is exactly half of 1):
std::cout << "1) " << std::pow(2, 8 * sizeof(size_t)) << std::endl;
std::cout << "2) " << static_cast<size_t>(std::pow(2, 8 * sizeof(size_t))) << std::endl;
The output is:
18446744073709551616
9223372036854775808
It is due to that part of the specification:
7.3.10 Floating-integral conversions [conv.fpint]
A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.
The value 18446744073709551616 (that's the truncated part) is larger than std::numberic_limit<size_t>::max() on your system, and due to that, the behavior of that cast is undefined.
If we want to calculate the amount of different values a certain unsigned integral datatype
can represent we can calculate
std::cout << "1) " << std::pow(2, 8 * sizeof(size_t)) << std::endl; // yields 18446744073709551616
This calculates 2 to the power of 64 and yields 18446744073709551616.
Since sizeof(size_t) is 8 byte, on a 64 bit machine,
and a byte has 8 bit, the width of the size_t data type is 64 bit hence 2^64.
This is no surprise since usually it is the case that size_t on a system has the width of its
underlying hardware bus system since we want to consume no more than one clock cycle to deliver
an address or an index of an array or vector.
The above number represents the amount of all different integral values that can be
represented by an unsigned integral datatype of 64 bit like size_t or unsigned long long
including 0 as one possibility.
And since it does include 0, the highest value to be represented is exactly one less,
so 18446744073709551615.
This number can also be retrieved by
std::cout << std::numeric_limits<size_t>::max() << std::endl; // yields 18446744073709551615
std::cout << std::numeric_limits<unsigned long long>::max() << std::endl; // yields the same
Now an unsigned datatype stores its values like
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 is 0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 is 1 or 2^0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000010 is 2 or 2^1
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000011 is 3 or 2^1+2^0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000100 is 4 or 2^2
...
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 is 18446744073709551615
and if you want to add another 1, you would need a 65th bit on the left which you dont have:
1 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 is 0 because
there are no more bits on the left.
Any amount higher than the highest possible value you would wish to represent
will come down to amount modulo the largest possible value + 1. (amount % (max + 1))
which leads as we can see to zero in above sample.
And since this comes so naturally the standard defines that if you convert any
integral datatype signed or unsigned to another unsigned integral datatype it is to be converted
amount modulo the largest possible value + 1. Beautiful.
But this easy rule has a little surprise for us when we wish to convert a negative integral to an
unsigned integral like -1 to unsigned long long for eaxample. You have a 0 value first and then
you deduct 1. What happens is the oposite sequence of the above sample. Have a look:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 is 0 and now do -1
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 is 18446744073709551615
So yes, converting -1 to size_t leads to std::numeric_limits<size_t>::max(). Quite unbelievable
at first but understandable after some thinking and playing around with it.
Now for our second line of code
std::cout << "2) " << static_cast<size_t>(std::pow(2, 8 * sizeof(size_t))) << std::endl;
we would expect naively 18446744073709551616, the same result as line one, of course.
But since we know now about modulo the largest + 1 and we know now that the largest plus one
gives 0 we would also, again naively, accept 0 as an answer.
Why naively? Because std::pow returns a double and not an integral datatype.
The double datatype is again 64 bit but internally its representation is entirely different.
0XXXXXXX XXXX0000 00000000 00000000 00000000 00000000 00000000 00000000
Only those 11 X bits represent the exponent in 2^n form. That means only those 11 bits have to show 64
and the double will represent 2^64 * 1. So the representation of our big number is much more compact
in double than in size_t. Would someone want to do modulo the largest plus 1 some more conversion would be
needed before to change the representation of 2^64 into a 64 bit line.
Some further reading about floating point representation can be found at
https://learn.microsoft.com/en-us/cpp/build/ieee-floating-point-representation?view=msvc-160
for example.
And the standard says that if you convert a floating value
to an integral which cannot be represented by the target integral datatype the result is UB, undefined behaviour.
See the C++17 Standard ISO/IEC14882:
7.10 Floating-integral conversions [conv.fpint]
A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates ;
that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented
in the destination type. ...
So double can easily hold 2^64 and thats the reason why line 1 could print out so easily. But it is 1
too much to be represented in size_t so the result is UB.
So whatever is the outcome of our line 2 is simply irrelevant because it is UB.
Ok, but if any random result will do, how come the UB outcome is exactly half?
Well fist of all, the outcome is from MSVC. Clang or other compiler may deliver any other UB result.
But lets look at the "half" outcome since it is easy.
Trying to add 1 to the largest
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 is 18446744073709551615
would if only integrals would be involved lead to,
1 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
but thats not possible since the bit does not exist and it is not integral but double datatype and
hence UB, so accidentially the result is
10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 which is 9223372036854775808
so exactly half of the naively expected or 2^63.

AVX intrinsics for tiled matrix multiplication [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I was trying to use AVX512 intrinsics to vectorize my loop of matrix multiplication (tiled). I used __mm256d as variables to store intermediate results and store them in my results. However, somehow this triggers memory corruption. I've got no hint why this is the case, as the non-AVX version works fine. Also, another weird thing is that tile sizes somehow affects the result now.
The matrix structs are attached in the following code section. The function takes two matrix pointers, m1 and m2 and an integer for tileSize.Thanks for #harold's feedback, I've now replaced the _mm256_load_pd for matrix m1 with broadcast. However, the memory corrupution problem still persist. I've also attached the output of memory corruption below
__m256d rResult rm1, rm2, rmult;
for (int bi = 0; bi < result->row; bi += tileSize) {
for (int bj = 0; bj < result->col; bj += tileSize) {
for (int bk = 0; bk < m1->col; bk += tileSize) {
for (int i = 0; i < tileSize; i++ ) {
for (int j = 0; j < tileSize; j+=4) {
rResult = _mm256_setzero_pd();
for (int k = 0; k < tileSize; k++) {
// result->val[bi+i][bj+j] += m1.val[bi+i][bk+k]*m2.val[bk+k][bj+j];
rm1 = _mm256_broadcast_pd((__m128d const *) &m1->val[bi+i][bk+k]);
rm2 = _mm256_load_pd(&m2->val[bk+k][bj+j]);
rmult = _mm256_mul_pd(rm1,rm2);
rResult = _mm256_add_pd(rResult,rmult);
_mm256_store_pd(&result->val[bi+i][bj+j],rResult);
}
}
}
}
}
}
return result;
*** Error in `./matrix': free(): invalid next size (fast): 0x0000000001880910 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81609)[0x2b04a26d0609]
./matrix[0x4016cc]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b04a2671495]
./matrix[0x400e29]
======= Memory map: ========
00400000-0040c000 r-xp 00000000 00:2c 6981358608 /home/matrix
0060b000-0060c000 r--p 0000b000 00:2c 6981358608 /home/matrix
0060c000-0060d000 rw-p 0000c000 00:2c 6981358608 /home/matrix
01880000-018a1000 rw-p 00000000 00:00 0 [heap]
2b04a1f13000-2b04a1f35000 r-xp 00000000 00:16 12900 /usr/lib64/ld-2.17.so
2b04a1f35000-2b04a1f3a000 rw-p 00000000 00:00 0
2b04a1f4e000-2b04a1f52000 rw-p 00000000 00:00 0
2b04a2134000-2b04a2135000 r--p 00021000 00:16 12900 /usr/lib64/ld-2.17.so
2b04a2135000-2b04a2136000 rw-p 00022000 00:16 12900 /usr/lib64/ld-2.17.so
2b04a2136000-2b04a2137000 rw-p 00000000 00:00 0
2b04a2137000-2b04a2238000 r-xp 00000000 00:16 13188 /usr/lib64/libm-2.17.so
2b04a2238000-2b04a2437000 ---p 00101000 00:16 13188 /usr/lib64/libm-2.17.so
2b04a2437000-2b04a2438000 r--p 00100000 00:16 13188 /usr/lib64/libm-2.17.so
2b04a2438000-2b04a2439000 rw-p 00101000 00:16 13188 /usr/lib64/libm-2.17.so
2b04a2439000-2b04a244e000 r-xp 00000000 00:16 12867 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
2b04a244e000-2b04a264d000 ---p 00015000 00:16 12867 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
2b04a264d000-2b04a264e000 r--p 00014000 00:16 12867 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
2b04a264e000-2b04a264f000 rw-p 00015000 00:16 12867 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
2b04a264f000-2b04a2811000 r-xp 00000000 00:16 13172 /usr/lib64/libc-2.17.so
2b04a2811000-2b04a2a11000 ---p 001c2000 00:16 13172 /usr/lib64/libc-2.17.so
2b04a2a11000-2b04a2a15000 r--p 001c2000 00:16 13172 /usr/lib64/libc-2.17.so
2b04a2a15000-2b04a2a17000 rw-p 001c6000 00:16 13172 /usr/lib64/libc-2.17.so
2b04a2a17000-2b04a2a1c000 rw-p 00000000 00:00 0
2b04a2a1c000-2b04a2a1e000 r-xp 00000000 00:16 13184 /usr/lib64/libdl-2.17.so
2b04a2a1e000-2b04a2c1e000 ---p 00002000 00:16 13184 /usr/lib64/libdl-2.17.so
2b04a2c1e000-2b04a2c1f000 r--p 00002000 00:16 13184 /usr/lib64/libdl-2.17.so
2b04a2c1f000-2b04a2c20000 rw-p 00003000 00:16 13184 /usr/lib64/libdl-2.17.so
2b04a4000000-2b04a4021000 rw-p 00000000 00:00 0
2b04a4021000-2b04a8000000 ---p 00000000 00:00 0
7ffc8448e000-7ffc844b1000 rw-p 00000000 00:00 0 [stack]
7ffc845ed000-7ffc845ef000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted
That code loads a small row vector from m1 and a small row vector from m2 and multiplies them, which is not how matrix multiplication works, I assume it's direct vectorization of the identical scalar loop. You can use a broadcast-load from m1, that way the product with the row vector from m2 results in a row vector of the result which is convenient (the other way around, broadcasting from m2, you get a column vector of the result which is tricky to store - unless of course you use the column-major matrix layout).
Never resetting rResult is also wrong, and takes extra care when using tiling, because the tiling means that individual results are put aside and then picked up again later. It's convenient to implement C += A*B because then you don't have to distinguish between the second time that a result is worked on (loading rResult back out of the result matrix) and the first time that a result is worked on (either zeroing the accumulator, or if you implement C += A*B, then it's also just loading it out of the result).
There are some performance bugs,
Using one accumulator. This limits the inner loop to run once every 4 cycles (Skylake) in the long term, because of the loop-carried dependency through the addition (or FMA). There should be 2 FMAs per cycle but that way there would be one FMA every 4 cycles, 1/8th speed.
Using a 2:1 load-to-FMA ratio (assuming the mul+add is contracted), it needs to be 1:1 or better to avoid getting bottlenecked by load throughput. A 2:1 ratio is limited to half speed.
The solution for both of them is multiplying a small column vector from m1 with a small row vector from m2 in the inner loop, summing into a small matrix of accumulators rather than just one of them. For example if you use a 3x16 region (3x4 vectors, with a vector length of 4 and the vectors corresponding to loads from m2, from m1 you would do broadcast-loads), then there are 12 accumulators, and therefore 12 independent dependency chains: enough to hide the high latency-throughput product of FMA (2 per cycle, but 4 cycles long on Skylake, so you need at least 8 independent chains, and at least 10 on Haswell). It also means there are 7 loads and 12 FMAs in the inner loop, even better than 1:1, it can even support Turbo frequencies without overclocking the cache.
I would also like to note that setting the tile size the same in every dimension is not necessarily the best. Maybe it is, but probably not, the dimensions do act a little differently.
More advanced performance issue,
Not re-packing tiles. This means tiles will span more pages than necessary, which hurts the effectiveness of the TLB. You can easily get into a situation where your tiles fit in the cache, but are spread over too many pages to fit in the TLB. TLB thrashing is not good.
Using asymmetric tile sizes you can arrange for either m1 tiles or m2 tiles to be TLB-friendly, but not both at the same time.
If you care about performance, normally you want one contiguous chunk of memory, not an array of pointers to rows.
Anyway, you're probably reading off the end of a row if your tile size isn't a multiple of 4 doubles per vector. Or if your rows or cols aren't a multiple of the tile size, then you need to stop after the last full tile, and write cleanup code for the end.
e.g. bi < result->row - (tileSize-1) for the outer loops
If your tile size isn't a multiple of 4, then you'd also need i < tileSize-3. But hopefully you are doing power-of-2 loop tiling / cache blocking. But you'd want a size - 3 boundary for vector cleanup in a partial tile. Then probably scalar cleanup for the last few elements. (Or if you can use an unaligned final vector that ends at the end of a row, that can work, maybe with masked loads/stores. But trickier for matmul than for algorithms that just make a single pass.)

Bit expand byte array

I have a situation where I need to upscale a dynamic sized byte array by 3.
Example:
10101010 11001100
to
11100011 10001110 00111000 11111100 00001111 11000000
I've used the algorithm here to generate a lookup table.
https://stackoverflow.com/a/9044057/280980
static const uint32_t bitExpandTable[256] = {
00000000, 0x000007, 0x000038, 0x00003f, 0x0001c0, 0x0001c7, 0x0001f8, 0x0001ff,
0x000e00, 0x000e07, 0x000e38, 0x000e3f, 0x000fc0, 0x000fc7, 0x000ff8, 0x000fff,
0x007000, 0x007007, 0x007038, 0x00703f, 0x0071c0, 0x0071c7, 0x0071f8, 0x0071ff,
0x007e00, 0x007e07, 0x007e38, 0x007e3f, 0x007fc0, 0x007fc7, 0x007ff8, 0x007fff,
0x038000, 0x038007, 0x038038, 0x03803f, 0x0381c0, 0x0381c7, 0x0381f8, 0x0381ff,
0x038e00, 0x038e07, 0x038e38, 0x038e3f, 0x038fc0, 0x038fc7, 0x038ff8, 0x038fff,
0x03f000, 0x03f007, 0x03f038, 0x03f03f, 0x03f1c0, 0x03f1c7, 0x03f1f8, 0x03f1ff,
0x03fe00, 0x03fe07, 0x03fe38, 0x03fe3f, 0x03ffc0, 0x03ffc7, 0x03fff8, 0x03ffff,
0x1c0000, 0x1c0007, 0x1c0038, 0x1c003f, 0x1c01c0, 0x1c01c7, 0x1c01f8, 0x1c01ff,
0x1c0e00, 0x1c0e07, 0x1c0e38, 0x1c0e3f, 0x1c0fc0, 0x1c0fc7, 0x1c0ff8, 0x1c0fff,
0x1c7000, 0x1c7007, 0x1c7038, 0x1c703f, 0x1c71c0, 0x1c71c7, 0x1c71f8, 0x1c71ff,
0x1c7e00, 0x1c7e07, 0x1c7e38, 0x1c7e3f, 0x1c7fc0, 0x1c7fc7, 0x1c7ff8, 0x1c7fff,
0x1f8000, 0x1f8007, 0x1f8038, 0x1f803f, 0x1f81c0, 0x1f81c7, 0x1f81f8, 0x1f81ff,
0x1f8e00, 0x1f8e07, 0x1f8e38, 0x1f8e3f, 0x1f8fc0, 0x1f8fc7, 0x1f8ff8, 0x1f8fff,
0x1ff000, 0x1ff007, 0x1ff038, 0x1ff03f, 0x1ff1c0, 0x1ff1c7, 0x1ff1f8, 0x1ff1ff,
0x1ffe00, 0x1ffe07, 0x1ffe38, 0x1ffe3f, 0x1fffc0, 0x1fffc7, 0x1ffff8, 0x1fffff,
0xe00000, 0xe00007, 0xe00038, 0xe0003f, 0xe001c0, 0xe001c7, 0xe001f8, 0xe001ff,
0xe00e00, 0xe00e07, 0xe00e38, 0xe00e3f, 0xe00fc0, 0xe00fc7, 0xe00ff8, 0xe00fff,
0xe07000, 0xe07007, 0xe07038, 0xe0703f, 0xe071c0, 0xe071c7, 0xe071f8, 0xe071ff,
0xe07e00, 0xe07e07, 0xe07e38, 0xe07e3f, 0xe07fc0, 0xe07fc7, 0xe07ff8, 0xe07fff,
0xe38000, 0xe38007, 0xe38038, 0xe3803f, 0xe381c0, 0xe381c7, 0xe381f8, 0xe381ff,
0xe38e00, 0xe38e07, 0xe38e38, 0xe38e3f, 0xe38fc0, 0xe38fc7, 0xe38ff8, 0xe38fff,
0xe3f000, 0xe3f007, 0xe3f038, 0xe3f03f, 0xe3f1c0, 0xe3f1c7, 0xe3f1f8, 0xe3f1ff,
0xe3fe00, 0xe3fe07, 0xe3fe38, 0xe3fe3f, 0xe3ffc0, 0xe3ffc7, 0xe3fff8, 0xe3ffff,
0xfc0000, 0xfc0007, 0xfc0038, 0xfc003f, 0xfc01c0, 0xfc01c7, 0xfc01f8, 0xfc01ff,
0xfc0e00, 0xfc0e07, 0xfc0e38, 0xfc0e3f, 0xfc0fc0, 0xfc0fc7, 0xfc0ff8, 0xfc0fff,
0xfc7000, 0xfc7007, 0xfc7038, 0xfc703f, 0xfc71c0, 0xfc71c7, 0xfc71f8, 0xfc71ff,
0xfc7e00, 0xfc7e07, 0xfc7e38, 0xfc7e3f, 0xfc7fc0, 0xfc7fc7, 0xfc7ff8, 0xfc7fff,
0xff8000, 0xff8007, 0xff8038, 0xff803f, 0xff81c0, 0xff81c7, 0xff81f8, 0xff81ff,
0xff8e00, 0xff8e07, 0xff8e38, 0xff8e3f, 0xff8fc0, 0xff8fc7, 0xff8ff8, 0xff8fff,
0xfff000, 0xfff007, 0xfff038, 0xfff03f, 0xfff1c0, 0xfff1c7, 0xfff1f8, 0xfff1ff,
0xfffe00, 0xfffe07, 0xfffe38, 0xfffe3f, 0xffffc0, 0xffffc7, 0xfffff8, 0xffffff,
};
I've tried looping through the byte array using the LUT to memcpy the first 3 bytes to the new array. However my output never looks correct.
Does anyone have any suggestions on how to efficiently implement this? This will be running on an embedded ARM processor.
Edit
LUT test
uint8_t msg[] = { 0xaa, 0x02, 0x43, 0x5a, 0x8d, 0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xd0, 0x84, 0xc6, 0x2d, 0x00, 0xb9 };
uint8_t expanded_msg[63] = { 0 };
uint8_t tmp_val = 0;
uint32_t lut_val;
for (int i = 0; i < 21; i++)
{
tmp_val = *(uint8_t*)(&msg + i);
lut_val = bitExpandTable[tmp_val];
memcpy(&expanded_msg[(i * 3)], &lut_val, 3);
}
print_binary(&msg, 21);
print_binary(&expanded_msg, sizeof(expanded_msg));
Output
[ 10101010 00000010 01000011 01011010 10001101 00000110 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 11010000 10000100 11000110 00101101 00000000 10111001 ]
[ 00111000 10001110 11100011 00000000 00000000 00000000 11000111 10001111 00000011 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000111 01110000 00011100 00000111 11110000 00000011 11111111 11111111 00011111 11000111 11110001 11100011 00000000 00000000 00000000 11111111 11111111 11111111 11000000 10000001 11100011 00000000 00000000 00000000 00000111 01110000 00011100 00111000 10000000 00011111 11111111 11111111 00011111 11000111 11110001 11100011 00000000 00000000 00000000 11111111 11111111 11111111 ]
There is a byte order issue with your table (or how you copy the data):
For the first input byte
10101010
your output is
a b c
00111000 10001110 11100011
but should be
c b a
11100011 10001110 00111000
So the 1st and 3rd byte need to be swapped (and so on)
Instead of changing the table to suit memcpy, I'd just do something like
int j = 0;
for (int i = 0; i < 21; i++) {
lut_val = bitExpandTable[msg[i]];
expanded_msg[j++] = (uint8_t) (lut_val >> 16);
expanded_msg[j++] = (uint8_t) (lut_val >> 8);
expanded_msg[j++] = (uint8_t) lut_val;
}

c++: glibc invalid pointer error when src is 32bit compiled

I've written a program which compiles and runs well on my 64-bit machine (running linux SUSE). Now I need to call an external library but I only have access to the 32-bit binary. My source code compiles and links with no errors from ssh command line to a 32 bit machine, but I get a memory error at runtime now before the library is called, or any of the interesting stuff happens...
I have a simple class cWorld to initialize some other classes, it has a method cWorld::ReadData() which opens a text file and parses/reads lines from the file and stores values in various members of cWorld, and then closes the file. The file, input.txt, just holds some explanation text and initial condition values, separated by commas and semicolons. Nothing groundbreaking!
Debugging with gdb showed that the file opens, closes successfully, all the data is stored successfully, then the SIGABRT is thrown at the very end when the ReadData() method is exited.
Extracted the problem code from my program:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
class cWorld {
public:
cWorld ();
void CallReadData ();
private:
int N_target, N_steps;
double t0, tf, delt;
std::vector<double> data;
void ReadData ();
};
cWorld::cWorld () {
N_target = 0;
N_steps = 0;
delt = 0.0;
t0 = 0.0;
tf = 0.0;
}
void cWorld::CallReadData() {
ReadData();
}
void cWorld::ReadData() {
std::string line;
std::ifstream input("input_test.txt");
if (input.is_open()) {
// RETRIEVE INPUT OPTIMIZATION PARAMETERS
input.ignore(1000, '>'); // ignore text until first '>' appears
std::getline(input, line, ';'); // get int N_target
std::stringstream(line) >> N_target;
input.ignore(1000, '>'); // ignore text until first '>' appears
std::getline(input, line, ','); // get t0
std::stringstream(line) >> t0;
std::getline(input, line, ','); // get delt
std::stringstream(line) >> delt;
std::cout << "delt = " << delt << std::endl;
std::getline(input, line, ','); // get tf
std::stringstream(line) >> tf;
N_steps = (int)( (tf - t0) / delt ) + 1; // set an int cWorld::N_steps
// RETRIEVE INPUT STATE PARAMETERS
int index = 0; // initialize local iterator
data.resize(12*N_target, 0.0); // set data size
std::cout << "data elements = " << data.size() << std::endl;
while (!input.eof()) {
// if there's '<' end loop
if (input.peek() == '<') break;
// if there's a semicolon, store following text in data...
else if (input.peek() == ';') {
input.ignore(1000, '>');
std::getline(input, line, ',');
std::stringstream(line) >> data[index];
index++;
}
// else if there's a comma, store following text in data...
else {
std::getline(input, line, ',');
std::stringstream(line) >> data[index];
index++;
}
}
input.close();
}
else std::cout << "Can't open file 'input.txt'.\n";
}
int main() {
cWorld world_1;
world_1.CallReadData();
return 0;
}
input text file:
/****************************************************************/
/* */
/* p2pOpt.C INPUT FILE */
/* */
/****************************************************************/
System Parameters: number of paths to optimize
format: N_target; (int)
>3;
System Parameters: start time, step size, end time
format: t0,delt,tf,; (doubles)
>0.0,0.001,1,;
Target 1 Parameters: Initial Conditions
format: x,y,z,theta1,theta2,theta3,xdot,ydot,zdot,theta1dot,theta2dot,theta3dot,;(doubles)
>1.0,0.0,0.0,3.14159265359,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,;
>2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,;
>3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,;
<
Here's the debug output:
======= Memory map: ========
08048000-0804b000 r-xp 00000000 00:29 18254842 /home/ston_sa/core/motion_planning/algorithms_cpp/p2pOpt/test_3_32
0804b000-0804c000 r--p 00002000 00:29 18254842 /home/ston_sa/core/motion_planning/algorithms_cpp/p2pOpt/test_3_32
0804c000-0804d000 rw-p 00003000 00:29 18254842 /home/ston_sa/core/motion_planning /algorithms_cpp/p2pOpt/test_3_32
0804d000-0806e000 rw-p 00000000 00:00 0 [heap]
b7b00000-b7b21000 rw-p 00000000 00:00 0
b7b21000-b7c00000 ---p 00000000 00:00 0
b7cd8000-b7cdb000 rw-p 00000000 00:00 0
b7cdb000-b7e42000 r-xp 00000000 08:06 114523898 /lib/libc-2.11.3.so
b7e42000-b7e44000 r--p 00167000 08:06 114523898 /lib/libc-2.11.3.so
b7e44000-b7e45000 rw-p 00169000 08:06 114523898 /lib/libc-2.11.3.so
b7e45000-b7e48000 rw-p 00000000 00:00 0
b7e48000-b7e64000 r-xp 00000000 08:06 114544736 /lib/libgcc_s.so.1
b7e64000-b7e65000 r--p 0001b000 08:06 114544736 /lib/libgcc_s.so.1
b7e65000-b7e66000 rw-p 0001c000 08:06 114544736 /lib/libgcc_s.so.1
b7e66000-b7e8c000 r-xp 00000000 08:06 114353773 /lib/libm-2.11.3.so
b7e8c000-b7e8d000 r--p 00026000 08:06 114353773 /lib/libm-2.11.3.so
b7e8d000-b7e8e000 rw-p 00027000 08:06 114353773 /lib/libm-2.11.3.so
b7e8e000-b7f70000 r-xp 00000000 08:06 2169219 /usr/lib/libstdc++.so.6.0.16
b7f70000-b7f74000 r--p 000e2000 08:06 2169219 /usr/lib/libstdc++.so.6.0.16
b7f74000-b7f75000 rw-p 000e6000 08:06 2169219 /usr/lib/libstdc++.so.6.0.16
b7f75000-b7f7c000 rw-p 00000000 00:00 0
b7fdd000-b7fdf000 rw-p 00000000 00:00 0
b7fdf000-b7ffe000 r-xp 00000000 08:06 114544574 /lib/ld-2.11.3.so
b7ffe000-b7fff000 r--p 0001e000 08:06 114544574 /lib/ld-2.11.3.so
b7fff000-b8000000 rw-p 0001f000 08:06 114544574 /lib/ld-2.11.3.so
bffdf000-c0000000 rw-p 00000000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
Program received signal SIGABRT, Aborted.
0xffffe424 in __kernel_vsyscall ()
and backtrace:
#0 0xffffe424 in __kernel_vsyscall ()
#1 0xb7d05e20 in raise () from /lib/libc.so.6
#2 0xb7d07755 in abort () from /lib/libc.so.6
#3 0xb7d44d65 in __libc_message () from /lib/libc.so.6
#4 0xb7d4ac54 in malloc_printerr () from /lib/libc.so.6
#5 0xb7d4c563 in _int_free () from /lib/libc.so.6
#6 0xb7d4f69d in free () from /lib/libc.so.6
#7 0xb7f3fa0f in operator delete(void*) () from /usr/lib/libstdc++.so.6
#8 0xb7f26f6b in std::string::_Rep::_M_destroy(std::allocator<char> const&) () from /usr/lib/libstdc++.so.6
#9 0xb7f26fac in ?? () from /usr/lib/libstdc++.so.6
#10 0xb7f2701e in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() () from /usr/lib/libstdc++.so.6
#11 0x080495bf in cWorld::ReadData (this=0xbfffefe0) at test_3.cpp:91
#12 0x0804961b in cWorld::CallReadData (this=0xbfffefe0) at test_3.cpp:30
#13 0x08049646 in main () at test_3.cpp:100
at #11 test_3.cpp:91 is the closing bracket of the ReadData() method.
First note, you didn't include a sample input.txt to test against. Second note, what are some sample values the variables are initialized to?
So, given that tf=0.0, t0=0.0, and delt=1.0 and using an input.txt of:
>
1;
1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0
<
I get an a vector of data with 11 entries, with the first 11 values in the list and no errors. Are you sure your input.txt is formatted as the code expects? Do you really want to delete the last item in the list?
Your first problem is that your loop does 37 reads but only resizes data to be 36 elements. You should restructure how you are parsing your input. Maybe use scanf() if nothing else.

(C++) Passing a pointer to dynamically allocated array to a function

I've had problems with variables overwriting each other in memory, so I decided I'd try to allocate one of my arrays dynamically.
In the simplified code below, I'm attempting to create an array of integers using dynamic allocation, then have a function edit the values within that array of integers. Once the function has finished executing, I'd like to have a nicely processed array for use in other functions.
From what I know, an array cannot be passed to a function, so I'm simply passing a pointer to the array to the function.
#include <iostream>
using namespace std;
void func(int *[]);
int main(){
//dynamically allocate an array
int *anArray[100];
anArray[100] = new int [100];
func(anArray);
int i;
for (i=0; i < 99; i++)
cout << "element " << i << " is: " << anArray[i] << endl;
delete [] anArray;
}
void func(int *array[]){
//fill with 0-99
int i;
for (i=0; i < 99; i++){
(*array)[i] = i;
cout << "element " << i << " is: " << array[i] << endl;
}
}
When I attempt to compile the code above, g++ gives me the following warning:
dynamicArray.cc: In function ‘int main()’:
dynamicArray.cc:21:12: warning: deleting array ‘int* anArray [100]’ [enabled by default]
When I run the compiled a.out executable anyway, it outputs nothing, leaving me with nothing but the message
Segmentation fault (core dumped)
in terminal.
What am I doing wrong? My code not attempting to access or write to anything outside of the array I created. In fact, I'm not even attempting to read or writing to the last element of the array!
Something REALLY weird happens when I comment out the part that actually modifies the array, like so
//(*array)[i] = i;
G++ compiles with the same warning, but when I execute a.out I get this instead:
element 0 is: 0x600df0
element 1 is: 0x400a3d
element 2 is: 0x7f5b00000001
element 3 is: 0x10000ffff
element 4 is: 0x7fffa591e320
element 5 is: 0x400a52
element 6 is: 0x1
element 7 is: 0x400abd
element 8 is: 0x7fffa591e448
element 0 is: 0x600df0
element 1 is: 0x400a3d
element 2 is: 0x7f5b00000001
element 3 is: 0x10000ffff
element 4 is: 0x7fffa591e320
element 5 is: 0x400a52
element 6 is: 0x1
element 7 is: 0x400abd
element 8 is: 0x7fffa591e448
*** glibc detected *** ./a.out: munmap_chunk(): invalid pointer: 0x00007fffa591e2f0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7f5b92ff4b96]
./a.out[0x400976]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f5b92f9776d]
./a.out[0x400829]
======= Memory map: ========
00400000-00401000 r-xp 00000000 00:13 4070334 /home/solderblob/Documents/2013 Spring Semester/CSC 1254s2 C++ II/Assignment 1/a.out
00600000-00601000 r--p 00000000 00:13 4070334 /home/solderblob/Documents/2013 Spring Semester/CSC 1254s2 C++ II/Assignment 1/a.out
00601000-00602000 rw-p 00001000 00:13 4070334 /home/solderblob/Documents/2013 Spring Semester/CSC 1254s2 C++ II/Assignment 1/a.out
01eb5000-01ed6000 rw-p 00000000 00:00 0 [heap]
7f5b92a64000-7f5b92a79000 r-xp 00000000 08:16 11276088 /lib/x86_64- linux-gnu/libgcc_s.so.1
7f5b92a79000-7f5b92c78000 ---p 00015000 08:16 11276088 /lib/x86_64- linux-gnu/libgcc_s.so.1
7f5b92c78000-7f5b92c79000 r--p 00014000 08:16 11276088 /lib/x86_64- linux-gnu/libgcc_s.so.1
7f5b92c79000-7f5b92c7a000 rw-p 00015000 08:16 11276088 /lib/x86_64-linux-gnu/libgcc_s.so.1
7f5b92c7a000-7f5b92d75000 r-xp 00000000 08:16 11276283 /lib/x86_64-linux-gnu/libm-2.15.so
7f5b92d75000-7f5b92f74000 ---p 000fb000 08:16 11276283 /lib/x86_64-linux-gnu/libm-2.15.so
7f5b92f74000-7f5b92f75000 r--p 000fa000 08:16 11276283 /lib/x86_64-linux-gnu/libm-2.15.so
7f5b92f75000-7f5b92f76000 rw-p 000fb000 08:16 11276283 /lib/x86_64-linux-gnu/libm-2.15.so
7f5b92f76000-7f5b9312b000 r-xp 00000000 08:16 11276275 /lib/x86_64-linux-gnu/libc-2.15.so
7f5b9312b000-7f5b9332a000 ---p 001b5000 08:16 11276275 /lib/x86_64-linux-gnu/libc-2.15.so
7f5b9332a000-7f5b9332e000 r--p 001b4000 08:16 11276275 /lib/x86_64-linux-gnu/libc-2.15.so
7f5b9332e000-7f5b93330000 rw-p 001b8000 08:16 11276275 /lib/x86_64-linux-gnu/libc-2.15.so
7f5b93330000-7f5b93335000 rw-p 00000000 00:00 0
7f5b93335000-7f5b93417000 r-xp 00000000 08:16 31987823 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16
7f5b93417000-7f5b93616000 ---p 000e2000 08:16 31987823 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16
7f5b93616000-7f5b9361e000 r--p 000e1000 08:16 31987823 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16
7f5b9361e000-7f5b93620000 rw-p 000e9000 08:16 31987823 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16
7f5b93620000-7f5b93635000 rw-p 00000000 00:00 0
7f5b93635000-7f5b93657000 r-xp 00000000 08:16 11276289 /lib/x86_64-linux-gnu/ld-2.15.so
7f5b93834000-7f5b93839000 rw-p 00000000 00:00 0
7f5b93853000-7f5b93857000 rw-p 00000000 00:00 0
7f5b93857000-7f5b93858000 r--p 00022000 08:16 11276289 /lib/x86_64-linux-gnu/ld-2.15.so
7f5b93858000-7f5b9385a000 rw-p 00023000 08:16 11276289 /lib/x86_64-linux-gnu/ld-2.15.so
7fffa5900000-7fffa5921000 rw-p 00000000 00:00 0 [stack]
7fffa59ff000-7fffa5a00000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted (core dumped)
When writing:
int *anArray[100];
anArray[100] = new int [100];
In the first line, you are allocating an array of 100 pointers to int.
In the second line, you dynamically allocating an array of ints and assigning the address of that array to the 100 cell of the array of pointers. Proper syntax would be:
int *anArray;
anArray = new int [100];
Array indexing starts from 0.
Array of len 100 has indexes from 0 to 99.
So anArray[100] gives you Segmentation Fault.
May be you want to do this:
anArray[99] = new int[100];
OR if you just want to dynamically allocate an array of pointer to ints, do that following:
int **anArray = new int*[100];
//dynamically allocate an array
int *anArray[100];
anArray[100] = new int [100];
anArray is an array of 100 pointers-to-int. To the 101th element (buffer overflow!) you assign a pointer that points to the first element of a dynamically allocated array of 100 ints. You want to fix that and merge the two lines as int* anArray = new int[100];.
int *anArray[100];
anArray[100] = new int [100];
First you are allocating an array of 100 pointers.
Then you are performing out of range access. The last element of anArray is anArray[99], but you are allocating memory to anArray[100] which does not exist. This will cause a segmentation fault.
In the end, you are deleting a static array of type int*. anArray is allocated at compile time and contains 100 pointers of type int. Remove the delete[] statement.