C++ Only Store Specific Bytes of a double in a char* - c++

Off the bat I am unfortunately using an older version of c++ (I believe 98) so c++11 goodies are unavailable to me.
That aside, I was wondering- is it possible to only store specific bytes of a double in a char* buffer? For example, if I have a double that has a low value and therefore only uses 3 bytes of data can I then copy just 3 bytes of data into a char* buffer?
I know it is possible to copy full doubles into a char* buffer. Currently I am doing so and printing out the binary of the char* buffer afterwards using this code:
char* buffer = new char[8]; // A double is 8 bytes
memset(buffer, 0, sizeof(buffer)); // Fill the buffer with 0's
double value = 243;
memcpy(&buffer[0], &value, 8); // copy all 8 bytes (sizeof(value) is better here, I'm just typing '8' for readability)
for (int i = sizeof(value); i > 0; i --)
{
std::bitset<8> x(buffer[i-1]); // 8 bits per byte
std::cout << x << " ";
}
The output of the above code is as expected:
01000000 01101110 01100000 00000000
00000000 00000000 00000000 00000000
If I try and only copy the first 3 bytes into the char* buffer, however, it appears that I don't end up copying over anything at all. Here is the code I'm attempting to use:
char* buffer = new char[8]; // A double is 8 bytes
memset(buffer, 0, sizeof(buffer)); // Fill the buffer with 0's
double value = 243;
memcpy(&buffer[0], &value, 3); // Only copy over 3 bytes
for (int i = sizeof(value); i > 0; i --)
{
std::bitset<8> x(buffer[i-1]); // 8 bits per byte
std::cout << x << " ";
}
The output of the above code is an empty buffer:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Is there a way for me to only copy 3 bytes of this double over to a char* buffer that I am missing?
Thanks!

You are copying over the wrong bytes, you're computer is in little endian and so the 3 bytes you want to copy over will actually be the last three bytes of the double. If you change the copy line of your code to this
memcpy(&buffer[0], (void*)(&value)+5, 3); // Copy the last three bytes
you get a correct result of
00000000 00000000 00000000 00000000 00000000 01000000 01101110 01100000

Related

Cast from double to size_t yields wrong result?

The following code works.
My question is, should 2) not lead to a result very close to 1) ?
Why is 2) casted to such a small amount ?
Whereby, maybe worth to note 2) is exactly half of 1):
std::cout << "1) " << std::pow(2, 8 * sizeof(size_t)) << std::endl;
std::cout << "2) " << static_cast<size_t>(std::pow(2, 8 * sizeof(size_t))) << std::endl;
The output is:
18446744073709551616
9223372036854775808
It is due to that part of the specification:
7.3.10 Floating-integral conversions [conv.fpint]
A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.
The value 18446744073709551616 (that's the truncated part) is larger than std::numberic_limit<size_t>::max() on your system, and due to that, the behavior of that cast is undefined.
If we want to calculate the amount of different values a certain unsigned integral datatype
can represent we can calculate
std::cout << "1) " << std::pow(2, 8 * sizeof(size_t)) << std::endl; // yields 18446744073709551616
This calculates 2 to the power of 64 and yields 18446744073709551616.
Since sizeof(size_t) is 8 byte, on a 64 bit machine,
and a byte has 8 bit, the width of the size_t data type is 64 bit hence 2^64.
This is no surprise since usually it is the case that size_t on a system has the width of its
underlying hardware bus system since we want to consume no more than one clock cycle to deliver
an address or an index of an array or vector.
The above number represents the amount of all different integral values that can be
represented by an unsigned integral datatype of 64 bit like size_t or unsigned long long
including 0 as one possibility.
And since it does include 0, the highest value to be represented is exactly one less,
so 18446744073709551615.
This number can also be retrieved by
std::cout << std::numeric_limits<size_t>::max() << std::endl; // yields 18446744073709551615
std::cout << std::numeric_limits<unsigned long long>::max() << std::endl; // yields the same
Now an unsigned datatype stores its values like
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 is 0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 is 1 or 2^0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000010 is 2 or 2^1
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000011 is 3 or 2^1+2^0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000100 is 4 or 2^2
...
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 is 18446744073709551615
and if you want to add another 1, you would need a 65th bit on the left which you dont have:
1 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 is 0 because
there are no more bits on the left.
Any amount higher than the highest possible value you would wish to represent
will come down to amount modulo the largest possible value + 1. (amount % (max + 1))
which leads as we can see to zero in above sample.
And since this comes so naturally the standard defines that if you convert any
integral datatype signed or unsigned to another unsigned integral datatype it is to be converted
amount modulo the largest possible value + 1. Beautiful.
But this easy rule has a little surprise for us when we wish to convert a negative integral to an
unsigned integral like -1 to unsigned long long for eaxample. You have a 0 value first and then
you deduct 1. What happens is the oposite sequence of the above sample. Have a look:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 is 0 and now do -1
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 is 18446744073709551615
So yes, converting -1 to size_t leads to std::numeric_limits<size_t>::max(). Quite unbelievable
at first but understandable after some thinking and playing around with it.
Now for our second line of code
std::cout << "2) " << static_cast<size_t>(std::pow(2, 8 * sizeof(size_t))) << std::endl;
we would expect naively 18446744073709551616, the same result as line one, of course.
But since we know now about modulo the largest + 1 and we know now that the largest plus one
gives 0 we would also, again naively, accept 0 as an answer.
Why naively? Because std::pow returns a double and not an integral datatype.
The double datatype is again 64 bit but internally its representation is entirely different.
0XXXXXXX XXXX0000 00000000 00000000 00000000 00000000 00000000 00000000
Only those 11 X bits represent the exponent in 2^n form. That means only those 11 bits have to show 64
and the double will represent 2^64 * 1. So the representation of our big number is much more compact
in double than in size_t. Would someone want to do modulo the largest plus 1 some more conversion would be
needed before to change the representation of 2^64 into a 64 bit line.
Some further reading about floating point representation can be found at
https://learn.microsoft.com/en-us/cpp/build/ieee-floating-point-representation?view=msvc-160
for example.
And the standard says that if you convert a floating value
to an integral which cannot be represented by the target integral datatype the result is UB, undefined behaviour.
See the C++17 Standard ISO/IEC14882:
7.10 Floating-integral conversions [conv.fpint]
A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates ;
that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented
in the destination type. ...
So double can easily hold 2^64 and thats the reason why line 1 could print out so easily. But it is 1
too much to be represented in size_t so the result is UB.
So whatever is the outcome of our line 2 is simply irrelevant because it is UB.
Ok, but if any random result will do, how come the UB outcome is exactly half?
Well fist of all, the outcome is from MSVC. Clang or other compiler may deliver any other UB result.
But lets look at the "half" outcome since it is easy.
Trying to add 1 to the largest
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 is 18446744073709551615
would if only integrals would be involved lead to,
1 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
but thats not possible since the bit does not exist and it is not integral but double datatype and
hence UB, so accidentially the result is
10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 which is 9223372036854775808
so exactly half of the naively expected or 2^63.

Issue of interpreting 64-bit to 32-bit integer in SIMD register

I am quite confused of the above interpretion issue. I set a 256-bit vector register with 4 * 64-bit integer value 2^32 using intrinsics like this:
__m256i vec_mask = _mm256_set1_epi64x(1 << 32);
then I would like to interpret it as 8 * 32-bit integers:
__m256i * tmp_mask = new __m256i;
_mm256_storeu_si256(tmp_mask, vec_mask); // store
for (int i = 0; i < 8; ++i)
printf("%d ", ((int *)(tmp_mask))[i]);
delete tmp_mask;
As for each 64-bit value 2^32, I think it is like this in SIMD register:
00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000000
255 (MSB) ----------------------------------------------------- 0 (LSB)
So each 64-bit value 2^32 is interpreted as <1, 0> in 2 * 32-bit format. The final output is expected to be <0, 1, 0, 1, 0, 1, 0, 1> from low to high, but the output is quite weird: <0, 0, 0, 0, 0, 0, 0, 0>.
Any idea where I made a mistake? Thanks.
The following code outputs 0 1 0 1 0 1 0 1 as you expect...
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <x86intrin.h>
int main() {
__m256i vec_mask = _mm256_set1_epi64x(UINT64_C(1) << 32);
uint32_t tmp_mask[8];
_mm256_storeu_si256((__m256i *)tmp_mask, vec_mask); // store
for (int i = 0; i < 8; ++i)
printf("%d ", tmp_mask[i]);
}

How to read a binary file to calculate frequency of Huffman tree?

I have to calculate frequency of Huffman tree from a "binary file" as sole argument. I have a doubt that binary files are the files which contains "0" and "1" only.
Whereas frequency is the repetition of the number of alphabets (eg, abbacdd here freq of a=2, b=2 ,c=1, d=2).
And my structure must be like this:
struct Node
{
unsigned char symbol; /* the symbol or alphabets */
int freq; /* related frequency */
struct Node *left,*right; /* Left and right leafs */
};
But i not at all understand how can i get the symbol and from ".bin" file (which consists of only "0" and "1") ?
When i try to see the contents of a file i get:
hp#ubuntu:~/Desktop/Internship_Xav/Huf_pointer$ xxd -b out.bin
0000000: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000006: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000000c: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000012: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000018: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000001e: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000024: 00000000 00000000 00000000 00000000 00000000 00000000 ......
000002a: 00000000 00000000 00000000 00000000 00000000 00000000 ......
0000030: 00000000 00000000 00000000 00000000 00000000 00000000 ......
.........//Here also there is similar kind of data ................
00008ca: 00010011 00010011 00010011 00010011 00010011 00010011 ......
00008d0: 00010011 00010011 00010011 00010011 00010011 00010011 ......
00008d6: 00010011 00010011 00010011 00010011 00010011 00010011 .....
So , I not at all understand where are the frequencies and where are the symbols. How to store the symbols and how to calculate frequencies. Actually after having frequencies and symbols i will create HUffman tree using it.
First, you need to create some sort of frequency table.
You could use a std::map.
You would do something like this:
#include <algorithm>
#include <fstream>
#include <map>
#include <string>
std::map <unsigned char, int> CreateFrequencyTable (const std::string &strFile)
{
std::map <unsigned char, int> char_freqs ; // character frequencies
std::ifstream file (strFile) ;
int next = 0 ;
while ((next = file.get ()) != EOF) {
unsigned char uc = static_cast <unsigned char> (next) ;
std::map <unsigned char, int>::iterator iter ;
iter = char_freqs.find (uc) ;
// This character is in our map.
if (iter != char_freqs.end ()) {
iter->second += 1 ;
}
// This character is not in our map yet.
else {
char_freqs [uc] = 1 ;
}
}
return char_freqs ;
}
Then you could use this function like this:
std::map <unsigned char, int> char_freqs = CreateFrequencyTable ("file") ;
You can obtain the element with the highest frequency like this:
std::map <unsigned char, int>::iterator iter = std::max_element (
char_freqs.begin (),
char_freqs.end (),
std::map <unsigned char, int>::value_comp
) ;
Then you would need to build your Huffman tree.
Remember that the characters are all leaf nodes, so you need a way to differentiate the leaf nodes from the non-leaf nodes.
Update
If reading a single character from the file is too slow, you could always load all of the contents into a vector like this:
// Make sure to #include <iterator>
std::ifstream file ("test.txt") ;
std::istream_iterator <unsigned char> begin = file ;
std::vector<unsigned char> vecBuffer (begin, std::istream_iterator <unsigned char> ()) ;
You would still need to create a frequency table.
A symbol in a huffman tree could be anything,
but as you have to use an unsigned char per symbol
you should probably take a byte?
So no, not only 0 or 1, but eight time 0 or 1 together.
Like 00010011 somewhere in your output of xxd
xxd -b will just give you eight 0/1 per byte.
You could write a number between 0 and 255 as well,
or two times one character of 0123456789abcdef
There are lots of possibilies how to show a byte on the screen,
but that does not matter at all.
If you know how to read the content of a file in C/C++,
just read unsigned char until the file ends
and count which value is how often in there. That´s all.
As you´re probably writing decimal numbers in your program code,
there are 256 different values (0,1,2...255).
So you will need 256 integers (in an array, or your Node struct...)
to count how often each value appears.

c++ bmp bitwise operator

for(unsigned int h=0; h<ImageBits.iHeight; h++)
{
for(unsigned int w=0; w<ImageBits.iWidth; w++)
{
// So in this loop - if our data isn't aligned to 4 bytes, then its been padded
// in the file so it aligns...so we check for this and skip over the padded 0's
// Note here, that the data is read in as b,g,r and not rgb as you'd think!
unsigned char r,g,b;
fread(&b, 1, 1, fp);
fread(&g, 1, 1, fp);
fread(&r, 1, 1, fp);
ImageBits.pARGB[ w + h*ImageBits.iWidth ] = (r<<16 | g<<8 | b);
}// End of for loop w
//If there are any padded bytes - we skip over them here
if( iNumPaddedBytes != 0 )
{
unsigned char skip[4];
fread(skip, 1, 4 - iNumPaddedBytes, fp);
}// End of if reading padded bytes
}// End of for loop h
I do not understand this statement and how does it store the rgb value of the pixel
ImageBits.pARGB[ w + h*ImageBits.iWidth ] = (r<<16 | g<<8 | b);
i did a read up on the << bitwise shift operator but i still do not understand how it works.Can someone help me out here.
You need to convert separate values for Red, Green and Blue into a single variable, so you push them 16 and 8 bits to the "left" respectively, so they align 8 bits for Red (begin - 16), then you get 8 bits for Green (begin - 8) and the remaining color.
Consider the following:
Red -> 00001111
Green -> 11110000
Blue -> 10101010
Then RGB -> that has 24 bits capacity would look like this initially ->
-> 00000000 00000000 00000000
(there would actually be some random rubbish but it's easier to
demonstrate like this)
Shift the Red byte 16 places to the left, so we get 00001111 00000000 00000000.
Shift the Green byte 8 places to the left, so we have 00001111 11110000 00000000.
Don't shift the Blue byte, so we have 00001111 11110000 10101010.
You could achieve a similar result with unions. Here's an ellaboration as to why we do it like this. The only way for you to access a variable is to have it's address (usually bound to a variable name, or an alias).
That means that we have an address of the first byte only and also a guarantee that if it's a variable that is 3 bytes wide, the following two bytes that are next to our addressed byte belong to us. So we can literally "push the bits" to the left (shift them) so they "flow" into the remaining bytes of the variable. We could also pointer-arithmetic a pointer there or as I've mentioned already, use a union.
Bit shifting moves the bits that make up the value along by the number you specify.
In this case it's done with colour values so that you can store multiple 1 byte components (such as RGBA which are in the range 0-255) in a single 4 byte structure such as an int
Take this byte:
00000011
which is equal to 3 in decimal. If we wanted to store the value 3 for the RGB and A channel, we would need to store this value in the int (the int being 32 bits)
R G B A
00000011 00000011 00000011 00000011
As you can see the bits are set in 4 groups of 8, and all equal the value 3, but how do you tell what the R value is when it's stored this way?
If you got rid of the G/B/A values, you'd be left with
00000011 00000000 00000000 00000000
Which still doesn't equal 3 - (in fact it's some massive number - 12884901888 I think)
In order to get this value into the last byte of the int, you need to shift the bits 24 places to the right. e.g.
12884901888 >> 24
Then the bits would look like this:
00000000 00000000 00000000 00000011
And you would have your value '3'
Basically it's just a way of moving bits around in a storage structure so that you can better manipulate the values. Putting the RGBA values into a single value is usually called stuffing the bits
let's visualize this and break it into several steps, and you'll see how simple it is.
let's say we have the ARGB 32 bit variable, that can be viewed as
int rgb = {a: 00, r: 00, g: 00, b: 00} (this is not valid code, of course, and let's leave the A out of this for now).
the value in each of these colors is 8 bit of course.
now we want to place a new value, and we have three 8 bit variables for each color:
unsigned char r = 0xff, g=0xff, b=0xff.
what we're essentially doing is taking a 32 bit variable, and then doing this:
rgb |= r << 16 (shifting the red 16 bit left. everything to right of it will remain 0)
so now we have
rgb = [a: 00, r: ff, g: 00, b: 00]
and now we do:
rgb = rgb | (g << 8) (meaning taking the existing value and OR'ing it with green shifted to its place)
so we have [a: 00, r: ff, g: ff, b: 00]
and finally...
rgb = rgb | b (meaning taking the value and ORing it with the blue 8 bits. the rest remains unchanged)
leaving us with [a: 00, r: ff, g: f, b: ff]
which represents a 32 bit (24 actually since the Alpha is irrelevant to this example) color.

Explanation of an RGB to BGR C++ macro

#define RGB2BGR(a_ulColor) (a_ulColor & 0xFF000000) | ((a_ulColor & 0xFF0000) >> 16) | (a_ulColor & 0x00FF00) | ((a_ulColor & 0x0000FF) << 16)
Can you please explain to me the meaning of this macro?
Colors are usually represented by a 32-bit integer. 32-bit integers can hold four 8-bit bytes. Three of them are used to hold red, green, and blue color information. The remaining byte is either left unused or used to hold transparency information.
Which byte represents which color is not standardized. Some APIs expect the bytes like this:
(MSB) ******** rrrrrrrr gggggggg bbbbbbbb (LSB)
Which is the "RGB" layout, perhaps the most common form. In the illlustration above, the most sigificant 8-bits are the "don't care" bits, that is, the bits there are not used. The least significant 8-bits store the information for the blue color.
Some APIs expect the reverse for the 3 color bytes, like this:
(MSB) ******** bbbbbbbb gggggggg rrrrrrrr (LSB)
Which is the "BGR" layout.
The macro helps interconvert the two layouts using the bitwise operators. Let's take a look at its definition:
(a_ulColor & 0xFF000000) | ((a_ulColor & 0xFF0000) >> 16) |
(a_ulColor & 0x00FF00) | ((a_ulColor & 0x0000FF) << 16)
Let's say we have a color, Cornflower Blue, which has a value of 0x93CCEA. In the RGB layout, it has the following bit pattern:
a_ulColor = 00000000 10010011 11001100 11101010
The following expressions give you the following patterns:
1. a_ulColor & 0xFF000000 --> 00000000 00000000 00000000 00000000
2. a_ulColor & 0xFF0000 --> 00000000 10010011 00000000 00000000
3. a_ulColor & 0x00FF00 --> 00000000 00000000 11001100 00000000
4. a_ulColor & 0x0000FF --> 00000000 00000000 00000000 11101010
Notices that we're just extracting the individual bytes. Expression #1 extracts the most significant 8-bits, and expression #4 extracts the least signficiant 8-bits. We were able to do this via the AND bitwise operation.
Now, to convert RGB to BGR, we have to move some bits left or right, via bitshifts. Like this:
1. (a_ulColor & 0xFF000000) --> 00000000 00000000 00000000 00000000
2. (a_ulColor & 0xFF0000) >> 16 --> 00000000 00000000 00000000 10010011
3. (a_ulColor & 0x00FF00) --> 00000000 00000000 11001100 00000000
4. (a_ulColor & 0x0000FF) << 16 --> 00000000 11101010 00000000 00000000
The expression a >> 16 simply shifts the bits to the right by 16 bits. a << 16 shifts the bits to the left by 16 bits.
Then, when you OR them all together, you get this:
00000000 11101010 11001100 10010011
Compare the result to the original bit pattern:
00000000 11101010 11001100 10010011
00000000 10010011 11001100 11101010
You can see that the 2nd and 4th bytes are swapped. That's all the macro does.
It takes a four-byte integral value, AA BB CC DD, and returns the value AA DD CC BB. You can see that the first and third byte are retained unchanged, while the second byte is moved down two bytes (>> 16) and the fourth is moved up by two (<< 16).
It swaps the order of the byte-sized RGB elements from RGB to BGR (and vice-versa, to be fair).
a_ulColor is a 32 bit RGB representation (e.g. of a pixel or bitmap). The macro converts it to BGR layout. It effectively produces a new value by swapping the Red and Blue component values.