I have float var like that
float f = 0b 00000000 11110001 00000000 00000000
I want to take 1st(not 0st) byte to char variable.
I can't do << and >>.
how can i do that?
There is generally little point messing with the binary representation of floating point values. Any you'll try will not be portable. However, generally, these two work:
char c(reinterpret_cast<char*>(&f)[1]);
union {
float f;
char c[sizeof(float)];
} u = { f };
u.c[1];
char bla;
bla = *((char *) &f + 1)
Also remember that with endianness, on little endian systems what you may actually want is byte 2 (assuming you count your byte from 0 to 3). In that case you would change the + 1 with + 2 in the code above.
Related
int* to char* :
int* pNum = new int[1];
pNum[0] = 57;
char* pChar = reinterpret_cast< char* >(pNum);
Result : pChar[0] = '9'; //'9' ASCII 57
float* to char* :
float* pFloat = new float[1];
pFloat[0] = 57; //assign the same value as before
char* pChar = reinterpret_cast< char* >(pFloat);
Result : pChar[0] = 'a';
So why I'm getting two different results ?
Thanks for your help.
You have this because floating point values don't use the same encoding as integer values (IEEE encoding with mantissa+exponent or something like that)
Besides, I suppose you're running a little endian CPU, otherwise your first test would have yielded 0 (I mean '\0').
Both float and int are data types which are (usually) represented by four bytes:
b1 b2 b3 b4
However, those bytes are interpreted quite differently across the two types - if they wouldn't, there would be hardly any need for two types.
Now if you reinterpret the pointers to pointers-to-char, the result points only to the first byte, as this is the length of a char:
b1 b2 b3 b4
^^
your char* points to here
As said, this first byte has a very different meaning for the two data types, and this is why the representation as a char in general differs.
Application to your example:
The number 57 in float (IEEE754 Single precision 32-bit) is represented in bits as
01000010 01100100 00000000 00000000
In contrast, the representation in a 32-bit integer format is
00000000 00000000 00000000 00111001
Here the number seems to be represented in "big-endian" format, where the most important byte (the one which changes the value of the int the most) comes first. As mentioned by #Jean-FrançoisFabre, in your PC it seems to be the other way round, but nevermind. For both conversions, I used this site.
Now your char* pointers point to the first of those 8-bit-blocks, respectively. And obviously they're different.
I have the following char array:
char* a = new char[6]{0};
Which in binary is:
00000000 00000000 00000000 00000000 00000000 00000000
I also have an integer:
int i = 123984343;
Which in binary is:
00000111 01100011 11011001 11010111
I would like to insert this 4-byte-integer i into the char array a from position [1] to position [4] so that the original array a becomes:
00000000 00000111 01100011 11011001 11010111 00000000
What is the quickest and easiest method to do that?
You can solve the problem as asked with
memcpy( &a[1], &i, sizeof(i) );
but I bet dollars to doughnuts that this is not the best way of solving your problem.
for (size_t ix = 0; ix < 4; ix++)
{
a[1+ix] = (static_cast<unsigned int>(i) >> (8*ix)) & 0xff;
}
Is a safe way of serializing an int which fits into four bytes into a character array. Neither this end, nor the other end, have to make non-portable assumptions.
I'm not convinced that even this is the best way of solving your actual problem (but it hard to tell without more information).
Use the copy algorithm and a cast to char to access the underlying byte sequence:
#include <algorithm>
#include <cstdint>
std::uint32_t n = 123984343;
char * a = new char[6]{};
{
const char * p = reinterpret_cast<const char *>(&n);
std::copy(p, p + sizeof n, a + 1);
}
In this case I am assuming that you are guaranteeing me that the bytes of the integer are in fact what you claim. This is platform-dependent, and integers may in general be laid out differently. Perhaps an algebraic operation would be more appropriate. You still need to consider the number of bits in a char; the following code works if uint8_t is supported:
std::uint8_t * p = reinterpret_cast<std::uint8_t *>(a);
p[1] = n / 0x1000000;
p[2] = n / 0x0010000;
p[3] = n / 0x0000100;
p[4] = n / 0x0000001;
If int is of 4 bytes, then copy the int to address of 2nd position in the char array.
int i = 123984343;
char* a = new char[6]{0};
memcpy(a+1, &i, sizeof(i));
Right now I'm watching this lecture:
https://www.youtube.com/watch?v=jTSvthW34GU
In around 50th minute of the film he says that this code will return non-zero value:
float f = 7.0;
short s = *(short*)&f;
Correct me if I'm mistaking:
&f is a pointer to float.
We take &f and cast it to pointer to short.
Then we dereference (don't know if it's a verb) that pointer so eventually the whole statement represents a value of 7.
If I print that it displays 0. Why?
Dereferencing through a cast pointer does not cause a conversion to take place the way casting a value does. No bits are changed. So, while
float f = 7.0;
short s = (short)f;
will result in s having the integer value 7,
short s = *(short *)&f;
will simply copy the first 16 bits (depending on platform) of the floating point representation of the value 7.0 into the short. On my system, using little-endian IEEE-754, those bits are all zero, so the value is zero.
Floats are represented internally as 4byte floating point numbers (1 signal bit, 8 exponent bits, 23 mantissa bits) while shorts are 2byte integer types (two's compliment numbers). The code above will reinterpret the top two or bottom two bytes (depending on endianness) of the floating point number as an short integer.
So in the case of 7.0, the floating point number looks like:
0_1000000 1_1100000 00000000 00000000
So on some machines, it will take the bottom 2bytes (all 0s) and on others, it will take the top bytes (non-zero).
For more, see:
Floating-point: http://en.wikipedia.org/wiki/Floating_point
Endianness: http://en.wikipedia.org/wiki/Endianness
Casting a pointer to a different type does not cause any conversion of the pointed-to value; you are just interpreting the pointed-to bytes through the "lens" of a different type.
In the general case, casting a pointer to a different pointer type causes undefined behavior. In this case that behavior happens to depend on your architecture.
To get a picture of what is going on, we can write a general function that will display the bits of an object given a pointer to it:
template <typename T>
void display_bits(T const * p)
{
char const * c = reinterpret_cast<char const *>(p);
for (int i = 0; i < sizeof(T); ++i) {
unsigned char b = static_cast<unsigned char>(*(c++));
for (int j = 0; j < 8; ++j) {
std::cout << ((b & 0x80) ? '1' : '0');
b <<= 1;
}
std::cout << ' ';
}
std::cout << std::endl;
}
If we run the following code, this will give you a good idea of what is going on:
int main() {
float f = 7.0;
display_bits(&f);
display_bits(reinterpret_cast<short*>(&f));
return 0;
}
The output on my system is:
00000000 00000000 11100000 01000000
00000000 00000000
The result you get should now be pretty clear, but again it depends on the compiler and/or architecture. For example, using the same representation for float but on a big-endian machine, the result would be quite different because the bytes in the float would be reversed. In that case the short* would be pointing at the bytes 01000000 11100000.
I was trying to search for a code to determine the endianness of the system, and this is what I found:
int main()
{
unsigned int i= 1;
char *c = (char *)&i;
if (*c) {
printf("Little Endian\n");
} else {
printf("Big Endian\n");
}
}
Could someone tell me how this code works? More specifically, why is the ampersand needed here in this typecasting :
char *c = (char *)&i;
What is getting stored into the pointer c.. the value i contains or the actual address i is contained in? Also why is this a char for this program?
While dereferencing a character pointer, only one byte is interpreted(Assuming a char variable takes one byte).And in little-endian mode,the least-significant-byte of an integer is stored first.So for a 4-byte integer,say 3,it is stored as
00000011 00000000 00000000 00000000
while for big-endian mode it is stored as:
00000000 00000000 00000000 00000011
So in the first case, the char* interprets the first byte and displays 3 but in the second case it displays 0.
Had you not typecasted it as :
char *c = (char *)&i;
it will show a warning about incompatible pointer type.Had c been an integer pointer, dereferencing it will get an integer value 3 irrespective of the endianness,as all 4 bytes will be interpreted.
NB You need to initialize the variable i to see the whole picture.Else a garbage value is stored in the variable by default.
Warning!! OP,we discussed the difference between little-endian and big-endian,but it's more important to know the difference between little-endian and little-indian.I noticed that you used the latter.Well, the difference is that little-indian can cost you your dream job in Google or a $3 million in venture capital if your interviewer is a Nikesh Arora,Sundar Pichai,Vinod Dham or Vinod Khosla :-)
Let's try to walk through this: (in comments)
int main(void){ /
unsigned int i = 1; // i is an int in memory that can be conceptualized as
// int[0x00 00 00 01]
char *c = *(char *)&i; // We take the address of i and then cast it to a char pointer
// which we then dereference. This cast from int(4 bytes)
// to char(1 byte) results in only keeping the lowest byte by
if(*c){ // Endian-ness.
puts("little!\n"); // This means that on a Little Endian machine, 0x01 will be
} else { // the byte kept, but on a Big Endian machine, 0x00 is kept.
puts("big!\n"); // int[0x00 00 00 (char)[01]] vs int[0x01 00 00 (char)[00]]
}
return 0;
}
I'm not used to binary files, and I'm trying to get the hang of it. I managed to store some integers and unsigned char, and read them without too much pain. Now, when I'm trying to save some booleans, I see that each of my bool takes exactly 1 octet in my file, which seems logical since a lone bool is stored in a char-sized data (correct me if I'm wrong!).
But since I'm going to have 3 or 4 bools to serialize, I figure it is a waste to store them like this : 00000001 00000001 00000000, for instance, when I could have 00000110. I guess to obtain this I should use bitwise operation, but I'm not very good with them... so could somebody tell me:
How to store up to 8 bools in a single octet using bitwise manipulations?
How to give proper values to (up to 8 bools) from a single octet using bitwise manipulation?
(And, bonus question, does anybody can recommend a simple, non-mathematical-oriented-mind like mine, bit manipulation tutorial if this exists? Everything I found I understood but could not put into practice...)
I'm using C++ but I guess most C-syntaxic languages will use the same kind of operation.
To store bools in a byte:
bool flag; // value to store
unsigned char b = 0; // all false
int position; // ranges from 0..7
b = b | (flag << position);
To read it back:
flag = (b & (1 << position));
The easy way is to use std::bitset which allows you to use indexing to access individual bits (bools), then get the resulting value as an integer. It also allows the reverse.
int main() {
std::bitset<8> s;
s[1] = s[2] = true; // 0b_0000_0110
cout << s.to_ulong() << '\n';
}
Without wrapping in fancy template/pre-processor machinery:
Set bit 3 in var:var |= (1 << 3)
Set bit n in var:var |= (1 << n)
Clear bit n in var:var &= ~(1 << n)
Test bit n in var: (the !! ensures the result is 0 or 1)!!(var & (1 << n))
Try reading this in order.
http://www.cprogramming.com/tutorial/bitwise_operators.html
http://www-graphics.stanford.edu/~seander/bithacks.html#ConditionalSetOrClearBitsWithoutBranching
Some people willthink that 2nd link is way too hardcore, but once you will master simple manipulation, it will come handy.
Basic stuff first:
The only combination of bits that means false is 00000000 all the others mean true i.e: 00001000,01010101
00000000 = 0(decimal), 00000001 = 2^0, 00000010 = 2^1, 00000100 = 2^2, …. ,10000000 = 2^7
There is a big difference between the operands (&&, ||) and (&,|) the first ones give the result of the logic operation between the two numbers, for example:
00000000 && 00000000 = false,
01010101 && 10101010 = true
00001100 || 00000000 = true,
00000000 || 00000000 = false
The second pair makes a bitwise operation (the logic operation between each bit of the numbers):
00000000 & 00000000 = 00000000 = false
00001111 & 11110000 = 00000000 = false
01010101 & 10101001 = 00000001 = true
00001111 | 11110000 = 11111111 = true
00001100 | 00000011 = 00001111 = true
To work with this and play with the bits, you only need to know some basic tricks:
To set a bit to 1 you make the operation | with an octet that has a 1 in that position and ceros in the rest.
For example: we want the first bit of the octet A to be 1 we make: A|00000001
To set a bit to 0 you make the operation & with an octet that has a 0 in that position and ones in the rest.
For example: we want the last bit of the octet A to be 0 we make: A&01111111
To get the Boolean value that holds a bit you make the operation & with an octet that has a 1 in that position and ceros in the rest.
For example: we want to see the value of the third bit of the octet A, we make: A&00000100, if A was XXXXX1XX we get 00000100 = true and if A was XXXXX0XX we get 00000000 = false;
You can always serialize bitfields. Something like:
struct bools
{
bool a:1;
bool b:1;
bool c:1;
bool d:1;
};
has a sizeof 1