I have a short integer variable called s_int that holds value = 2
unsighed short s_int = 2;
I want to copy this number to a char array to the first and second position of a char array.
Let's say we have char buffer[10];. We want the two bytes of s_int to be copied at buffer[0] and buffer[1].
How can I do it?
The usual way to do this would be with the bitwise operators to slice and dice it, a byte at a time:
b[0] = si & 0xff;
b[1] = (si >> 8) & 0xff;
though this should really be done into an unsigned char, not a plain char as they are signed on most systems.
Storing larger integers can be done in a similar way, or with a loop.
*((short*)buffer) = s_int;
But viator emptor that the resulting byte order will vary with endianness.
By using pointers and casts.
unsigned short s_int = 2;
unsigned char buffer[sizeof(unsigned short)];
// 1.
unsigned char * p_int = (unsigned char *)&s_int;
buffer[0] = p_int[0];
buffer[1] = p_int[1];
// 2.
memcpy(buffer, (unsigned char *)&s_int, sizeof(unsigned short));
// 3.
std::copy((unsigned char *)&s_int,
((unsigned char *)&s_int) + sizeof(unsigned short),
buffer);
// 4.
unsigned short * p_buffer = (unsigned short *)(buffer); // May have alignment issues
*p_buffer = s_int;
// 5.
union Not_To_Use
{
unsigned short s_int;
unsigned char buffer[2];
};
union Not_To_Use converter;
converter.s_int = s_int;
buffer[0] = converter.buffer[0];
buffer[1] = converter.buffer[1];
I would memcpy it, something like
memcpy(buffer, &s_int, 2);
The endianness is preserved correctly so that if you cast buffer into unsigned short *, you can read the same value of s_int the right way. Other solution must be endian-aware or you could swap lsb and msb. And of course sizeof(short) must be 2.
If you don't want to make all that bitwise stuff you could do the following
char* where = (char*)malloc(10);
short int a = 25232;
where[0] = *((char*)(&a) + 0);
where[1] = *((char*)(&a) + 1);
Related
I have a string of 256*4 bytes of data. These 256* 4 bytes need to be converted into 256 unsigned integers. The order in which they come is little endian, i.e. the first four bytes in the string are the little endian representation of the first integer, the next 4 bytes are the little endian representation of the next integer, and so on.
What is the best way to parse through this data and merge these bytes into unsigned integers? I know I have to use bitshift operators but I don't know in what way.
Hope this helps you
unsigned int arr[256];
char ch[256*4] = "your string";
for(int i = 0,k=0;i<256*4;i+=4,k++)
{
arr[k] = ch[i]|ch[i+1]<<8|ch[i+2]<<16|ch[i+3]<<24;
}
Alternatively, we can use C/C++ casting to interpret a char buffer as an array of unsigned int. This can help get away with shifting and endianness dependency.
#include <stdio.h>
int main()
{
char buf[256*4] = "abcd";
unsigned int *p_int = ( unsigned int * )buf;
unsigned short idx = 0;
unsigned int val = 0;
for( idx = 0; idx < 256; idx++ )
{
val = *p_int++;
printf( "idx = %d, val = %d \n", idx, val );
}
}
This would print out 256 values, the first one is
idx = 0, val = 1684234849
(and all remaining numbers = 0).
As a side note, "abcd" converts to 1684234849 because it's run on X86 (Little Endian), in which "abcd" is 0x64636261 (with 'a' is 0x61, and 'd' is 0x64 - in Little Endian, the LSB is in the smallest address). So 0x64636261 = 1684234849.
Note also, if using C++, reinterpret_cast should be used in this case:
const char *p_buf = "abcd";
const unsigned int *p_int = reinterpret_cast< const unsigned int * >( p_buf );
If your host system is little-endian, just read along 4 bytes, shift properly and copy them to int
char bytes[4] = "....";
int i = bytes[0] | (bytes[1] << 8) | (bytes[2] << 16) | (bytes[3] << 24);
If your host is big-endian, do the same and reverse the bytes in the int, or reverse it on-the-fly while copying with bit-shifting, i.e. just change the indexes of bytes[] from 0-3 to 3-0
But you shouldn't even do that just copy the whole char array to the int array if your PC is in little-endian
#define LEN 256
char bytes[LEN*4] = "blahblahblah";
unsigned int uint[LEN];
memcpy(uint, bytes, sizeof bytes);
That said, the best way is to avoid copying at all and use the same array for both types
union
{
char bytes[LEN*4];
unsigned int uint[LEN];
} myArrays;
// copy data to myArrays.bytes[], do something with those bytes if necessary
// after populating myArrays.bytes[], get the ints by myArrays.uint[i]
I scan through the byte representation of an int variable and get somewhat unexpected result.
If I do
int a = 127;
cout << (unsigned int) *((char *)&a);
I get 127 as expected. If I do
int a = 256;
cout << (unsigned int) *((char *)&a + 1);
I get 1 as expected. But if I do
int a = 128;
cout << (unsigned int) *((char *)&a);
I have 4294967168 which is, well… quite fancy.
The question is: is there a way to get 128 when looking at first byte of an int variable which value is 128?
For the same reason that (unsigned int)(char)128 is 4294967168: char is signed by default on most commonly used systems. 128 cannot fit in a signed 8-bit quantity, so when you cast it to char, you get -128 (0x80 in hex).
Then, when you cast -128 to an unsigned int, you get 232 - 128, which is 4294967168.
If you want to get +128, then use an unsigned char instead of char.
char is signed here, so in your second example, *((char *)&a + 1) = ((char)256 +1) = (0+1) = 1, which is encoded as 0b00000000000000000000000000000001, so becomes 1 as an unsigned int.
In your third example, *((char *)&a) = (char)128 = (char)-127, which is encoded as 0b10000000000000000000000000000000, i.e., 2<<31, which is 4294967168
As the comments have pointed out, it looks like what's happening here is that you are running into an oddity of twos complement. In your last cast, since you are not using an unsigned char, the highest-order bit of the byte is being used to indicate positive or negative values. You then only have 7 bits out of the full 8 to represent your value, giving you a range of 0-127 for positive numbers (-128-127 overall).
If you exceed this range, then it wraps, and you get -128, which when casted back to an unsigned int will result in that abnormally large value.
int a = 128;
cout << (unsigned int) *((unsigned char *)&a);
Also all of your code is dependent on running on a little endian machine.
Here's how you should probably be doing these things:
int a = 127;
cout << (unsigned)(unsigned char)(0xFF & a);
int a = 256;
cout << (unsigned)(unsigned char)(0xFF & (a>>8));
int a = 128;
cout << (unsigned)(unsigned char)(0xFF & a);
I have
typedef unsigned int DWORD;
void write_str(string str, char** buf) {
DWORD len = str.size();
**buf = len;
*buf += sizeof(len);
memcpy(*buf, str.c_str(), len);
*buf += len;
}
This code, and only 1 byte is overwriten in **buf = len; if i have i.e. 7 in len while 4 should be, since sizeof(DWORD) = 4
As buf is a char **, **buf is a char. It can hold only a single byte. Therefore, only a single byte is written to it.
Fix:
DWORD *tmpptr(*buf);
*tmpptr = len;
C++ is automatically casting len to a char, since that is what *buf is.
You have the parameter
char** buf
Meaning that **buf is a char, which is very likely a single byte.
1 byte is overwritten since the destination type is char (the type of **buf is char). This is correct. But the expression *buf += sizeof(len) has no meaning in my opinion.
I need to convert integer value into char array on bit layer. Let's say int has 4 bytes and I need to split it into 4 chunks of length 1 byte as char array.
Example:
int a = 22445;
// this is in binary 00000000 00000000 1010111 10101101
...
//and the result I expect
char b[4];
b[0] = 0; //first chunk
b[1] = 0; //second chunk
b[2] = 87; //third chunk - in binary 1010111
b[3] = 173; //fourth chunk - 10101101
I need this conversion make really fast, if possible without any loops (some tricks with bit operations perhaps). The goal is thousands of such conversions in one second.
I'm not sure if I recommend this, but you can #include <stddef.h> and <sys/types.h> and write:
*(u32_t *)b = htonl((u32_t)a);
(The htonl is to ensure that the integer is in big-endian order before you store it.)
int a = 22445;
char *b = (char *)&a;
char b2 = *(b+2); // = 87
char b3 = *(b+3); // = 173
Depending on how you want negative numbers represented, you can simply convert to unsigned and then use masks and shifts:
unsigned char b[4];
unsigned ua = a;
b[0] = (ua >> 24) & 0xff;
b[1] = (ua >> 16) & 0xff;
b[2] = (ua >> 8) & 0xff
b[3] = ua & 0xff;
(Due to the C rules for converting negative numbers to unsigned, this will produce the twos complement representation for negative numbers, which is almost certainly what you want).
To access the binary representation of any type, you can cast a pointer to a char-pointer:
T x; // anything at all!
// In C++
unsigned char const * const p = reinterpret_cast<unsigned char const *>(&x);
/* In C */
unsigned char const * const p = (unsigned char const *)(&x);
// Example usage:
for (std::size_t i = 0; i != sizeof(T); ++i)
std::printf("Byte %u is 0x%02X.\n", p[i]);
That is, you can treat p as the pointer to the first element of an array unsigned char[sizeof(T)]. (In your case, T = int.)
I used unsigned char here so that you don't get any sign extension problems when printing the binary value (e.g. through printf in my example). If you want to write the data to a file, you'd use char instead.
You have already accepted an answer, but I will still give mine, which might suit you better (or the same...). This is what I tested with:
int a[3] = {22445, 13, 1208132};
for (int i = 0; i < 3; i++)
{
unsigned char * c = (unsigned char *)&a[i];
cout << (unsigned int)c[0] << endl;
cout << (unsigned int)c[1] << endl;
cout << (unsigned int)c[2] << endl;
cout << (unsigned int)c[3] << endl;
cout << "---" << endl;
}
...and it works for me. Now I know you requested a char array, but this is equivalent. You also requested that c[0] == 0, c[1] == 0, c[2] == 87, c[3] == 173 for the first case, here the order is reversed.
Basically, you use the SAME value, you only access it differently.
Why haven't I used htonl(), you might ask?
Well since performance is an issue, I think you're better off not using it because it seems like a waste of (precious?) cycles to call a function which ensures that bytes will be in some order, when they could have been in that order already on some systems, and when you could have modified your code to use a different order if that was not the case.
So instead, you could have checked the order before, and then used different loops (more code, but improved performance) based on what the result of the test was.
Also, if you don't know if your system uses a 2 or 4 byte int, you could check that before, and again use different loops based on the result.
Point is: you will have more code, but you will not waste cycles in a critical area, which is inside the loop.
If you still have performance issues, you could unroll the loop (duplicate code inside the loop, and reduce loop counts) as this will also save you a couple of cycles.
Note that using c[0], c[1] etc.. is equivalent to *(c), *(c+1) as far as C++ is concerned.
typedef union{
byte intAsBytes[4];
int int32;
}U_INTtoBYTE;
I want to store a 4-byte int in a char array... such that the first 4 locations of the char array are the 4 bytes of the int.
Then, I want to pull the int back out of the array...
Also, bonus points if someone can give me code for doing this in a loop... IE writing like 8 ints into a 32 byte array.
int har = 0x01010101;
char a[4];
int har2;
// write har into char such that:
// a[0] == 0x01, a[1] == 0x01, a[2] == 0x01, a[3] == 0x01 etc.....
// then, pull the bytes out of the array such that:
// har2 == har
Thanks guys!
EDIT: Assume int are 4 bytes...
EDIT2: Please don't care about endianness... I will be worrying about endianness. I just want different ways to acheive the above in C/C++. Thanks
EDIT3: If you can't tell, I'm trying to write a serialization class on the low level... so I'm looking for different strategies to serialize some common data types.
Unless you care about byte order and such, memcpy will do the trick:
memcpy(a, &har, sizeof(har));
...
memcpy(&har2, a, sizeof(har2));
Of course, there's no guarantee that sizeof(int)==4 on any particular implementation (and there are real-world implementations for which this is in fact false).
Writing a loop should be trivial from here.
Not the most optimal way, but is endian safe.
int har = 0x01010101;
char a[4];
a[0] = har & 0xff;
a[1] = (har>>8) & 0xff;
a[2] = (har>>16) & 0xff;
a[3] = (har>>24) & 0xff;
#include <stdio.h>
int main(void) {
char a[sizeof(int)];
*((int *) a) = 0x01010101;
printf("%d\n", *((int *) a));
return 0;
}
Keep in mind:
A pointer to an object or incomplete type may be converted to a pointer to a different
object or incomplete type. If the resulting pointer is not correctly aligned for the
pointed-to type, the behavior is undefined.
Note: Accessing a union through an element that wasn't the last one assigned to is undefined behavior.
(assuming a platform where characters are 8bits and ints are 4 bytes)
A bit mask of 0xFF will mask off one character so
char arr[4];
int a = 5;
arr[3] = a & 0xff;
arr[2] = (a & 0xff00) >>8;
arr[1] = (a & 0xff0000) >>16;
arr[0] = (a & 0xff000000)>>24;
would make arr[0] hold the most significant byte and arr[3] hold the least.
edit:Just so you understand the trick & is bit wise 'and' where as && is logical 'and'.
Thanks to the comments about the forgotten shift.
int main() {
typedef union foo {
int x;
char a[4];
} foo;
foo p;
p.x = 0x01010101;
printf("%x ", p.a[0]);
printf("%x ", p.a[1]);
printf("%x ", p.a[2]);
printf("%x ", p.a[3]);
return 0;
}
Bear in mind that the a[0] holds the LSB and a[3] holds the MSB, on a little endian machine.
Don't use unions, Pavel clarifies:
It's U.B., because C++ prohibits
accessing any union member other than
the last one that was written to. In
particular, the compiler is free to
optimize away the assignment to int
member out completely with the code
above, since its value is not
subsequently used (it only sees the
subsequent read for the char[4]
member, and has no obligation to
provide any meaningful value there).
In practice, g++ in particular is
known for pulling such tricks, so this
isn't just theory. On the other hand,
using static_cast<void*> followed by
static_cast<char*> is guaranteed to
work.
– Pavel Minaev
You can also use placement new for this:
void foo (int i) {
char * c = new (&i) char[sizeof(i)];
}
#include <stdint.h>
int main(int argc, char* argv[]) {
/* 8 ints in a loop */
int i;
int* intPtr
int intArr[8] = {1, 2, 3, 4, 5, 6, 7, 8};
char* charArr = malloc(32);
for (i = 0; i < 8; i++) {
intPtr = (int*) &(charArr[i * 4]);
/* ^ ^ ^ ^ */
/* point at | | | */
/* cast as int* | | */
/* Address of | */
/* Location in char array */
*intPtr = intArr[i]; /* write int at location pointed to */
}
/* Read ints out */
for (i = 0; i < 8; i++) {
intPtr = (int*) &(charArr[i * 4]);
intArr[i] = *intPtr;
}
char* myArr = malloc(13);
int myInt;
uint8_t* p8; /* unsigned 8-bit integer */
uint16_t* p16; /* unsigned 16-bit integer */
uint32_t* p32; /* unsigned 32-bit integer */
/* Using sizes other than 4-byte ints, */
/* set all bits in myArr to 1 */
p8 = (uint8_t*) &(myArr[0]);
p16 = (uint16_t*) &(myArr[1]);
p32 = (uint32_t*) &(myArr[5]);
*p8 = 255;
*p16 = 65535;
*p32 = 4294967295;
/* Get the values back out */
p16 = (uint16_t*) &(myArr[1]);
uint16_t my16 = *p16;
/* Put the 16 bit int into a regular int */
myInt = (int) my16;
}
char a[10];
int i=9;
a=boost::lexical_cast<char>(i)
found this is the best way to convert char into int and vice-versa.
alternative to boost::lexical_cast is sprintf.
char temp[5];
temp[0]="h"
temp[1]="e"
temp[2]="l"
temp[3]="l"
temp[5]='\0'
sprintf(temp+4,%d",9)
cout<<temp;
output would be :hell9
union value {
int i;
char bytes[sizof(int)];
};
value v;
v.i = 2;
char* bytes = v.bytes;