Change byte in int - casting byte to an integer - c++

I'm streaming data from the server. Server sends various BigEndian variables, but also sends bytes (representing number). One of my SocketClient.read overloads accepts (int length, char* array). I want to pass an integer variable pointer to this function to get 0-255 value in it (unsigned byte).
What have I tried:
unsigned int UNSIGNED_BYTE;
socket.read(1, &((char*)&UNSIGNED_BYTE)[0]); //I am changing 1st byte of a variable - C++ uses little endian
//I know that function reads 6, and that is what I find in 1st byte
std::cout<<(int)((char*)&UNSIGNED_BYTE)[0]<<")\n"; //6 - correct
std::cout<<UNSIGNED_BYTE<<")\n"; //3435973638 -What the hell?
According to the above, I am changing the wrong part of the int. But what else should I change?
My class declaration and implementation:
/*Declares:*/
bool read(int bytes, char *text);
/*Implements:*/
bool SocketClient::read(int bytes, char *text) {
//boost::system::error_code error;
char buffer = 0;
int length = 0;
while(bytes>0) {
try
{
size_t len = sock.receive(boost::asio::buffer(&buffer, 1)); //Read a byte to a buffer
}
catch(const boost::system::system_error& ex) {
std::cout<<"Socket exception: "<<ex.code()<<'\n';
return false; //Happens when peer disconnects for example
}
if(byteEcho)
std::cout<<(int)(unsigned char)buffer<<' ';
bytes--; //Decrease ammount to read
text[length] = buffer;
length++;
}
return true;
}

So firstly:
unsigned int UNSIGNED_BYTE;
Probably isn't very helpfully named since I very much doubt the architecture you're using defines an int as an 8 bit unsigned integer additionally you're not initializing this to zero and then later you're writing to only part of it leaving the rest as garbage. It's likely to be 32/64 bits in size on most modern compilers/architectures.
Secondly:
socket.read(1, &((char*)&UNSIGNED_BYTE)[0])
Is reading 8 bits into a (probably) 32 bit memory location and the correct end to put the 8 bits is not down to C++ (as you say in your comments). It's actually down to your CPU since endianness is a property of the CPU not the language. Why don't you read the value into an actual char and then simply assign that to an int since this will deal with the conversion for you and will make your code portable.

The problem was, that I did not initialise the int. Though the 1st byte was changed, other 3 bytes had random values.
This makes the solution very simple (and also makes my question be likely to be closed as Too localised):
unsigned int UNSIGNED_BYTE = 0;

Related

With out type casting how I can fill the bit fields

#include <iostream>
#include <bitset>
typedef struct
{
int i;
char a[4];
uint8_t j:1;
uint8_t k:1;
} abctest;
int main()
{
abctest tryabc;
memset(&tryabc, 0x00, sizeof(tryabc));
std::bitset<1> b;
b = false;
std::cout << b << '\n';
b = true;
std::cout << sizeof(b) << '\n';
}
My doubt is like I have a char array, it is basically a structure received in some module, in this structure I have bit fields also, I can use memcpy but I cannot
Type cast the buffer to structure (for e.g if my char* arr is actually of type struct abc, I cannot do abc* temp = (abc*)arr)
All I can do is memcpy only, So I want to know with out type casting how I can fill the bit fields.
If you know the literal data type and its size in bytes, a variable can be used with bit-shifting to store and extract bits into the array. This is a lower-level function that still exists in C++ but is more related to the low-level programming style of C.
Another way is to use the division and modulo with powers of 2 to encode bits at exact locations. I'd suggest you look up how binary works first and then figure out that shifting to the right by 1 actually divides by 2.
Cheers!
Why can't you typecast a char array into an abctest pointer? I tested it and all works well:
typedef struct
{
int i;
char a[4];
uint8_t j:1;
uint8_t k:1;
} abctest;
int main(int argc, char **argv)
{
char buf[9];
abctest *abc = (abctest*)buf;
memset(buf, 0x00, sizeof(buf));
printf("%d\n", abc->j);
}
However, while you definitely CAN typecast a char array into an abctest pointer, it doesn't mean you SHOULD do that. I think you should definitely learn about data serialization and unserialization. If you want to convert a complex data structure into a character array, typecasting is not the solution as the data structure members may have different sizes or alignment constraints on 64-bit machine than on a 32-bit machine. Furthermore, if you typecast a char array into a struct pointer, the alignment may be incorrect which may result in problems using RISC processors.
You could serialize the code by e.g. writing i as a 32-bit integer in network byte order, a as 4 characters and j and k as two bits in one character (the rest 6 being unused). Then when you unserialize it, you read i from the 32-bit integer in network byte order, the a from 4 characters and the remaining character gives j and k.

Display value of Hexa stored in Array

I have and array of hexa values called
const char receiptLogo[] = {0x01,0x80,0x00,0xB4};
When I tried to get the value Rprintf("%x\r\n",receiptLogo[3]);
Value was displayed as "ffffffb4" and sometimes it was displayed as "b4"
The whole Function is as
void PRINT_PrintLogo( const char Data[])
{
unsigned int height=0;
const char receiptLogo[] = {0x01,0x80,0x00,0xB4};
height=(((unsigned short)receiptLogo[2] ) << 8) | ((unsigned short)receiptLogo[3] );
Rprintf("height=%d,%x,%x\r\n",height,Data[2],Data[3]);
}
The output of this function is
height=65460,0,ffffffb4
although in other times the output is height=180,0,b4
Kindly please advise the reason behind it
in printf() the %X format specifier is hexadecimal int. If you pass a char as a parameter and the most significant bit is set it should sign extend to fill the size of an integer. The compiler may optimize by packing variables and sometimes your variable my be in different places with respect to byte boundaries (i.e. the char may end up in the least significant byte or most significant byte of a 32-bit integer). This access may cause this behavior.

Pointers Casting Endianness

#include "stdio.h"
typedef struct CustomStruct
{
short Element1[10];
}CustomStruct;
void F2(char* Y)
{
*Y=0x00;
Y++;
*Y=0x1F;
}
void F1(CustomStruct* X)
{
F2((char *)X);
printf("s = %x\n", (*X).Element1[0]);
}
int main(void)
{
CustomStruct s;
F1(&s);
return 0;
}
The above C code prints 0x1f00 when compiled and ran on my PC.
But when I flash it to an embedded target (uController) and debugging, I find that
(*X).Element1[0] = 0x001f.
1- Why the results are different on PC and on the embedded target?
2- What can I modify in this code so that it prints 0x001f in the PC case,
without changing the core of code (by adding a compiler option or something maybe).
shorts are typically two bytes and 16 bits. When you say:
short s;
((char*)&s)[0] = 0x00;
((char*)&s)[1] = 0x1f;
This sets the first of those two bytes to 0x00 and the second of those two bytes to 0x1f. The thing is that C++ doesn't specify what setting the first or second byte does to the value of the overall short, so different platforms can do different things. In particular, some platforms say that setting the first byte affects the 'most significant' bits of the short's 16 bits and setting the second byte affects the 'least significant' bits of the short's 16 bits. Other platforms say the opposite; That setting the first byte affect the least significant bits and setting the second byte affects the most significant bits. These two platform behaviors are referred to as big-endian and little-endian respectively.
The solution to getting consistent behavior independent of these differences is to not access the bytes of the short this way. Instead you should simply manipulate the value of the short using methods that the language does define, such as with bitwise and arithmetic operators.
short s;
s = (0x1f << 8) | (0x00 << 0); // set the most significant bits to 0x1f and the least significant bits to 0x00.
The problem is that, for many reasons, I can only change the body of the function F2. I can not change its prototype. Is there a way to find the sizeof Y before it have been castled or something?
You cannot determine the original type and size using only the char*. You have to know the correct type and size through some other means. If F2 is never called except with CustomStruct then you can simply cast the char* back to CustomStruct like this:
void F2(char* Y)
{
CustomStruct *X = (CustomStruct*)Y;
X->Element[0] = 0x1F00;
}
But remember, such casts are not safe in general; you should only cast a pointer back to what it was originally cast from.
The portable way is to change the definition of F2:
void F2(short * p)
{
*p = 0x1F;
}
void F1(CustomStruct* X)
{
F2(&X.Element1[0]);
printf("s = %x\n", (*X).Element1[0]);
}
When you reinterpret an object as an array of chars, you expose the implementation details of the representation, which is inherently non-portable and... implementation-dependent.
If you need to do I/O, i.e. interface with a fixed, specified, external wire format, use functions like htons and ntohs to convert and leave the platform specifics to your library.
It appears that the PC is little endian and the target is either big-endian, or has 16-bit char.
There isn't a great way to modify the C code on the PC, unless you replace your char * references with short * references, and perhaps use macros to abstract the differences between your microcontroller and your PC.
For example, you might make a macro PACK_BYTES(hi, lo) that packs two bytes into a short the same way, regardless of machine endian. Your example becomes:
#include "stdio.h"
#define PACK_BYTES(hi,lo) (((short)((hi) & 0xFF)) << 8 | (0xFF & (lo)))
typedef struct CustomStruct
{
short Element1[10];
}CustomStruct;
void F2(short* Y)
{
*Y = PACK_BYTES(0x00, 0x1F);
}
void F1(CustomStruct* X)
{
F2(&(X->Element1[0]));
printf("s = %x\n", (*X).Element1[0]);
}
int main(void)
{
CustomStruct s;
F1(&s);
return 0;
}

When to use unsigned char pointer

What is the use of unsigned char pointers? I have seen it at many places that pointer is type cast to pointer to unsinged char Why do we do so?
We receive a pointer to int and then type cast it to unsigned char*. But if we try to print element in that array using cout it does not print anything. why? I do not understand. I am new to c++.
EDIT Sample Code Below
int Stash::add(void* element)
{
if(next >= quantity)
// Enough space left?
inflate(increment);
// Copy element into storage, starting at next empty space:
int startBytes = next * size;
unsigned char* e = (unsigned char*)element;
for(int i = 0; i < size; i++)
storage[startBytes + i] = e[i];
next++;
return(next - 1); // Index number
}
You are actually looking for pointer arithmetic:
unsigned char* bytes = (unsigned char*)ptr;
for(int i = 0; i < size; i++)
// work with bytes[i]
In this example, bytes[i] is equal to *(bytes + i) and it is used to access the memory on the address: bytes + (i* sizeof(*bytes)). In other words: If you have int* intPtr and you try to access intPtr[1], you are actually accessing the integer stored at bytes: 4 to 7:
0 1 2 3
4 5 6 7 <--
The size of type your pointer points to affects where it points after it is incremented / decremented. So if you want to iterate your data byte by byte, you need to have a pointer to type of size 1 byte (that's why unsigned char*).
unsigned char is usually used for holding binary data where 0 is valid value and still part of your data. While working with "naked" unsigned char* you'll probably have to hold the length of your buffer.
char is usually used for holding characters representing string and 0 is equal to '\0' (terminating character). If your buffer of characters is always terminated with '\0', you don't need to know it's length because terminating character exactly specifies the end of your data.
Note that in both of these cases it's better to use some object that hides the internal representation of your data and will take care of memory management for you (see RAII idiom). So it's much better idea to use either std::vector<unsigned char> (for binary data) or std::string (for string).
In C, unsigned char is the only type guaranteed to have no trapping values, and which guarantees copying will result in an exact bitwise image. (C++ extends this guarantee to char as well.) For this reason, it is traditionally used for "raw memory" (e.g. the semantics of memcpy are defined in terms of unsigned char).
In addition, unsigned integral types in general are used when bitwise operations (&, |, >> etc.) are going to be used. unsigned char is the smallest unsigned integral type, and may be used when manipulating arrays of small values on which bitwise operations are used. Occasionally, it's also used because one needs the modulo behavior in case of overflow, although this is more frequent with larger types (e.g. when calculating a hash value). Both of these reasons apply to unsigned types in general; unsigned char will normally only be used for them when there is a need to reduce memory use.
The unsinged char type is usually used as a representation of a single byte of binary data. Thus, and array is often used as a binary data buffer, where each element is a singe byte.
The unsigned char* construct will be a pointer to the binary data buffer (or its 1st element).
I am not 100% sure what does c++ standard precisely says about size of unsigned char, whether it is fixed to be 8 bit or not. Usually it is. I will try to find and post it.
After seeing your code
When you use something like void* input as a parameter of a function, you deliberately strip down information about inputs original type. This is very strong suggestion that the input will be treated in very general manner. I.e. as a arbitrary string of bytes. int* input on the other hand would suggest it will be treated as a "string" of singed integers.
void* is mostly used in cases when input gets encoded, or treated bit/byte wise for whatever reason, since you cannot draw conclusions about its contents.
Then In your function you seem to want to treat the input as a string of bytes. But to operate on objects, e.g. performing operator= (assignment) the compiler needs to know what to do. Since you declare input as void* assignment such as *input = something would have no sense because *input is of void type. To make compiler to treat input elements as the "smallest raw memory pieces" you cast it to the appropriate type which is unsigned int.
The cout probably did not work because of wrong or unintended type conversion. char* is considered a null terminated string and it is easy to confuse singed and unsigned versionin code. If you pass unsinged char* to ostream::operator<< as a char* it will treat and expect the byte input as normal ASCII characters, where 0 is meant to be end of string not an integer value of 0. When you want to print contents of memory it is best to explicitly cast pointers.
Also note that to print memory contents of a buffer you would need to use a loop, since other wise the printing function would not know when to stop.
Unsigned char pointers are useful when you want to access the data byte by byte. For example, a function that copies data from one area to another could need this:
void memcpy (unsigned char* dest, unsigned char* source, unsigned count)
{
for (unsigned i = 0; i < count; i++)
dest[i] = source[i];
}
It also has to do with the fact that the byte is the smallest addressable unit of memory. If you want to read anything smaller than a byte from memory, you need to get the byte that contains that information, and then select the information using bit operations.
You could very well copy the data in the above function using a int pointer, but that would copy chunks of 4 bytes, which may not be the correct behavior in some situations.
Why nothing appears on the screen when you try to use cout, the most likely explanation is that the data starts with a zero character, which in C++ marks the end of a string of characters.

C++: how to cast 2 bytes in an array to an unsigned short

I have been working on a legacy C++ application and am definitely outside of my comfort-zone (a good thing). I was wondering if anyone out there would be so kind as to give me a few pointers (pun intended).
I need to cast 2 bytes in an unsigned char array to an unsigned short. The bytes are consecutive.
For an example of what I am trying to do:
I receive a string from a socket and place it in an unsigned char array. I can ignore the first byte and then the next 2 bytes should be converted to an unsigned char. This will be on windows only so there are no Big/Little Endian issues (that I am aware of).
Here is what I have now (not working obviously):
//packetBuffer is an unsigned char array containing the string "123456789" for testing
//I need to convert bytes 2 and 3 into the short, 2 being the most significant byte
//so I would expect to get 515 (2*256 + 3) instead all the code I have tried gives me
//either errors or 2 (only converting one byte
unsigned short myShort;
myShort = static_cast<unsigned_short>(packetBuffer[1])
Well, you are widening the char into a short value. What you want is to interpret two bytes as an short. static_cast cannot cast from unsigned char* to unsigned short*. You have to cast to void*, then to unsigned short*:
unsigned short *p = static_cast<unsigned short*>(static_cast<void*>(&packetBuffer[1]));
Now, you can dereference p and get the short value. But the problem with this approach is that you cast from unsigned char*, to void* and then to some different type. The Standard doesn't guarantee the address remains the same (and in addition, dereferencing that pointer would be undefined behavior). A better approach is to use bit-shifting, which will always work:
unsigned short p = (packetBuffer[1] << 8) | packetBuffer[2];
This is probably well below what you care about, but keep in mind that you could easily get an unaligned access doing this. x86 is forgiving and the abort that the unaligned access causes will be caught internally and will end up with a copy and return of the value so your app won't know any different (though it's significantly slower than an aligned access). If, however, this code will run on a non-x86 (you don't mention the target platform, so I'm assuming x86 desktop Windows), then doing this will cause a processor data abort and you'll have to manually copy the data to an aligned address before trying to cast it.
In short, if you're going to be doing this access a lot, you might look at making adjustments to the code so as not to have unaligned reads and you'll see a perfromance benefit.
unsigned short myShort = *(unsigned short *)&packetBuffer[1];
The bit shift above has a bug:
unsigned short p = (packetBuffer[1] << 8) | packetBuffer[2];
if packetBuffer is in bytes (8 bits wide) then the above shift can and will turn packetBuffer into a zero, leaving you with only packetBuffer[2];
Despite that this is still preferred to pointers. To avoid the above problem, I waste a few lines of code (other than quite-literal-zero-optimization) it results in the same machine code:
unsigned short p;
p = packetBuffer[1]; p <<= 8; p |= packetBuffer[2];
Or to save some clock cycles and not shift the bits off the end:
unsigned short p;
p = (((unsigned short)packetBuffer[1])<<8) | packetBuffer[2];
You have to be careful with pointers, the optimizer will bite you, as well as memory alignments and a long list of other problems. Yes, done right it is faster, done wrong the bug can linger for a long time and strike when least desired.
Say you were lazy and wanted to do some 16 bit math on an 8 bit array. (little endian)
unsigned short *s;
unsigned char b[10];
s=(unsigned short *)&b[0];
if(b[0]&7)
{
*s = *s+8;
*s &= ~7;
}
do_something_With(b);
*s=*s+8;
do_something_With(b);
*s=*s+8;
do_something_With(b);
There is no guarantee that a perfectly bug free compiler will create the code you expect. The byte array b sent to the do_something_with() function may never get modified by the *s operations. Nothing in the code above says that it should. If you don't optimize your code then you may never see this problem (until someone does optimize or changes compilers or compiler versions). If you use a debugger you may never see this problem (until it is too late).
The compiler doesn't see the connection between s and b, they are two completely separate items. The optimizer may choose not to write *s back to memory because it sees that *s has a number of operations so it can keep that value in a register and only save it to memory at the end (if ever).
There are three basic ways to fix the pointer problem above:
Declare s as volatile.
Use a union.
Use a function or functions whenever changing types.
You should not cast a unsigned char pointer into an unsigned short pointer (for that matter cast from a pointer of smaller data type to a larger data type). This is because it is assumed that the address will be aligned correctly. A better approach is to shift the bytes into a real unsigned short object, or memcpy to a unsigned short array.
No doubt, you can adjust the compiler settings to get around this limitation, but this is a very subtle thing that will break in the future if the code gets passed around and reused.
Maybe this is a very late solution but i just want to share with you. When you want to convert primitives or other types you can use union. See below:
union CharToStruct {
char charArray[2];
unsigned short value;
};
short toShort(char* value){
CharToStruct cs;
cs.charArray[0] = value[1]; // most significant bit of short is not first bit of char array
cs.charArray[1] = value[0];
return cs.value;
}
When you create an array with below hex values and call toShort function, you will get a short value with 3.
char array[2];
array[0] = 0x00;
array[1] = 0x03;
short i = toShort(array);
cout << i << endl; // or printf("%h", i);
static cast has a different syntax, plus you need to work with pointers, what you want to do is:
unsigned short *myShort = static_cast<unsigned short*>(&packetBuffer[1]);
Did nobody see the input was a string!
/* If it is a string as explicitly stated in the question.
*/
int byte1 = packetBuffer[1] - '0'; // convert 1st byte from char to number.
int byte2 = packetBuffer[2] - '0';
unsigned short result = (byte1 * 256) + byte2;
/* Alternatively if is an array of bytes.
*/
int byte1 = packetBuffer[1];
int byte2 = packetBuffer[2];
unsigned short result = (byte1 * 256) + byte2;
This also avoids the problems with alignment that most of the other solutions may have on certain platforms. Note A short is at least two bytes. Most systems will give you a memory error if you try and de-reference a short pointer that is not 2 byte aligned (or whatever the sizeof(short) on your system is)!
char packetBuffer[] = {1, 2, 3};
unsigned short myShort = * reinterpret_cast<unsigned short*>(&packetBuffer[1]);
I (had to) do this all the time. big endian is an obvious problem. What really will get you is incorrect data when the machine dislike misaligned reads! (and write).
you may want to write a test cast and an assert to see if it reads properly. So when ran on a big endian machine or more importantly a machine that dislikes misaligned reads an assert error will occur instead of a weird hard to trace 'bug' ;)
On windows you can use:
unsigned short i = MAKEWORD(lowbyte,hibyte);
I realize this is an old thread, and I can't say that I tried every suggestion made here. I'm just making my self comfortable with mfc, and I was looking for a way to convert a uint to two bytes, and back again at the other end of a socket.
There are alot of bit shifting examples you can find on the net, but none of them seemed to actually work. Alot of the examples seem overly complicated; I mean we're just talking about grabbing 2 bytes out of a uint, sending them over the wire, and plugging them back into a uint at the other end, right?
This is the solution I finally came up with:
class ByteConverter
{
public:
static void uIntToBytes(unsigned int theUint, char* bytes)
{
unsigned int tInt = theUint;
void *uintConverter = &tInt;
char *theBytes = (char*)uintConverter;
bytes[0] = theBytes[0];
bytes[1] = theBytes[1];
}
static unsigned int bytesToUint(char *bytes)
{
unsigned theUint = 0;
void *uintConverter = &theUint;
char *thebytes = (char*)uintConverter;
thebytes[0] = bytes[0];
thebytes[1] = bytes[1];
return theUint;
}
};
Used like this:
unsigned int theUint;
char bytes[2];
CString msg;
ByteConverter::uIntToBytes(65000,bytes);
theUint = ByteConverter::bytesToUint(bytes);
msg.Format(_T("theUint = %d"), theUint);
AfxMessageBox(msg, MB_ICONINFORMATION | MB_OK);
Hope this helps someone out.