C++/C: Prepend length to Char[] in bytes (binary/hex)

C++/C: Prepend length to Char[] in bytes (binary/hex) - c++

I'm looking to send UDP datagrams from a client, to a server, and back.
The server needs to add a header to each datagram (represented as a char[]) in byte format, which I've struggled to find examples of. I know how to send it as actual text characters, but I want to send it perhaps as "effectively" binary form (eg, if the length were to be 40 bytes then I'd want to prepend 0x28 , or the 2 byte unsigned equivalent, rather than as '0028' in ASCII char form or similar, which would be 4 bytes instead of a potential 2.
As far as I can work out my best option is below:
unsigned int length = dataLength; //length of the data received
char test[512] = { (char)length };
Is this approach valid, or will it cause problems later?
Further, this gives me a hard limit of 255 if I'm not mistaken. How can I best represent it as 2 bytes to extend my maximum length.
EDIT: I need the length of each datagram to be prepended because I will be building each datagram into a larger frame, and the recipient needs to be able to take the frame apart into each information element, which I think means I should need the length included so the recipient and work out where each element ends and the next begins

You probably need something like this:
char somestring[] = "Hello World!";
char sendbuffer[1000];
int length = strlen(somestring);
sendbuffer[0] = length % 0xff; // put LSB of length
sendbuffer[1] = (length >> 8) & 0xff; // put MSB of length
strcpy(&sendbuffer[2], somestring); // copy the string right after the length
sendbuffer is the buffer that will be sent; I fixed it to a maximum length of 1000 allowing for strings up to an length of 997 beeing sent (1000 - 2 bytes for the length - 1 byte for NUL terminator).
LSB means least significant byte and MSB means most significant byte. Here we put the LSB first and the MSB second, this convention is called little endian, the other way round would be big endian. You need to be sure that on the receiver side that the length is correctly decoded. If the architecture on the receiver side has an other endianness than the sender, the length on the receiver side may be decoded wrong depending on the code. Google "endianness" for more details.
sendbuffer will look like this in memory:
0x0c 0x00 0x48 0x65 0x6c 0x6c ...
| 12 |'H' |'e' |'l' |'l '| ...
//... Decoding (assuming short is a 16 bit type on the receiver side)
// first method (won't work if endiannnes is different on receiver side)
int decodedlength = *((unsigned short*)sendbuffer);
// second method (endiannness safe)
int decodedlength2 = (unsigned char)sendbuffer[0] | (unsigned char)sendbuffer[1] << 8;
char decodedstring[1000];
strcpy(decodedstring, &sendbuffer[2]);
Possible optimisation:
If the majority of the strings you send have a length shorter than 255, you can optimize and not prepending systematically two bytes but only one byte most of the time, but that's another story.

Related

Is there a way to convert an array of bytes to a number in C++?

I have read 4 bytes as an array from a file from a SD-Card on an Arduino Mega. Now I want to convert this array in one number, so that I can work with the number as integer(The bytes are a length of the next File section). Is there any included function for my problem or must I code my own?
I read the File into the byte array with the file.read() function from SDFat:
byte array[4]; //creates the byte array
file.read(array,4); //reads 4 bytes from the file and stores it in the array
I hope, you can understand my Problem.

It depends on the endianess of the stored bytes.
If the endianess matches the one of your target system (I believe the Atmegas are big endian) you can just do
int32_t number = *(int32_t*)array;
to get a 32 bit integer.
If the endianess is not matching you have to shift the bytes around yourself, for a little endian encoded number:
int32_t number = uint32_t(array[3]) << 24 | uint32_t(array[2]) << 16 | uint32_t(array[1]) << 8 | uint32_t(array[0]);

How to create a packet for serial communication with specific requirements?

I want to communicate with a serial-device. The individual packets are sent with a continuously ascending byte sequence, i.e. the first byte of the packet is PacCmd and the last byte of a packet is CheckSum. The individual bytes are INTEL coded (Little Endian), i.e. the LSB is sent first. The INTEL coding is used everywhere where the packet's individual data elements consist of more than one byte (PacCmd, Data).
The datapack is described as follows:
DataPac ::= { PacCmd + DataLen + { DataByte } } + CheckSum
PacCmd - 2 Byte unsigned integer = unsigned short
DataLen - 1 Byte unsigned integer = unsigned char/uint8_t
DataByte - x Byte = Data to send/receive = mixed
CheckSum - 1 Byte unsigned integer = unsigned char/uint8_t
This is no problem. But the PacCmd is a problem. It consists of 3 Bits for control and a 13 bit integer value, the command. The description of PacCmd:
PacCmd ::= ( ReqBit | RspBit | MoreDataBit ) + Cmd
ReqBit - 1 Bit: If set, an acknowledgement is requested from the receiver
RspBit - 1 Bit: If set, this data packet is to be interpreted as an acknowledgement
MoreDataBit - 1 Bit: If set, an additional PacCmd field follows subsequent to the data belonging to this command
Cmd - 13 Bits Unsigned Integer: Identification via the content of the data field of this packet (only if the RspBit is not set)
My question is: How can I interpret the value of Cmd? How can I extract the 3 bits and the 13 bits integer? How can I set the 3 bits and the 13 bits integer? Any example-code on how to extract the information? Just use the bitwise operator &? What does the Little Endian do with that?

Get the network 5bytes warning left shift count >= width of type

My machine is 64 bit. My code as below:
unsigned long long periodpackcount=*(mBuffer+offset)<<32|*(mBuffer+offset+1)<<24|* (mBuffer+offset+2)<<16|*(mBuffer+offset+3)<<8|*(mBuffer+offset+4);
mBuffer is unsigned char*. I want to get 5 bytes data and transform the data to host byte-order.
How can I avoid this warning ?

Sometimes it's best to break apart into a few lines in order to avoid issues. You have a 5 byte integer you want to read.
// Create the number to read into.
uint64_t number = 0; // uint64_t is in <stdint>
char *ptr = (char *)&number;
// Copy from the buffer. Plus 3 for leading 0 bits.
memcpy(ptr + 3, mBuffer + offset, 5);
// Reverse the byte order.
std::reverse(ptr, ptr + 8); // Can bit shift here instead
Probably not the best byte swap ever (bit shifting is faster). And my logic might be off for the offsetting, but something along those lines should work.
The other thing you may want to do is cast each byte before shifting since you're leaving it up to the compiler to determine the data type *(mBuffer + offset) is a character (I believe), so you may want to cast it to a larger type static_cast<uint64_t>(*(mBuffer + offset)) << 32 or something.

Spitting a char array into a sequence of ints and floats

I'm writing a program in C++ to listen to a stream of tcp messages from another program to give tracking data from a webcam. I have the socket connected and I'm getting all the information in but having difficulty splitting it up into the data I want.
Here's the format of the data coming in:
8 byte header:
4 character string,
integer
32 byte message:
integer,
float,
float,
float,
float,
float
This is all being stuck into a char array called buffer. I need to be able to parse out the different bytes into the primitives I need. I have tried making smaller sub arrays such as headerString that was filled by looping through and copying the first 4 elements of the buffer array and I do get the the correct hear ('CCV ') printed out. But when I try the same thing with the next for elements (to get the integer) and try to print it out I get weird ascii characters being printed out. I've tried converting the headerInt array to an integer with the atoi method from stdlib.h but it always prints out zero.
I've already done this in python using the excellent unpack method, is their any alternative in C++?
Any help greatly appreciated,
Jordan
Links
CCV packet structure
Python unpack method

The buffer only contains the raw image of what you read over the
network. You'll have to convert the bytes in the buffer to whatever
format you want. The string is easy:
std::string s(buffer + sOffset, 4);
(Assuming, of course, that the internal character encoding is the same
as in the file—probably an extension of ASCII.)
The others are more complicated, and depend on the format of the
external data. From the description of the header, I gather than the
integers are four bytes, but that still doesn't tell me anything about
their representation. Depending on the case, either:
int getInt(unsigned char* buffer, int offset)
{
return (buffer[offset ] << 24)
| (buffer[offset + 1] << 16)
| (buffer[offset + 2] << 8)
| (buffer[offset + 3] );
}
or
int getInt(unsigned char* buffer, int offset)
{
return (buffer[offset + 3] << 24)
| (buffer[offset + 2] << 16)
| (buffer[offset + 1] << 8)
| (buffer[offset ] );
}
will probably do the trick. (Other four byte representations of
integers are possible, but they are exceedingly rare. Similarly, the
conversion of the unsigned results of the shifts and or's into a int
is implementation defined, but in practice, the above will work almost
everywhere.)
The only hint you give concerning the representation of the floats is in
the message format: 32 bytes, minus a 4 byte integer, leave 28 bytes for
5 floats; but 28 doesn't go into five, so I cannot even guess as to the
length of the floats (except that there must be some padding in there
somewhere). But converting floating point can be more or less
complicated if the external format isn't exactly like the internal
format.

Something like this may work:
struct {
char string[4];
int integers[2];
float floats[5];
} Header;
Header* header = (Header*)buffer;
You should check that sizeof(Header) == 32.

How do I store a byte into a 4 byte number without changing the bytes around it?

So if I have a 4 byte number (say hex) and want to store a byte say DD into hex, at the nth byte position without changing the other elements of hex's number, what's the easiest way of going about that? I'm guessing it's some combination of bitwise operations, but I'm still quite new with them, and have found them quite confusing thus far?

byte n = 0xDD;
uint i = 0x12345678;
i = (i & ~0x0000FF00) | ((uint)n << 8);
Edit: Forgot to mention, be careful if you're doing this with signed data types, so that things don't get inadvertently sign-extended.

Mehrdad's answer shows how to do it with bit manipulation. You could also use the old byte array trick (assuming C or some other language that allows this silliness):
byte n = 0xDD;
uint i = 0x12345678;
byte *b = (byte*)&i;
b[1] = n;
Of course, that's processor specific in that big-endian machines have the bytes reversed from little-endian. Also, this technique limits you to working on exact byte boundaries whereas the bit manipulation will let you modify any given 8 bits. That is, you might want to turn 0x12345678 into 0x12345DD8, which the technique I show won't do.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js