I have a DNS packet class which looks like this (I am pasting only part of it):
class DNSPacket {
public:
struct DNSHeader {
unsigned int ID :16;
unsigned int QR :1;
unsigned int OPCODE :4;
unsigned int AA :1;
unsigned int TC :1;
unsigned int RD :1;
unsigned int RA :1;
unsigned int Z :3;
unsigned int RCODE :4;
unsigned int QDCOUNT :16;
unsigned int ANCOUNT :16;
unsigned int NSCOUNT :16;
unsigned int ARCOUNT :16;
};
private:
DNSHeader header;
std::vector<DNSQuestion> questions;
std::vector<DNSAnswer> answers;
std::vector<DNSAnswer> nameservers; // TODO: DNSAnswer?
std::vector<DNSAnswer> add_records; // TODO: DNSAnswer?
}
What would be the right way to deserialize a char array into this object? The options I have are: overloading >> operator, adding a separate class to deserialize it to read and deserialize the data reading byte after byte and using reinterpret_cast().
I want to create a fast, modern implementation in C++11. Which way should I choose? Also, how should I deserialize the bitfields - should I stick with bitwise operations?
The way I would do it would be to treat word 2 (bits 16 to 31) as a single unsigned 16-bit integer (e.g. uint16_t) and simply get the bits by using bit-wise AND and SHIFT operations. All the other words can be read the same way, but used more or less as-is (after converting from network byte order to host byte order of course).
To get word X word as an uint16_t you have to do some casting:
uint16_t wordX = *reinterpret_cast<uint16_t*>(&your_array[X]);
Note that for the second word, since they are really just bits they are in the order specified, no byte order conversion is done on them.
Related
I need to create structure with an optional value :
typedef struct pkt_header{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length; // optional (present/not present)
unsigned short Version;
} PKT_HEADER;
How can i sometimes use pkt_header->Protected_Payload_Length and sometimes not use this value in a struct when the field is not present ?
My first idea is to declare unsigned char * Protected_Payload_Length and pass NULL when i not use the field and use the unsigned char* for store my unsigned short value.
typedef struct pkt_header{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned char * Protected_Payload_Length; // optional
unsigned short Version;
} PKT_HEADER;
I prepare my packet like this (and send this):
PKT_HEADER header;
header.Packet_Type = 0x0001;
header.Unprotected_Payload_Length = 0x0b00;
header.Protected_Payload_Length = NULL;
header.Version = 0x0000;
I receive response and do this :
PKT_HEADER * header= (PKT_HEADER*)recvbuf;
printf("Packet_Type : %04x\n", header->Packet_Type);
printf("Unprotected_Payload_Length : %04x\n", header->Unprotected_Payload_Length);
printf("Version : %04x\n", header->Version);
But in this case, if i understand correctly, unsigned char * Protected_Payload_Length contain a pointer with a length of 4 bytes then header->Protected_Payload_Length contain 4 bytes but i need 0 byte because the value/field is not present in this precise case.
Do I have to declare an appropriate structure in the data format or is there some other way to play with the structures?
Thanks for your help.
Beware. Structs can have padding, members are not necessarily adjacent in memory. Moreover reinterpreting something as a PKT_HEADER when that something is not a PKT_HEADER object is not allowed. Instead of casting:
PKT_HEADER * header= (PKT_HEADER*)recvbuf;
you probably should use memcpy. Having said this, now to your actual question...
If you rely on members having a specific order in the struct, then inheritance is not an option. In memory the base object comes first, then the derived members, you cannot mix that. For example
struct foo {
int x;
};
struct bar : foo {
int y;
int z;
};
Then a bar object will have in memory
| x | optional padding | y | optional padding | z | optional padding |
There is no simple way to get | y | x | z |.
If you want two different types the easiest is to define two different types:
struct PKT_HEADER_A {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length; // present
unsigned short Version;
};
struct PKT_HEADER_B {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
//unsigned short Protected_Payload_Length; // not present
unsigned short Version;
};
Note that your way to typedef the struct is a C-ism. It is not necessary (and not recommended) in C++.
You should probably take a look at the packing done on NanoPb or Protobuff , because it sounds like you have a packing problem. Data should be pieced together before sending, and the Packet_Type would encode which header to decode/encode with.
If you can't properly pack/unpack, an alternative is to create both
typedef struct {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length;
unsigned short Version;
} PKT_HEADER_FULL;
typedef struct {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Version;
} PKT_HEADER_SHORT;
then create your a packet header
typedef union u{
PKT_HEADER_FULL full;
PKT_HEADER_SHORT concat;
}PKT_HEADER;
// or as this
typedef struct{
unsigned short Protected_Payload_Length;
unsigned short version
}longform;
typedef struct{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
union u{
longform l;
unsigned short version;
};
} PKT_HEADER;
Then the data coming in could be decoded either way (again, depending on Packet_type), and the remaining space can be ignored. A caveat to this method is you can't use sizeof(PKT_HEADER) because the struct size will always be the larger value.
I'm working with a serial device that returns a byte array.
In this array are values that are stored in unsigned shorts and unsigned chars.
I have the following structure:
typedef struct {
unsigned short RPM; //0
unsigned short Intakepress; //1
unsigned short PressureV; //2
unsigned short ThrottleV; //3
unsigned short Primaryinp; //4
unsigned short Fuelc; //5
unsigned char Leadingign; //6
unsigned char Trailingign; //7
unsigned char Fueltemp; //8
unsigned char Moilp; //9
unsigned char Boosttp; //10
unsigned char Boostwg; //11
unsigned char Watertemp; //12
unsigned char Intaketemp; //13
unsigned char Knock; //14
unsigned char BatteryV; //15
unsigned short Speed; //16
unsigned short Iscvduty; //17
unsigned char O2volt; //18
unsigned char na1; //19
unsigned short Secinjpulse; //20
unsigned char na2; //21
} fc_adv_info_t;
what's the best way to map the array to this structure? The order in the array received from the serial device matches the structure.
First of all, your description of the type of data in the structure using C-like syntax is ambiguous. It tells us nothing about the size of a short or char type, nor about the endianness of the data! A short int doesn't have to be 16 bits wide, neither is char always 8 bits! At the very least, you should use the fixed width integer types, or their Qt equivalents, and specify their endianness.
Also, typedef struct is a C-ism, unnecessary in C++. Drop the typedef.
Assuming a big endian packet, unsigned short to mean uint16_t and unsigned char to mean uint8_t, here is how you could do it:
struct FcAdvInfo { // this structure shouldn't be packed or anything like that!
quint16 RPM;
quint16 IntakePress;
...
quint8 LeadingIgn;
...
FcAdvInfo parse(const QByteArray &);
};
FcAdvInfo FcAdvInfo::parse(const QByteArray & src) {
FcAdvInfo p;
QDataStream ds(src);
ds.setByteOrder(QDataStream::BigEndian);
ds
>> p.RPM
>> p.IntakePress
...
>> p.LeadingIgn
...
;
return p;
}
Finally, if your struct comes from some C code, you must understand that it's not portable, and even on the same CPU, if you upgrade the compiler, the packing and the size of structure types can and will change! So don't do it. A C/C++ struct declaration implies nothing about how the data is arranged in memory, other than the chosen arrangement doesn't lead to undefined behavior, and must agree with other requirements of the standard (there are just a few). That's all, pretty much.
First, I would say that is is not safe to pack unsigned short type in structures that you are going to serialize/deserialize and exchange with other devices: unsigned short is usually 16-bit, but you can't take that as guaranteed, it is platform dependent.
It is even worse if struct members are not aligned so that compiler inserts paddings in the struct.
If binary data received from serial port is kept in QByteArray and byte order and "unsigned short" types are ok then to map a data in QByteArray on the struct you can use the code below. Note, that it is only correct if your struct is packed and has no padding gaps within it, use struct packing technique for your compiler (see Structure padding and packing).
QByteArray bArr;
bArr.resize(sizeof(fc_adv_info_t));
// do something to fill bArr with received data
fc_adv_info_t* info=reinterpret_cast<fc_adv_info_t*>(bArr.data());
I've searched through many sites and can not seem to find anything relevant.
I would like to be able to take the individual bytes of each default data types such as short, unsigned short, int, unsigned int, float and double, and to store each individual byte information(binary part) into each index of the unsigned char array. How can this be achieved?
For example:
int main() {
short sVal = 1;
unsigned short usVal = 2;
int iVal = 3;
unsigned int uiVal = 4;
float fVal = 5.0f;
double dVal = 6.0;
const unsigned int uiLengthOfShort = sizeof(short);
const unsigned int uiLengthOfUShort = sizeof(unsigned short);
const unsigned int uiLengthOfInt = sizeof(int);
const unsigned int uiLengthOfUInt = sizeof(unsigned int);
const unsigned int uiLengthOfFloat = sizeof(float);
const unsigned int uiLengthOfDouble = sizeof(double);
unsigned char ucShort[uiLengthOfShort];
unsigned char ucUShort[uiLengthOfUShort];
unsigned char ucInt[uiLengthOfInt];
unsigned char ucUInt[uiLengthOfUInt];
unsigned char ucFloat[uiLengthOfFloat];
unsigned char ucDouble[uiLengthOfDouble];
// Above I declared a variable val for each data type to work with
// Next I created a const unsigned int of each type's size.
// Then I created unsigned char[] using each data types size respectively
// Now I would like to take each individual byte of the above val's
// and store them into the indexed location of each unsigned char array.
// For Example: - I'll not use int here since the int is
// machine and OS dependent.
// I will use a data type that is common across almost all machines.
// Here I will use the short as my example
// We know that a short is 2-bytes or has 16 bits encoded
// I would like to take the 1st byte of this short:
// (the first 8 bit sequence) and to store it into the first index of my unsigned char[].
// Then I would like to take the 2nd byte of this short:
// (the second 8 bit sequence) and store it into the second index of my unsigned char[].
// How would this be achieved for any of the data types?
// A Short in memory is 2 bytes here is a bit representation of an
// arbitrary short in memory { 0101 1101, 0011 1010 }
// I would like ucShort[0] = sVal's { 0101 1101 } &
// ucShort[1] = sVal's { 0011 1010 }
ucShort[0] = sVal's First Byte info. (8 Bit sequence)
ucShort[1] = sVal's Second Byte info. (8 Bit sequence)
// ... and so on for each data type.
return 0;
}
Ok, so first, don't do that if you can avoid it. Its dangerous and can be extremely dependent on architecture.
The commentators above are correct, union is the safest way to do it, you have the endian problem still, yes, but at least you don't have the stack alignment problem (I assume this is for network code, so stack-alignment is another potential architecture problem)
This is what I've found to be the most straight-forward way to do this:
uint32_t example_int;
char array[4];
//No endian switch
array[0] = ((char*) &example_int)[0];
array[1] = ((char*) &example_int)[1];
array[2] = ((char*) &example_int)[2];
array[3] = ((char*) &example_int)[3];
//Endian switch
array[0] = ((char*) &example_int)[3];
array[1] = ((char*) &example_int)[2];
array[2] = ((char*) &example_int)[1];
array[3] = ((char*) &example_int)[0];
If you're trying to write cross-architecture code, you will need to deal with endian problems one way or another. My suggestion is to construct a short endian test and build functions to "pack" and "unpack" byte arrays based on the above method. It should be noted that to "unpack" a byte array, simply reverse the above assignment statements.
The simplest correct way is:
// static_assert(sizeof ucShort == sizeof sVal);
memcpy( &ucShort, &sVal, sizeof ucShort);
The stuff you write in comments is not correct; all types have machine-dependent size, other than character types.
With the help of Raw N by providing me a website, I did a search on byte manipulation and found this thread - http://www.cplusplus.com/forum/articles/12/ and it presents a similar solution towards what I am looking for, however I would have to repeat this process for every default data type.
After doing some testing this is what I have come up with so far and this is dependent on machine architecture, but to do this on other machines the concept is the same.
typedef struct packed_2bytes {
unsigned char c0;
unsigned char c1;
} packed_2bytes;
typedef struct packed_4bytes {
unsigned char c0;
unsigned char c1;
unsigned char c2;
unsigned char c3;
} packed_4bytes;
typedef struct packed_8bytes {
unsigned char c0;
unsigned char c1;
unsigned char c2;
unsigned char c3;
unsigned char c4;
unsigned char c5;
unsigned char c6;
unsigned char c7;
} packed_8bytes;
typedef union {
short s;
packed_2bytes bytes;
} packed_short;
typedef union {
unsigned short us;
packed_2bytes bytes;
} packed_ushort;
typedef union { // 32bit machine, os, compiler only
int i;
packed_4bytes bytes;
} packed_int;
typedef union { // 32 bit machine, os, compiler only
unsigned int ui;
packed_4bytes bytes;
} packed_uint;
typedef union {
float f;
packed_4bytes bytes;
} packed_float;
typedef union {
double d;
packed_8bytes bytes;
} packed_double;
There is no implementation of use only the declarations or definitions to these types. I do think that they should contain which ever endian is being used, but the person who is using them has to know this ahead of time just as knowing the machines architectures sizes for each of the default types. I am not sure if there would be a problem with signed int or not due to one's, two's compliment or signed bit implementations, but it could also be something to consider.
Why this piece of code is needed ?
typedef struct corr_id_{
unsigned int size:8;
unsigned int valueType:8;
unsigned int classId:8;
unsigned int reserved:8;
} CorrId;
I did some investigation around it and found that this way we are limiting the memory consumption to just what we need.
For E.g.
typedef struct corr_id_new{
unsigned int size;
unsigned int valueType;
unsigned int classId;
unsigned int reserved;
} CorrId_NEW;
typedef struct corr_id_{
unsigned int size:8;
unsigned int valueType:8;
unsigned int classId:8;
unsigned int reserved:8;
} CorrId;
int main(){
CorrId_NEW Obj1;
CorrId Obj2;
std::cout<<sizeof(Obj1)<<endl;
std::cout<<sizeof(Obj2)<<endl;
}
Output:-
16
4
I want to understand the real use case of such scenarios? why can't we declare the struct something like this,
typedef struct corr_id_new{
unsigned _int8 size;
unsigned _int8 valueType;
unsigned _int8 classId;
unsigned _int8 reserved;
} CorrId_NEW;
Does this has something to do with compiler optimizations? Or, what are the benefits of declaring the structure that way?
I want to understand the real use case of such scenarios?
For example, structure of status register of some CPU may look like this:
In order to represent it via structure, you could use bitfield:
struct CSR
{
unsigned N: 1;
unsigned Z: 1;
unsigned C: 1;
unsigned V: 1;
unsigned : 20;
unsigned I: 1;
unsigned : 2;
unsigned M: 5;
};
You can see here that fields are not multiplies of 8, so you can't use int8_t, or something similar.
Lets see a simple scenario,
typedef struct student{
unsigned int age:8; // max 8-bits is enough to store a students's age 255 years
unsigned int roll_no:16; //max roll_no can be 2^16, which long enough
unsigned int classId:4; //class ID can be 4-bits long (0-15), as per need.
unsigned int reserved:4; // reserved
};
Above case all work is done in 32-bits only.
But if you use just a integer it would have taken 4*32 bits.
If we take age as 32-bit integer, It can store in range of 0 to 2^32. But don't forget a normal person's age is just max 100 or 140 or 150 (even somebody studying in this age also), which needs max 8-bits to store, So why to waste remaining 24-bits.
You are right, the last structure definition with unsigned _int8 is almost equivalent to the definition using :8. Almost, because byte order can make a difference here, so you might find that the memory layout is reversed in the two cases.
The main purpose of the :8 notation is to allow the use of fractional bytes, as in
struct foo {
uint32_t a:1;
uint32_t b:2;
uint32_t c:3;
uint32_t d:4;
uint32_t e:5;
uint32_t f:6;
uint32_t g:7;
uint32_t h:4;
}
To minimize padding, I strongly suggest to learn the padding rules yourself, they are not hard to grasp. If you do, you can know that your version with unsigned _int8 does not add any padding. Or, if you don't feel like learning those rules, just use __attribute__((__packed__)) on your struct, but that may introduce a severe performance penalty.
It's often used with pragma pack to create bitfields with labels, e.g.:
#pragma pack(0)
struct eg {
unsigned int one : 4;
unsigned int two : 8;
unsigned int three : 16
};
Can be cast for whatever purpose to an int32_t, and vice versa. This might be useful when reading serialized data that follows a (language agnostic) protocol -- you extract an int and cast it to a struct eg to match the fields and field sizes defined in the protocol. You could also skip the conversion and just read an int sized chunk into such a struct, point being that the bitfield sizes match the protocol field sizes. This is extremely common in network programming -- if you want to send a packet following the protocol, you just populate your struct, serialize, and transmit.
Note that pragma pack is not standard C but it is recognized by various common compilers. Without pragma pack, however, the compiler is free to place padding between fields, reducing the use value for the purposes described above.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How do I find the size of a struct?
Struct varies in memory size?
I am using following struct for network communication, It creates lots of unnecessary bytes in between.
It gives different size than expected 8 Bytes.
struct HttpPacket {
unsigned char x1;
union {
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
unsigned char bytes[7];
unsigned long num;
}
And Following gives different size even though that I am removing a field from a union
struct HttpPacket {
unsigned char x1;
union {
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
unsigned long num;
}
Also, A more clear example
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
And it gives a size of 8, instead of 7.
And I add one more field, It still gives the same size
struct {
unsigned char EXTRAADDEDFIELD;
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
Can someone please help on resolving this issue ?
UPDATE: I need the format to hold while transmitting the packet, So I want to skip these paddings
C makes no guarantees on the size of a struct. The compiler is allowed to line up the members however it wants. Usually, as in this case, it will make the size word-aligned since that's fastest on most machines.
Ever heard of alignment and padding?
Basically, to ensure fast access, certain types have to be on certain bounds of memory addresses.
This is called alignment.
To achieve that, the compiler is allowed to insert bytes into your data structure to achieve that alignment.
This is called padding.
By default, structure fields are aligned on natural boundaries. For example, a 4-byte field will start on a 4-byte boundary. The compiler inserts pad bytes to achieve this. You can avoid the padding by using #pragma pack(0) or other similar compiler directives
If you have a C99 compiler and can use the "new" fixed-width types: make an array of uint8_t and do the separation in members yourself.
uint8_t data[8];
x1 = data[0];
len = data[1];
host = data[2] * 256 + data[3]; /* big endian */
content[0] = data[4];
content[1] = data[5];
content[2] = data[6];
content[3] = data[7];
/* ... */
You can follow the same procedure in C89 if you can rely on CHAR_BIT being 8.