(C/C++) Structures containing unions containing structures...? - c++

I am programming in linux, which is new to me. I am working on a project to design a 'layer 7' network protocol, and we have these packets that contain resources. And depending on the type of resource, the length of that resource would be different. I am kind of new to C/C++, and am not sure I understand unions all that well. The idea was that I would be able to make a "generic resource" type and depending on what resource it was I could just cast a void* as a pointer to this typedef structure and then call the data contained in it as anything I please and it would take care of the 'casting'. Anyways, here is what I came up with:
typedef struct _pktresource
{
unsigned char Type; // The type of the resource.
union {
struct { // This is used for variable length data.
unsigned short Size;
void *Data;
};
void *ResourceData; // Just a generic pointer to the data.
unsigned char Byte;
char SByte;
short Int16;
unsigned short UInt16;
int Int32;
unsigned int UInt32;
long long Int64;
unsigned long long UInt64;
float Float;
double Double;
unsigned int Time;
};
} pktresource, *ppktresource;
The principal behind this was simple. But when I do something like
pktresource.Size = XXXX
It starts out 4 bytes into the structure instead of 1 byte. Am I failing to grasp a major concept here? Because it feels like I am.
EDIT: Forgot to mention, when I reference
pktresource.Type
It starts at the beginning like its supposed to.
EDIT: Correction was to add pragma statements for proper alignment. After fix, the code looks like:
#pragma pack(push)
#pragma pack(1)
typedef struct _pktresource
{
unsigned char Type; // The type of the resource.
union {
struct { // This is used for variable length data.
unsigned short Size;
unsigned char Data[];
};
unsigned char ResourceData[]; // Just a generic pointer to the data.
unsigned char Byte;
char SByte;
short Int16;
unsigned short UInt16;
int Int32;
unsigned int UInt32;
long long Int64;
unsigned long long UInt64;
float Float;
double Double;
unsigned int Time;
};
} pktresource, *ppktresource;
#pragma pack(pop)

Am I failing to grasp a major concept here?
You're missing knowledge of structure alignment. Basically, it forces certain fields to be aligned by > 1 byte boundaries depending on their size. You can use #pragma to override this behavior, but that can cause interoperability issues if the structure is used anywhere outside your application.

I think the problem is alignment. By default most compilers align to the word size of the machine / OS, in this case 32 bits / 4 bytes. So, since you have that unsigned char Type field up front, the compiler is pushing the Size field to the next even 4 byte boundary.
try
#pragma pack 1
ahead of you structure definitions.
I don't know what compiler you are using, but that's good old-fashioned C code that's been regularly in use for network programming since before most of these rude kids on StackOverflow were born.

Related

C++ , create structure with an optional value

I need to create structure with an optional value :
typedef struct pkt_header{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length; // optional (present/not present)
unsigned short Version;
} PKT_HEADER;
How can i sometimes use pkt_header->Protected_Payload_Length and sometimes not use this value in a struct when the field is not present ?
My first idea is to declare unsigned char * Protected_Payload_Length and pass NULL when i not use the field and use the unsigned char* for store my unsigned short value.
typedef struct pkt_header{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned char * Protected_Payload_Length; // optional
unsigned short Version;
} PKT_HEADER;
I prepare my packet like this (and send this):
PKT_HEADER header;
header.Packet_Type = 0x0001;
header.Unprotected_Payload_Length = 0x0b00;
header.Protected_Payload_Length = NULL;
header.Version = 0x0000;
I receive response and do this :
PKT_HEADER * header= (PKT_HEADER*)recvbuf;
printf("Packet_Type : %04x\n", header->Packet_Type);
printf("Unprotected_Payload_Length : %04x\n", header->Unprotected_Payload_Length);
printf("Version : %04x\n", header->Version);
But in this case, if i understand correctly, unsigned char * Protected_Payload_Length contain a pointer with a length of 4 bytes then header->Protected_Payload_Length contain 4 bytes but i need 0 byte because the value/field is not present in this precise case.
Do I have to declare an appropriate structure in the data format or is there some other way to play with the structures?
Thanks for your help.
Beware. Structs can have padding, members are not necessarily adjacent in memory. Moreover reinterpreting something as a PKT_HEADER when that something is not a PKT_HEADER object is not allowed. Instead of casting:
PKT_HEADER * header= (PKT_HEADER*)recvbuf;
you probably should use memcpy. Having said this, now to your actual question...
If you rely on members having a specific order in the struct, then inheritance is not an option. In memory the base object comes first, then the derived members, you cannot mix that. For example
struct foo {
int x;
};
struct bar : foo {
int y;
int z;
};
Then a bar object will have in memory
| x | optional padding | y | optional padding | z | optional padding |
There is no simple way to get | y | x | z |.
If you want two different types the easiest is to define two different types:
struct PKT_HEADER_A {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length; // present
unsigned short Version;
};
struct PKT_HEADER_B {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
//unsigned short Protected_Payload_Length; // not present
unsigned short Version;
};
Note that your way to typedef the struct is a C-ism. It is not necessary (and not recommended) in C++.
You should probably take a look at the packing done on NanoPb or Protobuff , because it sounds like you have a packing problem. Data should be pieced together before sending, and the Packet_Type would encode which header to decode/encode with.
If you can't properly pack/unpack, an alternative is to create both
typedef struct {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length;
unsigned short Version;
} PKT_HEADER_FULL;
typedef struct {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Version;
} PKT_HEADER_SHORT;
then create your a packet header
typedef union u{
PKT_HEADER_FULL full;
PKT_HEADER_SHORT concat;
}PKT_HEADER;
// or as this
typedef struct{
unsigned short Protected_Payload_Length;
unsigned short version
}longform;
typedef struct{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
union u{
longform l;
unsigned short version;
};
} PKT_HEADER;
Then the data coming in could be decoded either way (again, depending on Packet_type), and the remaining space can be ignored. A caveat to this method is you can't use sizeof(PKT_HEADER) because the struct size will always be the larger value.

Limiting structures size by use of :

Why this piece of code is needed ?
typedef struct corr_id_{
unsigned int size:8;
unsigned int valueType:8;
unsigned int classId:8;
unsigned int reserved:8;
} CorrId;
I did some investigation around it and found that this way we are limiting the memory consumption to just what we need.
For E.g.
typedef struct corr_id_new{
unsigned int size;
unsigned int valueType;
unsigned int classId;
unsigned int reserved;
} CorrId_NEW;
typedef struct corr_id_{
unsigned int size:8;
unsigned int valueType:8;
unsigned int classId:8;
unsigned int reserved:8;
} CorrId;
int main(){
CorrId_NEW Obj1;
CorrId Obj2;
std::cout<<sizeof(Obj1)<<endl;
std::cout<<sizeof(Obj2)<<endl;
}
Output:-
16
4
I want to understand the real use case of such scenarios? why can't we declare the struct something like this,
typedef struct corr_id_new{
unsigned _int8 size;
unsigned _int8 valueType;
unsigned _int8 classId;
unsigned _int8 reserved;
} CorrId_NEW;
Does this has something to do with compiler optimizations? Or, what are the benefits of declaring the structure that way?
I want to understand the real use case of such scenarios?
For example, structure of status register of some CPU may look like this:
In order to represent it via structure, you could use bitfield:
struct CSR
{
unsigned N: 1;
unsigned Z: 1;
unsigned C: 1;
unsigned V: 1;
unsigned : 20;
unsigned I: 1;
unsigned : 2;
unsigned M: 5;
};
You can see here that fields are not multiplies of 8, so you can't use int8_t, or something similar.
Lets see a simple scenario,
typedef struct student{
unsigned int age:8; // max 8-bits is enough to store a students's age 255 years
unsigned int roll_no:16; //max roll_no can be 2^16, which long enough
unsigned int classId:4; //class ID can be 4-bits long (0-15), as per need.
unsigned int reserved:4; // reserved
};
Above case all work is done in 32-bits only.
But if you use just a integer it would have taken 4*32 bits.
If we take age as 32-bit integer, It can store in range of 0 to 2^32. But don't forget a normal person's age is just max 100 or 140 or 150 (even somebody studying in this age also), which needs max 8-bits to store, So why to waste remaining 24-bits.
You are right, the last structure definition with unsigned _int8 is almost equivalent to the definition using :8. Almost, because byte order can make a difference here, so you might find that the memory layout is reversed in the two cases.
The main purpose of the :8 notation is to allow the use of fractional bytes, as in
struct foo {
uint32_t a:1;
uint32_t b:2;
uint32_t c:3;
uint32_t d:4;
uint32_t e:5;
uint32_t f:6;
uint32_t g:7;
uint32_t h:4;
}
To minimize padding, I strongly suggest to learn the padding rules yourself, they are not hard to grasp. If you do, you can know that your version with unsigned _int8 does not add any padding. Or, if you don't feel like learning those rules, just use __attribute__((__packed__)) on your struct, but that may introduce a severe performance penalty.
It's often used with pragma pack to create bitfields with labels, e.g.:
#pragma pack(0)
struct eg {
unsigned int one : 4;
unsigned int two : 8;
unsigned int three : 16
};
Can be cast for whatever purpose to an int32_t, and vice versa. This might be useful when reading serialized data that follows a (language agnostic) protocol -- you extract an int and cast it to a struct eg to match the fields and field sizes defined in the protocol. You could also skip the conversion and just read an int sized chunk into such a struct, point being that the bitfield sizes match the protocol field sizes. This is extremely common in network programming -- if you want to send a packet following the protocol, you just populate your struct, serialize, and transmit.
Note that pragma pack is not standard C but it is recognized by various common compilers. Without pragma pack, however, the compiler is free to place padding between fields, reducing the use value for the purposes described above.

Difference between different integer types

I was wondering what is the difference between uint32_t and uint32, and when I looked in the header files it had this:
types.h:
/** #brief 32-bit unsigned integer. */
typedef unsigned int uint32;
stdint.h:
typedef unsigned uint32_t;
This only leads to more questions:
What is the difference between
unsigned varName;
and
unsigned int varName;
?
I am using MinGW.
unsigned and unsigned int are synonymous, much like unsigned short [int] and unsigned long [int].
uint32_t is a type that's (optionally) defined by the C standard. uint32 is just a name you made up, although it happens to be defined as the same thing.
There is no difference.
unsigned int = uint32 = uint32_t = unsigned in your case and unsigned int = unsigned always
unsigned and unsigned int are synonymous for historical reasons; they both mean "unsigned integer of the most natural size for the CPU architecture/platform", which is often (but by no means always) 32 bits on modern platforms.
<stdint.h> is a standard header in C99 that is supposed to give type definitions for integers of particular sizes, with the uint32_t naming convention.
The <types.h> that you're looking at appears to be non-standard and presumably belongs to some framework your project is using. Its uint32 typedef is compatible with uint32_t. Whether you should use one or the other in your code is a question for your manager.
There is absolutely no difference between unsigned and unsigned int.
Whether that type is a good match for uint32_t is implementation-dependant though; an int could be "shorter" than 32 bits.

why add fillers in a c++ struct?

What are the effect of fillers in a c++ struct? I often see them in some c++ api. For example:
struct example
{
unsigned short a;
unsigned short b;
char c[3];
char filler1;
unsigned short e;
char filler2;
unsigned int g;
};
This struct is meant to transport through network
struct example
{
unsigned short a; //2 bytes
unsigned short b;//2 bytes
//4 bytes consumed
char c[3];//3 bytes
char filler1;//1 bytes
//4 bytes consumed
unsigned short e;//2 bytes
char filler2;//1 bytes
//3 bytes consumed ,should be filler[2]
unsigned int g;//4 bytes
};
Because sometimes you don't actually control the format of the data you're using.
The format may be specified by something beyond your control. For example, it may be created in a system with different alignment requirements to yours.
Alternatively, the data may have real data in those filler areas that your code doesn't care about.
Those fillers are usually inserted to explicitly make sure some of the members of a structure are naturally aligned i.e. their offset inside a structure is a multiple of its size.
In the example below assuming char is 1 bytes, short is 2 and int is 4.
struct example
{
unsigned short a;
unsigned short b;
char c[3];
char filler1;
unsigned short e; // starts at offset 8
char filler2[2];
unsigned int g; // starts at offset 12
};
If you don't specify any fillers, a compiler will usually add the necessary padding bytes to ensure a proper alignment of the structure members.
Btw, these fields can also be used for reserved fields that might appear in the future.
updated:
Since it has been mentioned that a structure is a network packet, the fillers are required to get a structure that is compatible with the one being passed from another host.
However, inserting filler bytes in this case might not be enough (especially, if portability is required). If these structures are to be sent via a network as is (i.e. without manually packing into a separate buffer for sending), you have to inform a compiler that the structure should be packed.
In microsoft compiler this can be achieved using #pragma pack:
#pragma pack(1)
struct T {
char t;
int i;
short j;
double k;
};
In gcc you can use __attribute__((packed))
struct foo {
char c;
int x;
} __attribute__((packed));
However, many people prefer to manually pack/unpack structures int a raw-byte array, because accessing misaligned data on some systems might not be [properly] supported.
Depending on what code you're working with they may be attempting to align the structure on word boundries (32 bit in your case), this is a speed optimization, however, doing things like this has been rendered obsolete by decent optimizing compilers, however if the compiler was instructed not to optimize this piece of code, or the compiler is very low-end e.g. for an embedded system, it may be better to handle this yourself. It basically boils downto how much you trust the compiler.
The other reason is for writing binary files, where reserved bytes have been left in the file format specification.

C struct sizes inconsistence [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How do I find the size of a struct?
Struct varies in memory size?
I am using following struct for network communication, It creates lots of unnecessary bytes in between.
It gives different size than expected 8 Bytes.
struct HttpPacket {
unsigned char x1;
union {
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
unsigned char bytes[7];
unsigned long num;
}
And Following gives different size even though that I am removing a field from a union
struct HttpPacket {
unsigned char x1;
union {
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
unsigned long num;
}
Also, A more clear example
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
And it gives a size of 8, instead of 7.
And I add one more field, It still gives the same size
struct {
unsigned char EXTRAADDEDFIELD;
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
Can someone please help on resolving this issue ?
UPDATE: I need the format to hold while transmitting the packet, So I want to skip these paddings
C makes no guarantees on the size of a struct. The compiler is allowed to line up the members however it wants. Usually, as in this case, it will make the size word-aligned since that's fastest on most machines.
Ever heard of alignment and padding?
Basically, to ensure fast access, certain types have to be on certain bounds of memory addresses.
This is called alignment.
To achieve that, the compiler is allowed to insert bytes into your data structure to achieve that alignment.
This is called padding.
By default, structure fields are aligned on natural boundaries. For example, a 4-byte field will start on a 4-byte boundary. The compiler inserts pad bytes to achieve this. You can avoid the padding by using #pragma pack(0) or other similar compiler directives
If you have a C99 compiler and can use the "new" fixed-width types: make an array of uint8_t and do the separation in members yourself.
uint8_t data[8];
x1 = data[0];
len = data[1];
host = data[2] * 256 + data[3]; /* big endian */
content[0] = data[4];
content[1] = data[5];
content[2] = data[6];
content[3] = data[7];
/* ... */
You can follow the same procedure in C89 if you can rely on CHAR_BIT being 8.