Why this piece of code is needed ?
typedef struct corr_id_{
unsigned int size:8;
unsigned int valueType:8;
unsigned int classId:8;
unsigned int reserved:8;
} CorrId;
I did some investigation around it and found that this way we are limiting the memory consumption to just what we need.
For E.g.
typedef struct corr_id_new{
unsigned int size;
unsigned int valueType;
unsigned int classId;
unsigned int reserved;
} CorrId_NEW;
typedef struct corr_id_{
unsigned int size:8;
unsigned int valueType:8;
unsigned int classId:8;
unsigned int reserved:8;
} CorrId;
int main(){
CorrId_NEW Obj1;
CorrId Obj2;
std::cout<<sizeof(Obj1)<<endl;
std::cout<<sizeof(Obj2)<<endl;
}
Output:-
16
4
I want to understand the real use case of such scenarios? why can't we declare the struct something like this,
typedef struct corr_id_new{
unsigned _int8 size;
unsigned _int8 valueType;
unsigned _int8 classId;
unsigned _int8 reserved;
} CorrId_NEW;
Does this has something to do with compiler optimizations? Or, what are the benefits of declaring the structure that way?
I want to understand the real use case of such scenarios?
For example, structure of status register of some CPU may look like this:
In order to represent it via structure, you could use bitfield:
struct CSR
{
unsigned N: 1;
unsigned Z: 1;
unsigned C: 1;
unsigned V: 1;
unsigned : 20;
unsigned I: 1;
unsigned : 2;
unsigned M: 5;
};
You can see here that fields are not multiplies of 8, so you can't use int8_t, or something similar.
Lets see a simple scenario,
typedef struct student{
unsigned int age:8; // max 8-bits is enough to store a students's age 255 years
unsigned int roll_no:16; //max roll_no can be 2^16, which long enough
unsigned int classId:4; //class ID can be 4-bits long (0-15), as per need.
unsigned int reserved:4; // reserved
};
Above case all work is done in 32-bits only.
But if you use just a integer it would have taken 4*32 bits.
If we take age as 32-bit integer, It can store in range of 0 to 2^32. But don't forget a normal person's age is just max 100 or 140 or 150 (even somebody studying in this age also), which needs max 8-bits to store, So why to waste remaining 24-bits.
You are right, the last structure definition with unsigned _int8 is almost equivalent to the definition using :8. Almost, because byte order can make a difference here, so you might find that the memory layout is reversed in the two cases.
The main purpose of the :8 notation is to allow the use of fractional bytes, as in
struct foo {
uint32_t a:1;
uint32_t b:2;
uint32_t c:3;
uint32_t d:4;
uint32_t e:5;
uint32_t f:6;
uint32_t g:7;
uint32_t h:4;
}
To minimize padding, I strongly suggest to learn the padding rules yourself, they are not hard to grasp. If you do, you can know that your version with unsigned _int8 does not add any padding. Or, if you don't feel like learning those rules, just use __attribute__((__packed__)) on your struct, but that may introduce a severe performance penalty.
It's often used with pragma pack to create bitfields with labels, e.g.:
#pragma pack(0)
struct eg {
unsigned int one : 4;
unsigned int two : 8;
unsigned int three : 16
};
Can be cast for whatever purpose to an int32_t, and vice versa. This might be useful when reading serialized data that follows a (language agnostic) protocol -- you extract an int and cast it to a struct eg to match the fields and field sizes defined in the protocol. You could also skip the conversion and just read an int sized chunk into such a struct, point being that the bitfield sizes match the protocol field sizes. This is extremely common in network programming -- if you want to send a packet following the protocol, you just populate your struct, serialize, and transmit.
Note that pragma pack is not standard C but it is recognized by various common compilers. Without pragma pack, however, the compiler is free to place padding between fields, reducing the use value for the purposes described above.
Related
I need to create structure with an optional value :
typedef struct pkt_header{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length; // optional (present/not present)
unsigned short Version;
} PKT_HEADER;
How can i sometimes use pkt_header->Protected_Payload_Length and sometimes not use this value in a struct when the field is not present ?
My first idea is to declare unsigned char * Protected_Payload_Length and pass NULL when i not use the field and use the unsigned char* for store my unsigned short value.
typedef struct pkt_header{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned char * Protected_Payload_Length; // optional
unsigned short Version;
} PKT_HEADER;
I prepare my packet like this (and send this):
PKT_HEADER header;
header.Packet_Type = 0x0001;
header.Unprotected_Payload_Length = 0x0b00;
header.Protected_Payload_Length = NULL;
header.Version = 0x0000;
I receive response and do this :
PKT_HEADER * header= (PKT_HEADER*)recvbuf;
printf("Packet_Type : %04x\n", header->Packet_Type);
printf("Unprotected_Payload_Length : %04x\n", header->Unprotected_Payload_Length);
printf("Version : %04x\n", header->Version);
But in this case, if i understand correctly, unsigned char * Protected_Payload_Length contain a pointer with a length of 4 bytes then header->Protected_Payload_Length contain 4 bytes but i need 0 byte because the value/field is not present in this precise case.
Do I have to declare an appropriate structure in the data format or is there some other way to play with the structures?
Thanks for your help.
Beware. Structs can have padding, members are not necessarily adjacent in memory. Moreover reinterpreting something as a PKT_HEADER when that something is not a PKT_HEADER object is not allowed. Instead of casting:
PKT_HEADER * header= (PKT_HEADER*)recvbuf;
you probably should use memcpy. Having said this, now to your actual question...
If you rely on members having a specific order in the struct, then inheritance is not an option. In memory the base object comes first, then the derived members, you cannot mix that. For example
struct foo {
int x;
};
struct bar : foo {
int y;
int z;
};
Then a bar object will have in memory
| x | optional padding | y | optional padding | z | optional padding |
There is no simple way to get | y | x | z |.
If you want two different types the easiest is to define two different types:
struct PKT_HEADER_A {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length; // present
unsigned short Version;
};
struct PKT_HEADER_B {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
//unsigned short Protected_Payload_Length; // not present
unsigned short Version;
};
Note that your way to typedef the struct is a C-ism. It is not necessary (and not recommended) in C++.
You should probably take a look at the packing done on NanoPb or Protobuff , because it sounds like you have a packing problem. Data should be pieced together before sending, and the Packet_Type would encode which header to decode/encode with.
If you can't properly pack/unpack, an alternative is to create both
typedef struct {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Protected_Payload_Length;
unsigned short Version;
} PKT_HEADER_FULL;
typedef struct {
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
unsigned short Version;
} PKT_HEADER_SHORT;
then create your a packet header
typedef union u{
PKT_HEADER_FULL full;
PKT_HEADER_SHORT concat;
}PKT_HEADER;
// or as this
typedef struct{
unsigned short Protected_Payload_Length;
unsigned short version
}longform;
typedef struct{
unsigned short Packet_Type;
unsigned short Unprotected_Payload_Length;
union u{
longform l;
unsigned short version;
};
} PKT_HEADER;
Then the data coming in could be decoded either way (again, depending on Packet_type), and the remaining space can be ignored. A caveat to this method is you can't use sizeof(PKT_HEADER) because the struct size will always be the larger value.
I'd like to define an enum to be smaller than one byte while maintaining type safety.
Defining an enum as:
enum MyEnum : unsigned char
{
i ,j, k, w
};
I can shrink it to one byte, however I'd like to make it use only 2 bits since I will at most have 4 values in it. Can this be done?
In my struct where I use the enum, the following does not work
struct MyStruct
{
MyEnum mEnum : 2; // This will be 4 bytes in size
};
Thanks!
Update:
The questions comes from this scenario:
enum MyEnum : unsigned char
{
i ,j, k, w
};
struct MyStruct
{
union
{
signed int mXa:3;
unsigned int mXb:3;
};
union
{
signed int mYa:3;
unsigned int mYb:3;
};
MyEnum mEnum:2;
};
sizeof(MyStruct) is showing 9 bytes. Ideally I'd like the struct to be 1 bytes in size.
Update for implemented solution:
This struct is one byte and offers the same functionality and type safety:
enum MyEnum :unsigned char
{
i,j,k,w
};
struct MyStruct
{
union
{
struct { MyEnum mEnum:2; char mXa:3; char mXb:3;};
struct { MyEnum mEnum:2; unsigned char mYa:3; unsigned char mYb:3;};
};
};
As per standard definition, a types sizeof must be at least 1 byte. This is the smallest addressable unit of memory.
The feature of bitfields you are mentioning allows to define members of structures to have smaller sizes, but the struct itself may not be because
It must be of at least 1 byte too
Alignment considerations might need it to be even bigger
additionally you may not take the address of bitfield members, since as said above, a byte is the smallest addressable unit of memory (You can already see that by sizeofactually returning the number of bytes, not bits, so if you expected less than CHAR_BIT bits, sizeof would not even be able to express it).
bitfields can only share space if they use the same underlying type. And any unused bits are actually left unused; if the sum of bits in an unsigned int bitfield is 3 bits, it still takes 4 bytes total. Since both enums have unsigned int members, they're both 4 bytes, but since they are bitfields, they have an alignment of one. So the first enum is 4 bytes, and the second is four bytes, then the MyEnum is 1 byte. Since all of those have an alignment of one, no padding is needed.
Unfortunately, union doesn't really work with bitfields really at all. Bitfields are for integer types only. The most I could get your data to without serious redesign is 3 bytes: http://coliru.stacked-crooked.com/view?id=c6ad03c93d7893ca2095fabc7f72ca48-e54ee7a04e4b807da0930236d4cc94dc
enum MyEnum : unsigned char
{
i ,j, k, w
};
union MyUnion
{
signed char ma:3; //char to save memory
unsigned char mb:3;
};
struct MyStruct
{
MyUnion X;
MyUnion Y;
MyEnum mEnum;
}; //this structure is three bytes
In the complete redesign category, you have this: http://coliru.stacked-crooked.com/view?id=58269eef03981e5c219bf86167972906-e54ee7a04e4b807da0930236d4cc94dc
No. C++ defines "char" to be the smallest addressable unit of memory for the platform. You can't address 2 bits.
Bit packing 'Works for me'
#include <iostream>
enum MyEnum : unsigned char
{
i ,j, k, w
};
struct MyStruct
{
MyEnum mEnum : 2;
unsigned char val : 6;
};
int main()
{
std::cout << sizeof(MyStruct);
}
prints out 1. How / what are you measuring?
Edit: Live link
Are you doing something like having a pointer as the next thing in the struct? In which case, you'll have 30bits of dead space as pointers must be 4 byte aligned on most 32bit systems.
Edit: With your updated example, its the unions which are breaking you
enum MyEnum : unsigned char
{
i ,j, k, w
};
struct MyStruct
{
unsigned char mXb:3;
unsigned char mYb:3;
MyEnum mEnum:2;
};
Has size 1. I'm not sure how unions and bit packing work together though, so I'm no more help.
What are the effect of fillers in a c++ struct? I often see them in some c++ api. For example:
struct example
{
unsigned short a;
unsigned short b;
char c[3];
char filler1;
unsigned short e;
char filler2;
unsigned int g;
};
This struct is meant to transport through network
struct example
{
unsigned short a; //2 bytes
unsigned short b;//2 bytes
//4 bytes consumed
char c[3];//3 bytes
char filler1;//1 bytes
//4 bytes consumed
unsigned short e;//2 bytes
char filler2;//1 bytes
//3 bytes consumed ,should be filler[2]
unsigned int g;//4 bytes
};
Because sometimes you don't actually control the format of the data you're using.
The format may be specified by something beyond your control. For example, it may be created in a system with different alignment requirements to yours.
Alternatively, the data may have real data in those filler areas that your code doesn't care about.
Those fillers are usually inserted to explicitly make sure some of the members of a structure are naturally aligned i.e. their offset inside a structure is a multiple of its size.
In the example below assuming char is 1 bytes, short is 2 and int is 4.
struct example
{
unsigned short a;
unsigned short b;
char c[3];
char filler1;
unsigned short e; // starts at offset 8
char filler2[2];
unsigned int g; // starts at offset 12
};
If you don't specify any fillers, a compiler will usually add the necessary padding bytes to ensure a proper alignment of the structure members.
Btw, these fields can also be used for reserved fields that might appear in the future.
updated:
Since it has been mentioned that a structure is a network packet, the fillers are required to get a structure that is compatible with the one being passed from another host.
However, inserting filler bytes in this case might not be enough (especially, if portability is required). If these structures are to be sent via a network as is (i.e. without manually packing into a separate buffer for sending), you have to inform a compiler that the structure should be packed.
In microsoft compiler this can be achieved using #pragma pack:
#pragma pack(1)
struct T {
char t;
int i;
short j;
double k;
};
In gcc you can use __attribute__((packed))
struct foo {
char c;
int x;
} __attribute__((packed));
However, many people prefer to manually pack/unpack structures int a raw-byte array, because accessing misaligned data on some systems might not be [properly] supported.
Depending on what code you're working with they may be attempting to align the structure on word boundries (32 bit in your case), this is a speed optimization, however, doing things like this has been rendered obsolete by decent optimizing compilers, however if the compiler was instructed not to optimize this piece of code, or the compiler is very low-end e.g. for an embedded system, it may be better to handle this yourself. It basically boils downto how much you trust the compiler.
The other reason is for writing binary files, where reserved bytes have been left in the file format specification.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How do I find the size of a struct?
Struct varies in memory size?
I am using following struct for network communication, It creates lots of unnecessary bytes in between.
It gives different size than expected 8 Bytes.
struct HttpPacket {
unsigned char x1;
union {
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
unsigned char bytes[7];
unsigned long num;
}
And Following gives different size even though that I am removing a field from a union
struct HttpPacket {
unsigned char x1;
union {
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
unsigned long num;
}
Also, A more clear example
struct {
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
And it gives a size of 8, instead of 7.
And I add one more field, It still gives the same size
struct {
unsigned char EXTRAADDEDFIELD;
unsigned char len;
unsigned short host;
unsigned char content[4];
} packet;
Can someone please help on resolving this issue ?
UPDATE: I need the format to hold while transmitting the packet, So I want to skip these paddings
C makes no guarantees on the size of a struct. The compiler is allowed to line up the members however it wants. Usually, as in this case, it will make the size word-aligned since that's fastest on most machines.
Ever heard of alignment and padding?
Basically, to ensure fast access, certain types have to be on certain bounds of memory addresses.
This is called alignment.
To achieve that, the compiler is allowed to insert bytes into your data structure to achieve that alignment.
This is called padding.
By default, structure fields are aligned on natural boundaries. For example, a 4-byte field will start on a 4-byte boundary. The compiler inserts pad bytes to achieve this. You can avoid the padding by using #pragma pack(0) or other similar compiler directives
If you have a C99 compiler and can use the "new" fixed-width types: make an array of uint8_t and do the separation in members yourself.
uint8_t data[8];
x1 = data[0];
len = data[1];
host = data[2] * 256 + data[3]; /* big endian */
content[0] = data[4];
content[1] = data[5];
content[2] = data[6];
content[3] = data[7];
/* ... */
You can follow the same procedure in C89 if you can rely on CHAR_BIT being 8.
I am programming in linux, which is new to me. I am working on a project to design a 'layer 7' network protocol, and we have these packets that contain resources. And depending on the type of resource, the length of that resource would be different. I am kind of new to C/C++, and am not sure I understand unions all that well. The idea was that I would be able to make a "generic resource" type and depending on what resource it was I could just cast a void* as a pointer to this typedef structure and then call the data contained in it as anything I please and it would take care of the 'casting'. Anyways, here is what I came up with:
typedef struct _pktresource
{
unsigned char Type; // The type of the resource.
union {
struct { // This is used for variable length data.
unsigned short Size;
void *Data;
};
void *ResourceData; // Just a generic pointer to the data.
unsigned char Byte;
char SByte;
short Int16;
unsigned short UInt16;
int Int32;
unsigned int UInt32;
long long Int64;
unsigned long long UInt64;
float Float;
double Double;
unsigned int Time;
};
} pktresource, *ppktresource;
The principal behind this was simple. But when I do something like
pktresource.Size = XXXX
It starts out 4 bytes into the structure instead of 1 byte. Am I failing to grasp a major concept here? Because it feels like I am.
EDIT: Forgot to mention, when I reference
pktresource.Type
It starts at the beginning like its supposed to.
EDIT: Correction was to add pragma statements for proper alignment. After fix, the code looks like:
#pragma pack(push)
#pragma pack(1)
typedef struct _pktresource
{
unsigned char Type; // The type of the resource.
union {
struct { // This is used for variable length data.
unsigned short Size;
unsigned char Data[];
};
unsigned char ResourceData[]; // Just a generic pointer to the data.
unsigned char Byte;
char SByte;
short Int16;
unsigned short UInt16;
int Int32;
unsigned int UInt32;
long long Int64;
unsigned long long UInt64;
float Float;
double Double;
unsigned int Time;
};
} pktresource, *ppktresource;
#pragma pack(pop)
Am I failing to grasp a major concept here?
You're missing knowledge of structure alignment. Basically, it forces certain fields to be aligned by > 1 byte boundaries depending on their size. You can use #pragma to override this behavior, but that can cause interoperability issues if the structure is used anywhere outside your application.
I think the problem is alignment. By default most compilers align to the word size of the machine / OS, in this case 32 bits / 4 bytes. So, since you have that unsigned char Type field up front, the compiler is pushing the Size field to the next even 4 byte boundary.
try
#pragma pack 1
ahead of you structure definitions.
I don't know what compiler you are using, but that's good old-fashioned C code that's been regularly in use for network programming since before most of these rude kids on StackOverflow were born.