What is the reason for naming unions? - c++

Why name a union if the compiler always treats the object as anonymous, regardless as to whether or not the union is named?
My implementation looks like this:
typedef struct _DMessageHeader {
union _msgId {
unsigned char ucMsgId;
unsigned short usMsgId;
unsigned long ulMsgId;
unsigned long long ullMsgId;
} msgId;
} DMSG_HDR, *PDMSG_HDR;
I'd like to be able to access it like this, but the compiler throws an error:
PDMSG_DESC ptMsg->hdr.msgId = id_in;
It only allows me to directly access the union member like this:
PDMSG_DESC ptMsg->hdr.msgId.ucMsgId = id_in;
Any thoughts as to why this is, or how I may access the union by name?

Its a type thing. The compiler can't convert an int something to a union.
You can however overload the "=" operator to do it.

I'm not sure why would you use union in this case at all.
Please note that the size of the struct is 8 bytes (size of long long) on my 64bit machine.
#include <iostream>
using std::cout;
using std::endl;
typedef struct _DMessageHeader {
union _msgId {
unsigned char ucMsgId;
unsigned short usMsgId;
unsigned long ulMsgId;
unsigned long long ullMsgId;
} msgId;
} DMSG_HDR, *PDMSG_HDR;
int main( int argc , char ** argv, char ** env)
{
cout<<"sizof DMessageHeader"<<sizeof(DMSG_HDR)<<endl;
return 0;
}
If all you store in union msgid is a single id of varying length (1 - 8) bytes depending on your architecture) and you have no memory constrains rewrite your struct as following:
typedef struct _DMessageHeader {
unsigned long long msgId;
} DMSG_HDR, *PDMSG_HDR;
DMSG_HDR hdr;
hdr.msgId = id_in;
Also I suggest reading this thread for thorough discussion about using unions in C++.

There can be various reasons:
There are restrictions in original C compiler which doesn't allow anonymous unions. In other words the structure may be used by both C and C++ programs.
You may want to work with whole union (moving, assigning etc.) and this allows you to define the variable of such types.

Because you're not using an anonymous union in your example. You've given your union member of your struct a name, msgId, and it has members. You can't assign directly to the union itself, you have to assign to a member of the union.
An anonymous union would be as follows:
union {
int i;
char c;
};
i = 1;
or
struct s
{
int i1;
union {
int i2;
char c2;
};
};
s s1.i2 = 5;
The union in struct s has no name, and it's members are accessed directly.
ETA:
Assuming your variable id_in is an unsigned char since you assign it to the unsigned char member in the example that works, why would you expect this to work?
PDMSG_DESC ptMsg->hdr.msgId = id_in;
ptMsg->hdr.msgId is not of type unsigned char nor is it an implicitly convertible type. ptMsg->hdr.msgId is of type _DMessageHeader::_msgId.
"A union is a special class type that can hold only one of its non-static data members at a time." (http://en.cppreference.com/w/cpp/language/union) It's a class type and you've defined no conversion operators or constructors. Of course it won't allow the assignment.

Related

Initializing structure object using std::fill

I have below a structure and need to initialize the object of it using std::fill.
typedef struct _test
{
char name[32];
char key[4];
int count;
}test;
As of now, I am using memset. I need to change it to std::fill. I have tried below but std::fill throws compiler error for the structure object.
test t;
char a[5];
std::fill(a, a + 5, 0);
std::fill(t, sizeof(t), 0);
Note: I don't want to initialize using this way. char a[5] = {0};
You don't need std::fill (or std::memset, on which you should read more here). Just value initialize the structure:
test t{};
This will in turn zero initialize all the fields. Beyond that, std::fill accepts a range, and t, sizeof(t) is not a range.
And as a final note, typedef struct _test is a needless C-ism. The structure tag in C++ is also a new type name. So what this does is pollute the enclosing namespace with a _test identifier. If you need C compatibility on the off-chance, the way to go is this
typedef struct test
{
char name[32];
char key[4];
int count;
} test;
This sort of typedef is explicitly valid in C++, despite both the struct declaration and the typedef declaring the same type name. It also allows both C and C++ code to refer to the type as either test or struct test.

Import structs as nested, anonymous structs in union using C++

Please consider the following "unchangeable" declarations:
typedef struct T_MESSAGE
{
unsigned int uiTimestamp;
unsigned char ucDataType;
unsigned int uiDataSize;
unsigned char aucData[1024];
} TT_MESSAGE;
typedef struct T_SENSORDATA_HEADER
{
unsigned char ucSensorType;
unsigned char ucMountingPoint;
} TT_SENSORDATA_HEADER;
In case the message contains Sensor Data, the data is stored within the aucData array, always beginning with the Sensor Data Header. I would like to create a union or struct, which allows me to directly access all members of such a message, without having to use another variable name.
I hope you understand what I want to do by looking at my previous attempts.
I tried it like this:
union SensorDataMessage
{
struct T_Message;
struct
{
unsigned : 32; // Skip uiTimestamp
unsigned : 8; // Skip ucDataType
unsigned : 32; // Skip uiDataSize
struct T_SENSORDATA_HEADER;
};
};
and this:
struct SensorDataOverlay
{
unsigned : 32; // Skip uiTimestamp
unsigned : 8; // Skip ucDataType
unsigned : 32; // Skip uiDataSize
struct T_SENSORDATA_HEADER;
};
union SensorDataMessage
{
struct T_Message;
struct SensorDataOverlay;
};
But none of that is working. In the end, I would like to be able to write something like this:
int Evaluate(SensorDataMessage msg)
{
unsigned char tmp = msg.ucDataType;
unsigned char tmp2 = msg.ucSensorType;
[...]
}
From here I learned that what I want to do should be possible, but only in Visual C:
A Microsoft C extension allows you to declare a structure variable
within another structure without giving it a name. These nested
structures are called anonymous structures. C++ does not allow
anonymous structures.
You can access the members of an anonymous structure as if they were
members in the containing structure.
However, this seems not to be entirely true, since anonymous structs can be used in Visual C++ as well, like it is suggested here.
I would highly appreciate any help.
Here's what I found might help you out:
Have to change C/C++ compiler as Compile as C code (/TC) to gain anonymous structure support.
There's a missing keyword union on the declaration of Evaluate()
Anonymous native data type declaration in SensorDataOverlay seems would confuse the compiler, so I try to collect them into one single structure as CommonHeader, and put one pack in SensorDataOverlay.
I found T_MESSAGE and SensorDataOverlay shared the same scheme in the first three fields, I would say it would be better to be replaced as CommonHeader, would make more sense in perspective of data inheritance. Since at the beginning of question you pointed out that the T_MESSAGE is unchangeable, so I don't do any modification in the following code.
the complete code posted here, able to run, and I guess the memory offset scheme meets your needs.
*struct CommonHeader
{
unsigned int skipUiTimestamp;
unsigned char skipUcDataType;
unsigned int skipUiDataSize;
};
struct SensorDataOverlay
{
/* Use CommonHeader instead */
//unsigned : 32; // Skip uiTimestamp
//unsigned : 8; // Skip ucDataType
//unsigned : 32; // Skip uiDataSize
struct CommonHeader;
struct T_SENSORDATA_HEADER;
};
union SensorDataMessage
{
TT_MESSAGE;
struct SensorDataOverlay;
};
int Evaluate(union SensorDataMessage msg)
{
unsigned char tmp = msg.uiDataSize;
unsigned char tmp2 = msg.ucSensorType;
return 0;
}*

C++ understanding Unions and Structs

I've come to work on an ongoing project where some unions are defined as follows:
/* header.h */
typedef union my_union_t {
float data[4];
struct {
float varA;
float varB;
float varC;
float varD;
};
} my_union;
If I understand well, unions are for saving space, so sizeof(my_union_t) = MAX of the variables in it. What are the advantages of using the statement above instead of this one:
typedef struct my_struct {
float varA;
float varB;
float varC;
float varD;
};
Won't be the space allocated for both of them the same?
And how can I initialize varA,varB... from my_union?
Unions are often used when implementing a variant like object (a type field and a union of data types), or in implementing serialisation.
The way you are using a union is a recipe for disaster.
You are assuming the the struct in the union is packing the floats with no gaps between then!
The standard guarantees that float data[4]; is contiguous, but not the structure elements. The only other thing you know is that the address of varA; is the same as the address of data[0].
Never use a union in this way.
As for your question: "And how can I initialize varA,varB... from my_union?". The answer is, access the structure members in the normal long-winded way not via the data[] array.
Union are not mostly for saving space, but to implement sum types (for that, you'll put the union in some struct or class having also a discriminating field which would keep the run-time tag). Also, I suggest you to use a recent standard of C++, at least C++11 since it has better support of unions (e.g. permits more easily union of objects and their construction or initialization).
The advantage of using your union is to be able to index the n-th floating point (with 0 <= n <= 3) as u.data[n]
To assign a union field in some variable declared my_union u; just code e.g. u.varB = 3.14; which in your case has the same effect as u.data[1] = 3.14;
A good example of well deserved union is a mutable object which can hold either an int or a string (you could not use derived classes in that case):
class IntOrString {
bool isint;
union {
int num; // when isint is true
str::string str; // when isint is false
};
public:
IntOrString(int n=0) : isint(true), num(n) {};
IntOrString(std::string s) : isint(false), str(s) {};
IntOrString(const IntOrString& o): isint(o.isint)
{ if (isint) num = o.num; else str = o.str); };
IntOrString(IntOrString&&p) : isint(p.isint)
{ if (isint) num = std::move (p.num);
else str = std::move (p.str); };
~IntOrString() { if (isint) num=0; else str->~std::string(); };
void set (int n)
{ if (!isint) str->~std::string(); isint=true; num=n; };
void set (std::string s) { str = s; isint=false; };
bool is_int() const { return isint; };
int as_int() const { return (isint?num:0; };
const std::string as_string() const { return (isint?"":str;};
};
Notice the explicit calls of destructor of str field. Notice also that you can safely use IntOrString in a standard container (std::vector<IntOrString>)
See also std::optional in future versions of C++ (which conceptually is a tagged union with void)
BTW, in Ocaml, you simply code:
type intorstring = Integer of int | String of string;;
and you'll use pattern matching. If you wanted to make that mutable, you'll need to make a record or a reference of it.
You'll better use union-s in a C++ idiomatic way (see this for general advices).
I think the best way to understand unions is to just to give 2 common practical examples.
The first example is working with images. Imagine you have and RGB image that is arranged in a long buffer.
What most people would do, is represent the buffer as a char* and then loop it by 3's to get the R,G,B.
What you could do instead, is make a little union, and use that to loop over the image buffer:
union RGB
{
char raw[3];
struct
{
char R;
char G;
char B;
} colors;
}
RGB* pixel = buffer[0];
///pixel.colors.R == The red color in the first pixel.
Another very useful use for unions is using registers and bitfields.
Lets say you have a 32 bit value, that represents some HW register, or something.
Sometimes, to save space, you can split the 32 bits into bit fields, but you also want the whole representation of that register as a 32 bit type.
This obviously saves bit shift calculation that a lot of programmers use for no reason at all.
union MySpecialRegister
{
uint32_t register;
struct
{
unsigned int firstField : 5;
unsigned int somethingInTheMiddle : 25;
unsigned int lastField : 6;
} data;
}
// Now you can read the raw register into the register field
// then you can read the fields using the inner data struct
The advantage is that with a union you can access the same memory in two different ways.
In your example the union contains four floats. You can access those floats as varA, varB... which might be more descriptive names or you can access the same variables as an array data[0], data[1]... which might be more useful in loops.
With a union you can also use the same memory for different kinds of data, you might find that useful for things like writing a function to tell you if you are on a big endian or little endian CPU.
No, it is not for saving space. It is for ability to represent some binary data as various data types.
for example
#include <iostream>
#include <stdint.h>
union Foo{
int x;
struct y
{
unsigned char b0, b1, b2, b3;
};
char z[sizeof(int)];
};
int main()
{
Foo bar;
bar.x = 100;
std::cout << std::hex; // to show number in hexadec repr;
for(size_t i = 0; i < sizeof(int); i++)
{
std::cout << "0x" << (int)bar.z[i] << " "; // int is just to show values as numbers, not a characters
}
return 0;
}
output: 0x64 0x0 0x0 0x0 The same values are stored in struct bar.y, but not in array but in sturcture members. Its because my machine have a little endiannes. If it were big, than the output would be reversed: 0x0 0x0 0x0 0x64
You can achieve the same using reinterpret_cast:
#include <iostream>
#include <stdint.h>
int main()
{
int x = 100;
char * xBytes = reinterpret_cast<char*>(&x);
std::cout << std::hex; // to show number in hexadec repr;
for (size_t i = 0; i < sizeof(int); i++)
{
std::cout << "0x" << (int)xBytes[i] << " "; // (int) is just to show values as numbers, not a characters
}
return 0;
}
its usefull, for example, when you need to read some binary file, that was written on a machine with different endianess than yours. You can just access values as bytearray and swap those bytes as you wish.
Also, it is usefull when you have to deal with bit fields, but its a whole different story :)
First of all: Avoid unions where the access goes to the same memory but to different types!
Unions did not save space at all. The only define multiple names on the same memory area! And you can only store one of the elements in one time in a union.
if you have
union X
{
int x;
char y[4];
};
you can store an int OR 4 chars but not both! The general problem is, that nobody knows which data is actually stored in a union. If you store a int and read the chars, the compiler will not check that and also there is no runtime check. A solution is often to provide an additional data element in a struct to a union which contains the actual stored data type as an enum.
struct Y
{
enum { IS_CHAR, IS_INT } tinfo;
union
{
int x;
char y[4];
};
}
But in c++ you always should use classes or structs which can derive from a maybe empty parent class like this:
class Base
{
};
class Int_Type: public Base
{
...
int x;
};
class Char_Type: public Base
{
...
char y[4];
};
So you can device pointers to base which actually can hold a Int or a Char Type for you. With virtual functions you can access the members in a object oriented way of programming.
As mentioned already from Basile's answer, a useful case can be the access via different names to the same type.
union X
{
struct data
{
float a;
float b;
};
float arr[2];
};
which allows different access ways to the same data with the same type. Using different types which are stored in the same memory should be avoided at all!

Typecaste char[] into structure and retrieve the values of the structure.?

#include <iostream> // std::cout
using namespace std;
struct mystruct
{
unsigned int a;
unsigned char b;
unsigned long long c;
};
int main ()
{
unsigned char str[1];
unsigned int a,b,c;
str[0]=1; // str[0]=??????
mystruct* obj = (mystruct *)(&(str[0]));
c=obj->c;
a=(unsigned int)obj->a;
b=(unsigned int)obj->b;
cout<<"a="<<a<<"\t b="<<b<<"\t c="<<c<<endl;
}
Is it possible do the above thing? If yes, then:
What should I fill in str[0] so that I get value of a=1,b=257,c=1?
currently I'm getting below output:
a=1 b=0 c=8388449
Unless you are coding for a microcontroller on a compiler with very defined semantics, you shouldn't be doing that. The reason is that the struct could have paddings, the computer could be little or big endian, sizeof(int) is not the same on all computers, and char is not necessarily 8 bits either.
This is besides the fact that your str is too short anyway.
While this is undefined behavior in C, on microcontrollers these things are often well-defined and can be used. One example would be:
unsigned char str[sizeof(struct mystruct)];
struct mystruct* obj = (void *)str;
To know the conversion between the contents of str and obj, you would need to exactly know how your compiler pads the struct as well as the sizeof each member and the endian-ness of the computer.
Again, unless in very specific locations, this kind of coding is plain wrong.

Can I use an enum as a struct name?

I think it may be called literal?
enum packet_structures{
PacketOne,
PacketTwo,
PacketThree
};
struct PacketOne{
unsigned int packet_id;
};
struct PacketTwo{
unsigned int packet_id;
};
struct PacketThree{
unsigned int packet_id;
};
And let's say I have a general packet.
struct PacketGeneral{
packet_structures TypeOfPacket;
};
PacketGeneral newPacket;
newPacket.TypeOfPacket = PacketOne;
Can I literally use that enum's name to typecast a char* to a struct (i.e PacketOne)? Without having to typecast with (struct PacketOne), how can I just typecast that same struct but with just the enumeration newPacket.TypeOfPacket?
No you cannot. Enums are used for storing literals and not identifiers.
No, at least not in C.
enum declares/defines constant identifiers.
So, you cannot use those same identifier again as structure tag name.
You can check that by compiling a program containing these declarations/definitions.