How can I change the order of a packed structure in C or C++?
struct myStruct {
uint32_t A;
uint16_t B1;
uint16_t B2;
} __attribute__((packed));
The address 0x0 of the structure (or the LSB) is A.
My app communicates with hardware and the structure in hardware is defined like this:
struct packed {
logic [31:0] A;
logic [15:0] B1;
logic [15:0] B2;
} myStruct;
But in SystemVerilog the "address 0x0" or more accurately the LSB of the structure is the LSB of B2 = B2[0].
The order is reversed.
To stay consistent and to avoid changing the hardware part, I'd like to inverse the "endianness" of the whole C/C++ structure.
I could just inverse all the fields:
struct myStruct {
uint16_t B2;
uint16_t B1;
uint32_t A;
} __attribute__((packed));
but it's error-prone and not so convenient.
For datatype, both SystemVerilog and Intel CPUs are little-endian, that's not an issue.
How can I do it?
How can I change the byte orders of a struct?
You cannot change the order of bytes within members. And you cannot change the memory order of the members in relation to other members to be different from the order of their declaration.
But, you can change the declaration order of members which is what determines their memory order. The first member is always in lowest memory position, second is after that and so on.
If correct order of members can be known based on the verilog source, then ideally the C struct definition should be generated with meta-programming to ensure matching order.
it's error-prone
Relying on particular memory order is error-prone indeed.
It is possible to rely only on the known memory order of the source data (presumably an array of bytes) without relying on the memory order of the members at all:
unsigned char* data = read_hardware();
myStruct s;
s.B2 = data[0] << 0u
| data[1] << 8u;
s.B1 = data[2] << 0u
| data[3] << 8u;
s.A = data[4] << 0u
| data[5] << 8u
| data[6] << 16u
| data[7] << 24u;
This relies neither memory layout of the members, nor on the endianness of CPU. It relies only on order of the source data (assumed to be little endian in this case).
If possible, this function should also ideally be generated based on the verilog source.
How can I change the order of a packed structure in C or C++?
C specifies that the members of a struct are laid out in memory in the order in which they are declared, with the address of the first-declared, when converted to the appropriate pointer type, being equal to the address of the overall struct. At least for struct types expressible in C, such as yours, conforming C++ implementations will follow the same member-order rule. Those implementations that support packed structure layout as an extension are pretty consistent in what they mean by that: packed structure layouts will have no padding between members, and the overall size is the sum of the sizes of the members. And no other effects.
I am not aware of any implementation that provides an extention allowing members to be ordered differently than declaration order, and who would bother to implement that? The order of members is well-defined. If you want a different order, then the solution is to change the declaration order of the members.
If VeriLog indeed orders the members differently (to which I cannot speak) then I think you're just going to need to make peace with that. Implement it as you need to do or as otherwise makes the most sense, document on both sides, and move on. I'm inclined to think that the number of people who ever notice that the declaration order differs in the two languages will be very small. As long as appropriate the documentation is present, those that do notice won't be inclined to think there's an error.
You know I just looked AMD does in it's open source drivers to handle endianness.
First of all they detect if the system is big endian/little endian using cmake.
#if !defined (__GFX10_GB_REG_H__)
#define __GFX10_GB_REG_H__
/*
* gfx10_gb_reg.h
*
* Register Spec Release: 1.0
*
*/
//
// Make sure the necessary endian defines are there.
//
#if defined(LITTLEENDIAN_CPU)
#elif defined(BIGENDIAN_CPU)
#else
#error "BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined"
#endif
union GB_ADDR_CONFIG
{
struct
{
#if defined(LITTLEENDIAN_CPU)
unsigned int NUM_PIPES : 3;
unsigned int PIPE_INTERLEAVE_SIZE : 3;
unsigned int MAX_COMPRESSED_FRAGS : 2;
unsigned int NUM_PKRS : 3;
unsigned int : 21;
#elif defined(BIGENDIAN_CPU)
unsigned int : 21;
unsigned int NUM_PKRS : 3;
unsigned int MAX_COMPRESSED_FRAGS : 2;
unsigned int PIPE_INTERLEAVE_SIZE : 3;
unsigned int NUM_PIPES : 3;
#endif
} bitfields, bits;
unsigned int u32All;
int i32All;
float f32All;
};
#endif
Yes there is some code duplication as mentioned above. But I'm not aware of a universally better solution either.
Independent of the endian issue, I wouldn't recommend C++ bit fields for this kind of purpose, or any purpose in which you actually need explicit control of bit alignment. A long time ago, the decision to put performance over portability ruined this possibility. Alignment of bit fields (and structs in general for that matter) is not well defined in C++, making bit fields useless for many purposes. IMO would be better to let programmers make such decisions for optimization, or implement a strictly portable (non-machine dependent) packed keyword. If this means the compiler has to emit code that combines multiple shift-and operations once in a while, so be it.
As far as I know, the only general solution for this kind of thing is to add a layer that implements bit fields explicitly using shift-and logic. Of course this will likely ruin performance because you really want the conditionals to be handled at compile time, which is ironic because performance is what motivated this situation in the first place.
Related
I have a
typedef struct {
uint32_t Thread: HTHREAD_BUS_WIDTH;
uint32_t Member: 3;
uint32_t Proxy:3;
// Other members, fill out 32 bits
} MyStruct;
that I must transfer from one system to another as an item of
a buffer comprising 32-bit words.
What is the best way to serialize the struct, and on the other side,
to deserialize it? "best" means here safe casting, and no unneeded copying.
For one direction of casting, I have found (as member function)
int &ToInt() {
return *reinterpret_cast<int *>(this);}
Is there similar valid casting in the other way round, i.e. from integer to MyStruct; the best would be as a member function?
How can I define which bit means which field? (It may even the case,
that the deserialization happens in another program, in another language, in little/big endian systems?
How can I define which bit means which field?
You cannot. You have no control over the layout of bitfields.
"best" means here safe casting, and no unneeded copying.
There is no portable safe cast that could avoid copying.
A portable way to serialise bitfields is to manually shift into an integer, in the desired order. For example:
MyStruct value = something;
uint32_t out = 0;
out |= value.Thread;
out << HTHREAD_BUS_WIDTH;
out |= value.Member;
out << 3;
out |= value.Proxy;
In the shown example, the least significant bits contain the field Proxy while the other fields are adjacent in more significant bits.
Of course, in order to serialise this generated integer correctly, just like serialising any integer, you must take endianness into consideration. Serialisation of an integer can be portably implemented by repeatedly shifting the integer, and copying the bytes in order of significance into an array.
If you need to read from other system which might have different endianess you cannot rely on a portable bitfield. A solution is to "expanse" your structure so that each field is serialyzed as a 32 bit value in the "transport" buffer. A safe implementation could be something like:
typedef struct {
uint32_t Thread: HTHREAD_BUS_WIDTH;
uint32_t Member: 3;
uint32_t Proxy:3;
// Other members, fill out 32 bits
std::vector<uint32_t > to_buffer() const;
} MyStruct;
Implementation of to_buffer():
std::vector<uint32_t > MyStruct::to_buffer() const
{
std::vector<uint32_t> buffer;
buffer.push_back((uint32_t )(Thread);
buffer.push_back((uint32_t )(Member);
buffer.push_back((uint32_t )(Proxy);
// push other members
return buffer;
}
then on the reception side you can do the "buffer" to struct.
If you do not want to expanse the fields that do not use 32 bits you can always implement you own packing function by shifting and masking bits eg:
uint32_t menber_and_procy = (Member << 3) | proxy; // and so one for other members.
It is much more error prone.
From my own experience, if communication bandwith is not an issue, relying on "text like" content is a better choice (no endianess issues and very easy to debug).
This question already has answers here:
Compile-time check to make sure that there is no padding anywhere in a struct
(4 answers)
Closed 3 years ago.
Lets consider the following task:
My C++ module as part of an embedded system receives 8 bytes of data, like: uint8_t data[8].
The value of the first byte determines the layout of the rest (20-30 different). In order to get the data effectively, I would create different structs for each layout and put each to a union and read the data directly from the address of my input through a pointer like this:
struct Interpretation_1 {
uint8_t multiplexer;
uint8_t timestamp;
uint32_t position;
uint16_t speed;
};
// and a lot of other struct like this (with bitfields, etc..., layout is not defined by me :( )
union DataInterpreter {
Interpretation_1 movement;
//Interpretation_2 temperatures;
//etc...
};
...
uint8_t exampleData[8] {1u, 10u, 20u,0u,0u,0u, 5u,0u};
DataInterpreter* interpreter = reinterpret_cast<DataInterpreter*>(&exampleData);
std::cout << "position: " << +interpreter->movement.position << "\n";
The problem I have is, the compiler can insert padding bytes to the interpretation structs and this kills my idea. I know I can use
with gcc: struct MyStruct{} __attribute__((__packed__));
with MSVC: I can use #pragma pack(push, 1) MyStruct{}; #pragma pack(pop)
with clang: ? (I could check it)
But is there any portable way to achieve this? I know c++11 has e.g. alignas for alignment control, but can I use it for this? I have to use c++11 but I would be just interested if there is a better solution with later version of c++.
But is there any portable way to achieve this?
No, there is no (standard) way to "make" a type that would have padding to not have padding in C++. All objects are aligned at least as much as their type requires and if that alignment doesn't match with the previous sub objects, then there will be padding and that is unavoidable.
Furthermore, there is another problem: You're accessing through a reinterpreted pointed that doesn't point to an object of compatible type. The behaviour of the program is undefined.
We can conclude that classes are not generally useful for representing arbitrary binary data. The packed structures are non-standard, and they also aren't compatible across different systems with different representations for integers (byte endianness).
There is a way to check whether a type contains padding: Compare the size of the sub objects to the size of the complete object, and do this recursively to each member. If the sizes don't match, then there is padding. This is quite tricky however because C++ has minimal reflection capabilities, so you need to resort either hard coding or meta programming.
Given such check, you can make the compilation fail on systems where the assumption doesn't hold.
Another handy tool is std::has_unique_object_representations (since C++17) which will always be false for all types that have padding. But note that it will also be false for types that contain floats for example. Only types that return true can be meaningfully compared for equality with std::memcmp.
Reading from unaligned memory is undefined behavior in C++. In other words, the compiler is allowed to assume that every uint32_t is located at a alignof(uint32_t)-byte boundary and every uint16_t is located at a alignof(uint16_t)-byte boundary. This means that if you somehow manage to pack your bytes portably, doing interpreter->movement.position will still trigger undefined behaviour.
(In practice, on most architectures, unaligned memory access will still work, but albeit incur a performance penalty.)
You could, however, write a wrapper, like how std::vector<bool>::operator[] works:
#include <cstdint>
#include <cstring>
#include <iostream>
#include <type_traits>
template <typename T>
struct unaligned_wrapper {
static_assert(std::is_trivial<T>::value);
std::aligned_storage_t<sizeof(T), 1> buf;
operator T() const noexcept {
T ret;
memcpy(&ret, &buf, sizeof(T));
return ret;
}
unaligned_wrapper& operator=(T t) noexcept {
memcpy(&buf, &t, sizeof(T));
return *this;
}
};
struct Interpretation_1 {
unaligned_wrapper<uint8_t> multiplexer;
unaligned_wrapper<uint8_t> timestamp;
unaligned_wrapper<uint32_t> position;
unaligned_wrapper<uint16_t> speed;
};
// and a lot of other struct like this (with bitfields, etc..., layout is not defined by me :( )
union DataInterpreter {
Interpretation_1 movement;
//Interpretation_2 temperatures;
//etc...
};
int main(){
uint8_t exampleData[8] {1u, 10u, 20u,0u,0u,0u, 5u,0u};
DataInterpreter* interpreter = reinterpret_cast<DataInterpreter*>(&exampleData);
std::cout << "position: " << interpreter->movement.position << "\n";
}
This would ensure that every read or write to the unaligned integer is transformed to a bytewise memcpy, which does not have any alignment requirement. There might be a performance penalty for this on architectures with the ability to access unaligned memory quickly, but it would work on any conforming compiler.
I have embedded device connected to PC
and some big struct S with many fields and arrays of custom defined type FixedPoint_t.
FixedPoint_t is a templated POD class with exactly one data member that vary in size from char to long depending on template params. Anyway it passes static_assert((std::is_pod<FixedPoint_t<0,8,8> >::value == true),"");
It will be good if this big struct has compatible underlaying memory representation on both embedded system and controlling PC. This allows significant simplification of communication protocol to commands like "set word/byte with offset N to value V". Assume endianess is the same on both platforms.
I see 3 solutions here:
Use something like #pragma packed on both sides.
BUT i got warning when i put attribute((packed)) to struct S declaration
warning: ignoring packed attribute because of unpacked non-POD field.
This is because FixedPoint_t is not declared as packed.
I don't want declare it as packed because this type is widely used in whole program and packing can lead to performance drop.
Make correct struct serialization. This is not acceptable because of code bloat, additional RAM usege...Protocol will be more complicated because i need random access to the struct. Now I think this is not an option.
Control padding manually. I can add some field, reorder others...Just to acheive no padding on both platforms. This will satisfy me at the moment. But i need good way to write a test that shows me is padding is there or not.
I can compare sum of sizeof() each field to sizeof(struct).
I can compare offsetof() each struct field on both platforms.
Both variants are ugly enough...
What do you recommend? Especially i am interested in manual padding controling and automaic padding detection in tests.
EDIT: Is it sufficient to compare sizeof(big struct) on two platforms to detect layout compatibility(assume endianess is equal)?? I think size should not match if padding will be different.
EDIT2:
//this struct should have padding on 32bit machine
//and has no padding on 8bit
typedef struct
{
uint8_t f8;
uint32_t f32;
uint8_t arr[5];
} serialize_me_t;
//count of members in struct
#define SERTABLE_LEN 3
//one table entry for each serialize_me_t data member
static const struct {
size_t width;
size_t offset;
// size_t cnt; //why we need cnt?
} ser_des_table[SERTABLE_LEN] =
{
{ sizeof(serialize_me_t::f8), offsetof(serialize_me_t, f8)},
{ sizeof(serialize_me_t::f32), offsetof(serialize_me_t, f32)},
{ sizeof(serialize_me_t::arr), offsetof(serialize_me_t, arr)},
};
void serialize(void* serialize_me_ptr, char* buf)
{
const char* struct_ptr = (const char*)serialize_me_ptr;
for(int i=0; i<SERTABLE_LEN; I++)
{
struct_ptr += ser_des_table[i].offset;
memcpy(buf, struct_ptr, ser_des_table[i].width );
buf += ser_des_table[i].width;
}
}
I strongly recommend to use option 2:
You are save for future changes (new PCD/ABI, compiler, platform, etc.)
Code-bloat can be kept to a minimum if well thought. There is just one function needed per direction.
You can create the required tables/code (semi-)automatically (I use Python for such). This way both sides will stay in sync.
You definitively should add a CRC to the data anyway. As you likely do not want to calculate this in the rx/tx-interrupt, you'll have to provide an array anyway.
Using a struct directly will soon become a maintenance nightmare. Even worse if someone else has to track this code.
Protocols, etc. tend to be reused. If it is a platform with different endianess, the other approach goes bang.
To create the data-structs and ser/des tables, you can use offsetof to get the offset of each type in the struct. If that table is made an include-file, it can be used on both sides. You can even create the struct and table e.g. by a Python script. Adding that to the build-process ensures it is always up-to-date and you avoid additional typeing.
For instance (in C, just to get idea):
// protocol.inc
typedef struct {
uint32_t i;
uint 16_t s[5];
uint32_t j;
} ProtocolType;
static const struct {
size_t width;
size_t offset;
size_t cnt;
} ser_des_table[] = {
{ sizeof(ProtocolType.i), offsetof(ProtocolType.i), 1 },
{ sizeof(ProtocolType.s[0]), offsetof(ProtocolType.s), 5 },
...
};
If not created automatically, I'd use macros to generate the data. Possibly by including the file twice: one to generate the struct definition and one for the table. This is possible by redefining the macros in-between.
You should care about the representation of signed integers and floats (implementation defined, floats are likely IEEE754 as proposed by the standard).
As an alternative to the width field, you can use an "type" code (e.g. a char which maps to an implementation-defined type. This way you can add custom types with same width, but different encoding (e.g. uint32_t and IEEE754-float). This will completely abstract the network protocol encoding from the physical machine (the best solution). Note noting hinders you from using common encodings which do not complicate code a single bit (literally).
I have a 64bit integer that is used as a handle. The 64bits must be sliced into the following fields, to be accessed individually:
size : 30 bits
offset : 30 bits
invalid flag : 1 bit
immutable flag : 1 bit
type flag : 1 bit
mapped flag : 1 bit
The two ways I can think of to achieve this are:
1) Traditional bit operations (& | << >>), etc. But I find this a bit cryptic.
2) Use a bitfield struct:
#pragma pack(push, 1)
struct Handle {
uint32_t size : 30;
uint32_t offset : 30;
uint8_t invalid : 1;
uint8_t immutable : 1;
uint8_t type : 1;
uint8_t mapped : 1;
};
#pragma pack(pop)
Then accessing a field becomes very clear:
handle.invalid = 1;
But I understand bitfields are quite problematic and non-portable.
I'm looking for ways to implement this bit manipulation with the object of maximizing code clarity and readability. Which approach should I take?
Side notes:
The handle size must not exceed 64bits;
The order these fields are laid in memory is irrelevant, as long as each field size is respected;
The handles are not saved/loaded to file, so I don't have to worry about endianess.
I would go for the bitfields solution.
Bitfields are only "non-portable" if you want to store the in binary form and later read the bitfield using a different compiler or, more commonly, on a different machine architecture. This is mainly because field order is not defined by the standard.
Using bitfields within your application will be fine, and as long as you have no requirement for "binary portability" (storing your Handle in a file and reading it on a different system with code compiled by a different compiler or different processor type), it will work just fine.
Obviously, you need to do some checking, e.g. sizeof(Handle) == 8 should be done somewhere, to ensure that you get the size right, and compiler hasn't decided to put your two 30-bit values in separate 32-bit words. To improve the chances of success on multiple architectures, I'd probably define the type as:
struct Handle {
uint64_t size : 30;
uint64_t offset : 30;
uint64_t invalid : 1;
uint64_t immutable : 1;
uint64_t type : 1;
uint64_t mapped : 1;
};
There is some rule that the compiler should not "split elements", and if you define something as uint32_t, and there are only two bits left in the field, the whole 30 bits move to the next 32-bit element. [It probably works in most compilers, but just in case, using the same 64-bit type throughout is a better choice]
I recommend bit operations. Of course you should hide all those operations inside a class. Provide member functions to perform set/get operations. Judicious use of constants inside the class will make most of the operations fairly transparent. For example:
bool Handle::isMutable() const {
return bits & MUTABLE;
}
void Handle::setMutable(bool f) {
if (f)
bits |= MUTABLE;
else
bits &= ~MUTABLE;
}
I am trying to perform a less-than-32bit read over the PCI bus to a VME-bridge chip (Tundra Universe II), which will then go onto the VME bus and picked up by the target.
The target VME application only accepts D32 (a data width read of 32bits) and will ignore anything else.
If I use bit field structure mapped over a VME window (nmap'd into main memory) I CAN read bit fields >24 bits, but anything less fails. ie :-
struct works {
unsigned int a:24;
};
struct fails {
unsigned int a:1;
unsigned int b:1;
unsigned int c:1;
};
struct main {
works work;
fails fail;
}
volatile *reg = function_that_creates_and_maps_the_vme_windows_returns_address()
This shows that the struct works is read as a 32bit, but a read via fails struct of a for eg reg->fail.a is getting factored down to a X bit read. (where X might be 16 or 8?)
So the questions are :
a) Where is this scaled down? Compiler? OS? or the Tundra chip?
b) What is the actual size of the read operation performed?
I basiclly want to rule out everything but the chip. Documentation on that is on the web, but if it can be proved that the data width requested over the PCI bus is 32bits then the problem can be blamed on the Tundra chip!
edit:-
Concrete example, code was:-
struct SVersion
{
unsigned title : 8;
unsigned pecversion : 8;
unsigned majorversion : 8;
unsigned minorversion : 8;
} Version;
So now I have changed it to this :-
union UPECVersion
{
struct SVersion
{
unsigned title : 8;
unsigned pecversion : 8;
unsigned majorversion : 8;
unsigned minorversion : 8;
} Version;
unsigned int dummy;
};
And the base main struct :-
typedef struct SEPUMap
{
...
...
UPECVersion PECVersion;
};
So I still have to change all my baseline code
// perform dummy 32bit read
pEpuMap->PECVersion.dummy;
// get the bits out
x = pEpuMap->PECVersion.Version.minorversion;
And how do I know if the second read wont actually do a real read again, as my original code did? (Instead of using the already read bits via the union!)
Your compiler is adjusting the size of your struct to a multiple of its memory alignment setting. Almost all modern compilers do this. On some processors, variables and instructions have to begin on memory addresses that are multiples of some memory alignment value (often 32-bits or 64-bits, but the alignment depends on the processor architecture). Most modern processors don't require memory alignment anymore - but almost all of them see substantial performance benefit from it. So the compilers align your data for you for the performance boost.
However, in many cases (such as yours) this isn't the behavior you want. The size of your structure, for various reasons, can turn out to be extremely important. In those cases, there are various ways around the problem.
One option is to force the compiler to use different alignment settings. The options for doing this vary from compiler to compiler, so you'll have to check your documentation. It's usually a #pragma of some sort. On some compilers (the Microsoft compilers, for instance) it's possible to change the memory alignment for only a very small section of code. For example (in VC++):
#pragma pack(push) // save the current alignment
#pragma pack(1) // set the alignment to one byte
// Define variables that are alignment sensitive
#pragma pack(pop) // restore the alignment
Another option is to define your variables in other ways. Intrinsic types are not resized based on alignment, so instead of your 24-bit bitfield, another approach is to define your variable as an array of bytes.
Finally, you can just let the compilers make the structs whatever size they want and manually record the size that you need to read/write. As long as you're not concatenating structures together, this should work fine. Remember, however, that the compiler is giving you padded structs under the hood, so if you make a larger struct that includes, say, a works and a fails struct, there will be padded bits in between them that could cause you problems.
On most compilers, it's going to be darn near impossible to create a data type smaller than 8 bits. Most architectures just don't think that way. This shouldn't be a huge problem because most hardware devices that use datatypes of smaller than 8-bits end up arranging their packets in such a way that they still come in 8-bit multiples, so you can do the bit manipulations to extract or encode the values on the data stream as it leaves or comes in.
For all of the reasons listed above, a lot of code that works with hardware devices like this work with raw byte arrays and just encode the data within the arrays. Despite losing a lot of the conveniences of modern language constructs, it ends up just being easier.
I am wondering about the value of sizeof(struct fails). Is it 1? In this case, if you perform the read by dereferencing a pointer to a struct fails, it looks correct to issue a D8 read on the VME bus.
You can try to add a field unsigned int unused:29; to your struct fails.
The size of a struct is not equal to the sum of the size of its fields, including bit fields. Compilers are allowed, by the C and C++ language specifications, to insert padding between fields in a struct. Padding is often inserted for alignment purposes.
The common method in embedded systems programming is to read the data as an unsigned integer then use bit masking to retrieve the interesting bits. This is due to the above rule that I stated and the fact that there is no standard compiler parameter for "packing" fields in a structure.
I suggest creating an object ( class or struct) for interfacing with the hardware. Let the object read the data, then extract the bits as bool members. This puts the implementation as close to the hardware. The remaining software should not care how the bits are implemented.
When defining bit field positions / named constants, I suggest this format:
#define VALUE (1 << BIT POSITION)
// OR
const unsigned int VALUE = 1 << BIT POSITION;
This format is more readable and has the compiler perform the arithmetic. The calculation takes place during compilation and has no impact during run-time.
As an example, the Linux kernel has inline functions that explicitly handle memory-mapped IO reads and writes. In newer kernels it's a big macro wrapper that boils down to an inline assembly movl instruction, but it older kernels it was defined like this:
#define readl(addr) (*(volatile unsigned int *) (addr))
#define writel(b,addr) ((*(volatile unsigned int *) (addr)) = (b))
Ian - if you want to be sure as to the size of things you're reading/writing I'd suggest not using structs like this to do it - it's possible the sizeof of the fails struct is just 1 byte - the compiler is free to decide what it should be based on optimizations etc- I'd suggest reading/writing explicitly using int's or generally the things you need to assure the sizes of and then doing something else like converting to a union/struct where you don't have those limitations.
It is the compiler that decides what size read to issue. To force a 32 bit read, you could use a union:
union dev_word {
struct dev_reg {
unsigned int a:1;
unsigned int b:1;
unsigned int c:1;
} fail;
uint32_t dummy;
};
volatile union dev_word *vme_map_window();
If reading the union through a volatile-qualified pointer isn't enough to force a read of the whole union (I would think it would be - but that could be compiler-dependent), then you could use a function to provide the required indirection:
volatile union dev_word *real_reg; /* Initialised with vme_map_window() */
union dev_word * const *reg_func(void)
{
static union dev_word local_copy;
static union dev_word * const static_ptr = &local_copy;
local_copy = *real_reg;
return &static_ptr;
}
#define reg (*reg_func())
...then (for compatibility with the existing code) your accesses are done as:
reg->fail.a
The method described earlier of using the gcc flag -fstrict-volatile-bitfields and defining bitfield variables as volatile u32 works, but the total number of bits defined must be greater than 16.
For example:
typedef union{
vu32 Word;
struct{
vu32 LATENCY :3;
vu32 HLFCYA :1;
vu32 PRFTBE :1;
vu32 PRFTBS :1;
};
}tFlashACR;
.
tFLASH* const pFLASH = (tFLASH*)FLASH_BASE;
#define FLASH_LATENCY pFLASH->ACR.LATENCY
.
FLASH_LATENCY = Latency;
causes gcc to generate code
.
ldrb r1, [r3, #0]
.
which is a byte read. However, changing the typedef to
typedef union{
vu32 Word;
struct{
vu32 LATENCY :3;
vu32 HLFCYA :1;
vu32 PRFTBE :1;
vu32 PRFTBS :1;
vu32 :2;
vu32 DUMMY1 :8;
vu32 DUMMY2 :8;
};
}tFlashACR;
changes the resultant code to
.
ldr r3, [r2, #0]
.
I believe the only solution is to
1) edit/create my main struct as all 32bit ints (unsigned longs)
2) keep my original bit-field structs
3) each access I require,
3.1) I have to read the struct member as a 32bit word, and cast it into the bit-field struct,
3.2) read the bit-field element I require. (and for writes, set this bit-field, and write the word back!)
(1) Which is a same, because then I lose the intrinsic types that each member of the "main/SEPUMap" struct are.
End solution :-
Instead of :-
printf("FirmwareVersionMinor: 0x%x\n", pEpuMap->PECVersion);
This :-
SPECVersion ver = *(SPECVersion*)&pEpuMap->PECVersion;
printf("FirmwareVersionMinor: 0x%x\n", ver.minorversion);
Only problem I have is writting! (Writes are now Read/Modify/Writes!)
// Read - Get current
_HVPSUControl temp = *(_HVPSUControl*)&pEpuMap->HVPSUControl;
// Modify - set to new value
temp.OperationalRequestPort = true;
// Write
volatile unsigned int *addr = reinterpret_cast<volatile unsigned int*>(&pEpuMap->HVPSUControl);
*addr = *reinterpret_cast<volatile unsigned int*>(&temp);
Just have to tidy that code up into a method!
#define writel(addr, data) ( *(volatile unsigned long*)(&addr) = (*(volatile unsigned long*)(&data)) )
I had same problem on ARM using GCC compiler, where write into memory is only through bytes rather than 32bit word.
The solution is to define bit-fields using volatile uint32_t (or required size to write):
union {
volatile uint32_t XY;
struct {
volatile uint32_t XY_A : 4;
volatile uint32_t XY_B : 12;
};
};
but while compiling you need add to gcc or g++ this parameter:
-fstrict-volatile-bitfields
more in gcc documentation.