force a bit field read to 32 bits - c++

I am trying to perform a less-than-32bit read over the PCI bus to a VME-bridge chip (Tundra Universe II), which will then go onto the VME bus and picked up by the target.
The target VME application only accepts D32 (a data width read of 32bits) and will ignore anything else.
If I use bit field structure mapped over a VME window (nmap'd into main memory) I CAN read bit fields >24 bits, but anything less fails. ie :-
struct works {
unsigned int a:24;
};
struct fails {
unsigned int a:1;
unsigned int b:1;
unsigned int c:1;
};
struct main {
works work;
fails fail;
}
volatile *reg = function_that_creates_and_maps_the_vme_windows_returns_address()
This shows that the struct works is read as a 32bit, but a read via fails struct of a for eg reg->fail.a is getting factored down to a X bit read. (where X might be 16 or 8?)
So the questions are :
a) Where is this scaled down? Compiler? OS? or the Tundra chip?
b) What is the actual size of the read operation performed?
I basiclly want to rule out everything but the chip. Documentation on that is on the web, but if it can be proved that the data width requested over the PCI bus is 32bits then the problem can be blamed on the Tundra chip!
edit:-
Concrete example, code was:-
struct SVersion
{
unsigned title : 8;
unsigned pecversion : 8;
unsigned majorversion : 8;
unsigned minorversion : 8;
} Version;
So now I have changed it to this :-
union UPECVersion
{
struct SVersion
{
unsigned title : 8;
unsigned pecversion : 8;
unsigned majorversion : 8;
unsigned minorversion : 8;
} Version;
unsigned int dummy;
};
And the base main struct :-
typedef struct SEPUMap
{
...
...
UPECVersion PECVersion;
};
So I still have to change all my baseline code
// perform dummy 32bit read
pEpuMap->PECVersion.dummy;
// get the bits out
x = pEpuMap->PECVersion.Version.minorversion;
And how do I know if the second read wont actually do a real read again, as my original code did? (Instead of using the already read bits via the union!)

Your compiler is adjusting the size of your struct to a multiple of its memory alignment setting. Almost all modern compilers do this. On some processors, variables and instructions have to begin on memory addresses that are multiples of some memory alignment value (often 32-bits or 64-bits, but the alignment depends on the processor architecture). Most modern processors don't require memory alignment anymore - but almost all of them see substantial performance benefit from it. So the compilers align your data for you for the performance boost.
However, in many cases (such as yours) this isn't the behavior you want. The size of your structure, for various reasons, can turn out to be extremely important. In those cases, there are various ways around the problem.
One option is to force the compiler to use different alignment settings. The options for doing this vary from compiler to compiler, so you'll have to check your documentation. It's usually a #pragma of some sort. On some compilers (the Microsoft compilers, for instance) it's possible to change the memory alignment for only a very small section of code. For example (in VC++):
#pragma pack(push) // save the current alignment
#pragma pack(1) // set the alignment to one byte
// Define variables that are alignment sensitive
#pragma pack(pop) // restore the alignment
Another option is to define your variables in other ways. Intrinsic types are not resized based on alignment, so instead of your 24-bit bitfield, another approach is to define your variable as an array of bytes.
Finally, you can just let the compilers make the structs whatever size they want and manually record the size that you need to read/write. As long as you're not concatenating structures together, this should work fine. Remember, however, that the compiler is giving you padded structs under the hood, so if you make a larger struct that includes, say, a works and a fails struct, there will be padded bits in between them that could cause you problems.
On most compilers, it's going to be darn near impossible to create a data type smaller than 8 bits. Most architectures just don't think that way. This shouldn't be a huge problem because most hardware devices that use datatypes of smaller than 8-bits end up arranging their packets in such a way that they still come in 8-bit multiples, so you can do the bit manipulations to extract or encode the values on the data stream as it leaves or comes in.
For all of the reasons listed above, a lot of code that works with hardware devices like this work with raw byte arrays and just encode the data within the arrays. Despite losing a lot of the conveniences of modern language constructs, it ends up just being easier.

I am wondering about the value of sizeof(struct fails). Is it 1? In this case, if you perform the read by dereferencing a pointer to a struct fails, it looks correct to issue a D8 read on the VME bus.
You can try to add a field unsigned int unused:29; to your struct fails.

The size of a struct is not equal to the sum of the size of its fields, including bit fields. Compilers are allowed, by the C and C++ language specifications, to insert padding between fields in a struct. Padding is often inserted for alignment purposes.
The common method in embedded systems programming is to read the data as an unsigned integer then use bit masking to retrieve the interesting bits. This is due to the above rule that I stated and the fact that there is no standard compiler parameter for "packing" fields in a structure.
I suggest creating an object ( class or struct) for interfacing with the hardware. Let the object read the data, then extract the bits as bool members. This puts the implementation as close to the hardware. The remaining software should not care how the bits are implemented.
When defining bit field positions / named constants, I suggest this format:
#define VALUE (1 << BIT POSITION)
// OR
const unsigned int VALUE = 1 << BIT POSITION;
This format is more readable and has the compiler perform the arithmetic. The calculation takes place during compilation and has no impact during run-time.

As an example, the Linux kernel has inline functions that explicitly handle memory-mapped IO reads and writes. In newer kernels it's a big macro wrapper that boils down to an inline assembly movl instruction, but it older kernels it was defined like this:
#define readl(addr) (*(volatile unsigned int *) (addr))
#define writel(b,addr) ((*(volatile unsigned int *) (addr)) = (b))

Ian - if you want to be sure as to the size of things you're reading/writing I'd suggest not using structs like this to do it - it's possible the sizeof of the fails struct is just 1 byte - the compiler is free to decide what it should be based on optimizations etc- I'd suggest reading/writing explicitly using int's or generally the things you need to assure the sizes of and then doing something else like converting to a union/struct where you don't have those limitations.

It is the compiler that decides what size read to issue. To force a 32 bit read, you could use a union:
union dev_word {
struct dev_reg {
unsigned int a:1;
unsigned int b:1;
unsigned int c:1;
} fail;
uint32_t dummy;
};
volatile union dev_word *vme_map_window();
If reading the union through a volatile-qualified pointer isn't enough to force a read of the whole union (I would think it would be - but that could be compiler-dependent), then you could use a function to provide the required indirection:
volatile union dev_word *real_reg; /* Initialised with vme_map_window() */
union dev_word * const *reg_func(void)
{
static union dev_word local_copy;
static union dev_word * const static_ptr = &local_copy;
local_copy = *real_reg;
return &static_ptr;
}
#define reg (*reg_func())
...then (for compatibility with the existing code) your accesses are done as:
reg->fail.a

The method described earlier of using the gcc flag -fstrict-volatile-bitfields and defining bitfield variables as volatile u32 works, but the total number of bits defined must be greater than 16.
For example:
typedef union{
vu32 Word;
struct{
vu32 LATENCY :3;
vu32 HLFCYA :1;
vu32 PRFTBE :1;
vu32 PRFTBS :1;
};
}tFlashACR;
.
tFLASH* const pFLASH = (tFLASH*)FLASH_BASE;
#define FLASH_LATENCY pFLASH->ACR.LATENCY
.
FLASH_LATENCY = Latency;
causes gcc to generate code
.
ldrb r1, [r3, #0]
.
which is a byte read. However, changing the typedef to
typedef union{
vu32 Word;
struct{
vu32 LATENCY :3;
vu32 HLFCYA :1;
vu32 PRFTBE :1;
vu32 PRFTBS :1;
vu32 :2;
vu32 DUMMY1 :8;
vu32 DUMMY2 :8;
};
}tFlashACR;
changes the resultant code to
.
ldr r3, [r2, #0]
.

I believe the only solution is to
1) edit/create my main struct as all 32bit ints (unsigned longs)
2) keep my original bit-field structs
3) each access I require,
3.1) I have to read the struct member as a 32bit word, and cast it into the bit-field struct,
3.2) read the bit-field element I require. (and for writes, set this bit-field, and write the word back!)
(1) Which is a same, because then I lose the intrinsic types that each member of the "main/SEPUMap" struct are.
End solution :-
Instead of :-
printf("FirmwareVersionMinor: 0x%x\n", pEpuMap->PECVersion);
This :-
SPECVersion ver = *(SPECVersion*)&pEpuMap->PECVersion;
printf("FirmwareVersionMinor: 0x%x\n", ver.minorversion);
Only problem I have is writting! (Writes are now Read/Modify/Writes!)
// Read - Get current
_HVPSUControl temp = *(_HVPSUControl*)&pEpuMap->HVPSUControl;
// Modify - set to new value
temp.OperationalRequestPort = true;
// Write
volatile unsigned int *addr = reinterpret_cast<volatile unsigned int*>(&pEpuMap->HVPSUControl);
*addr = *reinterpret_cast<volatile unsigned int*>(&temp);
Just have to tidy that code up into a method!
#define writel(addr, data) ( *(volatile unsigned long*)(&addr) = (*(volatile unsigned long*)(&data)) )

I had same problem on ARM using GCC compiler, where write into memory is only through bytes rather than 32bit word.
The solution is to define bit-fields using volatile uint32_t (or required size to write):
union {
volatile uint32_t XY;
struct {
volatile uint32_t XY_A : 4;
volatile uint32_t XY_B : 12;
};
};
but while compiling you need add to gcc or g++ this parameter:
-fstrict-volatile-bitfields
more in gcc documentation.

Related

How to declare compact type larger than CPU machine?

Sometimes, I need to declare a type, that works similar to a simple type, that may have a larger size than the register of a cpu, in some machines, in others not.
For example, a UUID (128 bits) or a (128 bits) datetime, in a 32 bits or 64 bits machine.
In some cases, there already are multiplatform libraries, but, in another doesn't.
I know is recommended to stick to existing libraries, but, in case that doesn't, how should I do ?
Example:
typedef
uint_16 /* redeclare as */ code16;
uint_32 /* redeclare as */ code32;
// option 1
struct code64
{
packed uint_32 A, B;
};
// option 2
struct code64
{
aligned uint32_t A, B;
};
// option 3
typedef
packed uint_16 code64[4];
void Example( )
{
code64 A = Foo();
code64 B = Bar();
code64 Q = Zaz(A, B);
}
As the tags indicates, I want it to be compiled, both, in "C" and "C++".
I already search for this subject in other questions, on stackoverflow.
A struct works like a simple type for the purpose of passing it around, taking its address, sizeof, etc. To just store some unstructured data inside, make a struct that contains a single array field.
typedef struct {
uint8_t a[16];
} code128;
Depending on what you want to store, you may want to put more structure in your struct. For example, a UUID is formally defined as having fields of a certain size, some of which are stored in platform endianness. (However not all software cares about endianness, and modern software often treats UUIDs as just a bunch of bytes.)
typedef struct {
uint32_t time_low;
uint16_t time_mid;
uint16_t time_hi_and_version;
uint8_t clock_seq_hi_and_res;
uint8_t clock_seq_low;
uint8_t node[6];
For a 128-bit time, what this could be depends on how the time is counted. It's common to represent time as a 64-bit number of seconds and a number of fractions such as nanoseconds or 2^64th of a second. For example, modern Unix systems have this type:
typedef struct timeval {
uint64_t tv_sec;
uint64_t tv_usec;
};
You can't do arithmetic on these structures, but arithmetic is meaningless on UUID anyway, and arithmetic on time is usually not direct arithmetic because time with subsecond precision is usually not represented as a single number.
If you do need arithmetic, some platforms offer uint128_t, but that's not portable (any C or C++ implementation must offer an integer type that's at least 64 bits, but they don't have to go beyond that). For example GCC and Clang both offer double-width integer types, so they offer uint128_t on machines where the CPU has 64-bit registers, but only up to uint64_t on 32-bit CPUs.

Header with restrictions using bytes for an UDP Socket

I am doing a Header for an UDP socket which have a restrictions using bytes.
| Packet ID (1 byte) | Packet Size (2 bytes) | Subpacket ID (1 Byte) | etc
I did an struct for store this kind of attributes like:
typedef struct WHEATHER_STRUCT
{
unsigned char packetID[1];
unsigned char packetSize[2];
unsigned char subPacketID[1];
unsigned char subPacketOffset[2];
...
} wheather_struct;
I initialized this struct using new and I updated the values. The question is about if I want to use only 2 bytes in Packet Size attribute. What of these two forms that I wrote below is the correct one?
*weather_struct->packetSize = '50';
or
*weather_struct->packetSize = 50;
If you can use C++11 and gcc (or clang) then I would do this:
typedef struct WHEATHER_STRUCT
{
uint8_t packetID;
uint16_t packetSize;
uint8_t subPacketID;
uint16_t subPacketOffset;
// ...
} __attribute__((packed)) wheather_struct;
If you can't use C++11 then you can use unsigned char and unsigned short instead.
If you're using Visual C then you can do:
#pragma pack (push, 1)
typedef struct ...
#pragma (pop)
Beware also byte ordering issues, depending on what architectures you need to support. You can use htons() and ntohs() to overcome this problem.
Live demo at Wandbox
Packing and unpacking data from IP packets is a problem as old as the internet itself (indeed, older).
Different machine architectures have different layouts for representing integers, which can cause problems when communicating between machines.
For this reason, the IP stack standardises on encoding integers in 'network byte order' (which basically means most significant byte first).
Standard functions exist to convert values in network byte order to native types and vice versa. I urge you to consider using these as your code will then be more portable.
Furthermore, it makes sense to abstract data representations from the program's point of view. c++ compilers can perform the conversions very efficiently.
Example:
#include <arpa/inet.h>
#include <cstring>
#include <cstdint>
typedef struct WEATHER_STRUCT
{
std::int8_t packetID;
std::uint16_t packetSize;
std::uint8_t subPacketID;
std::uint16_t subPacketOffset;
} weather_struct;
const std::int8_t* populate(weather_struct& target, const std::int8_t* source)
{
auto get16 = [&source]
{
std::uint16_t buf16;
std::memcpy(&buf16, source, 2);
source += 2;
return ntohs(buf16);
};
target.packetID = *source++;
target.packetSize = get16();
target.subPacketID = *source++;
target.subPacketOffset = get16();
return source;
}
uint8_t* serialise(uint8_t* target, weather_struct const& source)
{
auto write16 = [&target](std::uint16_t val)
{
val = ntohs(val);
std::memcpy(target, &val, 2);
target += 2;
};
*target++ = source.packetID;
write16(source.packetSize);
*target++ = source.subPacketID;
write16(source.subPacketOffset);
return target;
}
https://linux.die.net/man/3/htons
here's an link to a c++17 version of the above:
https://godbolt.org/z/oRASjI
A further note on conversion costs:
Data arriving into or leaving your program is an event that happens once per payload. Suffering a conversion cost here incurs a negligible penalty.
Once the data has arrived in your program, or before it leaves, it may be manipulated many times by your code.
Some processors architectures suffer huge performance penalties during data access if data is not aligned on natural word boundaries. This is why attributes such as packed exist - the compiler is doing all it can to avoid misaligned data. Using a packed attribute is tantamount to deliberately telling the compiler to produce very suboptimal code.
For this reason, I would recommend not using packed structures (e.g. __attribute__((packed)) etc) for data that will be referred to by program logic.
Compared to RAM, networks are many orders of magnitude slower. A minuscule performance hit (literally nanoseconds) at the point of encoding or decoding a network packet is inconsequential compared to the cost of actually transmitting it.
Packing structures can cause horrible performance issues in program code and often leads to portability headaches.
Neither is correct, you need to treat the two bytes as a single 16-bit number. You probably also need to take into account the different endianness of the network stream to your processor architecture (depending on the protocol, most are big endian).
The correct code would therefore be:
*((uint16_t*)weather_struct->packetSize) = htons(50);
It would be simpler if packetSize were uint16_t to start with:
weather_struct->packetSize = htons(50);

why add fillers in a c++ struct?

What are the effect of fillers in a c++ struct? I often see them in some c++ api. For example:
struct example
{
unsigned short a;
unsigned short b;
char c[3];
char filler1;
unsigned short e;
char filler2;
unsigned int g;
};
This struct is meant to transport through network
struct example
{
unsigned short a; //2 bytes
unsigned short b;//2 bytes
//4 bytes consumed
char c[3];//3 bytes
char filler1;//1 bytes
//4 bytes consumed
unsigned short e;//2 bytes
char filler2;//1 bytes
//3 bytes consumed ,should be filler[2]
unsigned int g;//4 bytes
};
Because sometimes you don't actually control the format of the data you're using.
The format may be specified by something beyond your control. For example, it may be created in a system with different alignment requirements to yours.
Alternatively, the data may have real data in those filler areas that your code doesn't care about.
Those fillers are usually inserted to explicitly make sure some of the members of a structure are naturally aligned i.e. their offset inside a structure is a multiple of its size.
In the example below assuming char is 1 bytes, short is 2 and int is 4.
struct example
{
unsigned short a;
unsigned short b;
char c[3];
char filler1;
unsigned short e; // starts at offset 8
char filler2[2];
unsigned int g; // starts at offset 12
};
If you don't specify any fillers, a compiler will usually add the necessary padding bytes to ensure a proper alignment of the structure members.
Btw, these fields can also be used for reserved fields that might appear in the future.
updated:
Since it has been mentioned that a structure is a network packet, the fillers are required to get a structure that is compatible with the one being passed from another host.
However, inserting filler bytes in this case might not be enough (especially, if portability is required). If these structures are to be sent via a network as is (i.e. without manually packing into a separate buffer for sending), you have to inform a compiler that the structure should be packed.
In microsoft compiler this can be achieved using #pragma pack:
#pragma pack(1)
struct T {
char t;
int i;
short j;
double k;
};
In gcc you can use __attribute__((packed))
struct foo {
char c;
int x;
} __attribute__((packed));
However, many people prefer to manually pack/unpack structures int a raw-byte array, because accessing misaligned data on some systems might not be [properly] supported.
Depending on what code you're working with they may be attempting to align the structure on word boundries (32 bit in your case), this is a speed optimization, however, doing things like this has been rendered obsolete by decent optimizing compilers, however if the compiler was instructed not to optimize this piece of code, or the compiler is very low-end e.g. for an embedded system, it may be better to handle this yourself. It basically boils downto how much you trust the compiler.
The other reason is for writing binary files, where reserved bytes have been left in the file format specification.

C++ struct containing unsigned char and int bug

Ok i have a struct in my C++ program that is like this:
struct thestruct
{
unsigned char var1;
unsigned char var2;
unsigned char var3[2];
unsigned char var4;
unsigned char var5[8];
int var6;
unsigned char var7[4];
};
When i use this struct, 3 random bytes get added before the "var6", if i delete "var5" it's still before "var6" so i know it's always before the "var6".
But if i remove the "var6" then the 3 extra bytes are gone.
If i only use a struct with a int in it, there is no extra bytes.
So there seem to be a conflict between the unsigned char and the int, how can i fix that?
The compiler is probably using its default alignment option, where members of size x are aligned on a memory boundary evenly divisible by x.
Depending on your compiler, you can affect this behaviour using a #pragma directive, for example:
#pragma pack(1)
will turn off the default alignment in Visual C++:
Specifies the value, in bytes, to be used for packing. The default value for n is 8. Valid values are 1, 2, 4, 8, and 16. The alignment of a member will be on a boundary that is either a multiple of n or a multiple of the size of the member, whichever is smaller.
Note that for low-level CPU performance reasons, it is usually best to try to align your data members so that they fall on an aligned boundary. Some CPU architectures require alignment, while others (such as Intel x86) tolerate misalignment with a decrease in performance (sometimes quite significantly).
Your data structure being aligned so that your int falls on word boundries, which for your target might be 32 or 64 bits.
You can reorganize your struct like so so that this won't happen:
struct thestruct
{
int var6;
unsigned char var1;
unsigned char var2;
unsigned char var3[2];
unsigned char var4;
unsigned char var5[8];
unsigned char var7[4];
};
Are you talking about padding bytes? That's not a bug. As allowed by the C++ standard, the compiler is adding padding to keep the members aligned. This is required for some architectures, and will greatly improve performance for others.
You're having a byte alignment problem. The compiler is adding padding to align the bytes. See this wikipedia article.
Read up on data structure alignment. Essentially, depending on the compiler and compile options, you'll get alignment onto different powers-of-2.
To avoid it, move multi-byte items (int or pointers) before single-byte (signed or unsigned char) items -- although it might still be there after your last item.
While rearranging the order you declare data members inside your struct is fine, it should be emphasized that overriding the default alignment by using #pragmas and such is a bad idea unless you know exactly what you're doing. Depending on your compiler and architecture, attempting to access unaligned data, particularly by storing the address in a pointer and later trying to dereference it, can easily give the dreaded Bus Error or other undefined behavior.

How can I get bitfields to arrange my bits in the right order?

To begin with, the application in question is always going to be on the same processor, and the compiler is always gcc, so I'm not concerned about bitfields not being portable.
gcc lays out bitfields such that the first listed field corresponds to least significant bit of a byte. So the following structure, with a=0, b=1, c=1, d=1, you get a byte of value e0.
struct Bits {
unsigned int a:5;
unsigned int b:1;
unsigned int c:1;
unsigned int d:1;
} __attribute__((__packed__));
(Actually, this is C++, so I'm talking about g++.)
Now let's say I'd like a to be a six bit integer.
Now, I can see why this won't work, but I coded the following structure:
struct Bits2 {
unsigned int a:6;
unsigned int b:1;
unsigned int c:1;
unsigned int d:1;
} __attribute__((__packed__));
Setting b, c, and d to 1, and a to 0 results in the following two bytes:
c0 01
This isn't what I wanted. I was hoping to see this:
e0 00
Is there any way to specify a structure that has three bits in the most significant bits of the first byte and six bits spanning the five least significant bits of the first byte and the most significant bit of the second?
Please be aware that I have no control over where these bits are supposed to be laid out: it's a layout of bits that are defined by someone else's interface.
(Note that all of this is gcc-specific commentary - I'm well aware that the layout of bitfields is implementation-defined).
Not on a little-endian machine: The problem is that on a little-endian machine, the most significant bit of the second byte isn't considered "adjacent" to the least significant bits of the first byte.
You can, however, combine the bitfields with the ntohs() function:
union u_Bits2{
struct Bits2 {
uint16_t _padding:7;
uint16_t a:6;
uint16_t b:1;
uint16_t c:1;
uint16_t d:1;
} bits __attribute__((__packed__));
uint16_t word;
}
union u_Bits2 flags;
flags.word = ntohs(flag_bytes_from_network);
However, I strongly recommend you avoid bitfields and instead use shifting and masks.
Usually you can't do strong assumptions on how the union will be packed, every compiler implementation may choose to pack it differently (to save space or align bitfields inside bytes).
I would suggest you to just work out with masking and bitwise operators..
from this link:
The main use of bitfields is either to allow tight packing of data or to be able to specify the fields within some externally produced data files. C gives no guarantee of the ordering of fields within machine words, so if you do use them for the latter reason, you program will not only be non-portable, it will be compiler-dependent too. The Standard says that fields are packed into ‘storage units’, which are typically machine words. The packing order, and whether or not a bitfield may cross a storage unit boundary, are implementation defined. To force alignment to a storage unit boundary, a zero width field is used before the one that you want to have aligned.
C/C++ has no means of specifying the bit by bit memory layout of structs, so you will need to do manual bit shifting and masking on 8 or 16bit (unsigned) integers (uint8_t, uint16_t from <stdint.h> or <cstdint>).
Of the good dozen of programming languages I know, only very few allow you to specify bit-by-bit memory layout for bit fields: Ada, Erlang, VHDL (and Verilog).
(Community wiki if you want to add more languages to that list.)