I am doing a Header for an UDP socket which have a restrictions using bytes.
| Packet ID (1 byte) | Packet Size (2 bytes) | Subpacket ID (1 Byte) | etc
I did an struct for store this kind of attributes like:
typedef struct WHEATHER_STRUCT
{
unsigned char packetID[1];
unsigned char packetSize[2];
unsigned char subPacketID[1];
unsigned char subPacketOffset[2];
...
} wheather_struct;
I initialized this struct using new and I updated the values. The question is about if I want to use only 2 bytes in Packet Size attribute. What of these two forms that I wrote below is the correct one?
*weather_struct->packetSize = '50';
or
*weather_struct->packetSize = 50;
If you can use C++11 and gcc (or clang) then I would do this:
typedef struct WHEATHER_STRUCT
{
uint8_t packetID;
uint16_t packetSize;
uint8_t subPacketID;
uint16_t subPacketOffset;
// ...
} __attribute__((packed)) wheather_struct;
If you can't use C++11 then you can use unsigned char and unsigned short instead.
If you're using Visual C then you can do:
#pragma pack (push, 1)
typedef struct ...
#pragma (pop)
Beware also byte ordering issues, depending on what architectures you need to support. You can use htons() and ntohs() to overcome this problem.
Live demo at Wandbox
Packing and unpacking data from IP packets is a problem as old as the internet itself (indeed, older).
Different machine architectures have different layouts for representing integers, which can cause problems when communicating between machines.
For this reason, the IP stack standardises on encoding integers in 'network byte order' (which basically means most significant byte first).
Standard functions exist to convert values in network byte order to native types and vice versa. I urge you to consider using these as your code will then be more portable.
Furthermore, it makes sense to abstract data representations from the program's point of view. c++ compilers can perform the conversions very efficiently.
Example:
#include <arpa/inet.h>
#include <cstring>
#include <cstdint>
typedef struct WEATHER_STRUCT
{
std::int8_t packetID;
std::uint16_t packetSize;
std::uint8_t subPacketID;
std::uint16_t subPacketOffset;
} weather_struct;
const std::int8_t* populate(weather_struct& target, const std::int8_t* source)
{
auto get16 = [&source]
{
std::uint16_t buf16;
std::memcpy(&buf16, source, 2);
source += 2;
return ntohs(buf16);
};
target.packetID = *source++;
target.packetSize = get16();
target.subPacketID = *source++;
target.subPacketOffset = get16();
return source;
}
uint8_t* serialise(uint8_t* target, weather_struct const& source)
{
auto write16 = [&target](std::uint16_t val)
{
val = ntohs(val);
std::memcpy(target, &val, 2);
target += 2;
};
*target++ = source.packetID;
write16(source.packetSize);
*target++ = source.subPacketID;
write16(source.subPacketOffset);
return target;
}
https://linux.die.net/man/3/htons
here's an link to a c++17 version of the above:
https://godbolt.org/z/oRASjI
A further note on conversion costs:
Data arriving into or leaving your program is an event that happens once per payload. Suffering a conversion cost here incurs a negligible penalty.
Once the data has arrived in your program, or before it leaves, it may be manipulated many times by your code.
Some processors architectures suffer huge performance penalties during data access if data is not aligned on natural word boundaries. This is why attributes such as packed exist - the compiler is doing all it can to avoid misaligned data. Using a packed attribute is tantamount to deliberately telling the compiler to produce very suboptimal code.
For this reason, I would recommend not using packed structures (e.g. __attribute__((packed)) etc) for data that will be referred to by program logic.
Compared to RAM, networks are many orders of magnitude slower. A minuscule performance hit (literally nanoseconds) at the point of encoding or decoding a network packet is inconsequential compared to the cost of actually transmitting it.
Packing structures can cause horrible performance issues in program code and often leads to portability headaches.
Neither is correct, you need to treat the two bytes as a single 16-bit number. You probably also need to take into account the different endianness of the network stream to your processor architecture (depending on the protocol, most are big endian).
The correct code would therefore be:
*((uint16_t*)weather_struct->packetSize) = htons(50);
It would be simpler if packetSize were uint16_t to start with:
weather_struct->packetSize = htons(50);
Related
Sometimes, I need to declare a type, that works similar to a simple type, that may have a larger size than the register of a cpu, in some machines, in others not.
For example, a UUID (128 bits) or a (128 bits) datetime, in a 32 bits or 64 bits machine.
In some cases, there already are multiplatform libraries, but, in another doesn't.
I know is recommended to stick to existing libraries, but, in case that doesn't, how should I do ?
Example:
typedef
uint_16 /* redeclare as */ code16;
uint_32 /* redeclare as */ code32;
// option 1
struct code64
{
packed uint_32 A, B;
};
// option 2
struct code64
{
aligned uint32_t A, B;
};
// option 3
typedef
packed uint_16 code64[4];
void Example( )
{
code64 A = Foo();
code64 B = Bar();
code64 Q = Zaz(A, B);
}
As the tags indicates, I want it to be compiled, both, in "C" and "C++".
I already search for this subject in other questions, on stackoverflow.
A struct works like a simple type for the purpose of passing it around, taking its address, sizeof, etc. To just store some unstructured data inside, make a struct that contains a single array field.
typedef struct {
uint8_t a[16];
} code128;
Depending on what you want to store, you may want to put more structure in your struct. For example, a UUID is formally defined as having fields of a certain size, some of which are stored in platform endianness. (However not all software cares about endianness, and modern software often treats UUIDs as just a bunch of bytes.)
typedef struct {
uint32_t time_low;
uint16_t time_mid;
uint16_t time_hi_and_version;
uint8_t clock_seq_hi_and_res;
uint8_t clock_seq_low;
uint8_t node[6];
For a 128-bit time, what this could be depends on how the time is counted. It's common to represent time as a 64-bit number of seconds and a number of fractions such as nanoseconds or 2^64th of a second. For example, modern Unix systems have this type:
typedef struct timeval {
uint64_t tv_sec;
uint64_t tv_usec;
};
You can't do arithmetic on these structures, but arithmetic is meaningless on UUID anyway, and arithmetic on time is usually not direct arithmetic because time with subsecond precision is usually not represented as a single number.
If you do need arithmetic, some platforms offer uint128_t, but that's not portable (any C or C++ implementation must offer an integer type that's at least 64 bits, but they don't have to go beyond that). For example GCC and Clang both offer double-width integer types, so they offer uint128_t on machines where the CPU has 64-bit registers, but only up to uint64_t on 32-bit CPUs.
I'm trying to cast a struct into a char vector.
I wanna send my struct casted in std::vector throw a UDP socket and cast it back on the other side. Here is my struct whith the PACK attribute.
#define PACK( __Declaration__ ) __pragma( pack(push, 1) ) __Declaration__ __pragma( pack(pop) )
PACK(struct Inputs
{
uint8_t structureHeader;
int16_t x;
int16_t y;
Key inputs[8];
});
Here is test code:
auto const ptr = reinterpret_cast<char*>(&in);
std::vector<char> buffer(ptr, ptr + sizeof in);
//send and receive via udp
Inputs* my_struct = reinterpret_cast<Inputs*>(&buffer[0]);
The issue is:
All works fine except my uint8_t or int8_t.
I don't know why but whenever and wherever I put a 1Bytes value in the struct,
when I cast it back the value is not readable (but the others are)
I tried to put only 16bits values and it works just fine even with the
maximum values so all bits are ok.
I think this is something with the alignment of the bytes in the memory but i can't figure out how to make it work.
Thank you.
I'm trying to cast a struct into a char vector.
You cannot cast an arbitrary object to a vector. You can cast your object to an array of char and then copy that array into a vector (which is actually what your code is doing).
auto const ptr = reinterpret_cast<char*>(&in);
std::vector<char> buffer(ptr, ptr + sizeof in);
That second line defines a new vector and initializes it by copying the bytes that represent your object into it. This is reasonable, but it's distinct from what you said you were trying to do.
I think this is something with the alignment of the bytes in the memory
This is good intuition. If you hadn't told the compiler to pack the struct, it would have inserted padding bytes to ensure each field starts at its natural alignment. The fact that the operation isn't reversible suggests that somehow the receiving end isn't packed exactly the same way. Are you sure the receiving program has exactly the same packing directive and struct layout?
On x86, you can get by with unaligned data, but you may pay a large performance cost whenever you access an unaligned member variable. With the packing set to one, and the first field being odd-sized, you've guaranteed that the next fields will be unaligned. I'd urge you to reconsider this. Design the struct so that all the fields fall at their natural alignment boundaries and that you don't need to adjust the packing. This may make your struct a little bigger, but it will avoid all the alignment and performance problems.
If you want to omit the padding bytes in your wire format, you'll have to copy the relevant fields byte by byte into the wire format and then copy them back out on the receiving end.
An aside regarding:
#define PACK( __Declaration__ ) __pragma( pack(push, 1) ) __Declaration__ __pragma( pack(pop) )
Identifiers that begin with underscore and a capital letter or with two underscores are reserved for "the implementation," so you probably shouldn't use __Declaration__ as the macro's parameter name. ("The implementation" refers to the compiler, the standard library, and any other runtime bits the compiler requires.)
1
vector class has dynamically allocated memory and uses pointers inside. So you can't send the vector (but you can send the underlying array)
2
SFML has a great class for doing this called sf::packet. It's free, open source, and cross-platform.
I was recently working on a personal cross platform socket library for use in other personal projects and I eventually quit it for SFML. There's just TOO much to test, I was spending all my time testing to make sure stuff worked and not getting any work done on the actual projects I wanted to do.
3
memcpy is your best friend. It is designed to be portable, and you can use that to your advantage.
You can use it to debug. memcpy the thing you want to see into a char array and check that it matches what you expect.
4
To save yourself from having to do tons of robustness testing, limit yourself to only chars, 32-bit integers, and 64-bit doubles. You're using different compilers? struct packing is compiler and architecture dependent. If you have to use a packed struct, you need to guarantee that the packing is working as expected on all platforms you will be using, and that all platforms have the same endianness. Obviously, that's what you're having trouble with and I'm sorry I can't help you more with that. I would I would recommend regular serializing and would definitely avoid struct packing if I was trying to make portable sockets.
If you can make those guarantees that I mentioned, sending is really easy on LINUX.
// POSIX
void send(int fd, Inputs& input)
{
int error = sendto(fd, &input, sizeof(input), ..., ..., ...);
...
}
winsock2 uses a char* instead of a void* :(
void send(int fd, Inputs& input)
{
char buf[sizeof(input)];
memcpy(buf, &input, sizeof(input));
int error = sendto(fd, buf, sizeof(input), ..., ..., ...);
...
}
Did you tried the most simple approach of:
unsigned char *pBuff = (unsigned char*)∈
for (unsigned int i = 0; i < sizeof(Inputs); i++) {
vecBuffer.push_back(*pBuff);
pBuff++;
}
This would work for both, pack and non pack, since you will iterate the sizeof.
I have embedded device connected to PC
and some big struct S with many fields and arrays of custom defined type FixedPoint_t.
FixedPoint_t is a templated POD class with exactly one data member that vary in size from char to long depending on template params. Anyway it passes static_assert((std::is_pod<FixedPoint_t<0,8,8> >::value == true),"");
It will be good if this big struct has compatible underlaying memory representation on both embedded system and controlling PC. This allows significant simplification of communication protocol to commands like "set word/byte with offset N to value V". Assume endianess is the same on both platforms.
I see 3 solutions here:
Use something like #pragma packed on both sides.
BUT i got warning when i put attribute((packed)) to struct S declaration
warning: ignoring packed attribute because of unpacked non-POD field.
This is because FixedPoint_t is not declared as packed.
I don't want declare it as packed because this type is widely used in whole program and packing can lead to performance drop.
Make correct struct serialization. This is not acceptable because of code bloat, additional RAM usege...Protocol will be more complicated because i need random access to the struct. Now I think this is not an option.
Control padding manually. I can add some field, reorder others...Just to acheive no padding on both platforms. This will satisfy me at the moment. But i need good way to write a test that shows me is padding is there or not.
I can compare sum of sizeof() each field to sizeof(struct).
I can compare offsetof() each struct field on both platforms.
Both variants are ugly enough...
What do you recommend? Especially i am interested in manual padding controling and automaic padding detection in tests.
EDIT: Is it sufficient to compare sizeof(big struct) on two platforms to detect layout compatibility(assume endianess is equal)?? I think size should not match if padding will be different.
EDIT2:
//this struct should have padding on 32bit machine
//and has no padding on 8bit
typedef struct
{
uint8_t f8;
uint32_t f32;
uint8_t arr[5];
} serialize_me_t;
//count of members in struct
#define SERTABLE_LEN 3
//one table entry for each serialize_me_t data member
static const struct {
size_t width;
size_t offset;
// size_t cnt; //why we need cnt?
} ser_des_table[SERTABLE_LEN] =
{
{ sizeof(serialize_me_t::f8), offsetof(serialize_me_t, f8)},
{ sizeof(serialize_me_t::f32), offsetof(serialize_me_t, f32)},
{ sizeof(serialize_me_t::arr), offsetof(serialize_me_t, arr)},
};
void serialize(void* serialize_me_ptr, char* buf)
{
const char* struct_ptr = (const char*)serialize_me_ptr;
for(int i=0; i<SERTABLE_LEN; I++)
{
struct_ptr += ser_des_table[i].offset;
memcpy(buf, struct_ptr, ser_des_table[i].width );
buf += ser_des_table[i].width;
}
}
I strongly recommend to use option 2:
You are save for future changes (new PCD/ABI, compiler, platform, etc.)
Code-bloat can be kept to a minimum if well thought. There is just one function needed per direction.
You can create the required tables/code (semi-)automatically (I use Python for such). This way both sides will stay in sync.
You definitively should add a CRC to the data anyway. As you likely do not want to calculate this in the rx/tx-interrupt, you'll have to provide an array anyway.
Using a struct directly will soon become a maintenance nightmare. Even worse if someone else has to track this code.
Protocols, etc. tend to be reused. If it is a platform with different endianess, the other approach goes bang.
To create the data-structs and ser/des tables, you can use offsetof to get the offset of each type in the struct. If that table is made an include-file, it can be used on both sides. You can even create the struct and table e.g. by a Python script. Adding that to the build-process ensures it is always up-to-date and you avoid additional typeing.
For instance (in C, just to get idea):
// protocol.inc
typedef struct {
uint32_t i;
uint 16_t s[5];
uint32_t j;
} ProtocolType;
static const struct {
size_t width;
size_t offset;
size_t cnt;
} ser_des_table[] = {
{ sizeof(ProtocolType.i), offsetof(ProtocolType.i), 1 },
{ sizeof(ProtocolType.s[0]), offsetof(ProtocolType.s), 5 },
...
};
If not created automatically, I'd use macros to generate the data. Possibly by including the file twice: one to generate the struct definition and one for the table. This is possible by redefining the macros in-between.
You should care about the representation of signed integers and floats (implementation defined, floats are likely IEEE754 as proposed by the standard).
As an alternative to the width field, you can use an "type" code (e.g. a char which maps to an implementation-defined type. This way you can add custom types with same width, but different encoding (e.g. uint32_t and IEEE754-float). This will completely abstract the network protocol encoding from the physical machine (the best solution). Note noting hinders you from using common encodings which do not complicate code a single bit (literally).
This is related to my question asked here today on SO. Is there a better way to build a packet to send over serial rather than doing this:
unsigned char buff[255];
buff[0] = 0x02
buff[1] = 0x01
buff[2] = 0x03
WriteFile(.., buff,3, &dwBytesWrite,..);
Note: I have about twenty commands to send, so if there was a better way to send these bytes to the serial device in a more concise manner rather than having to specify each byte, it would be great. Each byte is hexadecimal, with the last byte being the checksum. I should clarify that I know I will have to specify each byte to build the commands, but is there a better way than having to specify each array position?
You can initialize static buffers like so:
const unsigned char command[] = {0x13, 0x37, 0xf0, 0x0d};
You could even use these to initialize non-const buffers and then replace only changing bytes by index.
Not sure what you're asking. If you ask about the problem of setting the byte one by one and messing up the data, usually this is doen with a packed struct with members having meaningful names. Like:
#pragma push(pack)
#pragma pack(1)
struct FooHeader {
uint someField;
byte someFlag;
dword someStatus;
};
#pragma pack(pop)
FooHeader hdr;
hdr.someField = 2;
hdr.someFlag = 3;
hdr.someStatus = 4;
WriteFile(..., sizeof(hdr), &hdr);
Is there a better way to build a packet than assembling it byte by byte?
Yes, but it will require some thought and some careful engineering. Many of the other answers tell you other mechanisms by which you can put together a sequence of bytes in C++. But I suggest you design an abstraction that represents a part of a packet:
class PacketField {
void add_to_packet(Packet p);
};
Then you can define various subclasses:
Add a single byte to the packet
Add a 16-bit integer in big-endian order. Another for little-endian. Other widths besides 16.
Add a string to the packet; code the string by inserting the length and then the bytes.
You also can define a higher-order version:
PacketField sequence(PacketField first, PacketField second);
Returns a field that consists of the two arguments in sequence. If you like operator overloading you could overload this as + or <<.
Your underlying Packet abstraction will just be an extensible sequence of bytes (dynamic array) with some kind of write method.
If you wind up programming a lot of network protocols, you'll find this sort of design pays off big time.
Edit: The point of the PacketField class is composability and reuse:
By composing packet fields you can create more complex packet fields. For example, you could define "add a TCP header" as a function from PacketFields to PacketFields.
With luck you build up a library of PacketFields that are specific to your application or protocol family or whatever. Then you reuse the fields in the library.
You can create subclasses of PacketField that take extra parameters.
It's quite possibly that you can do something equally nice without having to have this extra level of indirection; I'm recommending it because I've seen it used effectively in other applications. You are decoupling the knowledge of how to build a packet (which can be applied to any packet, any time) from the act of actually building a particular packet. Separating concerns like this can help reuse.
Yes, there is a better method. Have your classes read from and write to a packed buffer. You could even implement this as an interface. Templates would help to.
An example of writing:
template <typename Member_Type>
void Store_Value_In_Buffer(const Member_Type&, member,
unsigned char *& p_buffer)
{
*((Member_Type *)(p_buffer)) = member;
p_buffer += sizeof(Member_Type);
return;
}
struct My_Class
{
unsigned int datum;
void store_to_buffer(unsigned char *& p_buffer)
{
Store_Value_In_Buffer(datum, buffer);
return;
}
};
//...
unsigned char buffer[256];
unsigned char * p_buffer(buffer);
MyClass object;
object.datum = 5;
object.store_to_buffer(p_buffer);
std::cout.write(p_buffer, 256);
Part of the interface is also to query the objects for the size that they would occupy in the buffer, say a method size_in_buffer. This is left as an exercise for the reader. :-)
There is a much better way, which is using structs to set the structures. This is usually how network packets are built on a low level.
For example, say you have packets which have an id, length, flag byte, and data, you'd do something like this:
struct packet_header {
int id;
byte length;
byte flags;
};
byte my_packet[] = new byte[100];
packet_header *header = &my_packet;
header->id = 20;
header->length = 10; // This can be set automatically by a function, maybe?
// etc.
header++; // Header now points to the data section.
Do note that you're going to have to make sure that the structures are "packed", i.e. when you write byte length, it really takes up a byte. Usually, you'd achieve this using something like #pragma pack or similar (you'll have to read about your compiler's pragma settings).
Also, note that you should probably use functions to do common operations. For example, create a function which gets as input the size, data to send, and other information, and fills out the packet header and data for you. This way, you can perform calculations about the actual size you want to write in the length field, you can calculate the CRC inside the function, etc.
Edit: This is a C-centric way of doing things, which is the style of a lot of networking code. A more C++-centric (object oriented) approach could also work, but I'm less familiar with them.
const char *c = "\x02\x02\x03";
I am trying to perform a less-than-32bit read over the PCI bus to a VME-bridge chip (Tundra Universe II), which will then go onto the VME bus and picked up by the target.
The target VME application only accepts D32 (a data width read of 32bits) and will ignore anything else.
If I use bit field structure mapped over a VME window (nmap'd into main memory) I CAN read bit fields >24 bits, but anything less fails. ie :-
struct works {
unsigned int a:24;
};
struct fails {
unsigned int a:1;
unsigned int b:1;
unsigned int c:1;
};
struct main {
works work;
fails fail;
}
volatile *reg = function_that_creates_and_maps_the_vme_windows_returns_address()
This shows that the struct works is read as a 32bit, but a read via fails struct of a for eg reg->fail.a is getting factored down to a X bit read. (where X might be 16 or 8?)
So the questions are :
a) Where is this scaled down? Compiler? OS? or the Tundra chip?
b) What is the actual size of the read operation performed?
I basiclly want to rule out everything but the chip. Documentation on that is on the web, but if it can be proved that the data width requested over the PCI bus is 32bits then the problem can be blamed on the Tundra chip!
edit:-
Concrete example, code was:-
struct SVersion
{
unsigned title : 8;
unsigned pecversion : 8;
unsigned majorversion : 8;
unsigned minorversion : 8;
} Version;
So now I have changed it to this :-
union UPECVersion
{
struct SVersion
{
unsigned title : 8;
unsigned pecversion : 8;
unsigned majorversion : 8;
unsigned minorversion : 8;
} Version;
unsigned int dummy;
};
And the base main struct :-
typedef struct SEPUMap
{
...
...
UPECVersion PECVersion;
};
So I still have to change all my baseline code
// perform dummy 32bit read
pEpuMap->PECVersion.dummy;
// get the bits out
x = pEpuMap->PECVersion.Version.minorversion;
And how do I know if the second read wont actually do a real read again, as my original code did? (Instead of using the already read bits via the union!)
Your compiler is adjusting the size of your struct to a multiple of its memory alignment setting. Almost all modern compilers do this. On some processors, variables and instructions have to begin on memory addresses that are multiples of some memory alignment value (often 32-bits or 64-bits, but the alignment depends on the processor architecture). Most modern processors don't require memory alignment anymore - but almost all of them see substantial performance benefit from it. So the compilers align your data for you for the performance boost.
However, in many cases (such as yours) this isn't the behavior you want. The size of your structure, for various reasons, can turn out to be extremely important. In those cases, there are various ways around the problem.
One option is to force the compiler to use different alignment settings. The options for doing this vary from compiler to compiler, so you'll have to check your documentation. It's usually a #pragma of some sort. On some compilers (the Microsoft compilers, for instance) it's possible to change the memory alignment for only a very small section of code. For example (in VC++):
#pragma pack(push) // save the current alignment
#pragma pack(1) // set the alignment to one byte
// Define variables that are alignment sensitive
#pragma pack(pop) // restore the alignment
Another option is to define your variables in other ways. Intrinsic types are not resized based on alignment, so instead of your 24-bit bitfield, another approach is to define your variable as an array of bytes.
Finally, you can just let the compilers make the structs whatever size they want and manually record the size that you need to read/write. As long as you're not concatenating structures together, this should work fine. Remember, however, that the compiler is giving you padded structs under the hood, so if you make a larger struct that includes, say, a works and a fails struct, there will be padded bits in between them that could cause you problems.
On most compilers, it's going to be darn near impossible to create a data type smaller than 8 bits. Most architectures just don't think that way. This shouldn't be a huge problem because most hardware devices that use datatypes of smaller than 8-bits end up arranging their packets in such a way that they still come in 8-bit multiples, so you can do the bit manipulations to extract or encode the values on the data stream as it leaves or comes in.
For all of the reasons listed above, a lot of code that works with hardware devices like this work with raw byte arrays and just encode the data within the arrays. Despite losing a lot of the conveniences of modern language constructs, it ends up just being easier.
I am wondering about the value of sizeof(struct fails). Is it 1? In this case, if you perform the read by dereferencing a pointer to a struct fails, it looks correct to issue a D8 read on the VME bus.
You can try to add a field unsigned int unused:29; to your struct fails.
The size of a struct is not equal to the sum of the size of its fields, including bit fields. Compilers are allowed, by the C and C++ language specifications, to insert padding between fields in a struct. Padding is often inserted for alignment purposes.
The common method in embedded systems programming is to read the data as an unsigned integer then use bit masking to retrieve the interesting bits. This is due to the above rule that I stated and the fact that there is no standard compiler parameter for "packing" fields in a structure.
I suggest creating an object ( class or struct) for interfacing with the hardware. Let the object read the data, then extract the bits as bool members. This puts the implementation as close to the hardware. The remaining software should not care how the bits are implemented.
When defining bit field positions / named constants, I suggest this format:
#define VALUE (1 << BIT POSITION)
// OR
const unsigned int VALUE = 1 << BIT POSITION;
This format is more readable and has the compiler perform the arithmetic. The calculation takes place during compilation and has no impact during run-time.
As an example, the Linux kernel has inline functions that explicitly handle memory-mapped IO reads and writes. In newer kernels it's a big macro wrapper that boils down to an inline assembly movl instruction, but it older kernels it was defined like this:
#define readl(addr) (*(volatile unsigned int *) (addr))
#define writel(b,addr) ((*(volatile unsigned int *) (addr)) = (b))
Ian - if you want to be sure as to the size of things you're reading/writing I'd suggest not using structs like this to do it - it's possible the sizeof of the fails struct is just 1 byte - the compiler is free to decide what it should be based on optimizations etc- I'd suggest reading/writing explicitly using int's or generally the things you need to assure the sizes of and then doing something else like converting to a union/struct where you don't have those limitations.
It is the compiler that decides what size read to issue. To force a 32 bit read, you could use a union:
union dev_word {
struct dev_reg {
unsigned int a:1;
unsigned int b:1;
unsigned int c:1;
} fail;
uint32_t dummy;
};
volatile union dev_word *vme_map_window();
If reading the union through a volatile-qualified pointer isn't enough to force a read of the whole union (I would think it would be - but that could be compiler-dependent), then you could use a function to provide the required indirection:
volatile union dev_word *real_reg; /* Initialised with vme_map_window() */
union dev_word * const *reg_func(void)
{
static union dev_word local_copy;
static union dev_word * const static_ptr = &local_copy;
local_copy = *real_reg;
return &static_ptr;
}
#define reg (*reg_func())
...then (for compatibility with the existing code) your accesses are done as:
reg->fail.a
The method described earlier of using the gcc flag -fstrict-volatile-bitfields and defining bitfield variables as volatile u32 works, but the total number of bits defined must be greater than 16.
For example:
typedef union{
vu32 Word;
struct{
vu32 LATENCY :3;
vu32 HLFCYA :1;
vu32 PRFTBE :1;
vu32 PRFTBS :1;
};
}tFlashACR;
.
tFLASH* const pFLASH = (tFLASH*)FLASH_BASE;
#define FLASH_LATENCY pFLASH->ACR.LATENCY
.
FLASH_LATENCY = Latency;
causes gcc to generate code
.
ldrb r1, [r3, #0]
.
which is a byte read. However, changing the typedef to
typedef union{
vu32 Word;
struct{
vu32 LATENCY :3;
vu32 HLFCYA :1;
vu32 PRFTBE :1;
vu32 PRFTBS :1;
vu32 :2;
vu32 DUMMY1 :8;
vu32 DUMMY2 :8;
};
}tFlashACR;
changes the resultant code to
.
ldr r3, [r2, #0]
.
I believe the only solution is to
1) edit/create my main struct as all 32bit ints (unsigned longs)
2) keep my original bit-field structs
3) each access I require,
3.1) I have to read the struct member as a 32bit word, and cast it into the bit-field struct,
3.2) read the bit-field element I require. (and for writes, set this bit-field, and write the word back!)
(1) Which is a same, because then I lose the intrinsic types that each member of the "main/SEPUMap" struct are.
End solution :-
Instead of :-
printf("FirmwareVersionMinor: 0x%x\n", pEpuMap->PECVersion);
This :-
SPECVersion ver = *(SPECVersion*)&pEpuMap->PECVersion;
printf("FirmwareVersionMinor: 0x%x\n", ver.minorversion);
Only problem I have is writting! (Writes are now Read/Modify/Writes!)
// Read - Get current
_HVPSUControl temp = *(_HVPSUControl*)&pEpuMap->HVPSUControl;
// Modify - set to new value
temp.OperationalRequestPort = true;
// Write
volatile unsigned int *addr = reinterpret_cast<volatile unsigned int*>(&pEpuMap->HVPSUControl);
*addr = *reinterpret_cast<volatile unsigned int*>(&temp);
Just have to tidy that code up into a method!
#define writel(addr, data) ( *(volatile unsigned long*)(&addr) = (*(volatile unsigned long*)(&data)) )
I had same problem on ARM using GCC compiler, where write into memory is only through bytes rather than 32bit word.
The solution is to define bit-fields using volatile uint32_t (or required size to write):
union {
volatile uint32_t XY;
struct {
volatile uint32_t XY_A : 4;
volatile uint32_t XY_B : 12;
};
};
but while compiling you need add to gcc or g++ this parameter:
-fstrict-volatile-bitfields
more in gcc documentation.