Embedded C++11 code — do I need volatile?

Embedded C++11 code — do I need volatile? - c++

Embedded device with Cortex M3 MCU(STM32F1). It has embedded flash(64K).
MCU firmware can reprogram flash sectors at runtime; this is done by Flash Memory Controller(FMC) registers (so it's not as easy as a=b). FMC gets buffer pointer and burns data to some flash sector.
I want to use the last flash sector for device configuration parameters.
Parameters are stored in a packed struct with arrays and contain some custom classes.
Parameters can be changed at runtime (copy to RAM, change and burn back to flash using FMC).
So there are some questions about that:
State (bitwise) of parameters struct is changed by FMC hardware.
C++ compiler does not know if it was changed or not.
Does this mean I should declare all struct members as volatile?
I think YES.
Struct should be statically initialized (default parameters) at compile time. Struct should be POD (TriviallyCopyable and has standard layout). Remember, there are some custom classes in there, so I keep in mind these classes should be POD too.
BUT there are some problems:
cppreference.com
The only trivially copyable types are scalar types, trivially copyable
classes, and arrays of such types/classes (possibly const-qualified,
but not volatile-qualified).
That means I can't keep my class both POD and volatile?
So how would I solve the problem?
It is possible to use only scalar types in parameters struct but it may result in much less clean code around config processing...
P.S.
It works even without volatile, but I am afraid someday, some smart LTO compiler will see static initialized, not changing (by C++) struct and optimize out some access to underlying memory adresses. That means fresh programmed parameters will not be applied because they were inlined by the compiler.
EDIT: It is possible to solve problem without using volatile. And it seems to be more correct.
You need define config struct variable in separate translation unit(.cpp file) and do not initialize variable to avoid values substitution during LTO. If not using LTO - all be OK because optimizations are done in one translation unit at a time, so variables with static storage duration and external linkage defined in dedicated translation unit should not be optimized out. Only LTO can throw it away or make values substitution without issuing memory fetches. Especially when defining variable as a const. I think it is OK to initialize variable if not using LTO.

You have some choices depending on your compiler:
You can declare a pointer to the structure and initialize the pointer
to the region.
Tell the compiler where the variable should reside
Pointer to Flash
Declare a pointer, of the structure.
Assign the pointer to the proper address in Flash.
Access the variables by dereferencing the pointer.
The pointer should be declared, and assigned, as a constant pointer to constant data.
Telling compiler address of variable.
Some compilers allow you to place a variable in a specific memory region. The first step is to create a region in the linker command file. Next step is to tell the compiler that the variable is in that region.
Again, the variable should be declared as "static const". The "static" because there is only 1 instance. The "const" because Flash memory is read-only for most of the time.
Flash Memory: Volatile vs. Const
In most cases, the Flash memory, however programmed, is read-only. In fact, the only way you can read the data in Flash is to lock it, a.k.a. make it read-only. In general, it won't be changed without concensus of the program.
Most Flash memories are programmed by the software. Normally, this is your program. If your program is going to reprogram the Flash, it knows the values have been changed. This is akin to writing to RAM. The program changed the value, not the hardware. Thus the Flash is not volatile.
My experience is that Flash can be programmed by another means, usually when your program is not running. In that case, it is still not volatile because your program is not running. The Flash is still read-only.
The Flash will be volatile, if and only if, another task or thread of execution programs the flash while your thread of execution is active. I still would not consider this case as volatile. This would be a case in syncronicity -- if the flash is modified, then some listeners should be notified.
Summary
The Flash memory is best treated as read-only memory. Variables residing in Flash are accessed via pointer for best portability, although some compilers and linkers allow you to declare variables at specific, hard-coded addresses. The variables should be declared as const static so that the compiler can emit code to access the variables directly, vs. copying on the stack. If the Flash is programmed by another task or thread of execution, this is a synchronicity issue, not one of volatile. In rare cases, the Flash is programmed by an external source while your program is executed.
Your program should provide checksums or other methods to determine if the content has changed, since the last time it was checked.
DO NOT HAVE COMPILER INITIALIZE VARIABLES FROM FLASH.
This is not really portable. A better method is to have your initialization code load the variable(s) from flash. Making the compiler load your variable from a different segment requires a lot of work with the internals of the compiler and linker; a lot more than initializing a pointer to the address in the Flash.

By reprogramming the flash, you are changing the underlying object's representation. The volatile qualifier is the appropriate solution for the
situation to ensure the changes in data are not optimized away.
You would like a declaration to be: const volatile Settings settings;
The drawback is that volatile prevents static initialization of your object. This stops you from using the linker to put the initialized object in its appropriate memory address.
You would like the definition to be: const Settings settings = { ... };
Luckily, you can initialize a const object and access it as a const volatile.
// Header file
struct Settings { ... };
extern const volatile Settings& settings;
// Source file
static const Settings init_settings = { ... };
const volatile Settings& settings = init_settings;
The init_settings object is statically initialized, but all accesses through the settings reference are treated as volatile.
Please note, though, modifying an object defined as const is undefined behavior.

Related

where are constexpr variables stored

I saw that constexpr variables are highly discussed on stack overflow. But there is one thing no one talks about:
Where are constexpr variables stored?
Everyone knows the memory location tables of C and C++ programs.
stack
heap
static
text
Operating Systems (like Linux)
For operating systems, executable code and all the static variables get copied from the hard drive disc into the allocated areas of text, static ect. in the RAM. From there the program starts as a process.
Embedded Systems (like Atmel Controller)
For an embedded system, this is different. Here the executable code and the literals are permanently stored in a flash Memory; only the static Variables get copied into the RAM.
text or static area
I understand the benefits of constexpr compared to #defines, but for an embedded system programmer, there is always the performance question. On embedded systems, RAM is an expensive resource. So, I need to know if constexpr variables get stored in the text or in the static area. Or more precisely, are they permanently stored in the flash memory or do they get really created as variable into the RAM of an embedded system?

constexpr does not change the storage class of a variable. So the presence of constexpr does not change where the compiler can put the variable.
There are however some important characteristics of constexpr variables. Any variable could have these characteristics, but the constexpr qualifier specifically requires:
The variable cannot be changed (is implicitly const).
The variable is initialized with a compile-time constant expression.
Given these facts, the compiler can put the storage basically anywhere. If the type matches a register type and the value is of sufficient size that it can be loaded via a single "load literal" opcode, the compiler could convert every use of the variable into simply loading the literal value into a register.
Of course, if you start doing things like getting the address of the variable, it may have to take up actual storage. But even that depends on how good the compiler is at inlining.

How to set a constexpr pointer to a physical Address

In embedded programming you often need to set pointers that point to a physical address. The address is non relocatable and fixed. These are not set by the linker as typically they represent registers or in this case calibration data located at a predetermined address in OPT memory. This data is set when the device is first tested in production by the chip manufacturer.
so the first attempt was:
static constexpr uint16_t *T30_CAL = reinterpret_cast<uint16_t *>(0x1FFFF7B8u);
But that leads to following warning / error under GCC and is 'illegal' according to the standard (c++ 14) .
..xyz/xxxx/calibration.cpp:23:40: error: reinterpret_cast from integer to pointer
Now i can fudge it by
constexpr uint32_t T30_ADDR = 0x1FFFF7B8u;
static constexpr inline uint16_t *T30_CAL(){
return reinterpret_cast<uint16_t *>(T30_ADDR);
}
which compiles without warnings but ......
I suppose GCC can optionally compile this to a function instead of a constexpr, though it does inline this every time.
Is there a simpler and more standards compliant way of doing this ?
For embedded code these definitions are required all the time, so it would be nice if there was a simple way of doing this that does not require function definitions.
The answers to the previous questions generally resulted in a answer that says this is not allowed in the standard and leaves it at that.
That is not really what I want. I need to get a compliant way of using C++ to generate compile time constant pointers to a fixed address. I want to do it without using Macros as that sprinkles my code with casts that cause problems with compliance checkers. It results in the need to get compliance exceptions in multiple places rather then one. Each exception is a process and takes time and effort.
Constexpr guarantees, on embedded systems, that the constant is placed in .text section (flash) whilst const does not. It may be placed in valuable ram and initialised by the .bss startup code. Typically embedded devices have much more flash then RAM. Also the code to access variables in RAM is often much more inefficient as it typically involves at least two memory access on embedded targets such as ARM. One to load the variable's RAM address and the second to load the actual constant pointer value from the variable's location. Constexpr results in the constant pointer being coded directly into the instruction stream or results in a single constant load.
If this was just a single instance it would not be an issue, but you generally have many different peripherals each controlled via there own register sets and then this becomes a problem.
A lot of the embedded code ends up reading and writing peripheral registers.

Use this instead:
static uint16_t * const T30_CAL = reinterpret_cast<uint16_t *>(0x1FFFF7B8u);
GCC will store T30_CAL in flash on an ARM target, not RAM. The point is that the 'const' must come after the '*' because it is T30_CAL that is const, not what T30_CAL points to.

As was already pointed out in the comments: reinterpret_cast is not allowed in a constant expression This is because the compiler has to be able to evaluate constexpr at compile time, but the reinterpret_cast might use runtime instructions to do its job.
You already suggested to use macros. This seems a fine way for me, because the compiler will definitely not produce any overhead. However, I would not suggest using your second way of hiding the reinterpret_cast, because as you said, a function is generated. This function will likely take way more memory away than an additional pointer.
In any case, the most reasonable way seems to me, to just declare a const pointer. As soon as you use optimizations, the compiler will just insert the memory location into your executable instead of using a variable. (See https://godbolt.org/g/8KnUKg )

does a variable consume memory in addition to just its content (e.g. type, location)?

Quite likely this has been asked/answered before, but not sure how to phrase it best, a link to a previously answered question would be great.
If you define something like
char myChar = 'a';
I understand that this will take up one byte in memory (depending on implementation and assuming no unicode and so on, the actual number is unimportant).
But I would assume the compiler/computer would also need to keep a table of variable types, addresses (i.e. pointers), and possibly more. Otherwise it would have the memory reserved, but would not be able to do anything with it. So that's already at least a few more bytes of memory consumed per variable.
Is this a correct picture of what happens, or am I misunderstanding what happens when a program gets compiled/executed? And if the above is correct, is it more to do with compilation, or execution?

The compiler will keep track of the properties of a variable - its name, lifetime, type, scope, etc. This information will exist in memory only during compilation. Once the program has been compiled and the program is executed, however, all that is left is the object itself. There is no type information at run-time (except if you use RTTI, then there will be some, but only because you required it for your program to function - such as is required for dynamic_casting).
Everything that happens in the code that accesses the object has been compiled into a form that treats it exactly as a single byte (because it's a char). The address that the object is located at can only be known at run-time anyway. However, variables with automatic storage duration (like local variables), are typically located simply by some fixed offset from the current stack frame. That offset is hard-baked into the executable.

Wether a variable contains extra information depends on the type of the variable and your compiler options. If you use RTTI, extra information is stored. If you compile with debug information then there will also extra overhead be added.
For native datatypes like your example of char there is usually no overhead, unless you have structs which also can cotnain padding bytes. If you define classes, there may be a virtual table associated with your class. However, if you dynamically allocate memory, then there usually will be some overhead along with your allocated memory.
Somtimes a variable may not even exist, because the optimizer realizes that there is no storage needed for it, and it can wrap it up in a register.
So in total, you can not rely on counting your used variables and sum their size up to calculate the amount of memory it requires because there is not neccessarily a 1:1: relation.

Some types can be detected in compile type, say in this code:
void foo(char c) {...}
it is obvious what type of variable c in compile time is.
In case of inheritance you cannot know the real type of the variable in the compile type, like:
void draw(Drawable* drawable); // where drawable can be Circle, Line etc.
But C++ compiler can help to determine the type of the Drawable using dynamic_cast. In this case it uses pointer to a virtual method tables, associated with an object to determine the real type.

Copy static class member to local variable for optimization

While browsing open source code (from OpenCV), I came across the following type of code inside a method:
// copy class member to local variable for optimization
int foo = _foo; //where _foo is a class member
for (...) //a heavy loop that makes use of foo
From another question on SO I've concluded that the answer to whether or not this actually needs to be done or is done automatically by the compiler may be compiler/setting dependent.
My question is if it would make any difference if _foo were a static class member? Would there still be a point in this manual optimization, or is accessing a static class member no more 'expensive' than accessing a local variable?
P.S. - I'm asking out of curiosity, not to solve a specific problem.

Accessing a property means de-referencing the object, in order to access it.
As the property may change during the execution (read threads), the compiler will read the value from memory each time the value is accessed.
Using a local variable will allow the compiler to use a register for the value, as it can safely assume the value won't change from the outside. This way, the value is read only once from memory.
About your question concerning the static member, it's the same, as it can also be changed by another thread, for instance. The compiler will also need to read the value each time from memory.

I think a local variable is more likely to participate in some optimization, precisely because it is local to the function: this fact can be used by the compiler, for example if it sees that nobody modifies the local variable, then the compiler may load it once, and use it in every iteration.
In case of member data, the compiler may have to work more to conclude that nobody modifies the member. Think about multi-threaded application, and note that the memory model in C++11 is multi-threaded, which means some other thread might modify the member, so the compiler may not conclude that nobody modifies it, in consequence it has to emit code for load member for every expression which uses it, possibly multiple times in a single iteration, in order to work with the updated value of the member.

In this example the the _foo will be copied into new local variable. so both cases the same.
Statis values are like any other variable. its just stored in different memory segment dedicated for static memory.

Reading a static class member is effectively like reading a global variable. They both have a fixed address. Reading a non-static one means first reading the this-pointer, adding an offset to the result and then reading that address. In other words, reading a non-static one requires more steps and memory accesses.

slightly weird C++ code

Sorry if this is simple, my C++ is rusty.
What is this doing? There is no assignment or function call as far as I can see. This code pattern is repeated many times in some code I inherited. If it matters it's embedded code.
*(volatile UINT16 *)&someVar->something;
edit: continuing from there, does the following additional code confirm Heaths suspicions? (exactly from code, including the repetition, except the names have been changed to protect the innocent)
if (!WaitForNotBusy(50))
return ERROR_CODE_X;
*(volatile UINT16 *)& someVar->something;
if (!WaitForNotBusy(50))
return ERROR_CODE_X;
*(volatile UINT16 *)& someVar->something;
x = SomeData;

This is a fairly common idiom in embedded programming (though it should be encapsulated in a set of functions or macros) where a device register needs to be accessed. In many architectures, device registers are mapped to a memory address and are accessed like any other variable (though at a fixed address - either pointers can be used or the linker or a compiler extension can help with fixing the address). However, if the C compiler doesn't see a side effect to a variable access it can optimize it away - unless the variable (or the pointer used to access the variable) is marked as volatile.
So the expression;
*(volatile UINT16 *)&someVar->something;
will issue a 16-bit read at some offset (provided by the something structure element's offset) from the address stored in the someVar pointer. This read will occur and cannot be optimized away by the compiler due to the volatile keyword.
Note that some device registers perform some functionality even if they are simply read - even if the data read isn't otherwise used. This is quite common with status registers, where an error condition might be cleared after the read of the register that indicates the error state in a particular bit.
This is probably one of the more common reasons for the use of the volatile keyword.

So here's a long shot.
If that address points to a memory mapped region on a FPGA or other device, then the device might actually be doing something when you read that address.

I think the author's intent was to cause the compiler to emit memory barriers at these points. By evaluating the expression result of a volatile, the indication to the compiler is that this expression should not be optimized away, and should 'instantiate' the semantics of access to a volatile location (memory barriers, restrictions on optimizations) at each line where this idiom occurs.
This type of idiom could be "encapsulated" in a pre-processor macro (#define) in case another compile has a different way to cause the same effect. For example, a compiler with the ability to directly encode read or write memory barriers might use the built-in mechanism rather than this idiom. Implementing this type of code inside a macro enables changing the method all over your code base.
EDIT: User sharth has a great point that if this code runs in an environment where the address of the pointer is a physical rather than virtual address (or a virtual address mapped to a specific physical address), then performing this read operation might cause some action at a peripheral device.

Generally this is bad code.
In C and C++ volatile means very few and does not provide implicit memory barrier. So this code is just quite wrong uness it is written as
memory_barrier();
*(volatile UINT16 *)&someVar->something;
It is just bad code.
Expenation: volatile does not make variable atomic!
Reed this article: http://www.mjmwired.net/kernel/Documentation/volatile-considered-harmful.txt
This is why volatile should almost never be used in proper code.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js