How to set a constexpr pointer to a physical Address

How to set a constexpr pointer to a physical Address - c++

In embedded programming you often need to set pointers that point to a physical address. The address is non relocatable and fixed. These are not set by the linker as typically they represent registers or in this case calibration data located at a predetermined address in OPT memory. This data is set when the device is first tested in production by the chip manufacturer.
so the first attempt was:
static constexpr uint16_t *T30_CAL = reinterpret_cast<uint16_t *>(0x1FFFF7B8u);
But that leads to following warning / error under GCC and is 'illegal' according to the standard (c++ 14) .
..xyz/xxxx/calibration.cpp:23:40: error: reinterpret_cast from integer to pointer
Now i can fudge it by
constexpr uint32_t T30_ADDR = 0x1FFFF7B8u;
static constexpr inline uint16_t *T30_CAL(){
return reinterpret_cast<uint16_t *>(T30_ADDR);
}
which compiles without warnings but ......
I suppose GCC can optionally compile this to a function instead of a constexpr, though it does inline this every time.
Is there a simpler and more standards compliant way of doing this ?
For embedded code these definitions are required all the time, so it would be nice if there was a simple way of doing this that does not require function definitions.
The answers to the previous questions generally resulted in a answer that says this is not allowed in the standard and leaves it at that.
That is not really what I want. I need to get a compliant way of using C++ to generate compile time constant pointers to a fixed address. I want to do it without using Macros as that sprinkles my code with casts that cause problems with compliance checkers. It results in the need to get compliance exceptions in multiple places rather then one. Each exception is a process and takes time and effort.
Constexpr guarantees, on embedded systems, that the constant is placed in .text section (flash) whilst const does not. It may be placed in valuable ram and initialised by the .bss startup code. Typically embedded devices have much more flash then RAM. Also the code to access variables in RAM is often much more inefficient as it typically involves at least two memory access on embedded targets such as ARM. One to load the variable's RAM address and the second to load the actual constant pointer value from the variable's location. Constexpr results in the constant pointer being coded directly into the instruction stream or results in a single constant load.
If this was just a single instance it would not be an issue, but you generally have many different peripherals each controlled via there own register sets and then this becomes a problem.
A lot of the embedded code ends up reading and writing peripheral registers.

Use this instead:
static uint16_t * const T30_CAL = reinterpret_cast<uint16_t *>(0x1FFFF7B8u);
GCC will store T30_CAL in flash on an ARM target, not RAM. The point is that the 'const' must come after the '*' because it is T30_CAL that is const, not what T30_CAL points to.

As was already pointed out in the comments: reinterpret_cast is not allowed in a constant expression This is because the compiler has to be able to evaluate constexpr at compile time, but the reinterpret_cast might use runtime instructions to do its job.
You already suggested to use macros. This seems a fine way for me, because the compiler will definitely not produce any overhead. However, I would not suggest using your second way of hiding the reinterpret_cast, because as you said, a function is generated. This function will likely take way more memory away than an additional pointer.
In any case, the most reasonable way seems to me, to just declare a const pointer. As soon as you use optimizations, the compiler will just insert the memory location into your executable instead of using a variable. (See https://godbolt.org/g/8KnUKg )

Related

Why address-of operator ('&') can be used with objects that are declared with the register storage class specifier in C++?

In C programming language we are not allowed to use address-of operator(&) with variables which are declared with register storage class specifier.
It gives error: address of register variable ‘var_name’ requested
But if we make a c++ program and perform the same task (i.e use the & with register storage variable) it doesn't gives us any error.
eg.
#include <iostream>
using namespace std;
int main()
{
register int a;
int * ptr;
a = 5;
ptr = &a;
cout << ptr << endl;
return 0;
}
Output :-
0x7ffcfed93624
Well this must be an extra feature of C++, but the question is on the difference between register class storage in C and C++.

The restriction on taking the address was deliberately removed in C++ - there was no benefit to it, and it made the language more complicated. (E.g. what would happen if you bound a reference to a register variable?)
The register keyword hasn't been much use for many years - compilers are very good at figuring out what to put in registers by themselves. Indeed in C++ the keyword is currently deprecated and will eventually be removed.

The register storage class originally hinted to the compiler that the variable so qualified was to be used so frequently that keeping its value in memory would be a performance drawback. The vast majority of CPU architectures (maybe not SPARC? Not even certain there's a counterexample) cannot perform any operation between two variables without first loading one or both from memory into its registers. Loading variables from memory into registers and writing them back to memory once operated upon takes many times more CPU cycles than the operations themselves. Thus, if a variable is used frequently, one can achieve a performance gain by setting aside a register for it and not bothering with memory at all.
Doing so, however, has a variety of requirements. Many are different for every CPU architecture:
All processors have a fixed number of registers, but each processor model has a different number. In the 80s you might have had 4 that could reasonably be used for a register variable.
Most processors do not support the use of every register for every instruction. In the 80s it was not uncommon to have only one register that you could use for addition and subtraction, and you probably couldn't use that same register as a pointer.
Calling conventions dictated differing sets of registers that could be expected to be overwritten by subroutines i.e. function calls.
The size of a register differs between processors, so there are cases where a register variable will not fit in a register.
Because C is intended to be independent of platform, these restrictions could not be enforced by the standard. In other words, while it may be impossible to compile a procedure with 20 register variables for a system that only had 4 machine registers, the C program itself should not be "wrong", as there is no logical reason a machine cannot have 20 registers. Thus, the register storage class was always just a hint that the compiler could ignore if the specific target platform would not support it.
The inability to reference a register is different. A register is specifically not kept updated in memory and not kept current if changes are made to memory; that's the whole point of the storage class. Since they are not intended to have a guaranteed representation in memory, they cannot logically have an address in memory that will be meaningful to external code that may obtain the pointer. Registers have no address to their own CPU, and they almost never have an address accessible to any coprocessor. Therefore, any attempt to obtain a reference to a register is always a mistake. The C standard could comfortably enforce this rule.
As computing evolved, however, some trends developed that weakened the purpose of the register storage class itself:
Processors came with greater numbers of registers. Today you probably have at least 16, and they can probably all be used interchangeably for most purposes.
Multi-core processors and distributed code execution has become very common; only one core has access to any one register and they never share without involving memory anyway.
Algorithms for allocating registers to variables became very effective.
Indeed, compilers are now so good at allocating variables to registers that they will usually do a better job at optimization than any human. They certainly know which ones you are using most frequently without you telling them. It would be more complicated for the compiler (i.e. not for the standard or for the programmer) to produce these optimizations if they were required to honor your manual register hints. It became increasingly common for compilers to categorically ignore them. By the time C++ existed, it was obsolete. It is included in the standard for backward compatibility, to keep C++ as close as possible to a proper superset of C. The requirements of a compiler to honor the hint and thus the requirements to enforce the conditions under which the hint could be honored were weakened accordingly. Today, the storage class itself is deprecated.
Therefore, even though it is still the case today (and will be until computers don't even have registers) that you cannot logically have a reference to a CPU register, the expectation that the register storage class will be honored is so long gone that it is unreasonable for the standard to require compilers to require you to be logical in your use of it.

A referenced register would be the register itself. If the calling function passed ESI as a referenced parameter, then the called function would use ESI as the parameter. As pointed out by Alan Stokes, the issue is if another function also calls the same function, but this time with EDI as the same referenced parameter.
In order for this to work, two overloaded like instances of the called function would need to be created, one taking ESI as a parameter, one taking EDI as a parameter. I don't know if any actual C++ compiler actually implements such an optimization in general, but that is how this could be done.
One example of register by reference is the way std::swap() gets optimized (both parameters are references), which often ends up as inlined code. Sometimes no swap takes place: for example, std::swap(a, b), no swap takes place, instead the sense of a and b is swapped in the code that follows (references to what was a become references to b and vice versa).
Otherwise, a reference parameter will force the variable to be located in memory instead of a register.

Embedded C++11 code — do I need volatile?

Embedded device with Cortex M3 MCU(STM32F1). It has embedded flash(64K).
MCU firmware can reprogram flash sectors at runtime; this is done by Flash Memory Controller(FMC) registers (so it's not as easy as a=b). FMC gets buffer pointer and burns data to some flash sector.
I want to use the last flash sector for device configuration parameters.
Parameters are stored in a packed struct with arrays and contain some custom classes.
Parameters can be changed at runtime (copy to RAM, change and burn back to flash using FMC).
So there are some questions about that:
State (bitwise) of parameters struct is changed by FMC hardware.
C++ compiler does not know if it was changed or not.
Does this mean I should declare all struct members as volatile?
I think YES.
Struct should be statically initialized (default parameters) at compile time. Struct should be POD (TriviallyCopyable and has standard layout). Remember, there are some custom classes in there, so I keep in mind these classes should be POD too.
BUT there are some problems:
cppreference.com
The only trivially copyable types are scalar types, trivially copyable
classes, and arrays of such types/classes (possibly const-qualified,
but not volatile-qualified).
That means I can't keep my class both POD and volatile?
So how would I solve the problem?
It is possible to use only scalar types in parameters struct but it may result in much less clean code around config processing...
P.S.
It works even without volatile, but I am afraid someday, some smart LTO compiler will see static initialized, not changing (by C++) struct and optimize out some access to underlying memory adresses. That means fresh programmed parameters will not be applied because they were inlined by the compiler.
EDIT: It is possible to solve problem without using volatile. And it seems to be more correct.
You need define config struct variable in separate translation unit(.cpp file) and do not initialize variable to avoid values substitution during LTO. If not using LTO - all be OK because optimizations are done in one translation unit at a time, so variables with static storage duration and external linkage defined in dedicated translation unit should not be optimized out. Only LTO can throw it away or make values substitution without issuing memory fetches. Especially when defining variable as a const. I think it is OK to initialize variable if not using LTO.

You have some choices depending on your compiler:
You can declare a pointer to the structure and initialize the pointer
to the region.
Tell the compiler where the variable should reside
Pointer to Flash
Declare a pointer, of the structure.
Assign the pointer to the proper address in Flash.
Access the variables by dereferencing the pointer.
The pointer should be declared, and assigned, as a constant pointer to constant data.
Telling compiler address of variable.
Some compilers allow you to place a variable in a specific memory region. The first step is to create a region in the linker command file. Next step is to tell the compiler that the variable is in that region.
Again, the variable should be declared as "static const". The "static" because there is only 1 instance. The "const" because Flash memory is read-only for most of the time.
Flash Memory: Volatile vs. Const
In most cases, the Flash memory, however programmed, is read-only. In fact, the only way you can read the data in Flash is to lock it, a.k.a. make it read-only. In general, it won't be changed without concensus of the program.
Most Flash memories are programmed by the software. Normally, this is your program. If your program is going to reprogram the Flash, it knows the values have been changed. This is akin to writing to RAM. The program changed the value, not the hardware. Thus the Flash is not volatile.
My experience is that Flash can be programmed by another means, usually when your program is not running. In that case, it is still not volatile because your program is not running. The Flash is still read-only.
The Flash will be volatile, if and only if, another task or thread of execution programs the flash while your thread of execution is active. I still would not consider this case as volatile. This would be a case in syncronicity -- if the flash is modified, then some listeners should be notified.
Summary
The Flash memory is best treated as read-only memory. Variables residing in Flash are accessed via pointer for best portability, although some compilers and linkers allow you to declare variables at specific, hard-coded addresses. The variables should be declared as const static so that the compiler can emit code to access the variables directly, vs. copying on the stack. If the Flash is programmed by another task or thread of execution, this is a synchronicity issue, not one of volatile. In rare cases, the Flash is programmed by an external source while your program is executed.
Your program should provide checksums or other methods to determine if the content has changed, since the last time it was checked.
DO NOT HAVE COMPILER INITIALIZE VARIABLES FROM FLASH.
This is not really portable. A better method is to have your initialization code load the variable(s) from flash. Making the compiler load your variable from a different segment requires a lot of work with the internals of the compiler and linker; a lot more than initializing a pointer to the address in the Flash.

By reprogramming the flash, you are changing the underlying object's representation. The volatile qualifier is the appropriate solution for the
situation to ensure the changes in data are not optimized away.
You would like a declaration to be: const volatile Settings settings;
The drawback is that volatile prevents static initialization of your object. This stops you from using the linker to put the initialized object in its appropriate memory address.
You would like the definition to be: const Settings settings = { ... };
Luckily, you can initialize a const object and access it as a const volatile.
// Header file
struct Settings { ... };
extern const volatile Settings& settings;
// Source file
static const Settings init_settings = { ... };
const volatile Settings& settings = init_settings;
The init_settings object is statically initialized, but all accesses through the settings reference are treated as volatile.
Please note, though, modifying an object defined as const is undefined behavior.

does a variable consume memory in addition to just its content (e.g. type, location)?

Quite likely this has been asked/answered before, but not sure how to phrase it best, a link to a previously answered question would be great.
If you define something like
char myChar = 'a';
I understand that this will take up one byte in memory (depending on implementation and assuming no unicode and so on, the actual number is unimportant).
But I would assume the compiler/computer would also need to keep a table of variable types, addresses (i.e. pointers), and possibly more. Otherwise it would have the memory reserved, but would not be able to do anything with it. So that's already at least a few more bytes of memory consumed per variable.
Is this a correct picture of what happens, or am I misunderstanding what happens when a program gets compiled/executed? And if the above is correct, is it more to do with compilation, or execution?

The compiler will keep track of the properties of a variable - its name, lifetime, type, scope, etc. This information will exist in memory only during compilation. Once the program has been compiled and the program is executed, however, all that is left is the object itself. There is no type information at run-time (except if you use RTTI, then there will be some, but only because you required it for your program to function - such as is required for dynamic_casting).
Everything that happens in the code that accesses the object has been compiled into a form that treats it exactly as a single byte (because it's a char). The address that the object is located at can only be known at run-time anyway. However, variables with automatic storage duration (like local variables), are typically located simply by some fixed offset from the current stack frame. That offset is hard-baked into the executable.

Wether a variable contains extra information depends on the type of the variable and your compiler options. If you use RTTI, extra information is stored. If you compile with debug information then there will also extra overhead be added.
For native datatypes like your example of char there is usually no overhead, unless you have structs which also can cotnain padding bytes. If you define classes, there may be a virtual table associated with your class. However, if you dynamically allocate memory, then there usually will be some overhead along with your allocated memory.
Somtimes a variable may not even exist, because the optimizer realizes that there is no storage needed for it, and it can wrap it up in a register.
So in total, you can not rely on counting your used variables and sum their size up to calculate the amount of memory it requires because there is not neccessarily a 1:1: relation.

Some types can be detected in compile type, say in this code:
void foo(char c) {...}
it is obvious what type of variable c in compile time is.
In case of inheritance you cannot know the real type of the variable in the compile type, like:
void draw(Drawable* drawable); // where drawable can be Circle, Line etc.
But C++ compiler can help to determine the type of the Drawable using dynamic_cast. In this case it uses pointer to a virtual method tables, associated with an object to determine the real type.

Is finding pointers in C/C++ code statically equivalent to the Halting Ρroblem?

I'm not too deeply rooted in the very formal side of static code analysis, hence this question.
A couple of years ago I read that distinguishing code from data using static code analysis is equivalent to the Halting Problem. (Citation needed, but I don't have it anymore. Stackoverflow has threads on this here or here.) At least for common computer architectures based on the Von Neumann architecture where code and data share the same memory this seemed to make sense.
Now I'm looking at the static analysis of C/C++ code and pointer analysis; the program does not execute. Somehow I have a feeling that tracking all creations and uses of pointer values statically is similar to the Halting Problem because I can not determine if a given value in memory is a pointer value, i.e. I can not track the value-flow of pointer values through memory. Alias analysis may narrow down the problem, but it seems to become less useful in the face of multi-threaded code.
(One might even consider tracking arbitrary values, not just pointers: constructing a complete value-flow for any given "interesting" value seems equivalent to the Halting Problem.)
As this is just a hunch, my question is: are the more formal findings on this that I can refer to? Am I mistaken?

You can always code up this:
extern bool some_program_halts();
extern int* invalid_pointer();
#include <iostream>
int main()
{
using namespace std;
if( some_program_halts() ) { cout << *invalid_pointer() << endl; }
}
Checking whether this program dereferences the invalid pointer is equivalent to finding out whether the call to some_program_halts(), uh, halts.

It's almost certainly equivalent, modulo the fact that C is not a turing-equivalent language (a given C implementation is a gigantic finite state machine rather than a turing machine, due to the Representation of Types). Pointers need not be kept in their original representations in objects whose effective type is pointer type; you can examine the representation and perform arbitrary operations on it, for example, encrypting pointers and decrypting them later. Determining whether an arbitrary computation is reversible, or whether two computations are inverses of one another, is (offhand) probably equivalent to determining halting.

If I understood you correctly: yes, checking whether a C or C++ program accesses an invalid pointer is equivalent to the halting problem (of a C or C++ program, in any case).
Suppose you had a tool that told you whether a program accessed an invalid pointer, and a program you wanted to check for halting. By adding extra information to each pointer you can make it checkable (at runtime) whether the pointer is valid or not; add such checks, with an infinite loop on failure. You now have a program with no invalid pointer accesses. By replacing all places the program can terminate with an invalid pointer access you get a program which has an invalid pointer access if and only if the original program terminates.

Static analysis is almost always an approximation, often provable by reduction to the halting problem with programs like the one in Alf's answer. However, the approximation can err on the side of either false positives or false negatives.
A "conservative" static check will only have false negatives. It will never accept a "bad" program, but it will inevitably reject some "sufficiently complicated" good programs.
A "liberal" static check will have false positives. Sometimes it accepts a bad program by mistake but (generally) it will also accept all good programs.
Some examples:
Java's type system is conservative: a variable with a type T will always contain an instance of type T (or a subtype of T or null) at runtime no matter what.
GCC's option to warn about uninitialized variables is liberal: it doesn't find all potential uses of an uninitialized variable. Here's an example of a false positive program.
In contrast, Java does a conservative uninitialized variable check for local variables. It refuses to compile the program if it sees any potential execution path using a potentially uninitialized variable.
Liberal checks are often used by compilers to emit warnings and by external static analysis tools. Things like type systems and compiler optimizations tend to rely on conservative checks to be correct.
Many tasks have several reasonable conservative and liberal algorithms of varying accuracy. Alias analysis is certainly one of these.
For more information, see any good compiler textbook, such as the dragon book.

slightly weird C++ code

Sorry if this is simple, my C++ is rusty.
What is this doing? There is no assignment or function call as far as I can see. This code pattern is repeated many times in some code I inherited. If it matters it's embedded code.
*(volatile UINT16 *)&someVar->something;
edit: continuing from there, does the following additional code confirm Heaths suspicions? (exactly from code, including the repetition, except the names have been changed to protect the innocent)
if (!WaitForNotBusy(50))
return ERROR_CODE_X;
*(volatile UINT16 *)& someVar->something;
if (!WaitForNotBusy(50))
return ERROR_CODE_X;
*(volatile UINT16 *)& someVar->something;
x = SomeData;

This is a fairly common idiom in embedded programming (though it should be encapsulated in a set of functions or macros) where a device register needs to be accessed. In many architectures, device registers are mapped to a memory address and are accessed like any other variable (though at a fixed address - either pointers can be used or the linker or a compiler extension can help with fixing the address). However, if the C compiler doesn't see a side effect to a variable access it can optimize it away - unless the variable (or the pointer used to access the variable) is marked as volatile.
So the expression;
*(volatile UINT16 *)&someVar->something;
will issue a 16-bit read at some offset (provided by the something structure element's offset) from the address stored in the someVar pointer. This read will occur and cannot be optimized away by the compiler due to the volatile keyword.
Note that some device registers perform some functionality even if they are simply read - even if the data read isn't otherwise used. This is quite common with status registers, where an error condition might be cleared after the read of the register that indicates the error state in a particular bit.
This is probably one of the more common reasons for the use of the volatile keyword.

So here's a long shot.
If that address points to a memory mapped region on a FPGA or other device, then the device might actually be doing something when you read that address.

I think the author's intent was to cause the compiler to emit memory barriers at these points. By evaluating the expression result of a volatile, the indication to the compiler is that this expression should not be optimized away, and should 'instantiate' the semantics of access to a volatile location (memory barriers, restrictions on optimizations) at each line where this idiom occurs.
This type of idiom could be "encapsulated" in a pre-processor macro (#define) in case another compile has a different way to cause the same effect. For example, a compiler with the ability to directly encode read or write memory barriers might use the built-in mechanism rather than this idiom. Implementing this type of code inside a macro enables changing the method all over your code base.
EDIT: User sharth has a great point that if this code runs in an environment where the address of the pointer is a physical rather than virtual address (or a virtual address mapped to a specific physical address), then performing this read operation might cause some action at a peripheral device.

Generally this is bad code.
In C and C++ volatile means very few and does not provide implicit memory barrier. So this code is just quite wrong uness it is written as
memory_barrier();
*(volatile UINT16 *)&someVar->something;
It is just bad code.
Expenation: volatile does not make variable atomic!
Reed this article: http://www.mjmwired.net/kernel/Documentation/volatile-considered-harmful.txt
This is why volatile should almost never be used in proper code.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js