C++ reinterpret_cast - will this always work correctly? - c++

I have written MyString and MyStringConst class. Now I need from time to time pass MyString as MyStringConst, hence overload cast operator. I have written this
MyString::operator const MyStringConst &() const
{
return reinterpret_cast<const MyStringConst &>(*this);
}
MyString has this data
char * str;
int length;
volatile int hashCode;
int bufferSize;
MyStringConst has this data
const char * c_str;
int length;
volatile int hashCode;
Plus there are some methods, that in both strings can recalculate hashCode.
Is this code correctly written. I have tested it on MSVC 2013 and it is working correctly, but I have no idea if it can be used in production code, that can be compiled with different compiler.

The common initial sequence of the data member is different and C++ makes no guarantee at all about the layout in this case, even if the types differ only by const qualification. Otherwise the guarantees for unions would effectively imply that the types need to have a common layout if they are standard-layout types (according to a note in 9.5 [class.union] paragraph 1).
In practice I would expect that the two types are laid out identical and that the reinterpret_cast works but there is no guarantee by the standard. Based on your comment MyStringConst merely holds a pointer to the string, i.e., instead of converting to references, I would just return a suitably constructed MyStringConst and avoid relying on undefined behavior:
MyString::operator MyStringConst() const {
return MyStringConst(str, length);
}
The MyString object still has to live as long as the result from the conversion but this is no different to the case using reinterpret_cast.
BTW, the volatile on the hashCode is ill-advised: the only effect it will have is to slow down the program. I guess you are trying to use it to achieve synchronization between threads but in C++ volatile doesn't help with that at all: you get a data race when writing the member in one thread it is also accessed unsynchronized in another thread. You'd spell the member
std::atomic<int> hashCode;
instead.

Related

c++20 handling of string literals

We’re updating a project to c++20, and are running into errors where we pass string literals into functions which take char *. I know this has been changed to make code more safe, but we are interfacing with libraries which we cannot change.
I’d rather not disable the strict treatment of literals via compiler flags, so is there a good way to wrap these literals just in these particular cases?
I was thinking of an inline function, that was named something specific to the library, that internally would use const_cast. That way later if we want to change the code because the library gets updated, we know exactly where to look.
Any other ideas?
"Any other ideas?"
static char my_string[] = "string";
...
//elsewhere in the code
library_function(my_string);
The only difference between passing a string like that, and passing a string literal is the section of the assembly the data is stored in.
A string literal is stored in .text, a non-modifiable section.
The non-const string will be stored in .data.
If you really, really care if you're passing a function a pointer to .text or a pointer to .data, and you really, really, trust the library to not modify the parameter now and for ever, then you can certainly cast away the const-ness of your string literals.
Ignoring the fact that documentation lags behind implementation, even if we could believe the documentation promise to not modify its inputs, if it doesn't enforce it through the interface, at any time, on purpose or on accident, that input could be modified.
The following string literal creates a std::string for each literal and implicitly converts to char*.
#include <string>
struct stringlitwrapper
{
constexpr stringlitwrapper(const char* c) : s(c) {};
operator char*() { return s.data(); }
std::string s;
};
constexpr stringlitwrapper operator"" _w (const char* c, std::size_t n)
{
return stringlitwrapper(c);
}
void libfunction(char* param) {
// uses non-const char* as parameter
}
int main() {
libfunction("string literal"_w);
return 0;
}
For compilers, which do not support constexpr here (e.g. msvc does, clang not), leave both constexpr away.
By internally storing the literal as non-const string, there is no undefined behaviour involved.
(The library function of course should not overwrite at all or at least not write over the end of the string.)
To prevent heap allocations, the std::string could be replaced by a char array with fixed (maximal) size.

Is it safe to "play" with parameter constness in extern "C" declarations?

Suppose I'm using some C library which has a function:
int foo(char* str);
and I know for a fact that foo() does not modify the memory pointed to by str. It's just poorly written and doesn't bother to declare str being constant.
Now, in my C++ code, I currently have:
extern "C" int foo(char* str);
and I use it like so:
foo(const_cast<char*>("Hello world"));
My question: Is it safe - in principle, from a language-lawyering perspective, and in practice - for me to write:
extern "C" int foo(const char* str);
and skip the const_cast'ing?
If it is not safe, please explain why.
Note: I am specifically interested in the case of C++98 code (yes, woe is me), so if you're assuming a later version of the language standard, please say so.
Is it safe for me to write: and skip the const_cast'ing?
No.
If it is not safe, please explain why.
-- From language side:
After reading the dcl.link I think exactly how the interoperability works between C and C++ is not exactly specified, with many "no diagnostic required" cases. The most important part is:
Two declarations for a function with C language linkage with the same function name (ignoring the namespace names that qualify it) that appear in different namespace scopes refer to the same function.
Because they refer to the same function, I believe a sane assumption would be that the declaration of a identifier with C language linkage on C++ side has to be compatible with the declaration of that symbol on C side. In C++ there is no concept of "compatible types", in C++ two declarations have to be identical (after transformations), making the restriction actually more strict.
From C++ side, we read c++draft basic#link-11:
After all adjustments of types (during which typedefs are replaced by their definitions), the types specified by all declarations referring to a given variable or function shall be identical, [...]
Because the declaration int foo(const char *str) with C language linkage in a C++ translation unit is not identical to the declaration int foo(char *str) declared in C translation unit (thus it has C language linkage), the behavior is undefined (with famous "no diagnostic required").
From C side (I think this is not even needed - the C++ side is enough to make the program have undefined behavior. anyway), the most important part would be C99 6.7.5.3p15:
For two function types to be compatible, both shall specify compatible return types. Moreover, the parameter type lists, if both are present, shall agree in the number of parameters and in use of the ellipsis terminator; corresponding parameters shall have compatible types [...]
Because from C99 6.7.5.1p2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
and C99 6.7.3p9:
For two qualified types to be compatible, both shall have the identically qualified version of a compatible type [...]
So because char is not compatible with const char, thus const char * is not compatible with char *, thus int foo(const char *) is not compatible with int foo(char*). Calling such a function (C99 6.5.2.2p9) would be undefined behavior (you may see also C99 J.2)
-- From practical side:
I do not believe will be able to find a compiler+architecture combination where one translation unit sees int foo(const char *) and the other translation unit defines a function int foo(char *) { /* some stuff */ } and it would "not work".
Theoretically, an insane implementation may use a different register to pass a const char* argument and a different one to pass a char* argument, which I hope would be well documented in that insane architecture ABI and compiler. If that's so, wrong registers will be used for parameters, it will "not work".
Still, using a simple wrapper costs nothing:
static inline int foo2(const char *var) {
return foo(static_cast<char*>(var));
}
I think the base answer is:
Yes, you can cast off const even if the referenced object is itself const such as a string literal in the example.
Undefined behaviour is only specified to arise in the event of an attempt to modify the const object not as a result of the cast.
Those rules and their reason to exist is 'old'. I'm sure they predate C++98.
Contrast it with volatile where any attempt to access a volatile object through a non-volatile reference is undefined behaviour. I can only read 'access' as read and/or write here.
I won't repeat the other suggestions but here is the most paranoid solution.
It's paranoid not because the C++ semantics aren't clear. They are clear. At least if you accept something being undefined behaviour is clear!
But you've described it as 'poorly written' and you want to put some sandbags round it!
The paranoid solution relies on the fact that if you are passing a constant object it will be constant for the whole execution (if the program doesn't risk UB).
So make a single copy of "hello world" lower in the call-stack or even initialised as a file scope object. You can declare it static in a function and it will (with minimal overhead) only be constructed once.
This recovers almost all of the benefits of string literal. The lower down the call stack including file-scope (global you put it the better.
I don't know how long the lifetime of the pointed-to object passed to foo() needs to be.
So it needs to be at least low enough in the chain to satisfy that condition.
NB: C++98 has std::string but it won't quite do here because you're still forbidden for modifying the result of c_str().
Here the semantics are defined.
#include <cstring>
#include <iostream>
class pseudo_const{
public:
pseudo_const(const char*const cstr): str(NULL){
const size_t sz=strlen(cstr)+1;
str=new char[sz];
memcpy(str,cstr,sz);
}
//Returns a pointer to a life-time permanent copy of
//the string passed to the constructor.
//Modifying the string through this value will be reflected in all
// subsequent calls.
char* get_constlike() const {
return str;
}
~pseudo_const(){
delete [] str;
}
private:
char* str;
};
const pseudo_const str("hello world");
int main() {
std::cout << str.get_constlike() << std::endl;
return 0;
}

Can Aliasing Problems be Avoided with const Variables

My company uses a messaging server which gets a message into a const char* and then casts it to the message type.
I've become concerned about this after asking this question. I'm not aware of any bad behavior in the messaging server. Is it possible that const variables do not incur aliasing problems?
For example say that foo is defined in MessageServer in one of these ways:
As a parameter: void MessageServer(const char* foo)
Or as const variable at the top of MessageServer: const char* foo = PopMessage();
Now MessageServer is a huge function, but it never assigns anything to foo, however at 1 point in MessageServer's logic foo will be cast to the selected message type.
auto bar = reinterpret_cast<const MessageJ*>(foo);
bar will only be read from subsequently, but will be used extensively for object setup.
Is an aliasing problem possible here, or does the fact that foo is only initialized, and never modified save me?
EDIT:
Jarod42's answer finds no problem with casting from a const char* to a MessageJ*, but I'm not sure this makes sense.
We know this is illegal:
MessageX* foo = new MessageX;
const auto bar = reinterpret_cast<MessageJ*>(foo);
Are we saying this somehow makes it legal?
MessageX* foo = new MessageX;
const auto temp = reinterpret_cast<char*>(foo);
auto bar = reinterpret_cast<const MessageJ*>(temp);
My understanding of Jarod42's answer is that the cast to temp makes it legal.
EDIT:
I've gotten some comments with relation to serialization, alignment, network passing, and so on. That's not what this question is about.
This is a question about strict aliasing.
Strict aliasing is an assumption, made by the C (or C++) compiler, that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias eachother.)
What I'm asking is: Will the initialization of a const object, by casting from a char*, ever be optimized below where that object is cast to another type of object, such that I am casting from uninitialized data?
First of all, casting pointers does not cause any aliasing violations (although it might cause alignment violations).
Aliasing refers to the process of reading or writing an object through a glvalue of different type than the object.
If an object has type T, and we read/write it via a X& and a Y& then the questions are:
Can X alias T?
Can Y alias T?
It does not directly matter whether X can alias Y or vice versa, as you seem to focus on in your question. But, the compiler can infer if X and Y are completely incompatible that there is no such type T that can be aliased by both X and Y, therefore it can assume that the two references refer to different objects.
So, to answer your question, it all hinges on what PopMessage does. If the code is something like:
const char *PopMessage()
{
static MessageJ foo = .....;
return reinterpret_cast<const char *>(&foo);
}
then it is fine to write:
const char *ptr = PopMessage();
auto bar = reinterpret_cast<const MessageJ*>(foo);
auto baz = *bar; // OK, accessing a `MessageJ` via glvalue of type `MessageJ`
auto ch = ptr[4]; // OK, accessing a `MessageJ` via glvalue of type `char`
and so on. The const has nothing to do with it. In fact if you did not use const here (or you cast it away) then you could also write through bar and ptr with no problem.
On the other hand, if PopMessage was something like:
const char *PopMessage()
{
static char buf[200];
return buf;
}
then the line auto baz = *bar; would cause UB because char cannot be aliased by MessageJ. Note that you can use placement-new to change the dynamic type of an object (in that case, char buf[200] is said to have stopped existing, and the new object created by placement-new exists and its type is T).
My company uses a messaging server which gets a message into a const char* and then casts it to the message type.
So long as you mean that it does a reinterpret_cast (or a C-style cast that devolves to a reinterpret_cast):
MessageJ *j = new MessageJ();
MessageServer(reinterpret_cast<char*>(j));
// or PushMessage(reinterpret_cast<char*>(j));
and later takes that same pointer and reinterpret_cast's it back to the actual underlying type, then that process is completely legitimate:
MessageServer(char *foo)
{
if (somehow figure out that foo is actually a MessageJ*)
{
MessageJ *bar = reinterpret_cast<MessageJ*>(foo);
// operate on bar
}
}
// or
MessageServer()
{
char *foo = PopMessage();
if (somehow figure out that foo is actually a MessageJ*)
{
MessageJ *bar = reinterpret_cast<MessageJ*>(foo);
// operate on bar
}
}
Note that I specifically dropped the const's from your examples as their presence or absence doesn't matter. The above is legitimate when the underlying object that foo points at actually is a MessageJ, otherwise it is undefined behavior. The reinterpret_cast'ing to char* and back again yields the original typed pointer. Indeed, you could reinterpret_cast to a pointer of any type and back again and get the original typed pointer. From this reference:
Only the following conversions can be done with reinterpret_cast ...
6) An lvalue expression of type T1 can be converted to reference to another type T2. The result is an lvalue or xvalue referring to the same object as the original lvalue, but with a different type. No temporary is created, no copy is made, no constructors or conversion functions are called. The resulting reference can only be accessed safely if allowed by the type aliasing rules (see below) ...
Type aliasing
When a pointer or reference to object of type T1 is reinterpret_cast (or C-style cast) to a pointer or reference to object of a different type T2, the cast always succeeds, but the resulting pointer or reference may only be accessed if both T1 and T2 are standard-layout types and one of the following is true:
T2 is the (possibly cv-qualified) dynamic type of the object ...
Effectively, reinterpret_cast'ing between pointers of different types simply instructs the compiler to reinterpret the pointer as pointing at a different type. More importantly for your example though, round-tripping back to the original type again and then operating on it is safe. That is because all you've done is instructed the compiler to reinterpret a pointer as pointing at a different type and then told the compiler again to reinterpret that same pointer as pointing back at the original, underlying type.
So, the round trip conversion of your pointers is legitimate, but what about potential aliasing problems?
Is an aliasing problem possible here, or does the fact that foo is only initialized, and never modified save me?
The strict aliasing rule allows compilers to assume that references (and pointers) to unrelated types do not refer to the same underlying memory. This assumption allows lots of optimizations because it decouples operations on unrelated reference types as being completely independent.
#include <iostream>
int foo(int *x, long *y)
{
// foo can assume that x and y do not alias the same memory because they have unrelated types
// so it is free to reorder the operations on *x and *y as it sees fit
// and it need not worry that modifying one could affect the other
*x = -1;
*y = 0;
return *x;
}
int main()
{
long a;
int b = foo(reinterpret_cast<int*>(&a), &a); // violates strict aliasing rule
// the above call has UB because it both writes and reads a through an unrelated pointer type
// on return b might be either 0 or -1; a could similarly be arbitrary
// technically, the program could do anything because it's UB
std::cout << b << ' ' << a << std::endl;
return 0;
}
In this example, thanks to the strict aliasing rule, the compiler can assume in foo that setting *y cannot affect the value of *x. So, it can decide to just return -1 as a constant, for example. Without the strict aliasing rule, the compiler would have to assume that altering *y might actually change the value of *x. Therefore, it would have to enforce the given order of operations and reload *x after setting *y. In this example it might seem reasonable enough to enforce such paranoia, but in less trivial code doing so will greatly constrain reordering and elimination of operations and force the compiler to reload values much more often.
Here are the results on my machine when I compile the above program differently (Apple LLVM v6.0 for x86_64-apple-darwin14.1.0):
$ g++ -Wall test58.cc
$ ./a.out
0 0
$ g++ -Wall -O3 test58.cc
$ ./a.out
-1 0
In your first example, foo is a const char * and bar is a const MessageJ * reinterpret_cast'ed from foo. You further stipulate that the object's underlying type actually is a MessageJ and that no reads are done through the const char *. Instead, it is only casted to the const MessageJ * from which only reads are then done. Since you do not read nor write through the const char * alias, then there can be no aliasing optimization problem with your accesses through your second alias in the first place. This is because there are no potentially conflicting operations performed on the underlying memory through your aliases of unrelated types. However, even if you did read through foo, then there could still be no potential problem as such accesses are allowed by the type aliasing rules (see below) and any ordering of reads through foo or bar would yield the same results because there are no writes occurring here.
Let us now drop the const qualifiers from your example and presume that MessageServer does do some write operations on bar and furthermore that the function also reads through foo for some reason (e.g. - prints a hex dump of memory). Normally, there might be an aliasing problem here as we have reads and writes happening through two pointers to the same memory through unrelated types. However, in this specific example, we are saved by the fact that foo is a char*, which gets special treatment by the compiler:
Type aliasing
When a pointer or reference to object of type T1 is reinterpret_cast (or C-style cast) to a pointer or reference to object of a different type T2, the cast always succeeds, but the resulting pointer or reference may only be accessed if both T1 and T2 are standard-layout types and one of the following is true: ...
T2 is char or unsigned char
The strict-aliasing optimizations that are allowed for operations through references (or pointers) of unrelated types are specifically disallowed when a char reference (or pointer) is in play. The compiler instead must be paranoid that operations through the char reference (or pointer) can affect and be affected by operations done through other references (or pointers). In the modified example where reads and writes operate on both foo and bar, you can still have defined behavior because foo is a char*. Therefore, the compiler is not allowed to optimize to reorder or eliminate operations on your two aliases in ways that conflict with the serial execution of the code as written. Similarly, it is forced to be paranoid about reloading values that may have been affected by operations through either alias.
The answer to your question is that, so long as your functions are properly round tripping pointers to a type through a char* back to its original type, then your function is safe, even if you were to interleave reads (and potentially writes, see caveat at end of EDIT) through the char* alias with reads+writes through the underlying type alias.
These two technical references (3.10.10) are useful for answering your question. These other references help give a better understanding of the technical information.
====
EDIT: In the comments below, zmb objects that while char* can legitimately alias a different type, that the converse is not true as several sources seem to say in varying forms: that the char* exception to the strict aliasing rule is an asymmetric, "one-way" rule.
Let us modify my above strict-aliasing code example and ask would this new version similarly result in undefined behavior?
#include <iostream>
char foo(char *x, long *y)
{
// can foo assume that x and y cannot alias the same memory?
*x = -1;
*y = 0;
return *x;
}
int main()
{
long a;
char b = foo(reinterpret_cast<char*>(&a), &a); // explicitly allowed!
// if this is defined behavior then what must the values of b and a be?
std::cout << (int) b << ' ' << a << std::endl;
return 0;
}
I argue that this is defined behavior and that both a and b must be zero after the call to foo. From the C++ standard (3.10.10):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:^52
the dynamic type of the object ...
a char or unsigned char type ...
^52: The intent of this list is to specify those circumstances in which an object may or may not be aliased.
In the above program, I am accessing the stored value of an object through both its actual type and a char type, so it is defined behavior and the results have to comport with the serial execution of the code as written.
Now, there is no general way for the compiler to always statically know in foo that the pointer x actually aliases y or not (e.g. - imagine if foo was defined in a library). Maybe the program could detect such aliasing at run time by examining the values of the pointers themselves or consulting RTTI, but the overhead this would incur wouldn't be worth it. Instead, the better way to generally compile foo and allow for defined behavior when x and y do happen to alias one another is to always assume that they could (i.e. - disable strict alias optimizations when a char* is in play).
Here's what happens when I compile and run the above program:
$ g++ -Wall test59.cc
$ ./a.out
0 0
$ g++ -O3 -Wall test59.cc
$ ./a.out
0 0
This output is at odds with the earlier, similar strict-aliasing program's. This is not dispositive proof that I'm right about the standard, but the different results from the same compiler provides decent evidence that I may be right (or, at least that one important compiler seems to understand the standard the same way).
Let's examine some of the seemingly conflicting sources:
The converse is not true. Casting a char* to a pointer of any type other than a char* and dereferencing it is usually in volation of the strict aliasing rule. In other words, casting from a pointer of one type to pointer of an unrelated type through a char* is undefined.
The bolded bit is why this quote doesn't apply to the problem addressed by my answer nor the example I just gave. In both my answer and the example, the aliased memory is being accessed both through a char* and the actual type of the object itself, which can be defined behavior.
Both C and C++ allow accessing any object type via char * (or specifically, an lvalue of type char). They do not allow accessing a char object via an arbitrary type. So yes, the rule is a "one way" rule."
Again, the bolded bit is why this statement doesn't apply to my answers. In this and similar counter-examples, an array of characters is being accessed through a pointer of an unrelated type. Even in C, this is UB because the character array might not be aligned according to the aliased type's requirements, for example. In C++, this is UB because such access does not meet any of the type aliasing rules as the underlying type of the object actually is char.
In my examples, we first have a valid pointer to a properly constructed type that is then aliased by a char* and then reads and writes through these two aliased pointers are interleaved, which can be defined behavior. So, there seems to be some confusion and conflation out there between the strict aliasing exception for char and not accessing an underlying object through an incompatible reference.
int value;
int *p = &value;
char *q = reinterpret_cast<char*>(&value);
Both p and p refer to the same address, they are aliasing the same memory. What the language does is provide a set of rules defining the behaviors that are guaranteed: write through p read through q fine, other way around not fine.
The standard and many examples clearly state that "write through q, then read through p (or value)" can be well defined behavior. What is not as abundantly clear, but what I'm arguing for here, is that "write through p (or value), then read through q" is always well defined. I claim even further, that "reads and writes through p (or value) can be arbitrarily interleaved with reads and writes to q" with well defined behavior.
Now there is one caveat to the previous statement and why I kept sprinkling the word "can" throughout the above text. If you have a type T reference and a char reference that alias the same memory, then arbitrarily interleaving reads+writes on the T reference with reads on the char reference is always well defined. For example, you might do this to repeatedly print out a hex dump of the underlying memory as you modify it multiple times through the T reference. The standard guarantees that strict aliasing optimizations will not be applied to these interleaved accesses, which otherwise might give you undefined behavior.
But what about writes through a char reference alias? Well, such writes may or may not be well defined. If a write through the char reference violates an invariant of the underlying T type, then you can get undefined behavior. If such a write improperly modified the value of a T member pointer, then you can get undefined behavior. If such a write modified a T member value to a trap value, then you can get undefined behavior. And so on. However, in other instances, writes through the char reference can be completely well defined. Rearranging the endianness of a uint32_t or uint64_t by reading+writing to them through an aliased char reference is always well defined, for example. So, whether such writes are completely well defined or not depends on the particulars of the writes themselves. Regardless, the standard guarantees that its strict aliasing optimizations will not reorder or eliminate such writes w.r.t. other operations on the aliased memory in a manner that itself could lead to undefined behavior.
So my understanding is that you are doing something like that:
enum MType { J,K };
struct MessageX { MType type; };
struct MessageJ {
MType type{ J };
int id{ 5 };
//some other members
};
const char* popMessage() {
return reinterpret_cast<char*>(new MessageJ());
}
void MessageServer(const char* foo) {
const MessageX* msgx = reinterpret_cast<const MessageX*>(foo);
switch (msgx->type) {
case J: {
const MessageJ* msgJ = reinterpret_cast<const MessageJ*>(foo);
std::cout << msgJ->id << std::endl;
}
}
}
int main() {
const char* foo = popMessage();
MessageServer(foo);
}
If that is correct, then the expression msgJ->id is ok (as would be any access to foo), as msgJ has the correct dynamic type. msgx->type on the other hand does incur UB, because msgx has a unrelated type. The fact that the the pointer to MessageJ was cast to const char* in between is completely irrelevant.
As was cited by others, here is the relevant part in the standard (the "glvalue" is the result of dereferencing the pointer):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:52
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in 4.4) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
As far as the discussion "cast to char*" vs "cast from char*" is concerned:
You might know that the standard doesn't talk about strict aliasing as such, it only provides the list above. Strict aliasing is one analysis technique based on that list for compilers to determine which pointers can potentially alias each other. As far as optimizations are concerned, it doesn't make a difference, if a pointer to a MessageJ object was cast to char* or vice versa. The compiler cannot (without further analysis) assume that a char* and MessageX* point to distinct objects and will not perform any optimizations (e.g. reordering) based on that.
Of course that doesn't change the fact that accessing a char array via a pointer to a different type would still be UB in C++ (I assume mostly due to alignment issues) and the compiler might perform other optimizations that could ruin your day.
EDIT:
What I'm asking is: Will the initialization of a const object, by
casting from a char*, ever be optimized below where that object is
cast to another type of object, such that I am casting from
uninitialized data?
No it will not. Aliasing analysis doesn't influence how the pointer itself is handled, but the access through that pointer. The compiler will NOT reorder the write access (store memory address in the pointer variable) with the read access (copy to other variable / load of address in order to access the memory location) to the same variable.
There is no aliasing problem as you use (const)char* type, see the last point of:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in 4.4) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among -its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
The other answer answered the question well enough (it's a direct quotation from the C++ standard in https://isocpp.org/files/papers/N3690.pdf page 75), so I'll just point out other problems in what you're doing.
Note that your code may run into alignment problems. For example, if the alignment of MessageJ is 4 or 8 bytes (typical on 32-bit and 64-bit machines), strictly speaking, it is undefined behaviour to access an arbitrary character array pointer as a MessageJ pointer.
You won't run into any problems on x86/AMD64 architectures as they allow unaligned access. However, someday you may find that the code you're developing is ported to a mobile ARM architecture and the unaligned access would be a problem then.
It therefore seems you're doing something you shouldn't be doing. I would consider using serialization instead of accessing a character array as a MessageJ type. The only problem isn't potential alignment problems, an additional problem is that the data may have a different representation on 32-bit and 64-bit architectures.

What is the difference between these declarations in C?

In C and C++ what do the following declarations do?
const int * i;
int * const i;
const volatile int ip;
const int *i;
Are any of the above declarations wrong?
If not what is the meaning and differences between them?
What are the useful uses of above declarations (I mean in which situation we have to use them in C/C++/embedded C)?
const int * i;
i is a pointer to constant integer. i can be changed to point to a different value, but the value being pointed to by i can not be changed.
int * const i;
i is a constant pointer to a non-constant integer. The value pointed to by i can be changed, but i cannot be changed to point to a different value.
const volatile int ip;
This one is kind of tricky. The fact that ip is const means that the compiler will not let you change the value of ip. However, it could still be modified in theory, e.g. by taking its address and using the const_cast operator. This is very dangerous and not a good idea, but it is allowed. The volatile qualifier indicates that any time ip is accessed, it should always be reloaded from memory, i.e. it should NOT be cached in a register. This prevents the compiler from making certain optimizations. You want to use the volatile qualifier when you have a variable which might be modified by another thread, or if you're using memory-mapped I/O, or other similar situations which could cause behavior the compiler might not be expecting. Using const and volatile on the same variable is rather unusual (but legal) -- you'll usually see one but not the other.
const int *i;
This is the same as the first declaration.
You read variables declarations in C/C++ right-to-left, so to speak.
const int *i; // pointer to a constant int (the integer value doesn't change)
int *const i; // constant pointer to an int (what i points to doesn't change)
const volatile int ip; // a constant integer whose value will never be cached by the system
They each have their own purposes. Any C++ textbook or half decent resource will have explanations of each.

when should a member function be both const and volatile together?

I was reading about volatile member function and came across an affirmation that member function can be both const and volatile together. I didn't get the real use of such a thing. Can anyone please share their experience on practical usage of having member function as const and volatile together.
I wrote small class to test the same:
class Temp
{
public:
Temp(int x) : X(x)
{
}
int getX() const volatile
{
return X;
}
int getBiggerX()
{
return X + 10;
}
private:
int X;
};
void test( const volatile Temp& aTemp)
{
int x = aTemp.getX();
}
int main(int argc, char* argv[])
{
const volatile Temp aTemp(10);
test(aTemp);
return 0;
}
The cv qualification distilled means:
I won't change the value, but there is something out there that can.
You are making a promise to yourself that you won't change the value (const qualification) and requesting the compiler to keep its slimy hands off of this object and turn off all optimization (volatile qualification). Unfortunately, there is little standard among the compiler vendors when it comes to treating volatile fairly. And volatile is a hint to the compiler after all.
A practical use case of this is a system clock. Supposing 0xDEADBEEF was your system specific address of a hardware clock register you'd write:
int const volatile *c = reinterpret_cast<int *>(0xDEADBEEF);
You can't modify that register value, but each time you read it, it is likely to have a different value.
Also, can use this to model UARTs.
You asked for a practical example of volatile member functions. Well i can't think of one because the only situations i could imagine are so low-level that i would not consider using a member function in the first place, but just a plain struct with data-members accessed by a volatile reference.
However, let's put a const volatile function into it just for the sake of answering the question. Assume you have a port with address 0x378h that contains 2 integers, 4 bytes each. Then you could write
struct ints {
int first;
int second;
int getfirst() const volatile {
return first;
}
int getsecond() const volatile {
return second;
}
// note that you could also overload on volatile-ness, just like
// with const-ness
};
// could also be mapped by the linker.
ints const volatile &p = *reinterpret_cast<ints*>(0x378L);
You are stating
I'm not changing them, but another thing outside this abstract semantics could change it. So always do a real load from its address.
Actually, volatile signals that the value of an object might not be the value last stored into it but is actually unknown and might have been changed in between by external (not observable by the compiler) conditions. So when you read from a volatile object, the compiler has to emulate the exact abstract semantics, and perform no optimizations:
a = 4;
a *= 2;
// can't be optimized to a = 8; if a is volatile because the abstract
// semantics described by the language contain two assignments and one load.
The following already determines what volatile does. Everything can be found in 1.9 of the Standard. The parameters it talks about are implementation defined things, like the sizeof of some type.
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. [...]
A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible execution sequences of the corresponding instance of the abstract machine with the same program and the same input. [...]
The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.
I've never needed anything being both const and volatile, but here's my guess:
Const: You, your code, is not allowed to change the value.
Volatile: The value may change over time without your program doing anything.
So some read-only data exposed by another process or by some hardware would be const and volatile. It could even be memory-mapped into your process and the page marked read-only, so you'd get an access violation if you tried to write to it if it wasn't const.
I think that the reason we have "const volatile" functions is the same as the reason we have "protected" inheritance: The grammar allows it , so we had better think up a meaning for it.
One situation I can think of that could require both const and volatile on a member function would be in an embedded systems situation where you had a the function was logically const but actually had to modify a data cache in a shared memory location (e.g. building a bitmap on demand and caching the bitmap in case the same bitmap was needed again soon). It certainly does not come up very often.
An object marked as const volatile will not be allowed to change by the code where it is declared. The error will be raised due to the const qualifier. The volatile part of the qualifier means that the compiler cannot optimize the code with respect to the object.
In an embedded system this is typically used to access hardware registers that can be read and are updated by the hardware, so it makes no sense to be able to write to the register via the code. An example might be the status register of a serial port.Various bits will indicate a status like if a character is waiting to be read. Each read to this status register could result in a different value depending on what else has occurred in the serial port hardware. It makes no sense to write to the status register but you need to make sure that each read of the register results in an actual read of the hardware.
Below is an illustration :
//We assume that the below declared pointers
//point to the correct
//hardware addresses
unsigned int const volatile *status_reg;
unsigned char const volatile *recv_reg;
#define CHAR_READ 0x01
int get_next_char()
{
while((*status_reg & CHAR_READ) == 0);
return *recv_reg;
}
Hope this helps.
Regards
Sandipan Karmakar.