Finding endian-ness programmatically at compile-time using C++11 - c++

I have referred many questions in SO on this topic, but couldn't find any solution so far. One natural solution was mentioned here: Determining endianness at compile time.
However, the related problems mentioned in the comments & the same answer.
With some modifications, I am able to compile a similar solution with g++ & clang++ (-std=c++11) without any warning.
static_assert(sizeof(char) == 1, "sizeof(char) != 1");
union U1
{
int i;
char c[sizeof(int)];
};
union U2
{
char c[sizeof(int)];
int i;
};
constexpr U1 u1 = {1};
constexpr U2 u2 = {{1}};
constexpr bool IsLittleEndian ()
{
return u1.i == u2.c[0]; // ignore different type comparison
}
static_assert(IsLittleEndian(), "The machine is BIG endian");
Demo.
Can this be considered a deterministic method to decide the endian-ness or does it miss type-punning or something else?

Since C++20 you can use std::endian from the <type_traits> header:
#include <type_traits>
int main()
{
static_assert(std::endian::native==std::endian::big,
"Not a big endian platform!");
}
See it live

Your attempt is no different from this obviously non-working one (where IsLittleEndian() is identical to true):
constexpr char c[sizeof(int)] = {1};
constexpr int i = {1};
constexpr bool IsLittleEndian ()
{
return i == c[0]; // ignore different type comparison
}
static_assert(IsLittleEndian(), "The machine is BIG endian");
I believe that C++11 doesn't provide means to programatically determine the endianness of the target platform during compile time. My argument is that the only valid way to perform that check during runtime is to examine an int variable using an unsigned char pointer (since other ways of type punning inevitably contain undefined behavior):
const uint32_t i = 0xffff0000;
bool isLittleEndian() {
return 0 == *reinterpret_cast<const unsigned char*>(&i);
}
C++11 doesn't allow to make this function constexpr, therefore this check cannot be performed during compile time.

Related

Does using compre_exchange with c++20 compare the value representations? (why doesn't this example agree)

From this link
Atomically compares the object representation (until C++20)value representation (since C++20) of *this with that of expected, and if those are bitwise-equal, replaces the former with desired (performs read-modify-write operation). Otherwise, loads the actual value stored in *this into expected (performs load operation).
Hence using C++20, the while loop in the following code must be infinite but It's finite. Am I wrong or what's happening
#include <atomic>
#include <iostream>
struct S {
char a{};
int b{};
};
bool operator==(const S& lhs, const S& rhs) {
return lhs.a == rhs.a && lhs.b == rhs.b;
}
int main() {
S expected{ 'a', 2 };
std::atomic<S> atomicS{ S{'a', 2} };
reinterpret_cast<unsigned char*>(&(atomicS))[1] = 'e';
reinterpret_cast<unsigned char*>(&(expected))[1] = 'f';
while (atomicS.compare_exchange_strong(expected, S{ 'a',2 }));
std::cout << "\nfinished";
}
Demo
This change (to use value representation instead of object representation) was done as part of P0528R3 (the motivation and story can be found in P0528R0). As you can see under cppreference's compiler support, neither gcc nor clang implement this feature yet. MSVC does on 19.28, but that's not available on compiler explorer, so I cannot check that to verify at the moment.
So at the moment, you're effectively verifying the old behavior.

Can you have constexpr rvalues? [duplicate]

This question already has an answer here:
Is there a non-indirection, non-hack way to guarantee that a constexpr function only be callable at compile time?
(1 answer)
Closed 2 years ago.
Can you have constexpr rvalues, e.g. when initializing variables using the result of several constexpr functions?
i.e. can I guarantee that an rvalue is computed at compile time regardless of compiler settings?
constexpr int getvalue1()
{
return 42;
}
constexpr int getvalue2()
{
return 24;
}
int main()
{
// I want to initialize val with a value known at compile time
constexpr int ceval = getvalue1() + getvalue2();
int val = ceval;
// why can't I just do:
//
// int val = constexpr getvalue1() + constexpr getvalue2();
}
https://godbolt.org/z/KcK23k
Just use:
int val = getvalue1() + getvalue2();
The optimizer will take care of it. If you disable the optimizer, then yes, the compiler will issue calls to these functions, otherwise you wouldn't be able to set breakpoints and step into them.
Even if you use C++20's consteval specifier which requires the functions to produce a constant expression, the compiler will still issue calls if you disable the optimizer:
consteval int getvalue1()
{ return 42; }
consteval int getvalue2()
{ return 24; }
// ...
int val = getvalue1() + getvalue2();
So long story short: just use the optimizer. If you force the issue the way you did through a intermediate constexpr variable, all you're doing is making debugging more difficult when you end up having to actually debug constexpr or consteval functions.
With C++11 functionality, you can’t guarantee this, although most compilers will do this for such a simple case if you don’t turn optimisations off.
C++20 adds the constinit keyword for just this purpose, but this only works for static or thread-local variables.

Using CRC32 algorithm to hash string at compile-time

Basically I want in my code to be able to do this:
Engine.getById(WSID('some-id'));
Which should get transformed by
Engine.getById('1a61bc96');
just before being compiled into asm. So at compile-time.
This is my try
constexpr int WSID(const char* str) {
boost::crc_32_type result;
result.process_bytes(str,sizeof(str));
return result.checksum();
}
But I get this when trying to compile with MSVC 18 (CTP November 2013)
error C3249: illegal statement or sub-expression for 'constexpr' function
How can I get the WSID function, using this way or any, as long as it is done during compile time?
Tried this: Compile time string hashing
warning C4592: 'crc32': 'constexpr' call evaluation failed; function will be called at run-time
EDIT:
I first heard about this technique in Game Engine Architecture by Jason Gregory. I contacted the author who obligingly answer to me this :
What we do is to pass our source code through a custom little pre-processor that searches for text of the form SID('xxxxxx') and converts whatever is between the single quotes into its hashed equivalent as a hex literal (0xNNNNNNNN). [...]
You could conceivably do it via a macro and/or some template metaprogramming, too, although as you say it's tricky to get the compiler to do this kind of work for you. It's not impossible, but writing a custom tool is easier and much more flexible. [...]
Note also that we chose single quotes for SID('xxxx') literals. This was done so that we'd get some reasonable syntax highlighting in our code editors, yet if something went wrong and some un-preprocessed code ever made it thru to the compiler, it would throw a syntax error because single quotes are normally reserved for single-character literals.
Note also that it's crucial to have your little pre-processing tool cache the strings in a database of some sort, so that the original strings can be looked up given the hash code. When you are debugging your code and you inspect a StringId variable, the debugger will normally show you the rather unintelligible hash code. But with a SID database, you can write a plug-in that converts these hash codes back to their string equivalents. That way, you'll see SID('foo') in your watch window, not 0x75AE3080 [...]. Also, the game should be able to load this same database, so that it can print strings instead of hex hash codes on the screen for debugging purposes [...].
But while preprocess has some main advantages, it means that I have to prepare some kind of output system of modified files (those will be stored elsewhere, and then we need to tell MSVC). So it might complicate the compiling task. Is there a way to preprocess file with python for instance without headaches? But this is not the question, and I'm still interested about using compile-time function (about cache I could use an ID index)
Here is a solution that works entirely at compile time, but may also be used at runtime. It is a mix of constexpr, templates and macros. You may want to change some of the names or put them in a separate file since they are quite short.
Note that I reused code from this answer for the CRC table generation and I based myself off of code from this page for the implementation.
I have not tested it on MSVC since I don't currently have it installed in my Windows VM, but I believe it should work, or at least be made to work with trivial changes.
Here is the code, you may use the crc32 function directly, or the WSID function that more closely matches your question :
#include <cstring>
#include <cstdint>
#include <iostream>
// Generate CRC lookup table
template <unsigned c, int k = 8>
struct f : f<((c & 1) ? 0xedb88320 : 0) ^ (c >> 1), k - 1> {};
template <unsigned c> struct f<c, 0>{enum {value = c};};
#define A(x) B(x) B(x + 128)
#define B(x) C(x) C(x + 64)
#define C(x) D(x) D(x + 32)
#define D(x) E(x) E(x + 16)
#define E(x) F(x) F(x + 8)
#define F(x) G(x) G(x + 4)
#define G(x) H(x) H(x + 2)
#define H(x) I(x) I(x + 1)
#define I(x) f<x>::value ,
constexpr unsigned crc_table[] = { A(0) };
// Constexpr implementation and helpers
constexpr uint32_t crc32_impl(const uint8_t* p, size_t len, uint32_t crc) {
return len ?
crc32_impl(p+1,len-1,(crc>>8)^crc_table[(crc&0xFF)^*p])
: crc;
}
constexpr uint32_t crc32(const uint8_t* data, size_t length) {
return ~crc32_impl(data, length, ~0);
}
constexpr size_t strlen_c(const char* str) {
return *str ? 1+strlen_c(str+1) : 0;
}
constexpr int WSID(const char* str) {
return crc32((uint8_t*)str, strlen_c(str));
}
// Example usage
using namespace std;
int main() {
cout << "The CRC32 is: " << hex << WSID("some-id") << endl;
}
The first part takes care of generating the table of constants, while crc32_impl is a standard CRC32 implementation converted to a recursive style that works with a C++11 constexpr.
Then crc32 and WSID are just simple wrappers for convenience.
If anyone is interested, I coded up a CRC-32 table generator function and code generator function using C++14 style constexpr functions. The result is, in my opinion, much more maintainable code than many other attempts I have seen on the internet and it stays far, far away from the preprocessor.
Now, it does use a custom std::array 'clone' called cexp::array, because G++ seems to not have not added the constexpr keyword to their non-const reference index access/write operator.
However, it is quite light-weight, and hopefully the keyword will be added to std::array in the close future. But for now, the very simple array implementation is as follows:
namespace cexp
{
// Small implementation of std::array, needed until constexpr
// is added to the function 'reference operator[](size_type)'
template <typename T, std::size_t N>
struct array {
T m_data[N];
using value_type = T;
using reference = value_type &;
using const_reference = const value_type &;
using size_type = std::size_t;
// This is NOT constexpr in std::array until C++17
constexpr reference operator[](size_type i) noexcept {
return m_data[i];
}
constexpr const_reference operator[](size_type i) const noexcept {
return m_data[i];
}
constexpr size_type size() const noexcept {
return N;
}
};
}
Now, we need to generate the CRC-32 table. I based the algorithm off some Hacker's Delight code, and it can probably be extended to support the many other CRC algorithms out there. But alas, I only required the standard implementation, so here it is:
// Generates CRC-32 table, algorithm based from this link:
// http://www.hackersdelight.org/hdcodetxt/crc.c.txt
constexpr auto gen_crc32_table() {
constexpr auto num_bytes = 256;
constexpr auto num_iterations = 8;
constexpr auto polynomial = 0xEDB88320;
auto crc32_table = cexp::array<uint32_t, num_bytes>{};
for (auto byte = 0u; byte < num_bytes; ++byte) {
auto crc = byte;
for (auto i = 0; i < num_iterations; ++i) {
auto mask = -(crc & 1);
crc = (crc >> 1) ^ (polynomial & mask);
}
crc32_table[byte] = crc;
}
return crc32_table;
}
Next, we store the table in a global and perform rudimentary static checking on it. This checking could most likely be improved, and it is not necessary to store it in a global.
// Stores CRC-32 table and softly validates it.
static constexpr auto crc32_table = gen_crc32_table();
static_assert(
crc32_table.size() == 256 &&
crc32_table[1] == 0x77073096 &&
crc32_table[255] == 0x2D02EF8D,
"gen_crc32_table generated unexpected result."
);
Now that the table is generated, it's time to generate the CRC-32 codes. I again based the algorithm off the Hacker's Delight link, and at the moment it only supports input from a c-string.
// Generates CRC-32 code from null-terminated, c-string,
// algorithm based from this link:
// http://www.hackersdelight.org/hdcodetxt/crc.c.txt
constexpr auto crc32(const char *in) {
auto crc = 0xFFFFFFFFu;
for (auto i = 0u; auto c = in[i]; ++i) {
crc = crc32_table[(crc ^ c) & 0xFF] ^ (crc >> 8);
}
return ~crc;
}
For sake of completion, I generate one CRC-32 code below and statically check if it has the expected output, and then print it to the output stream.
int main() {
constexpr auto crc_code = crc32("some-id");
static_assert(crc_code == 0x1A61BC96, "crc32 generated unexpected result.");
std::cout << std::hex << crc_code << std::endl;
}
Hopefully this helps anyone else that was looking to achieve compile time generation of CRC-32, or even in general.
#tux3's answer is pretty slick! Hard to maintain, though, because you are basically writing your own implementation of CRC32 in preprocessor commands.
Another way to solve your question is to go back and understand the need for the requirement first. If I understand you right, the concern seems to be performance. In that case, there is a second point of time you can call your function without performance impact: at program load time. In that case, you would be accessing a global variable instead of passing a constant. Performance-wise, after initialization both should be identical (a const fetches 32 bits from your code, a global variable fetches 32 bits from a regular memory location).
You could do something like this:
static int myWSID = 0;
// don't call this directly
static int WSID(const char* str) {
boost::crc_32_type result;
result.process_bytes(str,sizeof(str));
return result.checksum();
}
// Put this early into your program into the
// initialization code.
...
myWSID = WSID('some-id');
Depending on your overall program, you may want to have an inline accessor to retrieve the value.
If a minor performance impact is acceptable, you would also write your function like this, basically using the singleton pattern.
// don't call this directly
int WSID(const char* str) {
boost::crc_32_type result;
result.process_bytes(str,sizeof(str));
return result.checksum();
}
// call this instead. Note the hard-coded ID string.
// Create one such function for each ID you need to
// have available.
static int myWSID() {
// Note: not thread safe!
static int computedId = 0;
if (computedId == 0)
computedId = WSID('some-id');
return computedId;
}
Of course, if the reason for asking for compile-time evaluation is something different (such as, not wanting some-id to appear in the compiled code), these techniques won't help.
The other option is to use Jason Gregory's suggestion of a custom preprocessor. It can be done fairly cleanly if you collect all the IDS into a separate file. This file doesn't need to have C syntax. I'd give it an extension such as .wsid. The custom preprocessor generates a .H file from it.
Here is how this could look:
idcollection.wsid (before custom preprocessor):
some_id1
some_id2
some_id3
Your preprocessor would generate the following idcollection.h:
#define WSID_some_id1 0xabcdef12
#define WSID_some_id2 0xbcdef123
#define WSID_some_id3 0xcdef1234
And in your code, you'd call
Engine.getById(WSID_some_id1);
A few notes about this:
This assumes that all the original IDs can be converted into valid identifiers. If they contain special characters, your preprocessor may need to do additional munging.
I notice a mismatch in your original question. Your function returns an int, but Engine.getById seems to take a string. My proposed code would always use int (easy to change if you want always string).

Why aren't fields from constant POD object constants themselves?

I want to specialize a template for a certain GUID, which is a 16 byte struct. The GUID object has internal linkage, so I can't use the address of the object itself, but I thought I could use the contents of the object, since the object was a constant. But this doesn't work, as illustrated by this example code:
struct S
{
int const i;
};
S const s = { 42 };
char arr[s.i];
Why isn't s.i a constant if s is? Any workaround?
The initialization of the struct s can happen at run time. However, the size of an array must be known at compile time. The compiler won't (for sure) know that the value of s.i is known at compile time, so it just sees you're using a variable for something you shouldn't be. The issue isn't with constness, it's an issue of when the size of the array is needed.
You may be misunderstanding what const means. It only means that after the variable is initialized, it is never changed. For example this is legal:
void func(int x){
const int i = x*5; //can't be known at compile-time, but still const
//int array[i]; //<-- this would be illegal even though i is const
}
int main(){
int i;
std::cin >> i;
func(i);
return 0;
}
To get around this limitation, in C++11 you can mark it as constexpr to indicate that the value can be determined at compile time. This seems to be what you want.
struct S
{
int const i;
};
int main(){
constexpr S const s = { 42 };
char arr[s.i];
return 0;
}
compile with:
$ c++ -std=c++11 -pedantic file.cpp
in C99, what you're doing is legal, the size of an array does not need to be known at compile time.
struct S
{
int const i;
};
int main(){
struct S const s = { 42 };
char arr[s.i];
return 0;
}
compile with:
$ cc -std=c99 -pedantic file.c
At least most of the time, const really means something much closer to "read-only" than to "constant". In C89/90, essentially all it means is "read-only". C++ adds some circumstances in which it can be constant, but it still doesn't even close to all the time (and, unfortunately, keeping track of exactly what it means when is non-trivial).
Fortunately, the "workaround" is to write your code the way you almost certainly should in any case:
std::vector<char> arr(s.i);
Bottom line: most use of a built-in array in C++ should be considered suspect. The fact that you can initialize a vector from a non-constant expression is only one of many advantages.

strict aliasing and alignment

I need a safe way to alias between arbitrary POD types, conforming to ISO-C++11 explicitly considering 3.10/10 and 3.11 of n3242 or later.
There are a lot of questions about strict aliasing here, most of them regarding C and not C++. I found a "solution" for C which uses unions, probably using this section
union type that includes one of the aforementioned types among its
elements or nonstatic data members
From that I built this.
#include <iostream>
template <typename T, typename U>
T& access_as(U* p)
{
union dummy_union
{
U dummy;
T destination;
};
dummy_union* u = (dummy_union*)p;
return u->destination;
}
struct test
{
short s;
int i;
};
int main()
{
int buf[2];
static_assert(sizeof(buf) >= sizeof(double), "");
static_assert(sizeof(buf) >= sizeof(test), "");
access_as<double>(buf) = 42.1337;
std::cout << access_as<double>(buf) << '\n';
access_as<test>(buf).s = 42;
access_as<test>(buf).i = 1234;
std::cout << access_as<test>(buf).s << '\n';
std::cout << access_as<test>(buf).i << '\n';
}
My question is, just to be sure, is this program legal according to the standard?*
It doesn't give any warnings whatsoever and works fine when compiling with MinGW/GCC 4.6.2 using:
g++ -std=c++0x -Wall -Wextra -O3 -fstrict-aliasing -o alias.exe alias.cpp
* Edit: And if not, how could one modify this to be legal?
This will never be legal, no matter what kind of contortions you perform with weird casts and unions and whatnot.
The fundamental fact is this: two objects of different type may never alias in memory, with a few special exceptions (see further down).
Example
Consider the following code:
void sum(double& out, float* in, int count) {
for(int i = 0; i < count; ++i) {
out += *in++;
}
}
Let's break that out into local register variables to model actual execution more closely:
void sum(double& out, float* in, int count) {
for(int i = 0; i < count; ++i) {
register double out_val = out; // (1)
register double in_val = *in; // (2)
register double tmp = out_val + in_val;
out = tmp; // (3)
in++;
}
}
Suppose that (1), (2) and (3) represent a memory read, read and write, respectively, which can be very expensive operations in such a tight inner loop. A reasonable optimization for this loop would be the following:
void sum(double& out, float* in, int count) {
register double tmp = out; // (1)
for(int i = 0; i < count; ++i) {
register double in_val = *in; // (2)
tmp = tmp + in_val;
in++;
}
out = tmp; // (3)
}
This optimization reduces the number of memory reads needed by half and the number of memory writes to 1. This can have a huge impact on the performance of the code and is a very important optimization for all optimizing C and C++ compilers.
Now, suppose that we don't have strict aliasing. Suppose that a write to an object of any type can affect any other object. Suppose that writing to a double can affect the value of a float somewhere. This makes the above optimization suspect, because it's possible the programmer has in fact intended for out and in to alias so that the sum function's result is more complicated and is affected by the process. Sounds stupid? Even so, the compiler cannot distinguish between "stupid" and "smart" code. The compiler can only distinguish between well-formed and ill-formed code. If we allow free aliasing, then the compiler must be conservative in its optimizations and must perform the extra store (3) in each iteration of the loop.
Hopefully you can see now why no such union or cast trick can possibly be legal. You cannot circumvent fundamental concepts like this by sleight of hand.
Exceptions to strict aliasing
The C and C++ standards make special provision for aliasing any type with char, and with any "related type" which among others includes derived and base types, and members, because being able to use the address of a class member independently is so important. You can find an exhaustive list of these provisions in this answer.
Furthermore, GCC makes special provision for reading from a different member of a union than what was last written to. Note that this kind of conversion-through-union does not in fact allow you to violate aliasing. Only one member of a union is allowed to be active at any one time, so for example, even with GCC the following would be undefined behavior:
union {
double d;
float f[2];
};
f[0] = 3.0f;
f[1] = 5.0f;
sum(d, f, 2); // UB: attempt to treat two members of
// a union as simultaneously active
Workarounds
The only standard way to reinterpret the bits of one object as the bits of an object of some other type is to use an equivalent of memcpy. This makes use of the special provision for aliasing with char objects, in effect allowing you to read and modify the underlying object representation at the byte level. For example, the following is legal, and does not violate strict aliasing rules:
int a[2];
double d;
static_assert(sizeof(a) == sizeof(d));
memcpy(a, &d, sizeof(d));
This is semantically equivalent to the following code:
int a[2];
double d;
static_assert(sizeof(a) == sizeof(d));
for(size_t i = 0; i < sizeof(a); ++i)
((char*)a)[i] = ((char*)&d)[i];
GCC makes a provision for reading from an inactive union member, implicitly making it active. From the GCC documentation:
The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above will work as expected. See Structures unions enumerations and bit-fields implementation. However, this code might not:
int f() {
union a_union t;
int* ip;
t.d = 3.0;
ip = &t.i;
return *ip;
}
Similarly, access by taking the address, casting the resulting pointer and dereferencing the result has undefined behavior, even if the cast uses a union type, e.g.:
int f() {
double d = 3.0;
return ((union a_union *) &d)->i;
}
Placement new
(Note: I'm going by memory here as I don't have access to the standard right now).
Once you placement-new an object into a storage buffer, the lifetime of the underlying storage objects ends implicitly. This is similar to what happens when you write to a member of a union:
union {
int i;
float f;
} u;
// No member of u is active. Neither i nor f refer to an lvalue of any type.
u.i = 5;
// The member u.i is now active, and there exists an lvalue (object)
// of type int with the value 5. No float object exists.
u.f = 5.0f;
// The member u.i is no longer active,
// as its lifetime has ended with the assignment.
// The member u.f is now active, and there exists an lvalue (object)
// of type float with the value 5.0f. No int object exists.
Now, let's look at something similar with placement-new:
#define MAX_(x, y) ((x) > (y) ? (x) : (y))
// new returns suitably aligned memory
char* buffer = new char[MAX_(sizeof(int), sizeof(float))];
// Currently, only char objects exist in the buffer.
new (buffer) int(5);
// An object of type int has been constructed in the memory pointed to by buffer,
// implicitly ending the lifetime of the underlying storage objects.
new (buffer) float(5.0f);
// An object of type int has been constructed in the memory pointed to by buffer,
// implicitly ending the lifetime of the int object that previously occupied the same memory.
This kind of implicit end-of-lifetime can only occur for types with trivial constructors and destructors, for obvious reasons.
Aside from the error when sizeof(T) > sizeof(U), the problem there could be, that the union has an appropriate and possibly higher alignment than U, because of T.
If you don't instantiate this union, so that its memory block is aligned (and large enough!) and then fetch the member with destination type T, it will break silently in the worst case.
For example, an alignment error occurs, if you do the C-style cast of U*, where U requires 4 bytes alignment, to dummy_union*, where dummy_union requires alignment to 8 bytes, because alignof(T) == 8. After that, you possibly read the union member with type T aligned at 4 instead of 8 bytes.
Alias cast (alignment & size safe reinterpret_cast for PODs only):
This proposal does explicitly violate strict aliasing, but with static assertions:
///#brief Compile time checked reinterpret_cast where destAlign <= srcAlign && destSize <= srcSize
template<typename _TargetPtrType, typename _ArgType>
inline _TargetPtrType alias_cast(_ArgType* const ptr)
{
//assert argument alignment at runtime in debug builds
assert(uintptr_t(ptr) % alignof(_ArgType) == 0);
typedef typename std::tr1::remove_pointer<_TargetPtrType>::type target_type;
static_assert(std::tr1::is_pointer<_TargetPtrType>::value && std::tr1::is_pod<target_type>::value, "Target type must be a pointer to POD");
static_assert(std::tr1::is_pod<_ArgType>::value, "Argument must point to POD");
static_assert(std::tr1::is_const<_ArgType>::value ? std::tr1::is_const<target_type>::value : true, "const argument must be cast to const target type");
static_assert(alignof(_ArgType) % alignof(target_type) == 0, "Target alignment must be <= source alignment");
static_assert(sizeof(_ArgType) >= sizeof(target_type), "Target size must be <= source size");
//reinterpret cast doesn't remove a const qualifier either
return reinterpret_cast<_TargetPtrType>(ptr);
}
Usage with pointer type argument ( like standard cast operators such as reinterpret_cast ):
int* x = alias_cast<int*>(any_ptr);
Another approach (circumvents alignment and aliasing issues using a temporary union):
template<typename ReturnType, typename ArgType>
inline ReturnType alias_value(const ArgType& x)
{
//test argument alignment at runtime in debug builds
assert(uintptr_t(&x) % alignof(ArgType) == 0);
static_assert(!std::tr1::is_pointer<ReturnType>::value ? !std::tr1::is_const<ReturnType>::value : true, "Target type can't be a const value type");
static_assert(std::tr1::is_pod<ReturnType>::value, "Target type must be POD");
static_assert(std::tr1::is_pod<ArgType>::value, "Argument must be of POD type");
//assure, that we don't read garbage
static_assert(sizeof(ReturnType) <= sizeof(ArgType),"Target size must be <= argument size");
union dummy_union
{
ArgType x;
ReturnType r;
};
dummy_union dummy;
dummy.x = x;
return dummy.r;
}
Usage:
struct characters
{
char c[5];
};
//.....
characters chars;
chars.c[0] = 'a';
chars.c[1] = 'b';
chars.c[2] = 'c';
chars.c[3] = 'd';
chars.c[4] = '\0';
int r = alias_value<int>(chars);
The disadvantage of this is, that the union may require more memory than actually needed for the ReturnType
Wrapped memcpy (circumvents alignment and aliasing issues using memcpy):
template<typename ReturnType, typename ArgType>
inline ReturnType alias_value(const ArgType& x)
{
//assert argument alignment at runtime in debug builds
assert(uintptr_t(&x) % alignof(ArgType) == 0);
static_assert(!std::tr1::is_pointer<ReturnType>::value ? !std::tr1::is_const<ReturnType>::value : true, "Target type can't be a const value type");
static_assert(std::tr1::is_pod<ReturnType>::value, "Target type must be POD");
static_assert(std::tr1::is_pod<ArgType>::value, "Argument must be of POD type");
//assure, that we don't read garbage
static_assert(sizeof(ReturnType) <= sizeof(ArgType),"Target size must be <= argument size");
ReturnType r;
memcpy(&r,&x,sizeof(ReturnType));
return r;
}
For dynamic sized arrays of any POD type:
template<typename ReturnType, typename ElementType>
ReturnType alias_value(const ElementType* const array,const size_t size)
{
//assert argument alignment at runtime in debug builds
assert(uintptr_t(array) % alignof(ElementType) == 0);
static const size_t min_element_count = (sizeof(ReturnType) / sizeof(ElementType)) + (sizeof(ReturnType) % sizeof(ElementType) != 0 ? 1 : 0);
static_assert(!std::tr1::is_pointer<ReturnType>::value ? !std::tr1::is_const<ReturnType>::value : true, "Target type can't be a const value type");
static_assert(std::tr1::is_pod<ReturnType>::value, "Target type must be POD");
static_assert(std::tr1::is_pod<ElementType>::value, "Array elements must be of POD type");
//check for minimum element count in array
if(size < min_element_count)
throw std::invalid_argument("insufficient array size");
ReturnType r;
memcpy(&r,array,sizeof(ReturnType));
return r;
}
More efficient approaches may do explicit unaligned reads with intrinsics, like the ones from SSE, to extract primitives.
Examples:
struct sample_struct
{
char c[4];
int _aligner;
};
int test(void)
{
const sample_struct constPOD = {};
sample_struct pod = {};
const char* str = "abcd";
const int* constIntPtr = alias_cast<const int*>(&constPOD);
void* voidPtr = alias_value<void*>(pod);
int intValue = alias_value<int>(str,strlen(str));
return 0;
}
EDITS:
Assertions to assure conversion of PODs only, may be improved.
Removed superfluous template helpers, now using tr1 traits only
Static assertions for clarification and prohibition of const value (non-pointer) return type
Runtime assertions for debug builds
Added const qualifiers to some function arguments
Another type punning function using memcpy
Refactoring
Small example
I think that at the most fundamental level, this is impossible and violates strict aliasing. The only thing you've achieved is tricking the compiler into not noticing.
My question is, just to be sure, is this program legal according to the standard?
No. The alignment may be unnatural using the alias you have provided. The union you wrote just moves the point of the alias. It may appear to work, but that program may fail when CPU options, ABI, or compiler settings change.
And if not, how could one modify this to be legal?
Create natural temporary variables and treat your storage as a memory blob (moving in and out of the blob to/from temporaries), or use a union which represents all your types (remember, one active element at a time here).