custom optional breaks strict aliasing rules - c++

I wrote a custom optional class (since I am forced to use C++98 without STL).
It looks like this:
template <typename T>
struct optional {
char value[sizeof(T)];
bool has_value;
T& operator*() {
return *reinterpret_cast<T*>(value);
}
};
The compiler produces the warning dereferencing type-punned pointer will break strict aliasing rules.
What can I do to make this class without UB?
Maybe memcpy should be used, but I don't understand how.

What can I do to make this class without UB?
Use placement-new to create the object.
Call the destructor of the created object in destructor of optional
Do not reinterpret the address of the storage. Use the pointer returned from placement new. This unfortunately means that you need to store the pointer as a member. You can replace the bool since null pointer would signify empty state.
Take care of alignment. This can be quite tricky pre-C++11. You may need to rely on non-standard language features to achieve this. Given a pointer member which has quite strict alignment, and the fact that C++98 has no overaligned types, you might get away with ignoring alignment for most types.
It would be much easier to allocate the object dynamically. Slower, very likely. But simpler and standard conformant.

One way you can avoid the warning is by having a type-discriminating uinion. Unfortunately, in C++98 it would only work for types which do have default constructors.
I.e.:
...
union {
char empty;
T value;
} storage;
bool has_value;
...
// when object is set to value
new (&storage.value) T(/* arg *);
has_value = true;
...
T& operator() {
// check for has_value should be here
return storage.value;
...
~optiona() {
if (has_value) storage.value.~T();
...
This works, because when union is created, the first element is constructed (empty char). When you put value into the optional, you activate a second member of the union (value), and it is well-defined to access it since.

Related

c++ - how to properly overload operator to do implicit type cast from __generic_buffer<void*> to __generic_buffer<const void*>?

To put it simple, i have the following code:
template<typename p>
requires(std::is_pointer<p>::value)
struct __generic_buffer
{
p baseptr;
size_t size;
// member functions go here
};
using buffer = __generic_buffer<void*>;
using const_buffer = __generic_buffer<const void*>;
and i want to be able to write something like that:
uint8_t arr[64];
buffer b {arr, 64};
const_buffer cb = b;
so basically i want buffer pointing to modifiable memory area to be freely castable to const_buffer pointing to constant memory area, but not the other way around, just how usual pointers behave. Also, i'd prefer this conversion to work with any other instantiation of __generic_buffer, e.g. __generic_buffer<int*> to __generic_buffer<const int*>.
First things first. Types with __ in them are reserved by the C++ standard for your compiler and the standard library. Use of them in programs makes your program ill-formed, no diagnostic required. (also, _T starting with an _ and then a capital letter).
This kind of bad habit comes from copying the coding style of std headers and the like. They are allowed to do that, and in fact must do that, because if they called something bob, someone could legally #define bob alice and break the header file.
There are two ways to do what you want.
One of them involves writing a converting constructor, the other a converting operator.
I'll show the operator way.
template<class P>
requires(std::is_pointer<P>::value)
struct generic_buffer
{
P baseptr;
size_t size;
template<class U>
requires(std::is_same<std::remove_cv_t<std::remove_pointer_t<U>>, std::remove_pointer_t<P>>::value)
operator generic_buffer<U>() const {
return {baseptr, size};
}
// member functions go here
};
I might write a trait rather than do that requires inline, but basically we check if *U is the same type as *P after removing const and volatile from *U.
If so, we allow this conversion operator to exist.
Inside, we use implicit conversion to convert the pointer types.
Now, this isn't perfect, because we really want to check if removing either of const, volatile or both would make it match. You might want to write a type function that does that test.
The constructor way is just the above, but backwards. Declaring a constructor has some impact on the automatic constructors and the triviality of a type, while an operator like this doesn't, so to avoid having to digress into those issues I implemented it as an operator.

Odd usage of special pointer values

I am using a C++ implementation of an algorithm which makes odd usage of special pointer values, and I would like to known how safe and portable is this.
First, there is some structure containing a pointer field. It initializes an array of such structures by zeroing the array with memset(). Later on, the code relies on the pointer fields initialized that way to compare equal to NULL; wouldn't that fail on a machine whose internal representation of the NULL pointer is not all-bits-zero?
Subsequently, the code sets some pointers to, and laters compares some pointers being equal to, specific pointer values, namely ((type*) 1) and ((type*) 2). Clearly, these pointers are meant to be some flags, not supposed to be dereferenced. But can I be sure that some genuine valid pointer would not compare equal to one of these? Is there any better (safe, portable) way to do that (i.e. use specific pointer values that can be taken by pointer variables only through explicit assignment, in order to flag specific situations)?
Any comment is welcome.
To sum up the comments I received, both issues raised in the question are indeed expected to work on "usual" setup, but comes with no guarantee.
Now if I want absolute guarantees, it seems my best option is, for the NULL pointers, set them either manually or with a proper constructor, and for the special pointer values, to create manually sentinel pointer values.
For the latter, in a C++ class I guess the most elegant solution is to use static members
class The_class
{
static const type reserved;
static const type* const sentinel;
};
provided that they can be initialized somewhere:
const type The_class::reserved = foo; // 'foo' is a constant expression of type 'type'
const type* const The_class::sentinel = &The_class::reserved;
If type is templated, either the above initialization must be instantiated for each type intended, or one must resort to non-static (less elegant but still usefull) "reserved" and "sentinel" members.
template <typename type>
class The_class
{
type reserved; // cannot be static anymore, nor const for complicated 'type' without adapted constructor
const type* const sentinel;
public:
The_class() : sentinel(&reserved);
};

Right cast for a void* pointer

I'm trying to implement a C++ class with a value field that can point to anything (a bit like in boost::any). Currently I do the following:
class MyClass {
void* value;
template<typename T>
Myclass(const &T v) {
value = (void*)(new T(v));
}
};
The problem is now to implement a getValue() operation that creates a copy of the inner value with the right type:
template<typename T>
T getValue() {
return *value;
}
Here it cannot work because I'm trying to unreference a void* pointer. I was wondering which cast (static_cast? dynamic_cast? other...) I should use such that *value is properly converted into a T object and an exception is thrown if value was not originally of this type?
Thanks
You cannot dereference a void*, it simply makes no sense. Why not make the class itself generic? Then you can have:
template<typename T>
class MyClass {
T* value;
MyClass(const T& v) {
value = new T(v);
}
T getValue() {
return *value;
}
};
Make sure to create a destructor which deallocates value and also to follow The Rule of Three. You could also make a version of getValue that returns a const T& (const reference to T) to avoid the copy if one is not required.
which cast (static_cast? dynamic_cast? other...) I should use such that *value is properly converted into a T object
If you must do this conversion, then you should use static_cast, which in general is designed to (among other things) reverse any standard conversion. There's a standard conversion from any object pointer type to void*, and your getter reverses it, so use the cast designed for that:
return *static_cast<T*>(value);
You should also either remove the C-style cast from your constructor, or replace that with a static_cast too.
A reinterpret_cast would also work, but is "overkill". In general you should use the cast that is as restrictive as possible while still performing the conversion you need.
and an exception is thrown if value was not originally of this type
You are out of luck there - C++ cannot in general tell what the original type of the object was, once you've cast the pointer to void*. Your code relies on the caller to call getValue with the correct type. For example, consider what happens if the original type was char -- that's just one byte in C++, there is no room set aside for any type information that would allow the compiler to check the cast in getValue.
dynamic_cast does check types in some limited circumstances, but since your template is fully generic, those limited circumstances might not apply.
If you don't like this, you could change your class to store, in addition to the object pointer, a pointer to a type_info object (resulting from a use of the typeid operator). See the standard header <typeinfo>. You could then compare the type_info object for the type T in the constructor, with the type_info object for the type T in getValue, and throw if they don't match.
As you say, your class is intended to be a bit like boost::any, and getValue is like any_cast. You could consult the source and documentation of that class to see the tricks needed to do what you want. If there were a straightforward way to do it, then boost::any would be a straightforward class!
You can't. C++ doesn't provide that sort of mechanism, at least not directly, not for void*. A void* does not have any information that the computer would need to determine what it is, and attempting to "check" if it is a valid whatever-you-cast-it-to is impossible because there aren't particular flags for that.
There are options, though. The first is to use some kind of universal base class, similar to Java's Object, and derive all of your other classes from that. dynamic_cast will now work the way you want (returning NULL if the object is not a valid object of the class you casted it to).
Another is to simply keep track of what type of object it is yourself. That means augmenting the void* with another value that tells you what you need to cast it to.
But really, neither of these things strike me as good ideas. I think there is almost-definitely some other aspect of your design that should be changed rather than using these. Using templates, as #EdS. suggests, is a very good option, for example.

Does encapsulated char array used as object breaks strict aliasing rule

Do the following class break the strict aliasing rule:
template<typename T>
class store {
char m_data[sizeof(T)];
bool m_init;
public:
store() : m_init(false) {}
store(const T &t) : init(true) {
new(m_data) T(t);
}
~store() {
if(m_init) {
get()->~T();
}
}
store &operator=(const store &s) {
if(m_init) {
get()->~T();
}
if(s.m_init) {
new(m_data) T(*s.get());
}
m_init = s.m_init;
}
T *get() {
if (m_init) {
return reinterpret_cast<T *>(m_data);
} else {
return NULL;
}
}
}
My reading of a standard is that it is incorrect but I am not sure (my usage is to have an array of objects T + some metadata of those objects but to have control over the object construction/deconstruction without manually allocating memory) as the allocated objects are used as examples for placement new in standard.
The standard contains this note:
[ Note: A typical implementation would define aligned_storage as:
template <std::size_t Len, std::size_t Alignment>
struct aligned_storage {
typedef struct {
alignas(Alignment) unsigned char __data[Len];
} type;
};
— end note ]
— Pointer modifications [meta.trans.ptr] 20.9.7.5/1
And aligned_storage is defined in part with:
The member typedef type shall be a POD type suitable for use as uninitialized storage for any object whose size is at most Len and whose alignment is a divisor of Align.
The only property covered by the standard that restricts the addresses at which an object can be constructed is alignment. An implementation might have some other restrictions, however I'm not familiar with any that do. So just ensure that having correct alignment is enough on your implementation and I think this should be okay. (and in pre-C++11 compilers you can use use compiler extensions for setting alignment such as __attribute__((alignment(X))) or __declspec(align(X)).
I believe that as long as you don't access the underlying storage directly the aliasing rules don't even come into the picture, because the aliasing rules cover when it is okay to access the value of an object through an object of a different type. Constructing an object and accessing only that object doesn't involve accessing the object's value through an object of any other type.
Earlier answer
The aliasing rules specifically allow char arrays to alias other objects.
If a program attempts to access the stored value of an object through
a glvalue of other than one of the following types the behavior is
undefined:
[...]
— a char or unsigned char type.
— Lvalues and rvalues [basic.lval] 3.10/10
You do need to make sure that the array is properly aligned for type T though.
alignas(T) char m_data[sizeof(T)];
The above is C++11 syntax for setting alignment, but if you're on a C++03 compiler then you'll need a compiler specific attribute to do the same thing. GCC has __attribute__((aligned(32))) and MSVC has __declspec(align(32))
Kerrek SB brings up a good point that the aliasing rules state that it's okay to access the value of a T object via a char array, but that may not mean that accessing the value of a char array via a T object is okay. However, if the placement new expression is well defined then that creates a T object which I think it's okay to accesses as a T object by definition, and reading the original char array is accessing the value of the created T object, which is covered under the aliasing rules.
I think that implies that you could store a T object in, for example, an int array, and as long as you don't access the value of that T object through the original int array then you're not hitting undefined behavior.
What is allowed is to take a T object and interpret it as an array of chars. However, it is in general not allowed to take an arbitrary array of chars and treat it as a T, or even as the pointer to an area of memory containing a T. At the very least, your char array would need to be properly aligned.
One way around this might be to use a union:
union storage { char buf[sizeof(T)]; T dummy; };
Now you can construct a T inside storage.buf:
T * p = ::new (storage.buf) T();

Using (void*) as a type of an identifier

In my program, I have objects (of the same class) that must all have a unique identifier. For simplicity and performance, I chose to use the address of the object as identifier. And to keep the types simple, I use (void*) as a type for this identifier. In the end I have code like this:
class MyClass {
public:
typedef void* identity_t;
identity_t id() const { return (void*)this; }
}
This seems to be working fine, but gcc gives me a strict-aliasing warning. I understand the code would be bad if the id was used for data transfer. Luckily it is not, but the question remains: will aliasing optimisations have an impact on the code produced? And how to avoid the warning?
Note: I am reluctant to use (char*) as this would imply the user can use the data for copy, which it can not!
You could try using the type uintptr_t instead of void*. uintptr_t is a integer type that is defined as being big enough to hold any pointer value. And since it's not actually a pointer, the compiler won't flag any aliasing problems.
class MyClass {
public:
typedef uintptr_t identity_t;
identity_t id() const { return (identity_t)this; }
}
You are violating logical constness returning the object as mutable in a const method.
As Neil points out, no cast is needed.
class MyClass {
public:
typedef const void* identity_t;
identity_t id() const { return this; }
};
Try using
return static_cast<void*>(this);
This should be perfectly safe, any pointer should be able to be cast to void * without risk of loss.
I originally suggested dynamic_cast(this);, but after reading up a bit I think it adds no advantage, and since it's RTTI-only it's not a good solution in general.
By the way, I would make the returned value const, since the identity of an object cannot change.
I can't see any aliasing issues. However, you are casting away constness (since the id function is const), which the compiler might be unhappy about. Perhaps it'd be better to use const void* as your ID type.
Alternatively, cast the address to an integer type such as size_t. Then it is no longer a pointer, and aliasing becomes a non-issue.
Why not using type MyClass *?
Or type intptr_t?