Is there a term for a class/struct that is both trivial and standard-layout but also has no pointer members?
Basically I'd like to refer to "really" plain-old-data types. Data that I can grab from memory and store on disk, and read back into memory for later processing because it is nothing more than a collection of ints, characters, enums, etc.
Is there a way to test at compile time if a type is a "really" plain-old-data type?
related:
What are POD types in C++?
What are Aggregates and PODs and how/why are they special?
This can depend on semantics of the structure. I could imagine a struct having int fields being keys into some volatile temporary data store (or cache). You still shouldn't serialize those, but you need internal knowledge about that struct to be able to tell1.
In general, C++ lacks features for generic serialization. Making this automatic just on pointers is just a tip of the iceberg (if possibly pretty accurate in general) - it's also impossible in a generic way. C++ still has no reflection, and thus no way to check "every member" for some condition.
The realistic approaches could be:
preprocessing the class sources before build to scan for pointers
declaring all structs that are to be serialized with some macros that track the types
the regular template check could be implemented for a set of known names for fields
All of those have their limitations, though, and together with my earlier reservations, I'm not sure how practical they'd be.
1 This of course goes both ways; pointers could be used to store relative offsets, and thus be perfectly serializable.
Related
Is there an adequate way in C++ to extract a hash from the data that std::any stores?
Well, or at least an object in the form of a list of bytes and its length
std::any is a type-safe mechanism for passing an object of known type from one location to another, through an intermediary that does not need to know what that type is. Computing a hash from it is not its goal. And indeed, it wouldn't be meaningfully possible without compromising any's functionality.
Hashing an object requires some knowledge of what that object is and is doing. Assuming that you can just look at the bytes of the object representation and thereby compute a meaningful hash from it is not going to end well. It might appear to work... for a while. But eventually, it's going to do the wrong thing.
You could create a type-erased type similar to any that requires the object to implement hashing. But std::any is not that type, because anyone who doesn't want to hash the types they put into any would be unable to store said object in any.
This is because any operation that any provides is an operation that all types that gets stored into any must also provide. For example, any is copyable, therefore any cannot store move-only types. That is an annoyance for those who want to do so, and the more functionality you dump into any, the more restrictive the type's ability to store "any"thing becomes.
C++17 presents std::variant and std::any, both able to store different type of values under an object. For me, they are somehow similar (are they?).
Also std::variant restricts the entry types, beside this one. Why we should prefer std::variant over std::any which is simpler to use?
The more things you check at compile time the fewer runtime bugs you have.
variant guarantees that it contains one of a list of types (plus valueless by exception). It provides a way for you to guarantee that code operating on it considers every case in the variant with std::visit; even every case for a pair of variants (or more).
any does not. With any the best you can do is "if the type isn't exactly what I ask for, some code won't run".
variant exists in automatic storage. any may use the free store; this means any has performance and noexcept(false) issues that variant does not.
Checking for which of N types is in it is O(N) for an any -- for variant it is O(1).
any is a dressed-up void*. variant is a dressed-up union.
any cannot store non-copy or non-move able types. variant can.
The type of variant is documentation for the reader of your code.
Passing a variant<Msg1, Msg2, Msg3> through an API makes the operation obvious; passing an any there means understanding the API requires reliable documentation or reading the implementation source.
Anyone who has been frustrated by statically typeless languages will understand the dangers of any.
Now this doesn't mean any is bad; it just doesn't solve the same problems as variant. As a copyable object for type erasure purposes, it can be great. Runtime dynamic typing has its place; but that place is not "everywhere" but rather "where you cannot avoid it".
The difference is that the objects are stored within the memory allocated by std::variant:
cppreference.com - std::variant
As with unions, if a variant holds a value of some object type T, the object representation of T is allocated directly within the object representation of the variant itself. Variant is not allowed to allocate additional (dynamic) memory.
and for std::any this is not possible.
As of that a std::variant, does only require one memory allocation for the std::variant itself, and it can stay on the stack.
In addition to never using additional heap memory, variant has one other advantage:
You can std::visit a variant, but not any.
As far as I know each polymorphic class in C++ contains a string with a mangled type name. And RTTI is implemented by string comparison.
Is this true? Would it be more efficient to implement a centralized type storage instead?
With centralized type storage each object can just hold a pointer to type information. Dynamic casts can be implemented simply by pointer comparison.
The actual implementation is even more efficient than one pointer per object.
The Standard forbids adding any data to "standard layout" classes, so there's not even room for a pointer, let alone a string. For polymorphic classes, there will be extra metadata, but in real-world implementations, all data specific to the dynamic type of the object is stored together, and there's just one pointer needed to all of it.
As a result, because polymorphic objects already need a pointer to the virtual function dispatch table, there is zero incremental per-object cost to storing the type name. There just an extra pointer stored in the v-table alongside the function pointers, so the cost is one pointer per polymorphic type no matter how many instances exist.
Polymorphic classes contain what the compiler builder considered worthy putting in them, there is no rule or requirement to have any type information.
The concept of C++ is strongly typed, and the checking id to be done by the compiler. The compiled code is typically optimized for performance and/or size, and not to carry information that shouldn’t be needed.
Of course, some compilers offer this, but that is not the spirit of the language.
As far as I know, when two pointers (or references) do not type alias each other, it is legal to for the compiler to make the assumption that they address different locations and to make certain optimizations thereof, e.g., reordering instructions. Therefore, having pointers to different types to have the same value may be problematic. However, I think this issue only applies when the two pointers are passed to functions. Within the function body where the two pointers are created, the compiler should be able to make sure the relationship between them as to whether they address the same location. Am I right?
As far as I know, when two pointers (or references) do not type alias
each other, it is legal to for the compiler to make the assumption
that they address different locations and to make certain
optimizations thereof, e.g., reordering instructions.
Correct. GCC, for example, does perform optimizations of this form which can be disabled by passing the flag -fno-strict-aliasing.
However, I think this issue only applies when the two pointers are
passed to functions. Within the function body where the two pointers
are created, the compiler should be able to make sure the relationship
between them as to whether they address the same location. Am I right?
The standard doesn't distinguish between where those pointers came from. If your operation has undefined behavior, the program has undefined behavior, period. The compiler is in no way obliged to analyze the operands at compile time, but he may give you a warning.
Implementations which are designed and intended to be suitable for low-level programming should have no particular difficulty recognizing common patterns where storage of one type is reused or reinterpreted as another in situations not involving aliasing, provided that:
Within any particular function or loop, all pointers or lvalues used to access a particular piece of storage are derived from lvalues of a common type which identify the same object or elements of the same array, and
Between the creation of a derived-type pointer and the last use of it or any pointer derived from it, all operations involving the storage are performed only using the derived pointer or other pointers derived from it.
Most low-level programming scenarios requiring reuse or reinterpretation of storage fit these criteria, and handling code that fits these criteria will typically be rather straightforward in an implementation designed for low-level programming. If an implementation cache lvalues in registers and performs loop hoisting, for example, it could support the above semantics reasonably efficiently by flushing all cached values of type T whenever T or T* is used to form a pointer or lvalue of another type. Such an approach may be optimal, but would degrade performance much less than having to block all type-based optimizations entirely.
Note that it is probably in many cases not worthwhile for even an implementation intended for low-level programming to try to handle all possible scenarios involving aliasing. Doing that would be much more expensive than handling the far more common scenarios that don't involve aliasing.
Implementations which are specialized for other purposes are, of course, not required to make any attempt whatsoever to support any exceptions to 6.5p7--not even those that are often treated as part of the Standard. Whether such an implementation should be able to support such constructs would depend upon the particular purposes for which it is designed.
When should I define a type as a struct or as a class?
I know that struct are value types while classes are reference types. So I wonder, for example, should I define a stack as a struct or a class?
Reason #1 to choose struct vs class: classes have inheritance, structs do not. If you need polymorphism, you must use classes.
Reason #2: structs are normally value types (though you can make them reference types if you work at it). Classes are always reference types. So, if you want a value type, choose a struct. If you want a reference type, it's easiest to go with a class.
Reason #3: If you have a type with a lot of data members, then you're probably going to want a reference type (to avoid expensive copying), in which case, you're probably going to choose a class.
Reason #4: If you want deterministic destruction of your type, then it's going to need to be a struct on the stack. Nothing on the GC heap has deterministic destruction, and the destructiors/finalizers of stuff on the GC heap may never be run. If they're collected by the GC, then their finalizers will be run, but otherwise, they won't. So, if you want your type to automatically be destroyed when it leaves scope, you need to use a struct and put it on the stack.
As for your particular case, containers should normally be reference types (copying all of their elements every time that you pass one around would be insanely expensive), and a Stack is a container, so you're going to want to use a class unless you want to go to the trouble of making it a ref-counted struct, which is decidedly more work. It just has the advantage of guaranteeing that its destructor will run when it's not used anymore.
On a side note, if you create a container which is a class, you're probably going to want to make it final so that its various functions can be inlined (and won't be virtual if that class doesn't derive from anything other than Object and they're not functions that Object has), which can be important for something like a container where performance can definitely matter.
Read "D"iving Into the D Programming Language
In D you get structs and then you get classes. They share many amenities but have different charters: structs are value types, whereas classes are meant for dynamic polymorphism and are accessed solely by reference. That way confusions, slicing-related bugs, and comments à la // No! Do NOT inherit! do not exist. When you design a type, you decide upfront whether it'll be a monomorphic value or a polymorphic reference. C++ famously allows defining ambiguous-gender types, but their use is rare, error-prone, and objectionable enough to warrant simply avoiding them by design.
For your Stack type, you are probably best off defining an interface first and then implementations thereof (using class) so that you don't tie-in a particular implementation of your Stack type to its interface.