Last week a young student ask me if marshalling is the same as casting.
My answer was definetly no. Marshalling is seralization, the way to transform a
memory representation of an objet into bytes in order to be transmitted to a
network whereas casting is related to type convertion / coercion.
Later on, rethinking the question I was thought that marshalling can be seen as a special case of casting. For example the transformation of the memory representation is in xml then one can say that you are "casting" to the type defined by the corresponding xsd grammar of that xml file.
What do you think about this?
Casting doesn't modify the data type. That is a major distinction. When you marshal something, you are transforming the data into something else.
A simple cast only changes how you are interpreting the object, not what the object is internally.
I agree that the distinction should be clear else unfamiliar people may be confused.
They're both a "type conversion", but, they are different kinds of type conversion: casting is usually between related object types (e.g. a downcast from a superclass to a subclass), whereas a marshalling might be for example from an object graph to plain-text representation.
Marshalling is generally about a technology boundary (e.g. going across a network or from one memory type to another as in the case of managed/unmanaged) whereas casting is generally within the same technology boundary therefore I think they are definitely different things.
It would be exceptionally confusing if we used the same term for both approaches meaning we would need to define them differently as they have different behaviours.
Related
I recently learned that it is Undefined Behavior to reinterpret a POD as a different POD by reinterpret_casting its address. So I'm just wondering what a potential use-case of reinterpret_cast might be, if it can't be used for what its name suggests?
There are two situations in which I’ve used reinterpret_cast:
To cast to and from char* for the purpose of serialisation or when talking to a legacy API. In this case the cast from char* to an object pointer is still strictly speaking UB (even though done very frequently). And you don’t actually need reinterpret_cast here — you could use memcpy instead, but the cast might under specific circumstances avoid a copy (but in situations where reinterpreting the bytes is valid in the first place, memcpy usually doesn’t generate any redundant copies either, the compiler is smart enough for that).
To cast pointers from/to std::uintptr_t to serialise them across a legacy API or to perform some non-pointer arithmetic on them. This is definitely an odd beast and doesn’t happen frequently (even in low-level code) but consider the situation where one wants to exploit the fact that pointers on a given platform don’t use the most significant bits, and these bits can thus be used to store some bit flags. Garbage collector implementations occasionally do this. The lower bits of a pointer can sometimes also be used, if the programmer knows that the pointer will always be aligned e.g. at an 8 byte boundary (so the lowest three bits must be 0).
But to be honest I can’t remember the last concrete, legitimate situation where I’ve actually used reinterpret_cast. It’s definitely many years ago.
Conforming implementations of C and C++ are allowed to extend the semantics of C or C++ by behaving meaningfully even in cases where the Standards would not require them to do so. Implementations that do so will may be more suitable for a wider range of tasks than implementations that do not. In many cases, it is useful to have consistent syntax to specify constructs which will be processed meaningfully and consistently by implementations that are designed to be suitable for low-level programming tasks, even if implementations which are not designed to be suitable for such purposes would process them nonsensically.
One very frequent use case is when you're working with C library functions that take an opaque void * that gets forwarded to a callback function. Using reinterpret_cast on both sides of the fence, so to speak, keeps everything proper.
Is there an adequate way in C++ to extract a hash from the data that std::any stores?
Well, or at least an object in the form of a list of bytes and its length
std::any is a type-safe mechanism for passing an object of known type from one location to another, through an intermediary that does not need to know what that type is. Computing a hash from it is not its goal. And indeed, it wouldn't be meaningfully possible without compromising any's functionality.
Hashing an object requires some knowledge of what that object is and is doing. Assuming that you can just look at the bytes of the object representation and thereby compute a meaningful hash from it is not going to end well. It might appear to work... for a while. But eventually, it's going to do the wrong thing.
You could create a type-erased type similar to any that requires the object to implement hashing. But std::any is not that type, because anyone who doesn't want to hash the types they put into any would be unable to store said object in any.
This is because any operation that any provides is an operation that all types that gets stored into any must also provide. For example, any is copyable, therefore any cannot store move-only types. That is an annoyance for those who want to do so, and the more functionality you dump into any, the more restrictive the type's ability to store "any"thing becomes.
Is there a term for a class/struct that is both trivial and standard-layout but also has no pointer members?
Basically I'd like to refer to "really" plain-old-data types. Data that I can grab from memory and store on disk, and read back into memory for later processing because it is nothing more than a collection of ints, characters, enums, etc.
Is there a way to test at compile time if a type is a "really" plain-old-data type?
related:
What are POD types in C++?
What are Aggregates and PODs and how/why are they special?
This can depend on semantics of the structure. I could imagine a struct having int fields being keys into some volatile temporary data store (or cache). You still shouldn't serialize those, but you need internal knowledge about that struct to be able to tell1.
In general, C++ lacks features for generic serialization. Making this automatic just on pointers is just a tip of the iceberg (if possibly pretty accurate in general) - it's also impossible in a generic way. C++ still has no reflection, and thus no way to check "every member" for some condition.
The realistic approaches could be:
preprocessing the class sources before build to scan for pointers
declaring all structs that are to be serialized with some macros that track the types
the regular template check could be implemented for a set of known names for fields
All of those have their limitations, though, and together with my earlier reservations, I'm not sure how practical they'd be.
1 This of course goes both ways; pointers could be used to store relative offsets, and thus be perfectly serializable.
In the WinAPI, WndProc has lParam and wParam which are longs. This means you generally have to typecast them into the correct type.
I've read that message systems in OOP should not need to cast things and that this is a bad practice. Therefore, in a language like C++, how would a basic message system work, where each message has 2 parameters, or even object pointers, depending on the message, but doing so without typecasting?
Thanks
For the general case I doubt you can do without some typecasting.
However, in C++ level design the typecasting can be mostly be centralized.
Look up visitor pattern.
Cheers & hth.,
The problem with type-casting is that it isn't safe. Boost provides a number of ways to do typecasting in a safe way.
If the data that could be sent in your message system is well-defined, limited to a few possible choices, then one could employ a boost::variant object. Variants are kind of like type-safe unions that have built-in visitation support.
However, if the set of possible data is more or less arbitrary, then you won't be able to use a variant. You still want to preserve type-safety, so that the person receiving the message cannot cast it to a different type other than the type that it was originally given with. In that case, boost::any is a good choice. Yes, you still have to use an any_cast, but that will fail if it is not of the proper type.
My knowledge of C++ at this point is more academic than anything else. In all my reading thus far, the use of explicit conversion with named casts (const_cast, static_cast, reinterpret_cast, dynamic_cast) has come with a big warning label (and it's easy to see why) that implies that explicit conversion is symptomatic of bad design and should only be used as a last resort in desperate circumstances. So, I have to ask:
Is explicit conversion with named casts really just jury rigging code or is there a more graceful and positive application to this feature? Is there a good example of the latter?
There're cases when you just can't go without it. Like this one. The problem there is that you have multiple inheritance and need to convert this pointer to void* while ensuring that the pointer that goes into void* will still point to the right subobject of the current object. Using an explicit cast is the only way to achieve that.
There's an opinion that if you can't go without a cast you have bad design. I can't agree with this completely - different situations are possible, including one mentioned above, but perhaps if you need to use explicit casts too often you really have bad design.
There are situations when you can't really avoid explicit casts. Especially when interacting with C libraries or badly designed C++ libraries (like the COM library sharptooth used as examples).
In general, the use of explicit casts IS a red herring. It does not necessarily means bad code, but it does attract attention to a potential dangerous use.
However you should not throw the 4 casts in the same bag: static_cast and dynamic_cast are frequently used for up-casting (from Base to Derived) or for navigating between related types. Their occurrence in the code is pretty normal (indeed it's difficult to write a Visitor Pattern without either).
On the other hand, the use of const_cast and reinterpret_cast is much more dangerous.
using const_cast to try and modify a read-only object is undefined behavior (thanks to James McNellis for correction)
reinterpret_cast is normally only used to deal with raw memory (allocators)
They have their use, of course, but should not be encountered in normal code. For dealing with external or C APIs they might be necessary though.
At least that's my opinion.
How bad a cast is typically depends on the type of cast. There are legitimate uses for all of these casts, but some smell worse than others.
const_cast is used to cast away constness (since adding it doesn't require a cast). Ideally, that should never be used. It makes it easy to invoke undefined behavior (trying to change an object originally designated const), and in any case breaks the const-correctness of the program. It is sometimes necessary when interfacing with APIs that are not themselves const-correct, which may for example ask for a char * when they're going to treat it as const char *, but since you shouldn't write APIs that way it's a sign that you're using a really old API or somebody screwed up.
reinterpret_cast is always going to be platform-dependent, and is therefore at best questionable in portable code. Moreover, unless you're doing low-level operations on the physical structure of objects, it doesn't preserve meaning. In C and C++, a type is supposed to be meaningful. An int is a number that means something; an int that is basically the concatenation of chars doesn't really mean anything.
dynamic_cast is normally used for downcasting; e.g. from Base * to Derived *, with the proviso that either it works or it returns 0. This subverts OO in much the same way as a switch statement on a type tag does: it moves the code that defines what a class is away from the class definition. This couples the class definitions with other code and increases the potential maintenance load.
static_cast is used for data conversions that are known to be generally correct, such as conversions to and from void *, known safe pointer casts within the class hierarchy, that sort of thing. About the worst you can say for it is that it subverts the type system to some extent. It's likely to be needed when interfacing with C libraries, or the C part of the standard library, as void * is often used in C functions.
In general, well-designed and well-written C++ code will avoid the use cases above, in some cases because the only use of the cast is to do potentially dangerous things, and in other cases because such code tends to avoid the need for such conversions. The C++ type system is generally seen as a good thing to maintain, and casts subvert it.
IMO, like most things, they're tools, with appropriate uses and inappropriate ones. Casting is probably an area where the tools frequently get used inappropriately, for example, to cast between an int and pointer type with a reinterpret_cast (which can break on platforms where the two are different sizes), or to const_cast away constness purely as a hack, and so on.
If you know what they're for and the intended uses, there's absolutely nothing wrong with using them for what they were designed for.
There is an irony to explicit casts. The developer whose poor C++ design skills lead him to write code requiring a lot of casting is the same developer who doesn't use the explicit casting mechanisms appropriately, or at all, and litters his code with C-style casts.
On the other hand, the developer who understands their purpose, when to use them and when not to, and what the alternatives are, is not writing code that requires much casting!
Check out the more fine-grained variations on these casts, such as polymorphic_cast, in the boost conversions library, to give you an idea of just how careful C++ programmers are when it comes to casting.
Casts are a sign that you're trying to put a round peg in a square hole. Sometimes that's part of the job. But if you have some control over both the hole and the peg, it would be better not create this condition, and writing a cast should trigger you to ask yourself if there was something you could have done so this was a little smoother.