We often hear/read that one should avoid dynamic casting. I was wondering what would be 'good use' examples of it, according to you?
Edit:
Yes, I'm aware of that other thread: it is indeed when reading one of the first answers there that I asked my question!
This recent thread gives an example of where it comes in handy. There is a base Shape class and classes Circle and Rectangle derived from it. In testing for equality, it is obvious that a Circle cannot be equal to a Rectangle and it would be a disaster to try to compare them. While iterating through a collection of pointers to Shapes, dynamic_cast does double duty, telling you if the shapes are comparable and giving you the proper objects to do the comparison on.
Vector iterator not dereferencable
Here's something I do often, it's not pretty, but it's simple and useful.
I often work with template containers that implement an interface,
imagine something like
template<class T>
class MyVector : public ContainerInterface
...
Where ContainerInterface has basic useful stuff, but that's all. If I want a specific algorithm on vectors of integers without exposing my template implementation, it is useful to accept the interface objects and dynamic_cast it down to MyVector in the implementation. Example:
// function prototype (public API, in the header file)
void ProcessVector( ContainerInterface& vecIfce );
// function implementation (private, in the .cpp file)
void ProcessVector( ContainerInterface& vecIfce)
{
MyVector<int>& vecInt = dynamic_cast<MyVector<int> >(vecIfce);
// the cast throws bad_cast in case of error but you could use a
// more complex method to choose which low-level implementation
// to use, basically rolling by hand your own polymorphism.
// Process a vector of integers
...
}
I could add a Process() method to the ContainerInterface that would be polymorphically resolved, it would be a nicer OOP method, but I sometimes prefer to do it this way. When you have simple containers, a lot of algorithms and you want to keep your implementation hidden, dynamic_cast offers an easy and ugly solution.
You could also look at double-dispatch techniques.
HTH
My current toy project uses dynamic_cast twice; once to work around the lack of multiple dispatch in C++ (it's a visitor-style system that could use multiple dispatch instead of the dynamic_casts), and once to special-case a specific subtype.
Both of these are acceptable, in my view, though the former at least stems from a language deficit. I think this may be a common situation, in fact; most dynamic_casts (and a great many "design patterns" in general) are workarounds for specific language flaws rather than something that aim for.
It can be used for a bit of run-time type-safety when exposing handles to objects though a C interface. Have all the exposed classes inherit from a common base class. When accepting a handle to a function, first cast to the base class, then dynamic cast to the class you're expecting. If they passed in a non-sensical handle, you'll get an exception when the run-time can't find the rtti. If they passed in a valid handle of the wrong type, you get a NULL pointer and can throw your own exception. If they passed in the correct pointer, you're good to go.
This isn't fool-proof, but it is certainly better at catching mistaken calls to the libraries than a straight reinterpret cast from a handle, and waiting until some data gets mysteriously corrupted when you pass the wrong handle in.
Well it would really be nice with extension methods in C#.
For example let's say I have a list of objects and I want to get a list of all ids from them. I can step through them all and pull them out but I would like to segment out that code for reuse.
so something like
List<myObject> myObjectList = getMyObjects();
List<string> ids = myObjectList.PropertyList("id");
would be cool except on the extension method you won't know the type that is coming in.
So
public static List<string> PropertyList(this object objList, string propName) {
var genList = (objList.GetType())objList;
}
would be awesome.
It is very useful, however, most of the times it is too useful: if for getting the job done the easiest way is to do a dynamic_cast, it's more often than not a symptom of bad OO design, what in turn might lead to trouble in the future in unforeseen ways.
Related
There are times when knowing the actual subtype of a polymorphic object is beneficial - like ensuring no virtual calls happen in a loop where we invoke a virtual function repeatedly. Making std::variant extendable towards the end-users of a library is a huge pain, it's a lot nicer to have an interface instead.
Ideally, I could do something like this:
struct base {
virtual void foo() = 0;
};
struct derived : base {
virtual void foo() override {}
};
// Sample
base* b = new derived{};
b->invoke_with_subtype([](auto* subtype) {
static_assert(std::is_same_v<decltype(subtype), derived*>);
});
Of course, life isn't that ideal, I'd need to inject that function somehow. I was thinking some kind of CRTP trick, but I keep running into the same problem again and again, that I just can't erase that passed generic lambda function.
TL;DR: Is it possible to do dynamic dispatch on polymorphic types without a lot of boilerplate and knowing all the subtypes in advance?
No.
Solving the problem as described requires runtime/dynamic link time reification of template code. This is a feature that C++ lacks.
Modest restrictions remove that need, but all have tradeoffs. Most require a list of supported types somewhere, or a list of supported operations.
Fancy alternative object models permit splitting vtables from instances and "RLE"-ing types in buffers and similar, but that also doesn't do what you describe.
TL;DR remains "No" is the answer to your question, you cannot do that.
SO is best suited to concrete, practical problems. In my experience if you have an actual practical use case, 9/10 there is an implicit restriction in your practical use case that makes your problem solvable.
Instead, you produce an abstract solition that would be sufficient to solve your real problem. The issue is that your abstract solution isn't possible to solve.
Of course, 99/100 questions like this don't have an actual practical concrete problem they are solving.
If you have an actual practical concrete problem, I encourage you to post a question about it instead of this abstraction. My hobby is finding SO posts whose sensible answer is "no" and providing insensible "yes" answers.
You can use derived* test = dynamic_cast<derived*>(object) to test for a specific sub-type: if object really is derived then test will be a pointer to it but, if it's actually a base then test will be nullptr.
Microsoft's GDI+ defines many empty classes to be treated as handles internally. For example, (source GdiPlusGpStubs.h)
//Approach 1
class GpGraphics {};
class GpBrush {};
class GpTexture : public GpBrush {};
class GpSolidFill : public GpBrush {};
class GpLineGradient : public GpBrush {};
class GpPathGradient : public GpBrush {};
class GpHatch : public GpBrush {};
class GpPen {};
class GpCustomLineCap {};
There are other two ways to define handles. They're,
//Approach 2
class BOOK; //no need to define it!
typedef BOOK *PBOOK;
typedef PBOOK HBOOK; //handle to be used internally
//Approach 3
typedef void* PVOID;
typedef PVOID HBOOK; //handle to be used internally
I just want to know the advantages and disadvantages of each of these approaches.
One advantage with Microsoft's approach is that, they can define type-safe hierarchy of handles using empty classes, which (I think) is not possible with the other two approaches, though I wonder what advantages this hierarchy would bring to the implementation? Anyway, what else?
EDIT:
One advantage with the second approach (i.e using incomplete classes) is that we can prevent clients from dereferencing the handles (that means, this approach appears to support encapsulation strongly, I suppose). The code would not even compile if one attempts to dereference handles. What else?
The same advantage one has with third approach as well, that you cannot dereference the handles.
Approach #1 is some mid-way between C style and C++ interface. Instead of member functions you have to pass the handle as argument. The advantage of exposed polymorphism is that you can reduce the amount of functions in interface and the types are checked compile time. Usually most experts prefer pimpl idiom (sometimes called compilation firewall) to such interface. You can not use approach #1 to interface with C so better go full C++.
Approach #2 is C style encapsulation and information hiding. The pointer may be (and often is) a pointer to real thing, so it is not over-engineered. User of library may not dereference that pointer. Disadvantage is that it does not expose any polymorphism. Advantage is that you may use it when interfacing with modules written in C.
Approach #3 is over-abstracted C-style encapsulation. The pointer may be really not a pointer at all since user of library should not cast, deallocate or dereference it. Advantage is that it may so carry exception or error values, disadvantage is that most of it has to be checked run time.
I agree with DeadMG that language-neutral object-oriented interfaces are very easy and elegant to use from C++, but these also involve more run-time checks than compile time checks and are overkill when i don't need to interface with other languages. So i personally prefer Approach #2 if it needs to interface with C or Pimpl idiom when it is C++ only.
2 and 3 are slightly less typesafe as they allow to use handles instead of void*
void bluescreeen(HBOOK hb){
memset(hb,0,100000); // no compile errors
}
Approach 3 is not very good at all, as it allows the mixing and matching of handle types that don't actually make sense, any function that takes a HANDLE can take any HANDLE, even if it's compile-time determinable that that is the wrong type.
The downside of Approach 1 is that you have to do a bunch of casting on the other end to their actual types.
Approach 2 isn't that bad, except you can't do any kind of inheritance with it without having to externally query every time.
However, all of this is entirely moot ever since compilers discovered how to implement efficient virtual functions. The approach taken by DirectX and COM is the best- it's very flexible, powerful, and completely type-safe.
It even allows for some truly insane things, like you can inherit from DirectX interfaces and extend it that way. One of the best advantages of this is Direct2D and Direct3D11. They're not actually compatible (which is truly, horrendously stupid), but you can define a proxy type that inherits from ID3D10Device1 and forwards to the ID3D11Device and solve the problem like that. That kind of thing would never even think about being possible with any of the above approaches.
Oh, and last thing: You really, really shouldn't name your types in allcaps.
Over time I have come to appreciate the mindset of many small functions ,and I really do like it a lot, but I'm having a hard time losing my shyness to apply it to classes, especially ones with more than a handful of nonpublic member variables.
Every additional helper function clutters up the interface, since often the code is class specific and I can't just use some generic piece of code.
(To my limited knowledge, anyway, still a beginner, don't know every library out there, etc.)
So in extreme cases, I usually create a helper class which becomes the friend of the class that needs to be operated on, so it has access to all the nonpublic guts.
An alternative are free functions that need parameters, but even though premature optimization is evil, and I haven't actually profiled or disassembled it...
I still DREAD the mere thought of passing all the stuff I need sometimes, even just as reference, even though that should be a simple address per argument.
Is all this a matter of preference, or is there a widely used way of dealing with that kind of stuff?
I know that trying to force stuff into patterns is a kind of anti pattern, but I am concerned about code sharing and standards, and I want to get stuff at least fairly non painful for other people to read.
So, how do you guys deal with that?
Edit:
Some examples that motivated me to ask this question:
About the free functions:
DeadMG was confused about making free functions work...without arguments.
My issue with those functions is that unlike member functions, free functions only know about data, if you give it to them, unless global variables and the like are used.
Sometimes, however, I have a huge, complicated procedure I want to break down for readability and understandings sake, but there are so many different variables which get used all over the place that passing all the data to free functions, which are agnostic to every bit of member data, looks simply nightmarish.
Click for an example
That is a snippet of a function that converts data into a format that my mesh class accepts.
It would take all of those parameter to refactor this into a "finalizeMesh" function, for example.
At this point it's a part of a huge computer mesh data function, and bits of dimension info and sizes and scaling info is used all over the place, interwoven.
That's what I mean with "free functions need too many parameters sometimes".
I think it shows bad style, and not necessarily a symptom of being irrational per se, I hope :P.
I'll try to clear things up more along the way, if necessary.
Every additional helper function clutters up the interface
A private helper function doesn't.
I usually create a helper class which becomes the friend of the class that needs to be operated on
Don't do this unless it's absolutely unavoidable. You might want to break up your class's data into smaller nested classes (or plain old structs), then pass those around between methods.
I still DREAD the mere thought of passing all the stuff I need sometimes, even just as reference
That's not premature optimization, that's a perfectly acceptable way of preventing/reducing cognitive load. You don't want functions taking more than three parameters. If there are more then three, consider packaging your data in a struct or class.
I sometimes have the same problems as you have described: increasingly large classes that need too many helper functions to be accessed in a civilized manner.
When this occurs I try to seperate the class in multiple smaller classes if that is possible and convenient.
Scott Meyers states in Effective C++ that friend classes or functions is mostly not the best option, since the client code might do anything with the object.
Maybe you can try nested classes, that deal with the internals of your object. Another option are helper functions that use the public interface of your class and put the into a namespace related to your class.
Another way to keep your classes free of cruft is to use the pimpl idiom. Hide your private implementation behind a pointer to a class that actually implements whatever it is that you're doing, and then expose a limited subset of features to whoever is the consumer of your class.
// Your public API in foo.h (note: only foo.cpp should #include foo_impl.h)
class Foo {
public:
bool func(int i) { return impl_->func(i); }
private:
FooImpl* impl_;
};
There are many ways to implement this. The Boost pimpl template in the Vault is pretty good. Using smart pointers is another useful way of handling this, too.
http://www.boost.org/doc/libs/1_46_1/libs/smart_ptr/sp_techniques.html#pimpl
An alternative are free functions that
need parameters, but even though
premature optimization is evil, and I
haven't actually profiled or
disassembled it... I still DREAD the
mere thought of passing all the stuff
I need sometimes, even just as
reference, even though that should be
a simple address per argument.
So, let me get this entirely straight. You haven't profiled or disassembled. But somehow, you intend on ... making functions work ... without arguments? How, exactly, do you propose to program without using function arguments? Member functions are no more or less efficient than free functions.
More importantly, you come up with lots of logical reasons why you know you're wrong. I think the problem here is in your head, which possibly stems from you being completely irrational, and nothing that any answer from any of us can help you with.
Generic algorithms that take parameters are the basis of modern object orientated programming- that's the entire point of both templates and inheritance.
As discusses in The c++ Programming Language 3rd Edition in section 12.2.5, type fields tend to create code that is less versatile, error-prone, less intuitive, and less maintainable than the equivalent code that uses virtual functions and polymorphism.
As a short example, here is how a type field would be used:
void print(const Shape &s)
{
switch(s.type)
{
case Shape::TRIANGE:
cout << "Triangle" << endl;
case Shape::SQUARE:
cout << "Square" << endl;
default:
cout << "None" << endl;
}
}
Clearly, this is a nightmare as adding a new type of shape to this and a dozen similar functions would be error-prone and taxing.
Despite these shortcomings and those described in TC++PL, are there any examples where such an implementation (using a type field) is a better solution than utilizing the language features of virtual functions? Or should this practice be black listed as pure evil?
Realistic examples would be preferred over contrived ones, but I'd still be interested in contrived examples. Also, have you ever seen this in production code (even though virtual functions would have been easier)?
When you "know" you have a very specific, small, constant set of types, it can be easier to hardcode them like this. Of course, constants aren't and variables don't, so at some point you might have to rewrite the whole thing anyway.
This is, more or less, the technique used for discriminated unions in several of Alexandrescu's articles.
For example, if I was implementing a JSON library, I'd know each Value can only be an Object, Array, String, Integer, Boolean, or Null—the spec doesn't allow any others.
A type enum can be serialized via memcpy, a v-table can't. A similar feature is that corruption of a type enum value is easy to handle, corruption of the v-table pointer means instant badness. There's no portable way to even test a v-table pointer for validity, calling dynamic_cast or typeinfo to do RTTI checks on an invalid object is undefined behavior.
For example, one instance where I choose to use a type hierarchy with static dispatch controlled by a discriminator and not dynamic dispatch is when passing a pointer to a structure through a Windows message queue. This gives me some protection against other software that may have allocated broadcast messages from a range I'm using (it's supposed to be reserved for app-local messages, do not pass GO if you think that rule is actually respected).
The following guideline is from Clean Code by Robert C. Martin.
"My general rule for switch statements is that they can be tolerated if they appear only once, are used to create polymorphic objects, and are hidden behind an inheritance relationship so that the rest of the system can't see them".
The rationale is this: if you expose type fields to the rest of your code, you'll get multiple instances of the above switch statement. That is a clear violation of DRY. When you add a type, all these switches need to change (or, even worse, they become inconsistent without breaking your build).
My take is: It depends.
A parameterized Factory Method design pattern relies on this technique.
class Creator {
public:
virtual Product* Create(ProductId);
};
Product* Creator::Create (ProductId id) {
if (id == MINE) return new MyProduct;
if (id == YOURS) return new YourProduct;
// repeat for remaining products...
return 0;
}
So, is this bad. I don't think so as we do not have any other alternative at this stage. This is a place where it is absolutely necessary as it involves creation of an object. The type of the object is yet to be known.
The example in OP is however an example which sure needs refactoring. Here we are already dealing with an existing object/type (passed as argument to function).
As Herb Sutter mentions -
"Switch off: Avoid switching on the
type of an object to customize
behavior. Use templates and virtual
functions to let types (not their
calling code) decide their behavior."
Aren't there costs associated to virtual functions and polymorphism? Like maintaining a vtable per class, increase of each class object size by 4 byptes, runtime slowness (I have never measured it though) for resolving the virtual function appropriately. So, for simple situations, using a type field appears acceptable.
I think that if the type corresponds precisely to the implied classes then type is wrong. Where it gets complicated is where the type does not quite match or its not so cut and dried.
Taking your example what if type was Red, Green, Blue. Those are types of shapes. You could even have a color class as a mixin; but its probably too much.
I am thinking of using a type field to solve the problem of vector slicing. That is, I want a vector of hierarchical objects. For example I want my vector to be a vector of shapes, but I want to store circles, rectangles, triangles etc.
You can't do that in the most obvious simple way because of slicing. So the normal solution is to have a vector of pointers or smart pointers instead. But I think there are cases where using a type field will be a simpler solution, (avoids new/delete or alternative lifecycle techniques).
The best example I can think of (and the one I've run into before), is when your set of types is fixed and the set of functions you want to do (that depend on those types) is fluid. That way, when you add a new function, you modify a single place (adding a single switch) rather than adding a new base virtual function with the real implementation scattered all across the classes in your type hierarchy.
I don't know of any realistic examples. Contrived ones would depend on there being some good reason that virtual methods cannot be used.
A big reason why I use OOP is to create code that is easily reusable. For that purpose Java style interfaces are perfect. However, when dealing with C++ I really can't achieve any sort of functionality like interfaces... at least not with ease.
I know about pure virtual base classes, but what really ticks me off is that they force me into really awkward code with pointers. E.g. map<int, Node*> nodes; (where Node is the virtual base class).
This is sometimes ok, but sometimes pointers to base classes are just not a possible solution. E.g. if you want to return an object packaged as an interface you would have to return a base-class-casted pointer to the object.. but that object is on the stack and won't be there after the pointer is returned. Of course you could start using the heap extensively to avoid this but that's adding so much more work than there should be (avoiding memory leaks).
Is there any way to achieve interface-like functionality in C++ without have to awkwardly deal with pointers and the heap?? (Honestly for all that trouble and awkardness id rather just stick with C.)
You can use boost::shared_ptr<T> to avoid the raw pointers. As a side note, the reason why you don't see a pointer in the Java syntax has nothing to do with how C++ implements interfaces vs. how Java implements interfaces, but rather it is the result of the fact that all objects in Java are implicit pointers (the * is hidden).
Template MetaProgramming is a pretty cool thing. The basic idea? "Compile time polymorphism and implicit interfaces", Effective C++. Basically you can get the interfaces you want via templated classes. A VERY simple example:
template <class T>
bool foo( const T& _object )
{
if ( _object != _someStupidObject && _object > 0 )
return true;
return false;
}
So in the above code what can we say about the object T? Well it must be compatible with '_someStupidObject' OR it must be convertible to a type which is compatible. It must be comparable with an integral value, or again convertible to a type which is. So we have now defined an interface for the class T. The book "Effective C++" offers a much better and more detailed explanation. Hopefully the above code gives you some idea of the "interface" capability of templates. Also have a look at pretty much any of the boost libraries they are almost all chalk full of templatization.
Considering C++ doesn't require generic parameter constraints like C#, then if you can get away with it you can use boost::concept_check. Of course, this only works in limited situations, but if you can use it as your solution then you'll certainly have faster code with smaller objects (less vtable overhead).
Dynamic dispatch that uses vtables (for example, pure virtual bases) will make your objects grow in size as they implement more interfaces. Managed languages do not suffer from this problem (this is a .NET link, but Java is similar).
I think the answer to your question is no - there is no easier way. If you want pure interfaces (well, as pure as you can get in C++), you're going to have to put up with all the heap management (or try using a garbage collector. There are other questions on that topic, but my opinion on the subject is that if you want a garbage collector, use a language designed with one. Like Java).
One big way to ease your heap management pain somewhat is auto pointers. Boost has a nice automatic pointer that does a lot of heap management work for you. The std::auto_ptr works, but it's quite quirky in my opinion.
You might also evaluate whether you really need those pure interfaces or not. Sometimes you do, but sometimes (like some of the code I work with), the pure interfaces are only ever instantiated by one class, and thus just become extra work, with no benefit to the end product.
While auto_ptr has some weird rules of use that you must know*, it exists to make this kind of thing work easily.
auto_ptr<Base> getMeAThing() {
return new Derived();
}
void something() {
auto_ptr<Base> myThing = getMeAThing();
myThing->foo(); // Calls Derived::foo, if virtual
// The Derived object will be deleted on exit to this function.
}
*Never put auto_ptrs in containers, for one. Understand what they do on assignment is another.
This is actually one of the cases in which C++ shines. The fact that C++ provides templates and functions that are not bound to a class makes reuse much easier than in pure object oriented languages. The reality though is that you will have to adjust they manner in which you write your code in order to make use of these benefits. People that come from pure OO languages often have difficulty with this, but in C++ an objects interface includes not member functions. In fact it is considered to be good practice in C++ to use non-member functions to implement an objects interface whenever possible. Once you get the hang of using template nonmember functions to implement interfaces, well it is a somewhat life changing experience. \