I'd like to get deeper in C++. There are decisions made in STL that I'd like to understand and it's quite hard from just the code.
My idea is to implement some of the STL on my own to understand the pitfalls and so improve my understanding of C++ and improve my code. And I'd like to have some features in STL containers the STD does not have like destruction notification for a resource handling class. I created an extended version of my SharedPointer to contain a std::function as deletion notifier.
And I found some trouble.
Take this code for example: SmartPointer.hpp
This is some code I came up with and have some questions.
Short:
Known problems
Derived classes won't work
Complains about incomplete type
Unknown problems
Long:
1.1. Derived classes won't work
Just having T as type won't work after the type has been casted. The idea was to pass along OrigT as second parameter so I always know what type ptr points to. I can cast it back and call the correct destructor.
Considering
SharedPointer<Derived> member = base.Cast<Derived>();
will create T = OrigT and types will not match after cast on assertion I assume. I can't imagine anything how I could solve this.
if (!shared->HasReferences())
{
delete shared;
OriginalValuePointer origPtr = dynamic_cast<OriginalValuePointer>(ptr);
delete origPtr;
}
1.2. Complains about incomplete type
In my examples I get complaints about incomplete type. But I can't figure out why. Currently I am considering making operator* and operator-> templates, too that would be a shot in the dark. I have no clue why it complains and I'd like to ask if you could point me to the problem here.
Same code as above in compiler complaint
2.2. I think stackoverflow is not the ideal place to ask for feedback but considering my two problems I'd like to ask anyway.
Does anyone have any sources to readable and ideally explained smart pointers? The ones I've found did not quite match my expectations. They were either too simple or did not contain explanation at the critical points.
I'd appreciate some direct feedback on the code. Afar from coding style of course ;-). Is there anything you directly see where I made a mistake I'll regret? Is there anything that could be done better? (for example, .Cast as member is IMHO a bad choice. For once it is not directly a property of the pointer and I think it might cause flaws I'm not aware of yet.)
I'm really grateful for your help and your opinion.
Stay healthy.
Normal C++ classes use snake_case, rather than CamelCase.
This class isn't' thread safe (you probably knew that, but its worth calling out)
NumReferences returns the count by reference, which isn't useful, and is slightly slower than returning by int.
All methods defined inside the class are automatically inline, so you don't need that anywhere.
operator ValueType() is implicit, which is super dangerous. You can make it explicit, but I'd eliminate it entirely.
operator ValueType() needs to know the details of ValueType in order to be created. So if ValueType isn't fully defined yet, you'll get compiler errors about an incomplete type. Again, deleting the method eliminates this issue.
operator SharedPointer<U>() and operator bool() are also implicit. Prefer explicit.
strongly consider adding assert to all your methods that use shared or ptr without checking if it's null first.
Raw() is normally named get()
Now, on to OrigT and Release: std::shared_ptr does an interesting trick where the SharedData has inheritance:
struct SharedData {
std::atomic_uint count;
virtual ~SharedData() {}
};
template<class OrigT>
struct SharedDataImpl {
OrigT* data;
~SharedData() {delete data;}
};
Since all the shared_ptrs to the same data will point to the same SharedDataImpl, they don't have to know the most derived class. All they have to do is delete the SharedData member, and it'll automatically clean up the data correctly. This does require having a second data pointer: one in the SharedPointer itself and one in the SharedData, but usually this isn't an issue. (Or a virtual T* get() method)
Related
I'm facing design problems and could do with some external input. I am trying to avoid abstract base class casting (Since I've heard that's bad).
The issues are down to this structure:
class entity... (base with pure virtual functions)
class hostile : public entity... (base with pure virtual functions)
class friendly : public entity... (base with pure virtual functions)
// Then further derived classes using these last base classes...
Initially I thought I'd get away with:
const enum class FactionType : unsigned int
{ ... };
std::unordered_map<FactionType, std::vector<std::unique_ptr<CEntity>>> m_entitys;
And... I did but this causes me problems because I need to access "unique" function from say hostile or friendly specifically.
I have disgracefully tried (worked but don't like it nor does it feel safe):
// For-Each Loop: const auto& friendly : m_entitys[FactionType::FRIENDLY]
CFriendly* castFriendly = static_cast<CFriendly*>(&*friendly);
I was hoping/trying to maintain the unordered_map design that uses FactionType as a key for the base abstract class type... Anyway, input is greatly appreciated.
If there are any syntactical errors, I apologise.
About casting I agree with with #rufflewind. The casts mean different thing and are useful at different times.
To coerce a region of memory at compile time (the decision of typing happen at compile time anyway) use static_cast. The amount of memory on the other end of the T* equal to sizeof(T) will be interpreted as a T regardless of correct behavior.
The decisions for dynamic_cast are made entirely at runtime, sometimes requiring RTTI (Run Time Type Information). It makes a decision and it will either return a null pointer or a valid pointer to a T if one can be made.
The decision goes further than just the types of casts though. Using a data structure to look up types and methods (member functions) imposes the time constraints that would not otherwise exist when compared to the relatively fast and mandatory casts. There is a way to skip the data structures, but not the casting without major refactoring (with major refactoring you can do anything).
You can move the casts into the entity class, get them done right and just leave them encapsulated there.
class entity
{
// Previous code
public:
// This will be overridden in hostiles to return a valid
// pointer and nullptr or 0 in other types of entities
virtual hostile* cast_to_hostile() = 0
virtual const hostile* cast_to_hostile() const = 0
// This will be overridden in friendlies to return a valid
// pointer and nullptr or 0 in other types of entities
virtual friendly* cast_to_friendly() = 0
virtual const friendly* cast_to_friendly() const = 0
// The following helper methods are optional but
// can make it easier to write streamlined code in
// calling classes with a little trouble.
// Hostile and friendly can each implement this to return
// The appropriate enum member. This would useful for making
// decision about friendlies and hostiles
virtual FactionType entity_type() const = 0;
// These two method delegate storage of the knowledge
// of hostility or friendliness to the derived classes.
// These are implemented here as non-virtual functions
// because they shouldn't need to be overridden, but
// could be made virtual at the cost of a pointer
// indirection and sometimes, but not often a cache miss.
bool is_friendly() const
{
return entity_type() == FactionType_friendly;
}
bool is_hostile() const
{
return entity_type() == FactionType_hostile;
}
}
This strategy is good and bad for a variety of reasons.
Pros:
It is conceptually simple. This is easy to understand quickly if you understand polymorphism.
It seems similar to your existing code seems superficially similar to your existing code making migration easier. There is a reason hostility and friendliness is encoded in your types, this preserves that reason.
You can use static_casts safely because all the casts exist in the class they are used in, and therefor won't normally get called unless valid.
You can return shared_ptr or other custom smart pointers instead of raw pointers. And you probably should.
This avoids a potentially costly refactor that completely avoids casting. Casting is there to be used as a tool.
Cons:
It is conceptually simple. This does not provide a strong set of vocabulary (methods, classes and patterns) for building a smart set of tools for building advanced type mechanics.
Likely whether or not something is hostile should be a data member or implemented as series of methods controlling instance behavior.
Someone might think that the pointers this returns convey ownership and delete them.
Every caller must check pointers for validity prior to use. Or you can add methods to check, but then callers will need to call methods to check before the cast. Checks like these are surprising for users of the class and make it harder to use correctly.
It is polymorphism dense. This will perplex people who are uncomfortable with polymorphism. Even today there are many who are not comfortable with polymorphism.
A refactor that completely avoids casting is possible. Casting is dangerous and not a tool to use lightly.
My situation is a follows: There is some class MyList that will probably get a specific implemenation later on. For now, behavior like std::vector is fine.
However, I really need an easy way to call some kind of asString() / toString() method on it, because I'll need it in test assertions, debug output and so on. The only options I see are:
Public inheritence. I'll never delete such a list through a base-pointer, since there should never be any base pointers. If I do, there will be no pointer members, anyway. However, rule of thumb still states: Don't inherit from stl containers.
Some kind of "global" (actually in a namespace, of course) method that takes an instance of MyList as argument and does the asString() magic for me. In that case, MyList could be a simple typedef for std::vector.
I like neither of those options too much. Is there something else I failed to think of? Or if not - which way should I prefer?
what is wrong about the second approach? that is by far the easiest and also pretty elegant.-
Imagine the alternative of wrapping the vector. that would cause you alot of extra work and glue code that is error prone! I'd go with the function approach for sure!
edit: btw, i almost exclusively use free functions(sometimes static members) for conversions. Imagine you have a load of types that somehow need to be convertible to string. Having the toString() functions as free functions and not as members does not give you the headache you are heaving right now since you can basically simply overload the function as much as you want and don't have to touch any existing classes (or maybe classes that you don't even have source access to).
Then you can have a function like:
template<class T>
void printDebugInfo(const T & _obj)
{
std::cout<<toString(_obj)<<std::endl;
}
and you wont have the constraints you are experiencing.
Actually, free functions upon class types are a standard technique and are considered as part of the interface of a type. Read this GotW by Herb Sutter, one of people that have a voice in C++ standardization.
In general, prefer free functions over member functions. This increases encapsulation and re-usability and reduces class bloat and coupling. See this article by Scott Meyers for deeper information (highly regarded for his C++ books that you should definitely read if you want to improve your effective and clean use of C++).
Also note that you should never derive from STL containers. They are not designed as base classes and you might easily invoke undefined behaviour. But see Is there any real risk to deriving from the C++ STL containers? .
I think having a free
std::string toString( const MyList &l );
function is perfectly fine. If you are afraid of name clashes, you can consider a namespace as you said. This function is highly decoupled, and won't be able to tinker with private members of MyList objects (as is the case for a member or a friend function).
The only reason which would justify not making it a free function: you notice that you suddenly need to extend the public interface of MyList a lot just to be able to implement toString properly. In that case, I'd make it a friend function.
If you did something like:
template<typename T>
std::ostream& operator<< (std::ostream &strm, const MyList<T> &list)
{
if (list.empty())
return strm;
MyList<T>::const_iterator iter = list.begin(),
end = list.end();
// Write the first value
strm << *iter++;
while (iter != end)
strm << "," << *iter++;
return strm;
}
Then you would essentially have a to string for anything in the list, as long as the elements implement the streaming operator
Have you considered composition, as opposed to inheritance? i.e. Your MyList has a member variable of type std::vector.
You may complain that you will now need to replicate the API of std::vector in MyList. But you say that you might change the implementation later, so you'll need to do that anyway. You may as well do it straight away, to avoid having to change all the client code later on.
Inheritance is completely wrong in this case.
Global function approach is perfectly fine.
One of 'the ways' in C++ is to overload operator << and use stringstream, for example, to output your vector or something else.
I would go with global template function printOnStream. That way you can easily add support for other data types and using stream is more general than creating a string.
I would not use inheritance because there might be some tricky cases. Basic thing is as you mentioned - lack of virtual destructor. But also everything that expects std::vector won't work properly with your data type - for example you can run into slicing problem.
Why don't your debug and assert methods do this for you?
A big reason why I use OOP is to create code that is easily reusable. For that purpose Java style interfaces are perfect. However, when dealing with C++ I really can't achieve any sort of functionality like interfaces... at least not with ease.
I know about pure virtual base classes, but what really ticks me off is that they force me into really awkward code with pointers. E.g. map<int, Node*> nodes; (where Node is the virtual base class).
This is sometimes ok, but sometimes pointers to base classes are just not a possible solution. E.g. if you want to return an object packaged as an interface you would have to return a base-class-casted pointer to the object.. but that object is on the stack and won't be there after the pointer is returned. Of course you could start using the heap extensively to avoid this but that's adding so much more work than there should be (avoiding memory leaks).
Is there any way to achieve interface-like functionality in C++ without have to awkwardly deal with pointers and the heap?? (Honestly for all that trouble and awkardness id rather just stick with C.)
You can use boost::shared_ptr<T> to avoid the raw pointers. As a side note, the reason why you don't see a pointer in the Java syntax has nothing to do with how C++ implements interfaces vs. how Java implements interfaces, but rather it is the result of the fact that all objects in Java are implicit pointers (the * is hidden).
Template MetaProgramming is a pretty cool thing. The basic idea? "Compile time polymorphism and implicit interfaces", Effective C++. Basically you can get the interfaces you want via templated classes. A VERY simple example:
template <class T>
bool foo( const T& _object )
{
if ( _object != _someStupidObject && _object > 0 )
return true;
return false;
}
So in the above code what can we say about the object T? Well it must be compatible with '_someStupidObject' OR it must be convertible to a type which is compatible. It must be comparable with an integral value, or again convertible to a type which is. So we have now defined an interface for the class T. The book "Effective C++" offers a much better and more detailed explanation. Hopefully the above code gives you some idea of the "interface" capability of templates. Also have a look at pretty much any of the boost libraries they are almost all chalk full of templatization.
Considering C++ doesn't require generic parameter constraints like C#, then if you can get away with it you can use boost::concept_check. Of course, this only works in limited situations, but if you can use it as your solution then you'll certainly have faster code with smaller objects (less vtable overhead).
Dynamic dispatch that uses vtables (for example, pure virtual bases) will make your objects grow in size as they implement more interfaces. Managed languages do not suffer from this problem (this is a .NET link, but Java is similar).
I think the answer to your question is no - there is no easier way. If you want pure interfaces (well, as pure as you can get in C++), you're going to have to put up with all the heap management (or try using a garbage collector. There are other questions on that topic, but my opinion on the subject is that if you want a garbage collector, use a language designed with one. Like Java).
One big way to ease your heap management pain somewhat is auto pointers. Boost has a nice automatic pointer that does a lot of heap management work for you. The std::auto_ptr works, but it's quite quirky in my opinion.
You might also evaluate whether you really need those pure interfaces or not. Sometimes you do, but sometimes (like some of the code I work with), the pure interfaces are only ever instantiated by one class, and thus just become extra work, with no benefit to the end product.
While auto_ptr has some weird rules of use that you must know*, it exists to make this kind of thing work easily.
auto_ptr<Base> getMeAThing() {
return new Derived();
}
void something() {
auto_ptr<Base> myThing = getMeAThing();
myThing->foo(); // Calls Derived::foo, if virtual
// The Derived object will be deleted on exit to this function.
}
*Never put auto_ptrs in containers, for one. Understand what they do on assignment is another.
This is actually one of the cases in which C++ shines. The fact that C++ provides templates and functions that are not bound to a class makes reuse much easier than in pure object oriented languages. The reality though is that you will have to adjust they manner in which you write your code in order to make use of these benefits. People that come from pure OO languages often have difficulty with this, but in C++ an objects interface includes not member functions. In fact it is considered to be good practice in C++ to use non-member functions to implement an objects interface whenever possible. Once you get the hang of using template nonmember functions to implement interfaces, well it is a somewhat life changing experience. \
I suppose most of the persons on this site will agree that implementation can be outsourced in two ways:
private inheritance
composition
Inheritance is most often abused. Notably, public inheritance is often used when another form or inheritance could have been better and in general one should use composition rather than private inheritance.
Of course the usual caveats apply, but I can't think of any time where I really needed inheritance for an implementation problem.
For the Boost Parameter library however, you will notice than they have chosen inheritance over composition for the implementation of the named parameter idiom (for the constructor).
I can only think of the classical EBO (Empty Base Optimization) explanation since there is no virtual methods at play here that I can see.
Does anyone knows better or can redirect me to the discussion ?
Thanks,
Matthieu.
EDIT: Ooopss! I posted the answer below because I misread your post. I thought you said the Boost library used composition over inheritance, not the other way around. Still, if its usefull for anyone... (See EDIT2 for what I think could be the answer for you question.)
I don't know the specific answer for the Boost Parameter Library. However, I can say that this is usually a better choice. The reason is because whenever you have the option to implement a relationship in more than one way, you should choose the weakest one (low coupling/high cohesion). Since inheritance is stronger than composition...
Notice that sometimes using private inhertiance can make it harder to implement exception-safe code too. Take operator==, for example. Using composition you can create a temporary and do the assignment with commit/rollback logic (assuming a correct construction of the object). But if you use inheritance, you'll probably do something like Base::operator==(obj) inside the operator== of the derived class. If that Base::operator==(obj) call throws, you risk your guarantees.
EDIT 2: Now, trying to answer what you really asked. This is what I could understand from the link you provided. Since I don't know all details of the library, please correct me if I'm wrong.
When you use composition for "implemented in terms of" you need one level of indirection for the delegation.
struct AImpl
{
//Dummy code, just for the example.
int get_int() const { return 10; }
};
struct A
{
AImpl * impl_;
int get_int() const { return impl->get_int(); }
/* ... */
};
In the case of the parameter-enabled constructor, you need to create an implementation class but you should still be able to use the "wrapper" class in a transparent way. This means that in the example from the link you mentioned, it's desired that you can manipulate myclass just like you would manipulate myclass_impl. This can only be done via inheritance. (Notice that in the example the inheritance is public, since it's the default for struct.)
I assume myclass_impl is supposed to be the "real" class, the one with the data, behavior, etc. Then, if you had a method like get_int() in it and if you didn't use inheritance you would be forced to write a get_int() wrapper in myclass just like I did above.
This isn't a library I've ever used, so a glance through the documentation you linked to is the only thing I'm basing this answer on. It's entirely possible I'm about to be wrong, but...
They mention constructor delegation as a reason for using a common base class. You're right that composition could address that particular issue just as well. Putting it all in a single type, however, would not work. They want to boil multiple constructor signatures into a single user-written initialization function, and without constructor delegation that requires a second data type. My suspicion is that much of the library had already been written from the point of view of putting everything into the class itself. When they ran into the constructor delegation issue they compromised. Putting it into a base class was probably closer to what they were doing with the previous functionality, where they knew that both interface and implementation aspects of the functionality would be accessible to the class you're working with.
I'm not slamming the library in any way. I highly doubt I could put together a library like this one in any reasonable amount of time. I'm just reading between the lines. You know, speaking from ignorance but pretending I actually know something. :-)
We often hear/read that one should avoid dynamic casting. I was wondering what would be 'good use' examples of it, according to you?
Edit:
Yes, I'm aware of that other thread: it is indeed when reading one of the first answers there that I asked my question!
This recent thread gives an example of where it comes in handy. There is a base Shape class and classes Circle and Rectangle derived from it. In testing for equality, it is obvious that a Circle cannot be equal to a Rectangle and it would be a disaster to try to compare them. While iterating through a collection of pointers to Shapes, dynamic_cast does double duty, telling you if the shapes are comparable and giving you the proper objects to do the comparison on.
Vector iterator not dereferencable
Here's something I do often, it's not pretty, but it's simple and useful.
I often work with template containers that implement an interface,
imagine something like
template<class T>
class MyVector : public ContainerInterface
...
Where ContainerInterface has basic useful stuff, but that's all. If I want a specific algorithm on vectors of integers without exposing my template implementation, it is useful to accept the interface objects and dynamic_cast it down to MyVector in the implementation. Example:
// function prototype (public API, in the header file)
void ProcessVector( ContainerInterface& vecIfce );
// function implementation (private, in the .cpp file)
void ProcessVector( ContainerInterface& vecIfce)
{
MyVector<int>& vecInt = dynamic_cast<MyVector<int> >(vecIfce);
// the cast throws bad_cast in case of error but you could use a
// more complex method to choose which low-level implementation
// to use, basically rolling by hand your own polymorphism.
// Process a vector of integers
...
}
I could add a Process() method to the ContainerInterface that would be polymorphically resolved, it would be a nicer OOP method, but I sometimes prefer to do it this way. When you have simple containers, a lot of algorithms and you want to keep your implementation hidden, dynamic_cast offers an easy and ugly solution.
You could also look at double-dispatch techniques.
HTH
My current toy project uses dynamic_cast twice; once to work around the lack of multiple dispatch in C++ (it's a visitor-style system that could use multiple dispatch instead of the dynamic_casts), and once to special-case a specific subtype.
Both of these are acceptable, in my view, though the former at least stems from a language deficit. I think this may be a common situation, in fact; most dynamic_casts (and a great many "design patterns" in general) are workarounds for specific language flaws rather than something that aim for.
It can be used for a bit of run-time type-safety when exposing handles to objects though a C interface. Have all the exposed classes inherit from a common base class. When accepting a handle to a function, first cast to the base class, then dynamic cast to the class you're expecting. If they passed in a non-sensical handle, you'll get an exception when the run-time can't find the rtti. If they passed in a valid handle of the wrong type, you get a NULL pointer and can throw your own exception. If they passed in the correct pointer, you're good to go.
This isn't fool-proof, but it is certainly better at catching mistaken calls to the libraries than a straight reinterpret cast from a handle, and waiting until some data gets mysteriously corrupted when you pass the wrong handle in.
Well it would really be nice with extension methods in C#.
For example let's say I have a list of objects and I want to get a list of all ids from them. I can step through them all and pull them out but I would like to segment out that code for reuse.
so something like
List<myObject> myObjectList = getMyObjects();
List<string> ids = myObjectList.PropertyList("id");
would be cool except on the extension method you won't know the type that is coming in.
So
public static List<string> PropertyList(this object objList, string propName) {
var genList = (objList.GetType())objList;
}
would be awesome.
It is very useful, however, most of the times it is too useful: if for getting the job done the easiest way is to do a dynamic_cast, it's more often than not a symptom of bad OO design, what in turn might lead to trouble in the future in unforeseen ways.