Virtual parent for containers and 'iterable' class - c++

I was wondering if there was a reason why there's no virtual parent inheritance for all std containers.
Since C++ is oop, I get some reason for using external function as such from algorithm header.
But i wonder why string does not have size() function ( doing the same as length ) for genericity?
Also, why is there no "iterable" class from which would iterables containers inherit ( because of the different types of iterator? )
This one come from my short experience with python. In my opinion, python's biggest strengh is how easy if is to use iterable.
EDIT:
To clarify myself: i know C++ is mutli-paradigm, but i would like to focus on containers. We could have keep C-style using struct and functions, why defining some as methods and others not? For encapsulation only?
Having both sequential and oop means is confusing and illogical i think
Why having begin(vec) and vec.begin() ?
And i speak about iterable, not iterators, and yes, iterator was a class in the std defining an iterface for iterator which is now deprecated ( and i understand why ). So iterators are object, why not doing iterable object ( juste to pack iterator begin and iterator end together if it is not to define virtual parent for iterable containers) ?
I hope i was clearer this time. I already got answer i thought about by myself for the most, but i would prefer an explanation about the current logic in the std.
Thank you for the time taken to read and answer.

I was wondering if there was a reason why there's no virtual parent inheritance for all std containers.
There is a cost attached to this (virtual dispatch is not free!), and most people (all people?) wouldn't need it.
In C++, you don't pay for what you don't use. (In theory.)
i wonder why string does not have size() function ( doing the same as length ) for genericity?
It does.
why is there no "iterable" class from which would iterables containers inherit ( because of the different types of iterator? )
Because we don't need one. You claim that C++ is "oop", but that is not true: it is multi-paradigm, and most of the standard library (particularly the parts inherited from the antique STL) are template based, not object-oriented.
This allows us to provide iterators that fit a set of "requirements", rather than those that fit some definition of "class". For example, pointers (which do not have class type!) are a valid kind of iterator.
This one come from my short experience with python. In my opinion, python's biggest strengh is how easy if is to use iterable.
That's reasonable. But C++ is not Python.

I was wondering if there was a reason why there's no virtual parent inheritance for all std containers.
Design decision at early stages of C++ standard forming. There is no much reason to do that - sure, you would have a nice common interface, but this can be also achieved with consistent standard managing (all containers already do have common set of functions without an explicit interface).
A disadvantage of a common base is that all containers would suddenly need virtual functions, and virtual functions are not free (more expensive at runtime than normal functions).
Since C++ is oop
It's not. C++ is a multiparadigm language. For example, you can have functions outside of any class (unlike Java, which is following OOP paradigm more strictly).
i get some reason for using external function as such from algorithm header.
A follow-up decision after the above. Having global functions accepting generic pair(s) of iterators allows to use any container (even user-defined) with these functions.
But i wonder why string does not have size() function ( doing the same as length ) for genericity?
It does have such a function and it always had. length() is another option, provided for better readability - it's more "English" to ask about length of string than to ask about size of string
Also, why is there no "iterable" class from which would iterables containers inherit ( because of the different types of iterator? ) This one come from my short experience with python. In my opinion, python's biggest strengh is how easy if is to use iterable.
There is one very good reason for that - iterators don't have to be of class type. Pointer type satisfies all of the requirements for RandomAccessIterator, but it is not a class.

C++ prefers compile-time polymorphism, suggesting you to use templates instead of virtual calls. Back in the good old days, it was more efficient (virtual calls are still not free).
There is however a certain level of standardization for containers: all of them support iterators, and standardized C++20 ranges are coming as well.

Related

C++ 2a - polymorphic range

I am writing a C++ library and have had this amazing idea of using as much C++2a/C++20 as possible. Thus, I am using the standard library concepts and creating my own. However, the idea of a function returning a std::vector<X> seemed non-C++20 enough to me, so I declared in my concept a return type matching std::ranges::view<X>. I've then implemented some classes that fulfill this concept.
However, the problem appeared when I wanted to devise a polymorphic wrapper class. So, let's say the concept is C and I have three implementing classes C1, C2 and C3 (but allow for more). Now I want to create a class C_virtual and a template C_virtual_impl<C c> deriving from it, which will allow me to refer to all classes fulfilling C polymorphically. However, for that to work I need a polymorphic std::ranges::view wrapper, similar in spirit to C_virtual.
I have not seen any such class in the headers and in C++ reference. Moreover, when I started implementing it myself, I quickly found myself unable to due to some requirements on iterators, in particular default constructibility, swappability and similar.
Is there a nonobvious solution in the standard library or an idiom? If not, how do I deal with the problem? Possibly a change of design will work. I certainly do not want to return a std::vector<X> or to return a V<X> where V would be a type parameter of C. How do I do this?
Range views, and many other template techniques, are not meant to be used with inheritance-based polymorphism. This is much like how vector<BaseClass> is not especially useful.
If you need runtime polymorphism, then the tool you want is not inheritance (directly); it's type erasure. That is, you have some view wrapper which uses type erasure to forward the various view operations to the erased type. This would also need to be paired with type-erased iterators that wrap the iterators of the given view.
Now of course, this means that the characteristics of the view have to be defined by the type erased wrapper. The wrapper could implement the input_range concept, but it could never fulfill more than input_range itself. Even if you put a contiguous_range type in the wrapper, the wrapper will limit the interface to that of an input_range.
As such, it's best to just avoid this case and rely on static polymorphism via templates whenever possible.

Why not implement contains function in c++ Containers?

My initial question was: why does in C++ the contains() function miss in Containers?
So I looked for an explanation and I found something interesting about why some other function are not implemented in all the Containers (essentially because of performance issues and convenience).
I know that you can use the find function from algorithm library or you can just write your own function with Iterator, but what I can't understand is why in set, for example, the contains function(where it's called find) is implemented, whereas in vector or queue it is not.
It's pretty clear to me too why Container classes does not share a common interface like Collections do in Java (thanks to this answer) but in this case I can't find the reason why not implement the contains() function in all the Containers classes (or at least in some like vector).
Thank you
The reason why std::set implements its own find is that the "generic" std::find does a linear search, while std::set can do a lot better than that by searching its tree representation in O(log2 n ).
std::vector and std::list, on the other hand, cannot be searched faster than in linear time, so they rely on std::find implementation.
Note that you are still allowed to apply std::find to std::set to search it linearly; it just wouldn't be as efficient as using set's own find member function.
std::set<int> s {1, 2, 3, 4, 5, 6};
auto res = std::find(s.begin(), s.end(), 3);
std::cout << *res << std::endl; // prints 3
Because it's a bad design pattern. If you're using "find" on a linear container repeatedly, there's a problem with your code. The average and worst case time complexity is still O(n), which means you've made a bad choice.
For example, std::map and std::unordered_map have find member functions which allows O(logn) and O(1) lookups. This is because the container employs efficient item lookup via these methods: that is how the container should be used.
If you've weighed all the options, and decided a linear container is the best model for your data, but do need to find an element in rare occasions, std::find() allows you to do just that. You shouldn't rely on it. I view this as an antipattern in Python and Java, and a benefit in C++.
Just as a personal note, 3 years ago, when I was a beginner coder, I wrote if mydict in list: do_something() a lot. I thought this was a good idea, because Python makes list item membership idiomatic. I didn't know better. This led me to produce awful code, until I learned why linear searches are so inefficient compared to binary searches and hashmap lookups. A programming language or framework should enable good design patterns, and discourage bad ones. Enabling linear searches is a bad design pattern.
Other answers for some reason focus on complexity of generalized and per-container find methods. However they fail to explain the OP's question. The actual reason for a lack of helpful utility methods originates from the way the classes from different files are used in C++. If every container that does not hold any special properties would have contains method executing generic search with linear complexity we would stuck either with situation when each container header also includes <algorithm> or when each container header reimplements it's own find algorithm (which is even worse). And the pretty large document build during compilation of each translation unit after preprocessor includes (that is copy-pastes) every included header would grow even bigger. Compilation would take even more time. And the rather cryptic compilation error messaged may get even longer. That old principle of not making member function when a non-member function can do the job (and to include only stuff you are going to use) exists for a reason.
Note that there was a proposal recently for uniform call syntax that may allow to kinda mix-in utility methods into classes. If it goes live it probably will be possible to write extension functions like this:
template< typename TContainer, typename TItem > bool
contains(TContainer const & container, TItem const & item)
{
// Some implementation possibly calling container member find
// or ::std::find if container has none...
}
::std::vector< int > registry;
registry.contains(3); // false
There is a good reason that containers do not share a "common (inherited) interface" (like in Java) and it is what makes the C++ generics so powerful. Why write code for every container when you can write it only once for all containers? This is one of the main principles the STL was built on.
If using containers relied on member functions from a common inherited interface you would have to write a find function for every container. That is wasteful and poor engineering. Good design says if you can write code in only one place you should because you only have to remove the bugs from that one place, and fixing a bug in one place fixes the bug everywhere.
So the philosophy behind the STL is to separate the algorithms from the containers so that you only have to write the algorithm once and it works for all containers. Once the algorithm is debugged, it is debugged for all containers.
A fly in that ointment is that some containers can make more efficient decisions due to their internal structure. For those containers a type specific function has been added which will take advantage of that efficiency.
But most functions should be separate from the containers. It is called decoupling and it reduces bugs while promoting code reuse, often much more so than polymorphism which is what libraries like Java containers use (a common inherited interface).

Tagged container - is mimicking the container's interface a good practice?

Assume I have a container type to which I would like to attach additional information. My approach would be to define a class holding the container and the info. Is it good practice to define methods for the new class which mimic the container's methods? E.g., instead of writing myContainerObject.internalVector[i] I would like to write myContainerObject[i]. One would have to redefine every method one wishes to use (size(), push_back() etc.). What are the drawbacks of such an approach? What alternatives exist (e.g., is inheriting from a container the better solution?).
You are using composition with forwarding functions, and it's the right thing to do with concrete classes like STL containers. One drawback is that you have to redefine every overload of every function you want to allow, only to forward arguments, as well as parrot-typedef nested types (like iterator).
An alternative is to inherit but never publicly, only with private inheritance (because STL containers' destructor is public and non-virtual), then use using-declarations inside the custom class to bring names of base-class functions and types into scope (needing only one using Base::name; for all overloads of a function).
Possible ways of grasping a type which is a container, that I can think of:
Provide the a decorator (this is what you are asking about) whose component will be the internal container. This is can work when you have strict interfaces which you must comply with. It's neither good or bad practice. Read about decorator design pattern.
Use an iterator concept in your algorithms instead of a container. This a generic approach, and how stl algorithms are implemented
Use a container concept - similar to (2). Detect if the type is a container (SFINAE trick) and manipulate it.
Reimplement the container interface. You need a really strong reason to do it since it requires a significant amount of work/know-how. Some info: http://stdcxx.apache.org/doc/stdlibug/16-3.html
Generally , unless your uses case is very, very trivial or you have some specific requirements you should not expose class internal state (myContainerObject.internalVector[i]) to out side world.
Best practice is to keep the internals private and implement the [] operator and other functions (size() and such).

Most efficient way to process all items in an unknown container?

I'm doing a computation in C++ and it has to be as fast as possible (it is executed 60 times per second with possibly large data). During the computation, a certain set of items have to be processed. However, in different cases, different implementations of the item storage are optimal, so i need to use an abstract class for that.
My question is, what is the most common and most efficient way to do an action with each of the items in C++? (I don't need to change the structure of the container during that.) I have thought of two possible solutions:
Make iterators for the storage classes. (They're also mine, so i can add it.) This is common in Java, but doesn't seem very 'C' to me:
class Iterator {
public:
bool more() const;
Item * next();
}
Add sort of an abstract handler, which would be overriden in the computation part and would include the code to be called on each item:
class Handler {
public:
virtual void process(Item &item) = 0;
}
(Only a function pointer wouldn't be enough because it has to also bring some other data.)
Something completely different?
The second option seems a bit better to me since the items could in fact be processed in a single loop without interruption, but it makes the code quite messy as i would have to make quite a lot of derived classes. What would you suggest?
Thanks.
Edit: To be more exact, the storage data type isn't exactly just an ADT, it has means of only finding only a specific subset of the elements in it based on some parameters, which i need to then process, so i can't prepare all of them in an array or something.
#include <algorithm>
Have a look at the existing containers provided by the C++ standard, and functions such as for_each.
For a comparison of C++ container iteration to interfaces in "modern" languages, see this answer of mine. The other answers have good examples of what the idiomatic C++ way looks like in practice.
Using templated functors, as the standard containers and algorithms do, will definitely give you a speed advantage over virtual dispatch (although sometimes the compiler can devirtualize calls, don't count on it).
C++ has iterators already. It's not a particularly "Java" thing. (Note that their interface is different, though, and they're much more efficient than their Java equivalents)
As for the second approach, calling a virtual function for every element is going to hurt performance if you're worried about throughput.
If you can (pre-)sort your data so that all objects of the same type are stored consecutively, then you can select the function to call once, and then apply it to all elements of that type. Otherwise, you'll have to go through the indirection/type check of a virtual function or another mechanism to perform the appropriate action for every individual element.
What gave you the impression that iterators are not very C++-like? The standard library is full of them (see this), and includes a wide range of algorithms that can be used to effectively perform tasks on a wide range of standard container types.
If you use the STL containers you can save re-inventing the wheel and get easy access to a wide variety of pre-defined algorithms. This is almost always better than writing your own equivalent container with an ad-hoc iteration solution.
A function template perhaps:
template <typename C>
void process(C & c)
{
typedef typename C::value_type type;
for (type & x : c) { do_something_with(x); }
}
The iteration will use the containers iterators, which is generally as efficient as you can get.
You can specialize the template for specific containers.

Internal typedefs in C++ - good style or bad style?

Something I have found myself doing often lately is declaring typedefs relevant to a particular class inside that class, i.e.
class Lorem
{
typedef boost::shared_ptr<Lorem> ptr;
typedef std::vector<Lorem::ptr> vector;
//
// ...
//
};
These types are then used elsewhere in the code:
Lorem::vector lorems;
Lorem::ptr lorem( new Lorem() );
lorems.push_back( lorem );
Reasons I like it:
It reduces the noise introduced by the class templates, std::vector<Lorem> becomes Lorem::vector, etc.
It serves as a statement of intent - in the example above, the Lorem class is intended to be reference counted via boost::shared_ptr and stored in a vector.
It allows the implementation to change - i.e. if Lorem needed to be changed to be intrusively reference counted (via boost::intrusive_ptr) at a later stage then this would have minimal impact to the code.
I think it looks 'prettier' and is arguably easier to read.
Reasons I don't like it:
There are sometimes issues with dependencies - if you want to embed, say, a Lorem::vector within another class but only need (or want) to forward declare Lorem (as opposed to introducing a dependency on its header file) then you end up having to use the explicit types (e.g. boost::shared_ptr<Lorem> rather than Lorem::ptr), which is a little inconsistent.
It may not be very common, and hence harder to understand?
I try to be objective with my coding style, so it would be good to get some other opinions on it so I can dissect my thinking a little bit.
I think it is excellent style, and I use it myself. It is always best to limit the scope of names as much as possible, and use of classes is the best way to do this in C++. For example, the C++ Standard library makes heavy use of typedefs within classes.
It serves as a statement of intent -
in the example above, the Lorem class
is intended to be reference counted
via boost::shared_ptr and stored in a
vector.
This is exactly what it does not do.
If I see 'Foo::Ptr' in the code, I have absolutely no idea whether it's a shared_ptr or a Foo* (STL has ::pointer typedefs that are T*, remember) or whatever. Esp. if it's a shared pointer, I don't provide a typedef at all, but keep the shared_ptr use explicitly in the code.
Actually, I hardly ever use typedefs outside Template Metaprogramming.
The STL does this type of thing all the time
The STL design with concepts defined in terms of member functions and nested typedefs is a historical cul-de-sac, modern template libraries use free functions and traits classes (cf. Boost.Graph), because these do not exclude built-in types from modelling the concept and because it makes adapting types that were not designed with the given template libraries' concepts in mind easier.
Don't use the STL as a reason to make the same mistakes.
Typedefs are the ones what policy based design and traits built upon in C++, so The power of Generic Programming in C++ stems from typedefs themselves.
Typdefs are definitely are good style. And all your "reasons I like" are good and correct.
About problems you have with that. Well, forward declaration is not a holy grail. You can simply design your code to avoid multi level dependencies.
You can move typedef outside the class but Class::ptr is so much prettier then ClassPtr that I don't do this. It is like with namespaces as for me - things stay connected within the scope.
Sometimes I did
Trait<Loren>::ptr
Trait<Loren>::collection
Trait<Loren>::map
And it can be default for all domain classes and with some specialization for certain ones.
The STL does this type of thing all the time - the typedefs are part of the interface for many classes in the STL.
reference
iterator
size_type
value_type
etc...
are all typedefs that are part of the interface for various STL template classes.
Another vote for this being a good idea. I started doing this when writing a simulation that had to be efficient, both in time and space. All of the value types had an Ptr typedef that started out as a boost shared pointer. I then did some profiling and changed some of them to a boost intrusive pointer without having to change any of the code where these objects were used.
Note that this only works when you know where the classes are going to be used, and that all the uses have the same requirements. I wouldn't use this in library code, for example, because you can't know when writing the library the context in which it will be used.
Currently I'm working on code, that intensively uses these kind of typedefs. So far that is fine.
But I noticed that there are quite often iterative typedefs, the definitions are split among several classes, and you never really know what type you are dealing with. My task is to summarize the size of some complex data structures hidden behind these typedefs - so I can't rely on existing interfaces. In combination with three to six levels of nested namespaces and then it becomes confusing.
So before using them, there are some points to be considered
Does anyone else need these typedefs? Is the class used a lot by other classes?
Do I shorten the usage or hide the class? (In case of hiding you also could think of interfaces.)
Are other people working with the code? How do they do it? Will they think it is easier or will they become confused?
When the typedef is used only within the class itself (i.e. is declared as private) I think its a good idea. However, for exactly the reasons you give, I would not use it if the typedef's need to be known outside the class. In that case I recommend to move them outside the class.
I recommend to move those typedefs outside the class. This way, you remove direct dependency on shared pointer and vector classes and you can include them only when needed. Unless you are using those types in your class implementation, I consider they shouldn't be inner typedefs.
The reasons you like it are still matched, since they are solved by the type aliasing through typedef, not by declaring them inside your class.