Why C++20 filter_view has a .size member function? - c++

I found this on cppreference:
Following types are derived from std::ranges::view_interface and do
not declare their own size member function, but they cannot use the
default implementation, because their iterator and sentinel types
never satisfy sized_sentinel_for:
std::ranges::basic_istream_view
std::ranges::filter_view
std::ranges::join_view
std::ranges::split_view
std::ranges::take_while_view
This makes sense since those views can not calculate size in O(1). But I do not understand why then not make more than one base class for views, for example something like view_interface but without size member?
I presume that one possible reason is that ranges are already a insanely complicated library, but maybe I am missing something else.
If you wonder why this is a problem: maybe it is not, but I think/feel it is confusing to users since those views have member function that never works.
note: I know I can use distance, this is a question about design.

A class template is not a class. std::ranges::view_interface is not the base class of anything.
std::ranges::view_interface<std::ranges::filter_view<V, Pred>> does not have a size member. Other instantiations of std::ranges::view_interface do, but it's always been possible for one instantiation of a template to have members that another doesn't have.
The documentation for std::ranges::view_interface does mention that size is only present if (among other things) its sentinel and iterator type satisfy std::sized_sentinel_for.
I much prefer having one name that calculates appropriate capabilities, than a multitude of different names for subtly different capabilities. It is more extensible, if additional members are defined in later standards, everything automatically gains the members that it supports, without having to change e.g. sized_forward_view_interface to foobar_sized_forward_view_interface.

Related

Class diagram for variable member with the type of `std::vector<T>`

class Foo{};
class Demo
{
public:
std::vector<Foo> foo_vec;
};
How to draw the relationship between them?
How to tell the others that the foo_vec is vector in the class diagram(I found this)? And some posts tell me to add notation <:vector> on the line that represent composition.
If you want to disclose the details of the implementation, you could go for something as developed as:
The key of this design is to disclose that vector is a templated class, and that it is instantiated by binding the template parameter T to the class Foo. To put Foo in the picture, you could either just show it at the instantiation level, but I preferred to show some kind of relation with T. I used here a realization, since T implicitly defines in the C++ semantic an interface for the template parameter, and Foo is expected to implement this (implicit) interface. I also used composition, since C++ vectors are by value, and the vector owns the lifecycle of its elements.
Unfortunately, this makes the design look more complex than it is in reality. Moreover it doesn't show the full reality of a std::vector that has a second template parameter Allocator. Last but not least, to be fully transparent, you should use a qualified composition, with a std::vector::size_type qualifier (to document that there is an indexed access possible). You'd end-up with a very complex diagram.
I therefore recommend to consider a simplified approach. Since everybody knows what a std::vector is, you may use the more compact (and still legit) form of a direct binding, assuming that the template details are defined elsewhere:
This is very close to your implementation and stays very readable. However, you could simplify one step further: is vector a fundamental design choice? Or is it just one practical way to implement a one-to-many composition? If the real intent is just that Demo may be composed of several Foo you could be as simple as:
Here you would let to the implementer to chose the best way to implement the composition instead of graving the implementation choice in the marble of your model.

Does C++20 offer any new solutions to the problem of public member invisibility and source code bloat with inherited class templates?

Does C++20 offer any new solutions to the problem of public member invisibility and source code bloat/repetition with inherited class templates described in this question over 2 years ago ?
The "Problem"
The alleged "problem" is that, in a template, an unqualified name used in a way which isn't dependent on the specialization is truly independent of the specialization and refers to the entity with that name found at that point. The alleged source code "bloat" is using this-> to explicitly make the name dependent or qualifying the name. This is still the situation in C++20.
Just to be clear, the set of entities is not known at the point we refer to them. In the linked question, the base class depends on the template parameter, and we have only seen the primary base class template. The base class template may be specialized later and may have completely different member functions than the ones we've seen. So, any "solution" requires that a name without any obvious contextual dependence on the specialization find entities not yet declared which may be surprising.
Why It's Impossible
Any naive changes in this direction are either pretty big or have severe downsides, or both.
You could postpone all name lookup until instantiation time, discarding two phase lookup. That invites ODR violations which is silent UB, which is a huge downside.
You could restrict specialization so that you cannot specialize the base class later such that a different entity is found. That is difficult to diagnose, so it would likely be a new rule introducing silent UB.
You could opt in with a using declaration: using X::* as you propose or using class X as someone else suggested in a different context. This has the benefit of explicitness. It moves the problem a level up: if X is not dependent, presumably it should be found now, but if it is dependent, what happens? We can't instantiate it prior to instantiating the template we're in now. Thus, we can't interpret any names we see until instantiation. It has similar downsides to discarding two phase lookup.
Any of these changes would add complexity to an already complex area and would also pose significant backwards compatibility hurdles. None of them is a clear win.
Note: In C++20 they did make the rules more uniform by allowing ADL to find function templates with specified parameters to be found: f<int>(1).
Why It's Not a Problem
I doubt there would be consensus that this really is a problem. The linked question makes a poor argument. The derived class adds member function behavior to a certain base class, but a free function works better. These behaviors did not need member access, they aren't required by the language to be members, they can be found as non-members with ADL, and by using a free function they apply even when the static type you have is the base type. So, using inheritance for this is unnecessary coupling, and is a worse option.
Searching for 100s of places to add this-> and adding these 6 characters 100s of times strikes me as Code Bloat and Repetition when I have to templatize a base class
"Searching": The compiler will tell you when a name can't be found, which is better than silent bad behavior, such as making ODR violations easier to hit.
"templatize a base": Templatizing the base class doesn't trigger this. Templatizing the derived class and making the base dependent does. Yes, when templatizing the derived class, specifying that a bare name used in a way that is independent of the template parameter is in fact dependent may seem like boilerplate to some, but others might argue being explicit is clearer.
"100s of times": Seems hyperbolic.
These code patterns are used all the time in real world. Just look at CRTP. (comment on linked question)
Again, this only applies if the derived class is templated. I would dispute the commonality, but these idioms do exist and have a place.
Most importantly, though, is that CRTP is not a goal. CRTP is a hack. It's a C++ idiom because C++ lacks better facilities. CRTP allows a class to opt into certain behaviors that would otherwise be bothersome to write. Relevant C++ proposals do exist, but by and large, they have focused on making extension easier or removing boilerplate, and not on making CRTP, the hack, easier.
These are some that come to mind:
C++20: Comparisons
A very common use of CRTP is for things that require a lot of extra boilerplate. C++ required you to define operator== and operator!=, for instance. By opting into a CRTP base class, one could define only the primitive operation and have the other one generated.
C++20 fixed the underlying problem with comparisons. The typical member-wise comparison can be defaulted, and comparisons can be re-written so that != can invoke ==.
The problem is solved at the root, removing CRTP, not enhancing it.
C++20: Iterators to Ranges
Another common use of CRTP in the same vein as above is iterators. Writing custom iterators requires only a few fundamental operations: advance and dereference for forward iterators. But then there's a lot of extra seemingly unnecessary ceremony: pre-increment, post-increment, const iterators, typedefs, etc.
C++20 took a large step forward by introducing range concepts and the range library. The result is that it should be much less necessary to write a custom iterator. Ranges become a capable concept on their own, and there's a good suite of range combinators.
C++20: Concepts
C++ essentially has two systems for specifying an interface: virtual functions and concepts. They have trade-offs. Virtual functions are intrusive. But prior to C++20, concepts were implicit and emulated. One reason to use CRTP or inheritance generally would be to inject virtual functions. But in C++20, concepts are a language feature removing one big negative.
Future C++: Metaclasses
One value of CRTP in addition to the boilerplate reduction is satisfying an entire collection of multiple type requirements. A comparable class defines all the comparison operators. An clonable base defines a virtual destructor and clone.
This is the topic of metaclasses, which is not yet in C++.
See Also
See also the work on customization points which seems very interesting. And see the debates on unified function call syntax, which seems like we'll never get.
Summary
There's a very good question hiding in here about how C++20 makes it easier to reduce boilerplate, remove hacks like CRTP, and write better and clearer code. C++20 takes several steps in this regard, but they made the expression of intent easier, not a particular idiom.

Specializing std::optional

Will it be possible to specialize std::optional for user-defined types? If not, is it too late to propose this to the standard?
My use case for this is an integer-like class that represents a value within a range. For instance, you could have an integer that lies somewhere in the range [0, 10]. Many of my applications are sensitive to even a single byte of overhead, so I would be unable to use a non-specialized std::optional due to the extra bool. However, a specialization for std::optional would be trivial for an integer that has a range smaller than its underlying type. We could simply store the value 11 in my example. This should provide no space or time overhead over a non-optional value.
Am I allowed to create this specialization in namespace std?
The general rule in 17.6.4.2.1 [namespace.std]/1 applies:
A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly
prohibited.
So I would say it's allowed.
N.B. optional will not be part of the C++14 standard, it will be included in a separate Technical Specification on library fundamentals, so there is time to change the rule if my interpretation is wrong.
If you are after a library that efficiently packs the value and the "no-value" flag into one memory location, I recommend looking at compact_optional. It does exactly this.
It does not specialize boost::optional or std::experimental::optional but it can wrap them inside, giving you a uniform interface, with optimizations where possible and a fallback to 'classical' optional where needed.
I've asked about the same thing, regarding specializing optional<bool> and optional<tribool> among other examples, to only use one byte. While the "legality" of doing such things was not under discussion, I do think that one should not, in theory, be allowed to specialize optional<T> in contrast to eg.: hash (which is explicitly allowed).
I don't have the logs with me but part of the rationale is that the interface treats access to the data as access to a pointer or reference, meaning that if you use a different data structure in the internals, some of the invariants of access might change; not to mention providing the interface with access to the data might require something like reinterpret_cast<(some_reference_type)>. Using a uint8_t to store a optional-bool, for example, would impose several extra requirements on the interface of optional<bool> that are different to the ones of optional<T>. What should the return type of operator* be, for example?
Basically, I'm guessing the idea is to avoid the whole vector<bool> fiasco again.
In your example, it might not be too bad, as the access type is still your_integer_type& (or pointer). But in that case, simply designing your integer type to allow for a "zombie" or "undetermined" value instead of relying on optional<> to do the job for you, with its extra overhead and requirements, might be the safest choice.
Make it easy to opt-in to space savings
I have decided that this is a useful thing to do, but a full specialization is a little more work than necessary (for instance, getting operator= correct).
I have posted on the Boost mailing list a way to simplify the task of specializing, especially when you only want to specialize some instantiations of a class template.
http://boost.2283326.n4.nabble.com/optional-Specializing-optional-to-save-space-td4680362.html
My current interface involves a special tag type used to 'unlock' access to particular functions. I have creatively named this type optional_tag. Only optional can construct an optional_tag. For a type to opt-in to a space-efficient representation, it needs the following member functions:
T(optional_tag) constructs an uninitialized value
initialize(optional_tag, Args && ...) constructs an object when there may be one in existence already
uninitialize(optional_tag) destroys the contained object
is_initialized(optional_tag) checks whether the object is currently in an initialized state
By always requiring the optional_tag parameter, we do not limit any function signatures. This is why, for instance, we cannot use operator bool() as the test, because the type may want that operator for other reasons.
An advantage of this over some other possible methods of implementing it is that you can make it work with any type that can naturally support such a state. It does not add any requirements such as having a move constructor.
You can see a full code implementation of the idea at
https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/optional/optional.hpp?at=default&fileviewer=file-view-default
and for a class using the specialization:
https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/class.hpp?at=default&fileviewer=file-view-default
(lines 220 through 242)
An alternative approach
This is in contrast to my previous implementation, which required users to specialize a class template. You can see the old version here:
https://bitbucket.org/davidstone/bounded_integer/src/2defec41add2079ba023c2c6d118ed8a274423c8/bounded_integer/detail/optional/optional.hpp
and
https://bitbucket.org/davidstone/bounded_integer/src/2defec41add2079ba023c2c6d118ed8a274423c8/bounded_integer/detail/optional/specialization.hpp
The problem with this approach is that it is simply more work for the user. Rather than adding four member functions, the user must go into a new namespace and specialize a template.
In practice, all specializations would have an in_place_t constructor that forwards all arguments to the underlying type. The optional_tag approach, on the other hand, can just use the underlying type's constructors directly.
In the specialize optional_storage approach, the user also has the responsibility of adding proper reference-qualified overloads of a value function. In the optional_tag approach, we already have the value so we do not have to pull it out.
optional_storage also required standardizing as part of the interface of optional two helper classes, only one of which the user is supposed to specialize (and sometimes delegate their specialization to the other).
The difference between this and compact_optional
compact_optional is a way of saying "Treat this special sentinel value as the type being not present, almost like a NaN". It requires the user to know that the type they are working with has some special sentinel. An easily specializable optional is a way of saying "My type does not need extra space to store the not present state, but that state is not a normal value." It does not require anyone to know about the optimization to take advantage of it; everyone who uses the type gets it for free.
The future
My goal is to get this first into boost::optional, and then part of the std::optional proposal. Until then, you can always use bounded::optional, although it has a few other (intentional) interface differences.
I don't see how allowing or not allowing some particular bit pattern to represent the unengaged state falls under anything the standard covers.
If you were trying to convince a library vendor to do this, it would require an implementation, exhaustive tests to show you haven't inadvertently blown any of the requirements of optional (or accidentally invoked undefined behavior) and extensive benchmarking to show this makes a notable difference in real world (and not just contrived) situations.
Of course, you can do whatever you want to your own code.

For C++ templates, is there a way find types that are "valid" inputs?

I have a library where template classes/functions often access explicit members of the input type, like this:
template <
typename InputType>
bool IsSomethingTrue(
InputType arg1) {
typename InputType::SubType1::SubType2 &a;
//Do something
}
Here, SubType1 and SubType2 are themselves generic types that were used to instantiate InputType. Is there a way to quickly find all the types in the library that are valid to pass in for InputType (likewise for SubType1 and SubType2)? So far I have just been searching the entire code base for classes containing the appropriate members, but the template input names are reused in a lot of places so it is very cumbersome.
From a coding perspective, what is the point of using a template like this when there is only a limited set of valid input types that are probably already defined? Why not just overload this function with explicit types rather than making them generic?
From a coding perspective, what is the point of using a template like this when there is only a limited set of valid input types that are probably already defined? Why not just overload this function with explicit types rather than making them generic?
First of all, because those overload would have the exact same body, or very similar ones. If the body of the function is long enough, having more versions of it is a problem for maintenance. When you need to change the algorithm, you now have to do it N times and hope you won't make mistakes. Most of the times, redundancy is bad.
Moreover, even though now there could be just a few such types which satisfy the syntactic requirements of your function, there may be more in future. Having a function template allows you to let your algorithm work with new types without the need to write a new overload every time one new such type is introduced.
The advantage of using generic types is not on the template end: if you're willing to explicitly name them and edit the template code every time, it's the same.
What happens, however, when you introduce a subclass or variant of a type accepted by the template? No modification needed on the other end.
In other words, when you say that all types are known beforehand, you are excluding code modifications and extensions, which is half the point of using templates.

Is it always evil to have a struct with methods?

I've just been browsing and spotted the following...
When should you use a class vs a struct in C++?
The consensus there is that, by convention, you should only use struct for POD, no methods, etc.
I've always felt that some types were naturally structs rather than classes, yet could still have a few helper functions as members. The struct should still be POD by most of the usual rules - in particular it must be safe to copy using memcpy. It must have all member data public. But it still makes sense to me to have helper functions as members. I wouldn't even necessarily object to a private method, though I don't recall ever doing this myself. And although it breaks the normal POD rules, I wouldn't object to a struct having constructors, provided they were just initialise-a-few-fields constructors (overriding assignment or destructors would definitely be against the rules).
To me a struct is intuitively a collection of fields - a data structure node or whatever - whereas a class is an abstraction. The logical place to put the helper functions for your collection-of-fields may well be within the struct.
I even think I once read some advice along these lines, though I don't remember where.
Is this against accepted best practice?
EDIT - POD (Plain Old Data) is misrepresented by this question. In particular, a struct can be non-POD purely because a member is non-POD - e.g. an aggregate with a member of type std::string. That aggregate must not be copied with memcpy. In case of confusion, see here.
For what it's worth, all the standard STL functors are defined as structs, and their sole purpose is to have member functions; STL functors aren't supposed to have state.
EDIT: Personally, I use struct whenever a class has all public members. It matters little, so long as one is consistent.
As far as the language is concerned, it doesn't matter, except for default private vs. public access. The choice is subjective.
I'd personally say use struct for PODs, but remember that "POD" doesn't mean "no member functions". It means no virtual functions, constructors, destructor, or operator=.
Edit: I also use structs for simple public-access aggregates of data, even if their members aren't PODs.
I tend to use struct a lot.
For the "traditional" OOP classes (the ones that represent a specific "thing"), I tend to use class simply because it's a common convention.
But most of my classes aren't really OOP objects. They tend to be functors, and traits classes and all sorts of other more abstract code concepts, things I use to express myself, rather than modelling specific "things". And those I usually make struct. It saves me having to type the initial public:, and so it makes the class definition shorter and easier to get an overview of.
I also lean towards making larger, more complex classes class. Except this rarely affects anything, because I lean even more towards refactoring large, complex classes... And then I'm left with small simple ones, which might be made struct's.
But in either case, I'd never consider it "evil". "Evil" is when you do something that actively obfuscates your code, or when you make something far more complex than necessary.
But every C++ programmer knows that struct and class mean virtually the same thing. You're not going to be confused for long because you see a struct Animal {...}; You know that it's a class designed to model an animal. So no, it's not "evil" no matter which way you do it.
When I have control over the style guides, everything is defined as a struct. Historically, I wanted all my objects to be one or the other, since I was taught it was undefined behavior to forward declared a struct as a class, and vice versa (I'm not sure if this was ever actually true, its just what I was told). And really, struct has the more reasonable default state.
I still do the same because I've never been convinced of the value of using it as a form of documentation. If you need to convey that an object only has public members, or doesn't have any member functions, or whatever arbitrary line you chose to draw between the two, you have actual documentation available for that. Since you never actually use the struct or class keyword when using the type, you would need to go hunting for the definition anyhow if you want the information. And with everybody having their own opinion on what struct actually means, its usefulness as self-documentation is reduced. So I shoot for consistency: everything as one or the other. In this case, struct.
Its not a popular way of doing things, to say the least. Fortunately for everybody involved, I very rarely have control over the coding standards.
I use a struct whenever I need an aggregate of public data members (as opposed to class, which I use for types that come with a certain level of abstractions and have their data private).
Whether or not such a data aggregate has member functions doesn't matter to me. In fact, I rarely ever write a struct without also providing at least one constructor for it.
You said, "To me a struct is intuitively a collection of fields - a data structure node or whatever". What you are describing is Plain Old Data. The C++ standard does have an opinion of what POD means with respect to C++ language features. I suggest that, where possible, you adopt the meaning of POD used in C++0x.
I need to find a reference for this, but I thought that, as far as type definitions are concerned, the keywords "struct" and "class" are to be synonyms in C++0x. The only difference being that class defaults to private members, while a struct defaults to public. This means that you'll never see complier error messages like "Type X first seen using struct, now seen using class." once we're all using C++0x.
I think that it is wrong to have strict rules on when to use class and when to use struct. If consistency is important to you, the go ahead and make a coding standard that has what ever sort of rules you like. Just make sure everyone knows that your coding standard relates to how you want you code to look - it doesn't relate to the underlying language features at all.
Maybe to give an example where I break all your rules, you say "To me a struct is intuitively a collection of fields - a data structure node or whatever - whereas a class is an abstraction." When I want to write a class that implements some abstraction, I define the interface as a struct with pure virtual methods (egad!) - this is exposed in a header, as too is a factory method to construct my concrete class. The concrete class is not exposed i a public header at all. Since the concrete class is a secret, it can be implemented however I like and it makes no difference to external code.
I'd suggest that if you think that the keywords struct and class have any relation to the concept of POD, then you're not understanding C++ yet.