In C++17 std::optional is introduced, I was happy about this decision, until I looked at the ref. I know Optional/Maybe from Scala, Haskell and Java 8, where optional is a monad and follows monadic laws. This is not the case in the C++17 implementation. How am I supposed to use std::optional, whithout functions like map and flatMap/bind, whats the advantage using a std::optional vs for example returning -1, or a nullptr from a function if it fails to compute a result?
And more important for me, why wasn't std::optional designed to be a monad, is there a reason?
There is P0798r0 proposal with exactly this, and the associated implementation here on Github. The proposal also refers to general monadic interface proposal, and similarly usable std::expected. Implementations of those are also available.
How am I supposed to use std::optional, whithout functions like map and flatMap/bind
Maybe in Haskell is perfectly usable without fmap, it represents a value that may or may not be there. It also brings to the type system the distinction so you need to handle both cases.
whats the advantage using a std::optional vs for example returning -1, or a nullptr from a function if it fails to compute a result?
How do you know what the error condition is? Is it 0, -1, MAX_INT, nullptr or something else? If I have both a unsigned int and int return value and the int version previously returned -1 should you change them both to MAX_INT or make them return different values? std::optional avoids the problem.
And more important for me, why wasn't std::optional designed to be a monad, is there a reason?
Does C++ have monads at the moment? Until a different abstraction than the container one there isn't really a way to add that functionality.
You can define bind and return over std::optional, so in that sense it is still a Monad.
For instance, a possible bind
template<typename T1, typename T2>
std::optional<T2> bind(std::optional<T1> a, std::function< std::optional<T2>(T1)> f) {
if(a.has_value()) return f(a.value());
return std::optional<T2>{};
}
It is actually probably useful to define this.
As to why the standard library does not ship with this, or something like this, I think the answer is one of preferred style in the language.
Related
std::transform from the <algorithm> header applies to ranges, and it is what "enables" us to use ranges as the functors they are (in the sense of category theory(¹)). std::transform is iterator-based, yes, but std::ranges::views::transform is not, and its signature closely matches the signature of corresponding functions in functional languages (modulo the different order of the two arguments, but this is not a big deal).
When I saw this question (and in the process of answering to it), I learned that C++23 introduces std::optional<T>::transform, which makes std::optional a functor too.
All these news truly excite me, but I can't help thinking that functors are general, and that it'd be nice to have a uniform interface to transform any functor, as is the case in Haskell, for instance.
This makes me think that an object similar to std::ranges::views::transform (with a different name not alluding to ranges) could be made a customization point that the STL would customize not just for ranges, but also for std::optional and for any other functor in the STL, whereas the programmer could customize it for their user-defined classes.
Quite similarly, C++23 also introduces std::optional<T>::and_then, which is basically the monadic binding for std::optional. I'm not aware of any similar function that implements monadic binding for ranges, but C++20's some_range | std::ranges::views::transform(f) | std::ranges::views::join is essentially the monadic binding of some_range with f.
And this makes me think that there could be some generic interface, name it mbind, that one can opt in with any type. The STL would opt in for std::optional by implementing it in terms of std::optional<T>::and_then, for instance.
Is there any chance, or are there any plans that the language will one day support such a genericity?
I can certainly see some problems. Today std::ranges::views::transform(some_optional, some_func) is invalid, so some code might be relying on that via SFINAE. Making it suddenly work would break the code.
(¹) As regards the word functor, I refer to the definition that is given in category theory (see also this), not to the concept of "object of a class which has operator() defined"; the latter is not defined anywhere in the standard and is not even mentioned on cppreference, which instead uses the term FunctionObject to refer to
an object that can be used on the left of the function call operator
I'm not aware of any similar function that implements monadic binding for ranges, but C++20's some_range | std::ranges::views::transform(f) | std::ranges::views::join is essentially the monadic binding of some_range with f.
ranges::views::for_each is monadic bind for ranges (read), although it is just views::transform | views::join under the hood.
As for whether you'll get a generic interface for Functor and Monad. I doubt it unless such genericity will be useful to library writers writing templates. std::expiremental::future is monadic also (and I imagine Executors are too), so one could to write generic algorithms such as foldM over these three types. I think Erik Niebler has shown with range-v3 that it is possible to write a Functor/Monad library at the expense of hand coding every pipe operator i.e.
#include <fp_extremist.hpp>
template <typename M> requires Monad<M>
auto func(M m)
{
return m
| fp::extremist::fmap([](auto a) { return ...; })
| fp::extremist::mbind([](auto b) { return ...; })
;
}
What I think is actually possible, is that we'll get UFCS and a |> operator so we can get the benefits of invocable |> east syntax and the ability to pipe to algorithms. From Barry's blog:
It doesn’t work because while the range adapters are pipeable, the algorithms are not. ...
That is, x |> f still evaluates as f(x) as before… but x |> f(y) evaluates as f(x, y).
P.S. it's not hard to give the definition of a Functor in c++: <typename T> struct that provides transform.
P.S.S
Edit: I realised how to handle applicatives.
Applicative<int> a, b, c;
auto out = a
| zip_transform(b, c
, [](int a, int b, int c){ return a + b + c; })
;
zip_transform because it zips a, b, c into a Applicative<std::tuple<int, int, int>> and then transforms it (think optional, future, range). Although you could always work with Applicatives of partially applied functions, but that would involve lots of nested functions which is not the style of c++ and would disrupt the top to bottom reading order.
And this makes me think that there could be some generic interface, name it mbind, that one can opt in with any type. ...
Is there any chance, or are there any plans that the language will one
day support such a genericity?
There is P0650 which proposes a non-member monadic interface, customizable using traits. The paper shows customizations for expected and other types, implementable in C++17.
The accepted proposal for std::optional monadic operations P0798 §13.1 references P0650 while discussing alternatives to member-function syntax:
Unfortunately doing the kind of composition described above would be very verbose with the current proposal without some kind of Haskell-style do notation
std::optional<int> get_cute_cat(const image& img) {
return functor::map(
functor::map(
monad::bind(
monad::bind(crop_to_cat(img),
add_bow_tie),
make_eyes_sparkle),
make_smaller),
add_rainbow);
}
My proposal is not necessarily an alternative to [P0650]; compatibility between the two could be ensured and the generic proposal could use my proposal as part of its implementation.
It goes on to mention how other C++ features still in development, like unified call syntax, might provide a more concise syntax for generic monadic operations.
The accepted proposal for std::expected monadic operations P2505 doesn't reference P0650 directly, but discusses "Free vs member functions" as part of its design considerations, ultimately prioritizing consistency with the std::optional monadic interface.
The Boost.Hana documentation for tuple_c states:
Also note that the type of the objects returned by tuple_c and an
equivalent call to make<tuple_tag> may differ.
followed by the following snippet:
BOOST_HANA_CONSTANT_CHECK(
hana::to_tuple(hana::tuple_c<int, 0, 1, 2>)
==
hana::make_tuple(hana::int_c<0>, hana::int_c<1>, hana::int_c<2>)
);
However, the actual implementation for tuple_c simply has:
#ifdef BOOST_HANA_DOXYGEN_INVOKED
template <typename T, T ...v>
constexpr implementation_defined tuple_c{};
#else
template <typename T, T ...v>
constexpr hana::tuple<hana::integral_constant<T, v>...> tuple_c{};
#endif
and, indeed, the code snippet works just fine without the to_tuple wrapper:
BOOST_HANA_CONSTANT_CHECK(
hana::tuple_c<int, 0, 1, 2>
==
hana::make_tuple(hana::int_c<0>, hana::int_c<1>, hana::int_c<2>)
);
Question: why is the actual type of tuple_c implementation defined? Isn't the to_tuple wrapper superfluous?
The phrase "implementation defined" does not describe an implementation. It explicitly states that an implementation choice is left undocumented on purpose, for one reason or another. Of course it is implemented somehow. The users should not rely on any particular implementation but use only documented APIs.
Leaving an implementation choice undocumented is a sensible default unless there's a specific reason to document it. This is true even if there's only one obvious choice today, because tomorrow things may change.
Actually, the documentation has this covered in a FAQ:
Why leave some container's representation implementation-defined?
First, it gives much more wiggle room for the implementation to
perform compile-time and runtime optimizations by using clever
representations for specific containers. For example, a tuple
containing homogeneous objects of type T could be implemented as an
array of type T instead, which is more efficient at compile-time.
Secondly, and most importantly, it turns out that knowing the type of
a heterogeneous container is not as useful as you would think. Indeed,
in the context of heterogeneous programming, the type of the object
returned by a computation is usually part of the computation too. In
other words, there is no way to know the type of the object returned
by an algorithm without actually performing the algorithm.
Not speaking with authority, but I would say that wrapping the tuple_c with to_tuple is in fact superfluous. The documentation states that the result is functionally equivalent to make_tuple except that the type is not guaranteed to be the same.
One possible optimization would be returning something like this:
template <auto ...i>
struct tuple_c_t { };
To be sure I made a pull request to see if we can get the superfluous conversion removed from the example.
https://github.com/boostorg/hana/pull/394
UPDATE: It was confirmed by the author of Boost.Hana that the conversion is unnecessary and the example was updated to reflect that.
Obviously, std::optional is the best choice to return an optional value from a function if one uses C++17 or boost (see also GOTW #90)
std::optional<double> possiblyFailingCalculation()
But what and why would be the best alternative if one is stuck with an older version (and can't use boost)?
I see a few options:
STL smart pointers (C++11 only)
std::unique_ptr<double> possiblyFailingCalculation();
(+) virtually the same usage as optional
(−) confusing to have smart pointers to non-polymorphic types or built-in types
Pairing it up with a bool
std::pair<double,bool> possiblyFailingCalculation();
Old style
bool possiblyFailingCalculation(double& output);
(−) incompatible with new C++11 auto value = calculation() style
A DIY template: a basic template with the same functionality is easy enough to code, but are there any pitfalls to implement a robust std::optional<T> look-a-like template ?
Throw an exception
(−) Sometimes "impossible to calculate" is a valid return value.
std::optional, like its boost::optional parent, is a pretty basic class template. It's a bool, some storage, and a bunch of convenience member functions most of which are one line of code and an assert.
The DIY option is definitely preferred. (1) involves allocation and (2), (3) involve having to construct a T even if you want a null value - which doesn't matter at all for double but does matter for more expensive types. With (5), exceptions are not a replacement for optional.
You can always compare your implementation to Boost's. It's a small header-only library, after all.
instead of std::optional, use tl::optional from this link:
https://github.com/TartanLlama/optional
It has the same public interface as its std counterpart, only it compiles in C++98 also.
I used it in production code (C++11) and works great!
I'd also consider a sentinel value.
In the case of a double the NaN value (std::numeric_limits<double>::quiet_NaN()) is a possible candidate (only meaningful if std::numeric_limits<double>::has_quiet_NaN == true).
There are various opinions about this approach (e.g. take a look at NaN or false as double precision return value and Good sentinel value for double if prefer to use -ffast-math).
In specific domains there could be other meaningful sentinel values.
In any case (not only for double) I'd adopt/implement something like markable (https://github.com/akrzemi1/markable) to avoid magic values and indicate that the value may not be there and that its potential absence should be checked by the user.
For additional motivation and overview of this approach: Efficient optional values.
Why is std::pair<A,B> not the same as std::tuple<A,B>? It always felt strange to not be able to just substitute one with the other. They are somewhat convertible, but there are limitations.
I know that std::pair<A,B> is required to have the two data members A first and B second, so it can't be just a type alias of std::tuple<A,B>. But my intuition says that we could specialize std::tuple<A,B>, that is a tuple with exactly two elements, to equal the definition of what the standard requires a std::pair to be. And then alias this to std::pair.
I guess this wouldn't be possible as it is too straight-forward to not to be already thought of, yet it wasn't done in g++'s libstdc++ for example (I didn't look at the source code of other libraries). What would the problem of this definition be? Is it "just" that it would break the standard library's binary compatibility?
You've gotta be careful about things like SFINAE and overloading. For example, the code below is currently well-formed but you would make it illegal:
void f(std::pair<int, int>);
void f(std::tuple<int, int>);
Currently, I can disambiguate between pair and tuple through overload resolution, SFINAE, template specialization, etc. These tools would all become incapable of telling them apart if you make them the same thing. This would break existing code.
There might have been an opportunity to introduce it as part of C++11, but there certainly isn't now.
This is purely historical. std::pair exist since C++98 whereas tuple came after and was initially not part of the standard.
Backward compatibility is the biggest burden for C++ evolution, preventing some nice things to be done easily !
I've not tried this and don't have the bandwidth right now to do so. You could try making a specialisation of std::tuple, derived from a sd::pair. Someone please tell me this won't work or is particularly horrible idea. I suspect you'd run into trouble with accessors.
Why isn't std::initializer_list a core-language built-in?
It seems to me that it's quite an important feature of C++11 and yet it doesn't have its own reserved keyword (or something alike).
Instead, initializer_list it's just a template class from the standard library that has a special, implicit mapping from the new braced-init-list {...} syntax that's handled by the compiler.
At first thought, this solution is quite hacky.
Is this the way new additions to the C++ language will be now implemented: by implicit roles of some template classes and not by the core language?
Please consider these examples:
widget<int> w = {1,2,3}; //this is how we want to use a class
why was a new class chosen:
widget( std::initializer_list<T> init )
instead of using something similar to any of these ideas:
widget( T[] init, int length ) // (1)
widget( T... init ) // (2)
widget( std::vector<T> init ) // (3)
a classic array, you could probably add const here and there
three dots already exist in the language (var-args, now variadic templates), why not re-use the syntax (and make it feel built-in)
just an existing container, could add const and &
All of them are already a part of the language. I only wrote my 3 first ideas, I am sure that there are many other approaches.
There were already examples of "core" language features that returned types defined in the std namespace. typeid returns std::type_info and (stretching a point perhaps) sizeof returns std::size_t.
In the former case, you already need to include a standard header in order to use this so-called "core language" feature.
Now, for initializer lists it happens that no keyword is needed to generate the object, the syntax is context-sensitive curly braces. Aside from that it's the same as type_info. Personally I don't think the absence of a keyword makes it "more hacky". Slightly more surprising, perhaps, but remember that the objective was to allow the same braced-initializer syntax that was already allowed for aggregates.
So yes, you can probably expect more of this design principle in future:
if more occasions arise where it is possible to introduce new features without new keywords then the committee will take them.
if new features require complex types, then those types will be placed in std rather than as builtins.
Hence:
if a new feature requires a complex type and can be introduced without new keywords then you'll get what you have here, which is "core language" syntax with no new keywords and that uses library types from std.
What it comes down to, I think, is that there is no absolute division in C++ between the "core language" and the standard libraries. They're different chapters in the standard but each references the other, and it has always been so.
There is another approach in C++11, which is that lambdas introduce objects that have anonymous types generated by the compiler. Because they have no names they aren't in a namespace at all, certainly not in std. That's not a suitable approach for initializer lists, though, because you use the type name when you write the constructor that accepts one.
The C++ Standard Committee seems to prefer not to add new keywords, probably because that increases the risk of breaking existing code (legacy code could use that keyword as the name of a variable, a class, or whatever else).
Moreover, it seems to me that defining std::initializer_list as a templated container is quite an elegant choice: if it was a keyword, how would you access its underlying type? How would you iterate through it? You would need a bunch of new operators as well, and that would just force you to remember more names and more keywords to do the same things you can do with standard containers.
Treating an std::initializer_list as any other container gives you the opportunity of writing generic code that works with any of those things.
UPDATE:
Then why introduce a new type, instead of using some combination of existing? (from the comments)
To begin with, all others containers have methods for adding, removing, and emplacing elements, which are not desirable for a compiler-generated collection. The only exception is std::array<>, which wraps a fixed-size C-style array and would therefore remain the only reasonable candidate.
However, as Nicol Bolas correctly points out in the comments, another, fundamental difference between std::initializer_list and all other standard containers (including std::array<>) is that the latter ones have value semantics, while std::initializer_list has reference semantics. Copying an std::initializer_list, for instance, won't cause a copy of the elements it contains.
Moreover (once again, courtesy of Nicol Bolas), having a special container for brace-initialization lists allows overloading on the way the user is performing initialization.
This is nothing new. For example, for (i : some_container) relies on existence of specific methods or standalone functions in some_container class. C# even relies even more on its .NET libraries. Actually, I think, that this is quite an elegant solution, because you can make your classes compatible with some language structures without complicating language specification.
This is indeed nothing new and how many have pointed out, this practice was there in C++ and is there, say, in C#.
Andrei Alexandrescu has mentioned a good point about this though: You may think of it as a part of imaginary "core" namespace, then it'll make more sense.
So, it's actually something like: core::initializer_list, core::size_t, core::begin(), core::end() and so on. This is just an unfortunate coincidence that std namespace has some core language constructs inside it.
Not only can it work completely in the standard library. Inclusion into the standard library does not mean that the compiler can not play clever tricks.
While it may not be able to in all cases, it may very well say: this type is well known, or a simple type, lets ignore the initializer_list and just have a memory image of what the initialized value should be.
In other words int i {5}; can be equivalent to int i(5); or int i=5; or even intwrapper iw {5}; Where intwrapper is a simple wrapper class over an int with a trivial constructor taking an initializer_list
It's not part of the core language because it can be implemented entirely in the library, just line operator new and operator delete. What advantage would there be in making compilers more complicated to build it in?