Why do I need to include <compare> header to get <=> to compile? - c++

I know the technical answer is: because the standard says so.
But I am confused regarding the motivation:
I see nothing "library" in the defaulting the <=>: it may return some type that is technically defined in std but it is a "fake library" type in a sense that compiler must know about it since it must be able to default operator <=> with auto return type (not to mention that error messages in good compilers specify <compare> so it is clear that there is a language<=>library link here).
So I understand there is some library functionality that might require me to include <compare> but I do not understand why defaulting <=> requires me to include that header since compiler anyway has to know about everything needed to make the <=>.
Note: I know most of the time some other standard headers will include the <compare>, this is a question about language/library design, not that much about one extra line that C++ forces me to write without a good reason.

it may return some type that is technically defined in std but it is a "fake library" type in a sense
Well, <=> returns types that are very much real, that are actually defined in <compare> and implemented there. In the same way that an initializer list is used to construct a std::initializer_list<T>, which is very much a real type that is actually defined in <initializer_list>. And typeinfo in <typeinfo>.
And those comparison types - std::strong_ordering, std::weak_ordering, and std::partial_ordering (and originally also std::strong_equality and std::weak_equality) - themselves have non-trivial conversion semantics and other operations defined on them, that we may want to change in the future. They'd be very special language types indeed, where the convertibility only goes in one direction but in a way that's very much unlike inheritance (there are only three values for the total ordering types, but four for the partial one...). It's really much easier to define these as real library types and then specify their interaction as real library code.
that compiler must know about it since it must be able to default operator<=> with auto return type
Kind of, but not really. The compiler knows what the names of the types are, and how to produce values of them for the fundamental types, but it doesn't actually need to know anything more than that. The rule for the return type is basically hardcoded based on the types that the underyling members' <=>s return, don't need to know what those actual types look like to do that. And then you're just invoking functions that do... whatever.
The cost of you having to include a header is typing #include <compare> and then parse it. The cost of the compiler having to synthesize these types is a cost that would have to be paid for every TU, whether or not it does any three-way comparisons. Plus if/when we want to change these types, it's easier to change library types than language types anyway.

Related

How does the compiler define the classes in type_traits?

In C++11 and later, the <type_traits> header contains many classes for type checking, such as std::is_empty, std::is_polymorphic, std::is_trivially_constructible and many others.
While we use these classes just like normal classes, I cannot figure out any way to possibly write the definition of these classes. No amount of SFINAE (even with C++14/17 rules) or other method seems to be able to tell if a class is polymorphic, empty, or satisfy other properties. An class that is empty still occupies a positive amount of space as the class must have a unique address.
How then, might compilers define such classes in C++? Or perhaps it is necessary for the compiler to be intrinsically aware of these class names and parse them specially?
Back in the olden days, when people were first fooling around with type traits, they wrote some really nasty template code in attempts to write portable code to detect certain properties. My take on this was that you had to put a drip-pan under your computer to catch the molten metal as the compiler overheated trying to compile this stuff. Steve Adamczyk, of Edison Design Group (provider of industrial-strength compiler frontends), had a more constructive take on the problem: instead of writing all this template code that takes enormous amounts of compiler time and often breaks them, ask me to provide a helper function.
When type traits were first formally introduced (in TR1, 2006), there were several traits that nobody knew how to implement portably. Since TR1 was supposed to be exclusively library additions, these couldn't count on compiler help, so their specifications allowed them to get an answer that was occasionally wrong, but they could be implemented in portable code.
Nowadays, those allowances have been removed; the library has to get the right answer. The compiler help for doing this isn't special knowledge of particular templates; it's a function call that tells you whether a particular class has a particular property. The compiler can recognize the name of the function, and provide an appropriate answer. This provides a lower-level toolkit that the traits templates can use, individually or in combination, to decide whether the class has the trait in question.

Why can't I overload C++ conversion operators outside a class, as a non-member function?

This question has kind of been asked before but I feel the asker was hasty to call an answer correct when he never actually got a real answer. Maybe there is no reason why, and this needs to be put in the standard later, you tell me.
What is the rationale to not allow overloading of C++ conversions operator with non-member functions
I'm looking for the specific reason for this not being allowed as part of the design of the current standard. Basically, when you overload a cast operator to define an implicit conversion between two types, this overloaded definition has to be a member of the class that you're converting from, and not something outside a class. The obvious problem is that if you have types that you really can't modify for some reason but you want to implicitly convert between them for the sake of simplicity of syntax (despite the evils of implicit conversion) or because you have a bunch of other code, standard or custom that relies on implicit conversion...you can't do that if you can't add appropriate implicit conversions to the classes, so you need to use workarounds like regular functions for conversion that you would wrap around what would otherwise be the convenience of implicit conversion.
Also, is it...really possible that there would be a computational overhead to add these conversions outside a class? The way I see it, it would be easy for a compiler to, when going through to figure out what functions are available, associate external implicit conversion functions with the class they convert from so that the code is executed like it were part of that class as far as efficiency goes. The only downside would be the extra work it has to do to make the initial association, which should be almost nothing.
I will not take "because the standard says so" or "because implicit conversions are bad" as an answer. Somebody surely had a reason when they wrote the actual standard.
(I'm not a huge expert, I'm still learning the language.)
Edit, response:
Well, I imagine the situation could be like, yes you change the header file, but what you don't do is overwrite the existing one because that would be terrible. You would create a new header file based on the old one to accomodate the changes. The assumption would be that the old code is already compiled in an object file and changing the header just tells the compiler there's additional code somewhere else that you added. It wouldn't change what the old code does because it's already compiled and doesn't depend on that (i.e. some vendor handed you object code and a header). If I could modify and recompile the code I would be using the conversion for, then you couldn't make me write the conversion function externally, I wouldn't do it, it's too confusing. You wouldn't have to search every header randomly for the right definitions; if I was writing the code myself I would make a custom header with a highly visible section where the stuff I added to the vendor-supplied header is, and said header would be relatively obvious as to which one it was because it would be associated with the related types, and the other headers would be named by their original names so you would know they weren't changed. And you would have a corresponding file that contains only the conversion definitions, so my modifications would be self-contained, separated from the original object code, and relatively easy to find. Of course that's apart from the actual struggle of figuring out in the code which conversion function applies. I think you can find a variety of cases where that's easy enough to determine and natural enough to use where it makes sense to add on to an existing library like this for your own purposes. If I was using commercial code that I couldn't really modify and I saw a situation where what I was doing with it could be improved by using a conversion function to integrate it with some of my own stuff, I could see myself wanting to do this. Granted such things aren't obvious for a third person just reading a = b, they wouldn't know what was going on with my conversions just from that, but if you knew and it read nicely then it could work.
I appreciate the insight on how standards decisions tend to work, this is definitely a kind of fringe thing that you could ignore.
Besides having non-explicit conversion operators e.g. operator bool() in a class, you can also have non-explicit constructors taking a single argument, in the class you are converting to, as a way of introducing a user-defined conversion. (Not mentioned in question)
As to why you cannot introduce user-defined conversions between two types A and B without modifying their definitions... well this would create chaos.
If you can do this, then you can do it in a header file, and since introducing new user-defined conversions can change the meaning of code, it would mean that "old" code using only A and B could totally change what it is doing depending on if your header is then included before it, or something like this.
It's already hard enough to figure out exactly what user-defined conversion sequences are taking place when things are going wrong, even with the restriction that the conversions have to be declared by one of the two types. If you literally have to search every single unrelated header file in full potentially to find these conversion function definitions it dramatically worsens the maintenance problem, and there doesn't appear to be any benefit to allowing this. I mean can you give a non-contrived example where this language feature would help you make the implementation of something much simpler or easier to read?
In general, I think programmers like the idea that to figure out what a line a = b; does, they just have to read the definition of the type of a and the type of b and go from there... it's potentially ugly and painful if you start allowing these "gotcha" conversions that are harder to know about.
I guess you could say the same thing with regards to operator << being used for streaming... but with user-defined conversions its more serious since it can potentially affect any line of code where an object of that type is being passed as a parameter.
Also, I don't think you should necessarily expect to find a well-thought out reason, not everything that is feasible for compilers to implement is permitted by the standard. Committee tends to be conservative and seek consensus, so "no one really cared about feature X enough to fight for it" is probably as good an explanation as you will find about why feature X is not available.
Why is initialization of a constant dependent type in a template parameter list disallowed by the standard?
Answer to that question suggests a common reason for a feature not being available:
Legacy: the feature was left out in the first place and now we've built a lot without it that it's almost forgotten (see partial function template specialization).

Is std::vector<T> a `user-defined type`?

In 17.6.4.2.1/1 and 17.6.4.2.1/2 of the current draft standard restrictions are placed on specializations injected by users into namespace std.
The behavior of a C
++
program is undefined if it adds declarations or definitions to namespace
std
or to a
namespace within namespace
std
unless otherwise specified. A program may add a template specialization
for any standard library template to namespace
std
only if the declaration depends on a user-defined type
and the specialization meets the standard library requirements for the original template and is not explicitly
prohibited.
I cannot find where in the standard the phrase user-defined type is defined.
One option I have heard claimed is that a type that is not std::is_fundamental is a user-defined type, in which case std::vector<int> would be a user-defined type.
An alternative answer would be that a user-defined type is a type that a user defines. As users do not define std::vector<int>, and std::vector<int> is not dependent on any type a user defines, std::vector<int> is not a user-defined type.
A practical problem this impacts is "can you inject a specialization for std::hash for std::tuple<Ts...> into namespace std? Being able to do so is somewhat convenient -- the alternative is to create another namespace where we recursively build our hash for std::tuple (and possibly other types in std that do not have hash support), and if and only if we fail to find a hash in that namespace do we fall back on std.
However, if this is legal, then if and when the standard adds a hash specialization for std::tuple to namespace std, code that specialized it already would be broken, creating a reason not to add such specializations in the future.
While I am talking about std::vector<int> as a concrete example, I am trying to ask if types defined in std are ever user-defined type s. A secondary question is, even if not, maybe std::tuple<int> becomes a user-defined type when used by a user (this gets slippery: what then happens if something inside std defines std::tuple<int>, and you partial-specialize hash for std::tuple<Ts...>).
There is currently an open defect on this problem.
Prof. Stroustrup is very clear that any type that is not built-in is user-defined. See the second paragraph of section 9.1 in Programming Principles and Practice Using C++.
He even specifically calls out “standard library types” as an example of user-defined types. In other words, a user-defined type is any compound type.
Source
The article explicitly mentions that not everyone seems to agree, but this is IMHO mostly wishful thinking and not what the standard (and Prof. Stroustrup) are actually saying, only what some people want to read into it.
When Clause 17 says "user-defined" it means "a type not defined in the standard" so std::vector<int> is not user-defined, neither is std::string, so you cannot specialize std::vector<int> or std::vector<std::string>. On the other hand, struct MyClass is user-defined, because it's not a type defined in the standard, so you can specialize std::vector<MyClass>.
This is not the same meaning of "user-defined" used in clauses 1-16, and that difference is confusing and silly. There is a defect report for this, with some discussion recorded that basically says "yes, the library uses the wrong term, but we don't have a better one".
So the answer to your question is "it depends". If you're talking to a C++ compiler implementor or a core language expert, std::vector<int> is definitely a user-defined type, but if you're talking to a standard library implementor, it is not. More precisely, it's not user-defined for the purposes of 17,6.4.2.1.
One way to look at it is that the standard library is "user code" as far as the core language is concerned. But the standard library has a different idea of "users" and considers itself to be part of the implementation, and only things that aren't part of the library are "user-defined".
Edit: I have proposed changing the library Clauses to use a new term, "program-defined", which means something defined in your program (as opposed to UDTs defined in the standard, such as std::string).
As users do not define std::vector<int>, and std::vector<int> is not dependent on any type a user defines, std::vector<int> is not a user-defined type.
The logical counter argument is that users do define std::vector<int>. You see std::vector is a class template and as such has no direct representation in binary code.
In a sense it gets it binary representation through the instantiation of a type, so the very action of declaring a std::vector<int> object is what gives "soul" to the template (pardon the phrasing). In a program where noone uses a std::vector<int> this data type does not exist.
On the other hand, following the same argument, std::vector<T> is not a user defined type, it is not even a type, it does not exist; only if we want to (instantiate a type), it will mandate how a structure will be layed out but until then we can only argue about it in terms of structure, design, properties and so on.
Note
The above argument (about templates being not code but ... well templates for code) may seem a bit superficial but draws it's logic, from Mayer's introduction in A. Alexandrescu's book Modern C++ Design. The relative quote there, goes like this :
Eventually, Andrei turned his attention to the development of template-based implementations of popular language idioms and design patterns, especially the GoF[*] patterns. This led to a brief skirmish with the Patterns community, because one of their fundamental tenets is that patterns cannot be represented in code. Once it became clear that Andrei was automating the generation of pattern implementations rather than trying to encode patterns themselves, that objection was removed, and I was pleased to see Andrei and one of the GoF (John Vlissides) collaborate on two columns in the C++ Report focusing on Andrei's work.
The draft standard contrasts fundamental types with user-defined types in a couple of (non-normative) places.
The draft standard also uses the term "user-defined" in other contexts, referring to entities created by the programmer or defined in the standard library. Examples include user-defined constructor, user-defined operator and user-defined conversion.
These facts allow us, absent other evidence, to tentatively assume that the intent of the standard is that user-defined type should mean compound type, according to historical usage. Only an explicit clarification in a future standard document can definitely resolve the issue.
Note that the historical usage is not clear on types like int* or struct foo* or void(*)(struct foo****). They are compound, but should they (or some of them) be considered user-defined?

Specializing std::optional

Will it be possible to specialize std::optional for user-defined types? If not, is it too late to propose this to the standard?
My use case for this is an integer-like class that represents a value within a range. For instance, you could have an integer that lies somewhere in the range [0, 10]. Many of my applications are sensitive to even a single byte of overhead, so I would be unable to use a non-specialized std::optional due to the extra bool. However, a specialization for std::optional would be trivial for an integer that has a range smaller than its underlying type. We could simply store the value 11 in my example. This should provide no space or time overhead over a non-optional value.
Am I allowed to create this specialization in namespace std?
The general rule in 17.6.4.2.1 [namespace.std]/1 applies:
A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly
prohibited.
So I would say it's allowed.
N.B. optional will not be part of the C++14 standard, it will be included in a separate Technical Specification on library fundamentals, so there is time to change the rule if my interpretation is wrong.
If you are after a library that efficiently packs the value and the "no-value" flag into one memory location, I recommend looking at compact_optional. It does exactly this.
It does not specialize boost::optional or std::experimental::optional but it can wrap them inside, giving you a uniform interface, with optimizations where possible and a fallback to 'classical' optional where needed.
I've asked about the same thing, regarding specializing optional<bool> and optional<tribool> among other examples, to only use one byte. While the "legality" of doing such things was not under discussion, I do think that one should not, in theory, be allowed to specialize optional<T> in contrast to eg.: hash (which is explicitly allowed).
I don't have the logs with me but part of the rationale is that the interface treats access to the data as access to a pointer or reference, meaning that if you use a different data structure in the internals, some of the invariants of access might change; not to mention providing the interface with access to the data might require something like reinterpret_cast<(some_reference_type)>. Using a uint8_t to store a optional-bool, for example, would impose several extra requirements on the interface of optional<bool> that are different to the ones of optional<T>. What should the return type of operator* be, for example?
Basically, I'm guessing the idea is to avoid the whole vector<bool> fiasco again.
In your example, it might not be too bad, as the access type is still your_integer_type& (or pointer). But in that case, simply designing your integer type to allow for a "zombie" or "undetermined" value instead of relying on optional<> to do the job for you, with its extra overhead and requirements, might be the safest choice.
Make it easy to opt-in to space savings
I have decided that this is a useful thing to do, but a full specialization is a little more work than necessary (for instance, getting operator= correct).
I have posted on the Boost mailing list a way to simplify the task of specializing, especially when you only want to specialize some instantiations of a class template.
http://boost.2283326.n4.nabble.com/optional-Specializing-optional-to-save-space-td4680362.html
My current interface involves a special tag type used to 'unlock' access to particular functions. I have creatively named this type optional_tag. Only optional can construct an optional_tag. For a type to opt-in to a space-efficient representation, it needs the following member functions:
T(optional_tag) constructs an uninitialized value
initialize(optional_tag, Args && ...) constructs an object when there may be one in existence already
uninitialize(optional_tag) destroys the contained object
is_initialized(optional_tag) checks whether the object is currently in an initialized state
By always requiring the optional_tag parameter, we do not limit any function signatures. This is why, for instance, we cannot use operator bool() as the test, because the type may want that operator for other reasons.
An advantage of this over some other possible methods of implementing it is that you can make it work with any type that can naturally support such a state. It does not add any requirements such as having a move constructor.
You can see a full code implementation of the idea at
https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/optional/optional.hpp?at=default&fileviewer=file-view-default
and for a class using the specialization:
https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/class.hpp?at=default&fileviewer=file-view-default
(lines 220 through 242)
An alternative approach
This is in contrast to my previous implementation, which required users to specialize a class template. You can see the old version here:
https://bitbucket.org/davidstone/bounded_integer/src/2defec41add2079ba023c2c6d118ed8a274423c8/bounded_integer/detail/optional/optional.hpp
and
https://bitbucket.org/davidstone/bounded_integer/src/2defec41add2079ba023c2c6d118ed8a274423c8/bounded_integer/detail/optional/specialization.hpp
The problem with this approach is that it is simply more work for the user. Rather than adding four member functions, the user must go into a new namespace and specialize a template.
In practice, all specializations would have an in_place_t constructor that forwards all arguments to the underlying type. The optional_tag approach, on the other hand, can just use the underlying type's constructors directly.
In the specialize optional_storage approach, the user also has the responsibility of adding proper reference-qualified overloads of a value function. In the optional_tag approach, we already have the value so we do not have to pull it out.
optional_storage also required standardizing as part of the interface of optional two helper classes, only one of which the user is supposed to specialize (and sometimes delegate their specialization to the other).
The difference between this and compact_optional
compact_optional is a way of saying "Treat this special sentinel value as the type being not present, almost like a NaN". It requires the user to know that the type they are working with has some special sentinel. An easily specializable optional is a way of saying "My type does not need extra space to store the not present state, but that state is not a normal value." It does not require anyone to know about the optimization to take advantage of it; everyone who uses the type gets it for free.
The future
My goal is to get this first into boost::optional, and then part of the std::optional proposal. Until then, you can always use bounded::optional, although it has a few other (intentional) interface differences.
I don't see how allowing or not allowing some particular bit pattern to represent the unengaged state falls under anything the standard covers.
If you were trying to convince a library vendor to do this, it would require an implementation, exhaustive tests to show you haven't inadvertently blown any of the requirements of optional (or accidentally invoked undefined behavior) and extensive benchmarking to show this makes a notable difference in real world (and not just contrived) situations.
Of course, you can do whatever you want to your own code.

How is is_standard_layout useful?

From what I understand, standard layout allows three things:
Empty base class optimization
Backwards compatibility with C with certain pointer casts
Use of offsetof
Now, included in the library is the is_standard_layout predicate metafunction, but I can't see much use for it in generic code as those C features I listed above seem extremely rare to need checking in generic code. The only thing I can think of is using it inside static_assert, but that is only to make code more robust and isn't required.
How is is_standard_layout useful? Are there any things which would be impossible without it, thus requiring it in the standard library?
General response
It is a way of validating assumptions. You wouldn't want to write code that assumes standard layout if that wasn't the case.
C++11 provides a bunch of utilities like this. They are particularly valuable for writing generic code (templates) where you would otherwise have to trust the client code to not make any mistakes.
Notes specific to is_standard_layout
It looks to me like the (pseudo code) definition of is_pod would roughly be...
// note: applied recursively to all members
bool is_pod(T) { return is_standard_layout(T) && is_trivial(T); }
So, you need to know is_standard_layout in order to implement is_pod. Given that, we might as well expose is_standard_layout as a tool available to library developers. Also of note: if you have a use-case for is_pod, you might want to consider the possibility that is_standard_layout might actually be a better (more accurate) choice in that case, since POD is essentially a subset of standard layout.
I get the feeling that they added every conceivable variant of type evaluation, regardless of any obvious value, just in case someone might encounter a need sometime before the next standard comes out. I doubt if piling on these "extra" type properties adds a significant additional burden to compiler developers.
There is a nice discussion of standard layout here: Why is C++11's POD "standard layout" definition the way it is?
There is also a lot of good detail at cppreference.com: Non-static data members