Is std::vector<T> a `user-defined type`? - c++

In 17.6.4.2.1/1 and 17.6.4.2.1/2 of the current draft standard restrictions are placed on specializations injected by users into namespace std.
The behavior of a C
++
program is undefined if it adds declarations or definitions to namespace
std
or to a
namespace within namespace
std
unless otherwise specified. A program may add a template specialization
for any standard library template to namespace
std
only if the declaration depends on a user-defined type
and the specialization meets the standard library requirements for the original template and is not explicitly
prohibited.
I cannot find where in the standard the phrase user-defined type is defined.
One option I have heard claimed is that a type that is not std::is_fundamental is a user-defined type, in which case std::vector<int> would be a user-defined type.
An alternative answer would be that a user-defined type is a type that a user defines. As users do not define std::vector<int>, and std::vector<int> is not dependent on any type a user defines, std::vector<int> is not a user-defined type.
A practical problem this impacts is "can you inject a specialization for std::hash for std::tuple<Ts...> into namespace std? Being able to do so is somewhat convenient -- the alternative is to create another namespace where we recursively build our hash for std::tuple (and possibly other types in std that do not have hash support), and if and only if we fail to find a hash in that namespace do we fall back on std.
However, if this is legal, then if and when the standard adds a hash specialization for std::tuple to namespace std, code that specialized it already would be broken, creating a reason not to add such specializations in the future.
While I am talking about std::vector<int> as a concrete example, I am trying to ask if types defined in std are ever user-defined type s. A secondary question is, even if not, maybe std::tuple<int> becomes a user-defined type when used by a user (this gets slippery: what then happens if something inside std defines std::tuple<int>, and you partial-specialize hash for std::tuple<Ts...>).
There is currently an open defect on this problem.

Prof. Stroustrup is very clear that any type that is not built-in is user-defined. See the second paragraph of section 9.1 in Programming Principles and Practice Using C++.
He even specifically calls out “standard library types” as an example of user-defined types. In other words, a user-defined type is any compound type.
Source
The article explicitly mentions that not everyone seems to agree, but this is IMHO mostly wishful thinking and not what the standard (and Prof. Stroustrup) are actually saying, only what some people want to read into it.

When Clause 17 says "user-defined" it means "a type not defined in the standard" so std::vector<int> is not user-defined, neither is std::string, so you cannot specialize std::vector<int> or std::vector<std::string>. On the other hand, struct MyClass is user-defined, because it's not a type defined in the standard, so you can specialize std::vector<MyClass>.
This is not the same meaning of "user-defined" used in clauses 1-16, and that difference is confusing and silly. There is a defect report for this, with some discussion recorded that basically says "yes, the library uses the wrong term, but we don't have a better one".
So the answer to your question is "it depends". If you're talking to a C++ compiler implementor or a core language expert, std::vector<int> is definitely a user-defined type, but if you're talking to a standard library implementor, it is not. More precisely, it's not user-defined for the purposes of 17,6.4.2.1.
One way to look at it is that the standard library is "user code" as far as the core language is concerned. But the standard library has a different idea of "users" and considers itself to be part of the implementation, and only things that aren't part of the library are "user-defined".
Edit: I have proposed changing the library Clauses to use a new term, "program-defined", which means something defined in your program (as opposed to UDTs defined in the standard, such as std::string).

As users do not define std::vector<int>, and std::vector<int> is not dependent on any type a user defines, std::vector<int> is not a user-defined type.
The logical counter argument is that users do define std::vector<int>. You see std::vector is a class template and as such has no direct representation in binary code.
In a sense it gets it binary representation through the instantiation of a type, so the very action of declaring a std::vector<int> object is what gives "soul" to the template (pardon the phrasing). In a program where noone uses a std::vector<int> this data type does not exist.
On the other hand, following the same argument, std::vector<T> is not a user defined type, it is not even a type, it does not exist; only if we want to (instantiate a type), it will mandate how a structure will be layed out but until then we can only argue about it in terms of structure, design, properties and so on.
Note
The above argument (about templates being not code but ... well templates for code) may seem a bit superficial but draws it's logic, from Mayer's introduction in A. Alexandrescu's book Modern C++ Design. The relative quote there, goes like this :
Eventually, Andrei turned his attention to the development of template-based implementations of popular language idioms and design patterns, especially the GoF[*] patterns. This led to a brief skirmish with the Patterns community, because one of their fundamental tenets is that patterns cannot be represented in code. Once it became clear that Andrei was automating the generation of pattern implementations rather than trying to encode patterns themselves, that objection was removed, and I was pleased to see Andrei and one of the GoF (John Vlissides) collaborate on two columns in the C++ Report focusing on Andrei's work.

The draft standard contrasts fundamental types with user-defined types in a couple of (non-normative) places.
The draft standard also uses the term "user-defined" in other contexts, referring to entities created by the programmer or defined in the standard library. Examples include user-defined constructor, user-defined operator and user-defined conversion.
These facts allow us, absent other evidence, to tentatively assume that the intent of the standard is that user-defined type should mean compound type, according to historical usage. Only an explicit clarification in a future standard document can definitely resolve the issue.
Note that the historical usage is not clear on types like int* or struct foo* or void(*)(struct foo****). They are compound, but should they (or some of them) be considered user-defined?

Related

Why do I need to include <compare> header to get <=> to compile?

I know the technical answer is: because the standard says so.
But I am confused regarding the motivation:
I see nothing "library" in the defaulting the <=>: it may return some type that is technically defined in std but it is a "fake library" type in a sense that compiler must know about it since it must be able to default operator <=> with auto return type (not to mention that error messages in good compilers specify <compare> so it is clear that there is a language<=>library link here).
So I understand there is some library functionality that might require me to include <compare> but I do not understand why defaulting <=> requires me to include that header since compiler anyway has to know about everything needed to make the <=>.
Note: I know most of the time some other standard headers will include the <compare>, this is a question about language/library design, not that much about one extra line that C++ forces me to write without a good reason.
it may return some type that is technically defined in std but it is a "fake library" type in a sense
Well, <=> returns types that are very much real, that are actually defined in <compare> and implemented there. In the same way that an initializer list is used to construct a std::initializer_list<T>, which is very much a real type that is actually defined in <initializer_list>. And typeinfo in <typeinfo>.
And those comparison types - std::strong_ordering, std::weak_ordering, and std::partial_ordering (and originally also std::strong_equality and std::weak_equality) - themselves have non-trivial conversion semantics and other operations defined on them, that we may want to change in the future. They'd be very special language types indeed, where the convertibility only goes in one direction but in a way that's very much unlike inheritance (there are only three values for the total ordering types, but four for the partial one...). It's really much easier to define these as real library types and then specify their interaction as real library code.
that compiler must know about it since it must be able to default operator<=> with auto return type
Kind of, but not really. The compiler knows what the names of the types are, and how to produce values of them for the fundamental types, but it doesn't actually need to know anything more than that. The rule for the return type is basically hardcoded based on the types that the underyling members' <=>s return, don't need to know what those actual types look like to do that. And then you're just invoking functions that do... whatever.
The cost of you having to include a header is typing #include <compare> and then parse it. The cost of the compiler having to synthesize these types is a cost that would have to be paid for every TU, whether or not it does any three-way comparisons. Plus if/when we want to change these types, it's easier to change library types than language types anyway.

What are the similarities and differences between C++'s concepts and Rust's traits?

In Rust, the main tool for abstraction are traits. In C++, there are two tools for abstractions: abstract classes and templates. To get rid of some of the disadvantages of using templates (e.g. hard to read error messages), C++ introduced concepts which are "named sets of requirements".
Both features seem to be fairly similar:
Defining a trait/concept is done by listing requirements.
Both can be used to bound/restrict generic/template type parameters.
Rust traits and C++ templates with concepts are both monomorphized (I know Rust traits can also be used with dynamic dispatch, but that's a different story).
But from what I understand, there are also notable differences. For example, C++'s concepts seem to define a set of expressions that have to be valid instead of listing function signatures. But there is a lot of different and confusing information out there (maybe because concepts only land in C++20?). That's why I'd like to know: what exactly are the differences between and the similarities of C++'s concepts and Rust's traits?
Are there features that are only offered by either concepts or traits? E.g. what about Rust's associated types and consts? Or bounding a type by multiple traits/concepts?
Disclaimer: I have not yet used concepts, all I know about them was gleaned from the various proposals and cppreference, so take this answer with a grain of salt.
Run-Time Polymorphism
Rust Traits are used both for Compile-Time Polymorphism and, sometimes, Run-Time Polymorphism; Concepts are only about Compile-Time Polymorphism.
Structural vs Nominal.
The greatest difference between Concepts and Traits is that Concepts use structural typing whereas Traits use nominal typing:
In C++ a type never explicitly satisfies a Concept; it may "accidentally" satisfy it if it happens to satisfy all requirements.
In Rust a specific syntactic construct impl Trait for Type is used to explicitly indicates that a type implements a Trait.
There are a number of consequences; in general Nominal Typing is better from a maintainability point of view -- adding a requirement to a Trait -- whereas Structural Typing is better a bridging 3rd party libraries -- a type from library A can satisfy a Concept from library B without them being aware of each other.
Constraints
Traits are mandatory:
No method can be called on a variable of a generic type without this type being required to implement a trait providing the method.
Concepts are entirely optional:
A method can be called on a variable of a generic type without this type being required to satisfy any Concept, or being constrained in any way.
A method can be called on a variable of a generic type satisfying a Concept (or several) without that method being specified by any Concept or Constraint.
Constraints (see note) can be entirely ad-hoc, and specify requirements without using a named Concept; and once again, they are entirely optional.
Note: a Constraint is introduced by a requires clause and specifies either ad-hoc requirements or requirements based on Concepts.
Requirements
The set of expressible requirements is different:
Concepts/Constraints work by substitution, so allow the whole breadth of the languages; requirements include: nested types/constants/variables, methods, fields, ability to be used as an argument of another function/method, ability to used as a generic argument of another type, and combinations thereof.
Traits, by contrast, only allow a small set of requirements: associated types/constants, and methods.
Overload Selection
Rust has no concept of ad-hoc overloading, overloading only occurs by Traits and specialization is not possible yet.
C++ Constraints can be used to "order" overloads from least specific to most specific, so the compiler can automatically select the most specific overload for which requirements are satisfied.
Note: prior to this, either SFINAE or tag-dispatching would be used in C++ to achieve the selection; calisthenics were required to work with open-ended overload sets.
Disjunction
How to use this feature is not quite clear to me yet.
The requirement mechanisms in Rust are purely additive (conjunctions, aka &&), in contrast, in C++ requires clauses can contain disjunctions (aka ||).

Will specialization of function templates in std for program-defined types no longer be allowed in C++20?

Quote from cppreference.com:
Adding template specializations
It is allowed to add template specializations for any standard library |class (since C++20)| template to the namespace std only if the declaration depends on at least one program-defined type and the specialization satisfies all requirements for the original template, except where such specializations are prohibited.
Does it mean, that starting from C++20, adding specializations of function templates to the std namespace for user-defined types will be no longer allowed? If so, it implies that many pieces of existing code can break, doesn't it? (It seems to me to be kind-of a "radical" change.) Moreover, it will inject into such codes undefined behavior, which will not trigger compilations errors (warnings hopefully will).
As it stands now it definitly looks that way. Previously [namespace.std] contained
A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited.
While the current draft states
Unless explicitly prohibited, a program may add a template specialization for any standard library class template to namespace std provided that (a) the added declaration depends on at least one program-defined type and (b) the specialization meets the standard library requirements for the original template.
emphasis mine
And it looks like the paper Thou Shalt Not Specialize std Function Templates! by Walter E. Brown is responsible for it. In it he details an number of reason why this should be changed such as:
Herb Sutter: “specializations don’t participate in overloading. [...] If you want to customize a function base template and want that
customization to participate in overload resolution (or, to always be
used in the case of exact match), make it a plain old function, not a
specialization. And, if you do provide overloads, avoid also providing
specializations.”
David Abrahams: “it’s wrong to use function template specialization [because] it interacts in bad ways with overloads. [...] For example,
if you specialize the regular std::swap for std::vector<mytype>&,
your specialization won’t get chosen over the standard’s vector
specific swap, because specializations aren’t considered during
overload resolution.”
Howard Hinnant: “this issue has been settled for a long time. . . . Disregard Dave’s expert opinion/answer in this area at your own
peril.”
Eric Niebler: “[because of] the decidedly wonky way C++ resolves function calls in templates. . . , [w]e make an unqualified call to
swap in order to find an overload that might be defined in [...]
associated namespaces[...] , and we do using std::swap so that, on
the off-chance that there is no such overload, we find the default
version defined in the std namespace.”
High Integrity C++ Coding Standard: “Overload resolution does not take into account explicit specializations of function templates. Only
after overload resolution has chosen a function template will any
explicit specializations be considered.”
Not really that radical. This change is based on this paper from Walter E. Brown. The paper goes into rationale rather deeply, but ultimately it boils down to this:
Specialization of function templates is rather poor as a customization point. Overloading and ADL are much better in that regard. There are other customization points discussed in the paper as well.
The standard library doesn't rely on this poor customization point too much already.
The wording change that's put into place actually permits adding entire declarations to namespace std (not just specializations) where it's explicitly permitted. So now there are better customization points.
Given #1 and #2, it's rather unlikely existing code will break. Or at least, not enough for this to be a major problem. Code that used auto and register also "broke" in the past, but that minuscule amount of C++ code didn't stop progress.

Why isn't std::initializer_list a language built-in?

Why isn't std::initializer_list a core-language built-in?
It seems to me that it's quite an important feature of C++11 and yet it doesn't have its own reserved keyword (or something alike).
Instead, initializer_list it's just a template class from the standard library that has a special, implicit mapping from the new braced-init-list {...} syntax that's handled by the compiler.
At first thought, this solution is quite hacky.
Is this the way new additions to the C++ language will be now implemented: by implicit roles of some template classes and not by the core language?
Please consider these examples:
widget<int> w = {1,2,3}; //this is how we want to use a class
why was a new class chosen:
widget( std::initializer_list<T> init )
instead of using something similar to any of these ideas:
widget( T[] init, int length ) // (1)
widget( T... init ) // (2)
widget( std::vector<T> init ) // (3)
a classic array, you could probably add const here and there
three dots already exist in the language (var-args, now variadic templates), why not re-use the syntax (and make it feel built-in)
just an existing container, could add const and &
All of them are already a part of the language. I only wrote my 3 first ideas, I am sure that there are many other approaches.
There were already examples of "core" language features that returned types defined in the std namespace. typeid returns std::type_info and (stretching a point perhaps) sizeof returns std::size_t.
In the former case, you already need to include a standard header in order to use this so-called "core language" feature.
Now, for initializer lists it happens that no keyword is needed to generate the object, the syntax is context-sensitive curly braces. Aside from that it's the same as type_info. Personally I don't think the absence of a keyword makes it "more hacky". Slightly more surprising, perhaps, but remember that the objective was to allow the same braced-initializer syntax that was already allowed for aggregates.
So yes, you can probably expect more of this design principle in future:
if more occasions arise where it is possible to introduce new features without new keywords then the committee will take them.
if new features require complex types, then those types will be placed in std rather than as builtins.
Hence:
if a new feature requires a complex type and can be introduced without new keywords then you'll get what you have here, which is "core language" syntax with no new keywords and that uses library types from std.
What it comes down to, I think, is that there is no absolute division in C++ between the "core language" and the standard libraries. They're different chapters in the standard but each references the other, and it has always been so.
There is another approach in C++11, which is that lambdas introduce objects that have anonymous types generated by the compiler. Because they have no names they aren't in a namespace at all, certainly not in std. That's not a suitable approach for initializer lists, though, because you use the type name when you write the constructor that accepts one.
The C++ Standard Committee seems to prefer not to add new keywords, probably because that increases the risk of breaking existing code (legacy code could use that keyword as the name of a variable, a class, or whatever else).
Moreover, it seems to me that defining std::initializer_list as a templated container is quite an elegant choice: if it was a keyword, how would you access its underlying type? How would you iterate through it? You would need a bunch of new operators as well, and that would just force you to remember more names and more keywords to do the same things you can do with standard containers.
Treating an std::initializer_list as any other container gives you the opportunity of writing generic code that works with any of those things.
UPDATE:
Then why introduce a new type, instead of using some combination of existing? (from the comments)
To begin with, all others containers have methods for adding, removing, and emplacing elements, which are not desirable for a compiler-generated collection. The only exception is std::array<>, which wraps a fixed-size C-style array and would therefore remain the only reasonable candidate.
However, as Nicol Bolas correctly points out in the comments, another, fundamental difference between std::initializer_list and all other standard containers (including std::array<>) is that the latter ones have value semantics, while std::initializer_list has reference semantics. Copying an std::initializer_list, for instance, won't cause a copy of the elements it contains.
Moreover (once again, courtesy of Nicol Bolas), having a special container for brace-initialization lists allows overloading on the way the user is performing initialization.
This is nothing new. For example, for (i : some_container) relies on existence of specific methods or standalone functions in some_container class. C# even relies even more on its .NET libraries. Actually, I think, that this is quite an elegant solution, because you can make your classes compatible with some language structures without complicating language specification.
This is indeed nothing new and how many have pointed out, this practice was there in C++ and is there, say, in C#.
Andrei Alexandrescu has mentioned a good point about this though: You may think of it as a part of imaginary "core" namespace, then it'll make more sense.
So, it's actually something like: core::initializer_list, core::size_t, core::begin(), core::end() and so on. This is just an unfortunate coincidence that std namespace has some core language constructs inside it.
Not only can it work completely in the standard library. Inclusion into the standard library does not mean that the compiler can not play clever tricks.
While it may not be able to in all cases, it may very well say: this type is well known, or a simple type, lets ignore the initializer_list and just have a memory image of what the initialized value should be.
In other words int i {5}; can be equivalent to int i(5); or int i=5; or even intwrapper iw {5}; Where intwrapper is a simple wrapper class over an int with a trivial constructor taking an initializer_list
It's not part of the core language because it can be implemented entirely in the library, just line operator new and operator delete. What advantage would there be in making compilers more complicated to build it in?

Which standard c++ classes cannot be reimplemented in c++?

I was looking through the plans for C++0x and came upon std::initializer_list for implementing initializer lists in user classes. This class could not be implemented in C++
without using itself, or else using some "compiler magic". If it could, it wouldn't be needed since whatever technique you used to implement initializer_list could be used to implement initializer lists in your own class.
What other classes require some form of "compiler magic" to work? Which classes are in the Standard Library that could not be implemented by a third-party library?
Edit: Maybe instead of implemented, I should say instantiated. It's more the fact that this class is so directly linked with a language feature (you can't use initializer lists without initializer_list).
A comparison with C# might clear up what I'm wondering about: IEnumerable and IDisposable are actually hard-coded into language features. I had always assumed C++ was free of this, since Stroustrup tried to make everything implementable in libraries. So, are there any other classes / types that are inextricably bound to a language feature.
std::type_info is a simple class, although populating it requires typeinfo: a compiler construct.
Likewise, exceptions are normal objects, but throwing exceptions requires compiler magic (where are the exceptions allocated?).
The question, to me, is "how close can we get to std::initializer_lists without compiler magic?"
Looking at wikipedia, std::initializer_list<typename T> can be initialized by something that looks a lot like an array literal. Let's try giving our std::initializer_list<typename T> a conversion constructor that takes an array (i.e., a constructor that takes a single argument of T[]):
namespace std {
template<typename T> class initializer_list {
T internal_array[];
public:
initializer_list(T other_array[]) : internal_array(other_array) { };
// ... other methods needed to actually access internal_array
}
}
Likewise, a class that uses a std::initializer_list does so by declaring a constructor that takes a single std::initializer_list argument -- a.k.a. a conversion constructor:
struct my_class {
...
my_class(std::initializer_list<int>) ...
}
So the line:
my_class m = {1, 2, 3};
Causes the compiler to think: "I need to call a constructor for my_class; my_class has a constructor that takes a std::initializer_list<int>; I have an int[] literal; I can convert an int[] to a std::initializer_list<int>; and I can pass that to the my_class constructor" (please read to the end of the answer before telling me that C++ doesn't allow two implicit user-defined conversions to be chained).
So how close is this? First, I'm missing a few features/restrictions of initializer lists. One thing I don't enforce is that initializer lists can only be constructed with array literals, while my initializer_list would also accept an already-created array:
int arry[] = {1, 2, 3};
my_class = arry;
Additionally, I didn't bother messing with rvalue references.
Finally, this class only works as the new standard says it should if the compiler implicitly chains two user-defined conversions together. This is specifically prohibited under normal cases, so the example still needs compiler magic. But I would argue that (1) the class itself is a normal class, and (2) the magic involved (enforcing the "array literal" initialization syntax and allowing two user-defined conversions to be implicitly chained) is less than it seems at first glance.
The only other one I could think of was the type_info class returned by typeid. As far as I can tell, VC++ implements this by instantiating all the needed type_info classes statically at compile time, and then simply casting a pointer at runtime based on values in the vtable. These are things that could be done using C code, but not in a standard-conforming or portable way.
All classes in the standard library, by definition, must be implemented in C++. Some of them hide some obscure language/compiler constructs, but still are just wrappers around that complexity, not language features.
Anything that the runtime "hooks into" at defined points is likely not to be implementable as a portable library in the hypothetical language "C++, excluding that thing".
So for instance I think atexit() in <cstdlib> can't be implemented purely as a library, since there is no other way in C++ to ensure it is called at the right time in the termination sequence, that is before any global destructor.
Of course, you could argue that C features "don't count" for this question. In which case std::unexpected may be a better example, for exactly the same reason. If it didn't exist, there would be no way to implement it without tinkering with the exception code emitted by the compiler.
[Edit: I just noticed the questioner actually asked what classes can't be implemented, not what parts of the standard library can't be implemented. So actually these examples don't strictly answer the question.]
C++ allows compilers to define otherwise undefined behavior. This makes it possible to implement the Standard Library in non-standard C++. For instance, "onebyone" wonders about atexit(). The library writers can assume things about the compiler that makes their non-portable C++ work OK for their compiler.
MSalter points out printf/cout/stdout in a comment. You could implement any one of them in terms of the one of the others (I think), but you can't implement the whole set of them together without OS calls or compiler magic, because:
These are all the ways of accessing the process's standard output stream. You have to stuff the bytes somewhere, and that's implementation-specific in the absence of these things. Unless I've forgotten another way of accessing it, but the point is you can't implement standard output other than through implementation-specific "magic".
They have "magic" behaviour in the runtime, which I think could not be perfectly imitated by a pure library. For example, you couldn't just use static initialization to construct cout, because the order of static initialization between compilation units is not defined, so there would be no guarantee that it would exist in time to be used by other static initializers. stdout is perhaps easier, since it's just fd 1, so any apparatus supporting it can be created by the calls it's passed into when they see it.
I think you're pretty safe on this score. C++ mostly serves as a thick layer of abstraction around C. Since C++ is also a superset of C itself, the core language primitives are almost always implemented sans-classes (in a C-style). In other words, you're not going to find many situations like Java's Object which is a class which has special meaning hard-coded into the compiler.
Again from C++0x, I think that threads would not be implementable as a portable library in the hypothetical language "C++0x, with all the standard libraries except threads".
[Edit: just to clarify, there seems to be some disagreement as to what it would mean to "implement threads". What I understand it to mean in the context of this question is:
1) Implement the C++0x threading specification (whatever that turns out to be). Note C++0x, which is what I and the questioner are both talking about. Not any other threading specification, such as POSIX.
2) without "compiler magic". This means not adding anything to the compiler to help your implementation work, and not relying on any non-standard implementation details (such as a particular stack layout, or a means of switching stacks, or non-portable system calls to set a timed interrupt) to produce a thread library that works only on a particular C++ implementation. In other words: pure, portable C++. You can use signals and setjmp/longjmp, since they are portable, but my impression is that's not enough.
3) Assume a C++0x compiler, except that it's missing all parts of the C++0x threading specification. If all it's missing is some data structure (that stores an exit value and a synchronisation primitive used by join() or equivalent), but the compiler magic to implement threads is present, then obviously that data structure could be added as a third-party portable component. But that's kind of a dull answer, when the question was about which C++0x standard library classes require compiler magic to support them. IMO.]