What are the new features in C++17? - c++

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
C++17 is now feature complete, so unlikely to experience large changes. Hundreds of proposals were put forward for C++17.
Which of those features were added to C++ in C++17?
When using a C++ compiler that supports "C++1z", which of those features are going to be available when the compiler updates to C++17?

Language features:
Templates and Generic Code
Template argument deduction for class templates
Like how functions deduce template arguments, now constructors can deduce the template arguments of the class
http://wg21.link/p0433r2 http://wg21.link/p0620r0 http://wg21.link/p0512r0
template <auto>
Represents a value of any (non-type template argument) type.
Non-type template arguments fixes
template<template<class...>typename bob> struct foo {}
( Folding + ... + expressions ) and Revisions
auto x{8}; is an int
modernizing using with ... and lists
Lambda
constexpr lambdas
Lambdas are implicitly constexpr if they qualify
Capturing *this in lambdas
[*this]{ std::cout << could << " be " << useful << '\n'; }
Attributes
[[fallthrough]], [[nodiscard]], [[maybe_unused]] attributes
[[attributes]] on namespaces and enum { erator[[s]] }
using in attributes to avoid having to repeat an attribute namespace.
Compilers are now required to ignore non-standard attributes they don't recognize.
The C++14 wording allowed compilers to reject unknown scoped attributes.
Syntax cleanup
Inline variables
Like inline functions
Compiler picks where the instance is instantiated
Deprecate static constexpr redeclaration, now implicitly inline.
namespace A::B
Simple static_assert(expression); with no string
no throw unless throw(), and throw() is noexcept(true).
Cleaner multi-return and flow control
Structured bindings
Basically, first-class std::tie with auto
Example:
const auto [it, inserted] = map.insert( {"foo", bar} );
Creates variables it and inserted with deduced type from the pair that map::insert returns.
Works with tuple/pair-likes & std::arrays and relatively flat structs
Actually named structured bindings in standard
if (init; condition) and switch (init; condition)
if (const auto [it, inserted] = map.insert( {"foo", bar} ); inserted)
Extends the if(decl) to cases where decl isn't convertible-to-bool sensibly.
Generalizing range-based for loops
Appears to be mostly support for sentinels, or end iterators that are not the same type as begin iterators, which helps with null-terminated loops and the like.
if constexpr
Much requested feature to simplify almost-generic code.
Misc
Hexadecimal float point literals
Dynamic memory allocation for over-aligned data
Guaranteed copy elision
Finally!
Not in all cases, but distinguishes syntax where you are "just creating something" that was called elision, from "genuine elision".
Fixed order-of-evaluation for (some) expressions with some modifications
Not including function arguments, but function argument evaluation interleaving now banned
Makes a bunch of broken code work mostly, and makes .then on future work.
Direct list-initialization of enums
Forward progress guarantees (FPG) (also, FPGs for parallel algorithms)
I think this is saying "the implementation may not stall threads forever"?
u8'U', u8'T', u8'F', u8'8' character literals (string already existed)
"noexcept" in the type system
__has_include
Test if a header file include would be an error
makes migrating from experimental to std almost seamless
Arrays of pointer conversion fixes
inherited constructors fixes to some corner cases (see P0136R0 for examples of behavior changes)
aggregate initialization with inheritance.
std::launder, type punning, etc
Library additions:
Data types
std::variant<Ts...>
Almost-always non-empty last I checked?
Tagged union type
{awesome|useful}
std::optional
Maybe holds one of something
Ridiculously useful
std::any
Holds one of anything (that is copyable)
std::string_view
std::string like reference-to-character-array or substring
Never take a string const& again. Also can make parsing a bajillion times faster.
"hello world"sv
constexpr char_traits
std::byte off more than they could chew.
Neither an integer nor a character, just data
Invoke stuff
std::invoke
Call any callable (function pointer, function, member pointer) with one syntax. From the standard INVOKE concept.
std::apply
Takes a function-like and a tuple, and unpacks the tuple into the call.
std::make_from_tuple, std::apply applied to object construction
is_invocable, is_invocable_r, invoke_result
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0077r2.html
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0604r0.html
Deprecates result_of
is_invocable<Foo(Args...), R> is "can you call Foo with Args... and get something compatible with R", where R=void is default.
invoke_result<Foo, Args...> is std::result_of_t<Foo(Args...)> but apparently less confusing?
File System TS v1
[class.path]
[class.filesystem.error]
[class.file_status]
[class.directory_entry]
[class.directory_iterator] and [class.recursive_directory_iterator]
[fs.ops.funcs]
fstreams can be opened with paths, as well as with const path::value_type* strings.
New algorithms
for_each_n
reduce
transform_reduce
exclusive_scan
inclusive_scan
transform_exclusive_scan
transform_inclusive_scan
Added for threading purposes, exposed even if you aren't using them threaded
Threading
std::shared_mutex
Untimed, which can be more efficient if you don't need it.
atomic<T>::is_always_lockfree
scoped_lock<Mutexes...>
Saves some std::lock pain when locking more than one mutex at a time.
Parallelism TS v1
The linked paper from 2014, may be out of date
Parallel versions of std algorithms, and related machinery
hardware_*_interference_size
(parts of) Library Fundamentals TS v1 not covered above or below
[func.searchers] and [alg.search]
A searching algorithm and techniques
[pmr]
Polymorphic allocator, like std::function for allocators
And some standard memory resources to go with it.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0358r1.html
std::sample, sampling from a range?
Container Improvements
try_emplace and insert_or_assign
gives better guarantees in some cases where spurious move/copy would be bad
Splicing for map<>, unordered_map<>, set<>, and unordered_set<>
Move nodes between containers cheaply.
Merge whole containers cheaply.
non-const .data() for string.
non-member std::size, std::empty, std::data
like std::begin/end
Minimal incomplete type support in containers
Contiguous iterator "concept"
constexpr iterators
The emplace family of functions now returns a reference to the created object.
Smart pointer changes
unique_ptr<T[]> fixes and other unique_ptr tweaks.
weak_from_this and some fixed to shared from this
Other std datatype improvements:
{} construction of std::tuple and other improvements
TriviallyCopyable reference_wrapper, can be performance boost
Misc
C++17 library is based on C11 instead of C99
Reserved std[0-9]+ for future standard libraries
destroy(_at|_n), uninitialized_move(_n), uninitialized_value_construct(_n), uninitialized_default_construct(_n)
utility code already in most std implementations exposed
Special math functions
scientists may like them
std::clamp()
std::clamp( a, b, c ) == std::max( b, std::min( a, c ) ) roughly
gcd and lcm
std::uncaught_exceptions
Required if you want to only throw if safe from destructors
std::as_const
std::bool_constant
A whole bunch of _v template variables
std::void_t<T>
Surprisingly useful when writing templates
std::owner_less<void>
like std::less<void>, but for smart pointers to sort based on contents
std::chrono polish
std::conjunction, std::disjunction, std::negation exposed
std::not_fn
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0358r1.html
Rules for noexcept within std
std::is_contiguous_layout, useful for efficient hashing
std::to_chars/std::from_chars, high performance, locale agnostic number conversion; finally a way to serialize/deserialize to human readable formats (JSON & co)
std::default_order, indirection over std::less. (breaks ABI of some compilers due to name mangling, removed.)
memory_order_consume, added language to prefer use of memory_order_acquire
Traits
swap
is_aggregate
has_unique_object_representations
Deprecated
Some C libraries,
<codecvt>
result_of, replaced with invoke_result
shared_ptr::unique, it isn't very threadsafe
Isocpp.org has has an independent list of changes since C++14; it has been partly pillaged.
Naturally TS work continues in parallel, so there are some TS that are not-quite-ripe that will have to wait for the next iteration. The target for the next iteration is C++20 as previously planned, not C++19 as some rumors implied. C++1O has been avoided.
Initial list taken from this reddit post and this reddit post, with links added via googling or from the above isocpp.org page.
Additional entries pillaged from SD-6 feature-test list.
clang's feature list and library feature list are next to be pillaged. This doesn't seem to be reliable, as it is C++1z, not C++17.
these slides had some features missing elsewhere.
While "what was removed" was not asked, here is a short list of a few things ((mostly?) previous deprecated) that are removed in C++17 from C++:
Removed:
register, keyword reserved for future use
bool b; ++b;
trigraphs
if you still need them, they are now part of your source file encoding, not part of language
ios aliases
auto_ptr, old <functional> stuff, random_shuffle
allocators in std::function
There were rewordings. I am unsure if these have any impact on code, or if they are just cleanups in the standard:
Papers not yet integrated into above:
P0505R0 (constexpr chrono)
P0418R2 (atomic tweaks)
P0512R0 (template argument deduction tweaks)
P0490R0 (structured binding tweaks)
P0513R0 (changes to std::hash)
P0502R0 (parallel exceptions)
P0509R1 (updating restrictions on exception handling)
P0012R1 (make exception specifications be part of the type system)
P0510R0 (restrictions on variants)
P0504R0 (tags for optional/variant/any)
P0497R0 (shared ptr tweaks)
P0508R0 (structured bindings node handles)
P0521R0 (shared pointer use count and unique changes?)
Spec changes:
exception specs and throw expressions
Further reference:
papers grouped by year; not all accepted
https://isocpp.org/files/papers/p0636r0.html
Should be updated to "Modifications to existing features" here.

Related

Why does std::unique_lock use type tags to differentiate constructors?

In C++11, the std::unique_lock constructor is overloaded to accept the type tags defer_lock_t, try_to_lock_t, and adopt_lock_t:
unique_lock( mutex_type& m, std::defer_lock_t t );
unique_lock( mutex_type& m, std::try_to_lock_t t );
unique_lock( mutex_type& m, std::adopt_lock_t t );
These are empty classes (type tags) defined as follows:
struct defer_lock_t { };
struct try_to_lock_t { };
struct adopt_lock_t { };
This allows the user to disambiguate between the three constructors by passing one of the pre-defined instances of these classes:
constexpr std::defer_lock_t defer_lock {};
constexpr std::try_to_lock_t try_to_lock {};
constexpr std::adopt_lock_t adopt_lock {};
I am surprised that this is not implemented as an enum. As far as I can tell, using an enum would:
be simpler to implement
not change the syntax
allow the argument to be changed at runtime (albeit not very useful in this case).
(probably) could be inlined by the compiler with no performance hit
Why does the standard library use type tags, instead of an enum, to disambiguate these constructors? Perhaps more importantly, should I also prefer to use type tags in this situation when writing my own C++ code?
Tag dispatching
It is a technique known as tag dispatching. It allows the appropriate constructor to be called given the behaviour required by the client.
The reason for tags is that the types used for the tags are thus unrelated and will not conflict during overload resolution. Types (and not values as in the case of enums) are used to resolve overloaded functions. In addition, tags can be used to resolve calls that would otherwise have been ambiguous; in this case the tags would typically be based on some type trait(s).
Tag dispatching with templates means that only code that is required to be used given the construction is required to be implemented.
Tag dispatching allows for easier reading code (in my opinion at least) and simpler library code; the constructor won't have a switch statement and the invariants can be established in the initialiser list, based on these arguments, before executing the constructor itself. Sure, your milage may vary but this has been my general experience using tags.
Boost.org has a write up on the tag dispatching technique. It has a history of use that seems to go back at least as far as the SGI STL.
Why use it?
Why does the standard library use type tags, instead of an enum, to disambiguate these constructors?
Types would be more powerful and flexible when used during overload resolution and the possible implementation than enums; bear in mind the enums were originally unscoped and limited in how they could be used (by contrast to the tags).
Additional noteworthy reasons for tags;
Compile time decisions can be made over which constructor to use, and not runtime.
Disallows more "hacky" code where a integer is cast to the enum type with a value that is not catered for - design decisions would need to be made out to handle this and then code implemented to cater for any resultant exceptions or errors.
Keep in mind that the shared_lock and lock_guard also use these tags, but in the case of the lock_guard, only the adopt_lock is used. An enumeration would introduce more potential error conditions.
I think precedence and history also plays a role here. Given the wide spread use in the Standard Library and elsewhere; it is unlikely to change how situations, such as the original example, are implemented in the library.
Perhaps more importantly, should I also prefer to use type tags in this situation when writing my own C++ code?
This is essentially a design decision. Both can and should be used to target the problems they solve. I have used tags to "route" data and types to the correct function; particular when the implementation would be incompatible at compile time and if there are any overload resolutions in play.
The Standard Library std::advance is often given as an example of how tag dispatching can be used to implement and optimise an algorithm based on traits (or characteristics) of the types used (in this case when the iterators are random access iterators).
It is a powerful technique when used appropriately and should not be ignored. If using enums, favour the newer scoped enums over the older unscoped ones.
Using these tags enables you to take advantage of the type system of the language. This is closely related to template meta-programming. Simply speaking, using these tags allows the dispatch decision concerning which constructor to invoke to be made statically at compile time. This leaves room for compiler optimization, improves run-time efficiency, and makes template meta-programming with std::unique_lock easier. This is possible, because the tags are of different static types. With an enum, this cannot be done, for the value of an enum cannot be foreseen at compile time. Note that, using tags for differentiating purposes is a common template meta-programming technique. Just see those iterator tags used by the standard library.
The point is that if you want to add another function using enum, you should edit your enum, then rebuild all projects, which use your functions and enum. In addition there will be one function taking enum as argument and using switch or something. This will bring excess code into your application.
Otherwise if you use overloaded functions with tags, you can easily add another tag and add another overloaded function, without touching old ones. This is more back-compatible.
I suspect it was optimization. Notice that using a type (as is) the correct version is selected at compile time. As you point out using an enum is (potentially) selected in some conditional statement (maybe a switch) at run-time.
In many implementations locks are acquired and released at extremely high frequency and maybe designers thought with branch prediction and the implied memory synchronization events that might be a significant issue.
The flaw in my argument (which you also point out) is that the constructor is likely to be inline and it is likely that the condition would be optimized away anyway.
Notice that using 'dummy' parameters is the closest possible analogue to actually providing named constructors.
This method is called tag dispatching (I may be wrong). Enum type with different values is just one type in compile time and enum values can't be used to overload constructor. So with enum it will be one constructor with switch statement in it. Tag dispatching is equivalent to switch statement in compile time. Each tag type specify: what this constructor would do, how it will try to acquire the lock. You should use type tags, when you want to make decision in compile time and use enum to make decision in run-time.
Because, in std::unique_lock<Mutex>, you don't want to force Mutex to have a lock or try_lock method if it may never need to be called.
If it accepted an enum parameter, then both of those methods would need to be present.

Is a const string Still Preferable?

I asked a question on Iterators here: Prefer Iterators Over Pointers? I've come to understand some of the protection and debug capabilities they offer as a result.
However, I believe that begin and end now offer similar possibilities on C-style array.
If I want to create a const string that will only be iterated over in STL algorithms, is there still an advantage to using a const string, or should I prefer const char[] with begin and end?
So the answer depends on what version of c++ you're using
C++98
Because C++98 doesn't have std::begin or std::end the best move is just to accept you're going to have to pay the costs of construction and use std::string. If you have boost available you should still consider boost::string_ref for two reasons. First its construction will always avoid allocation, and is overall much simpler than std::string.
boost::string_ref works because it only stores a pointer to the string and the length. Thus the overhead is minimal in all cases.
c++11
Very similar to C++98 except the recommendation to use boost::string_ref becomes MUCH stronger because c++11 has constexpr which allows the compiler to bypass construction completely by constructing the object at compile time.
c++1z
Allegedly (it's not final) the Library Fundamentals TS will be bringing us std::string_view. boost::string_ref was a prototype for an earlier proposal of std::string_view and is designed to bring the functionality to all versions of C++ in some form.
On C++14 String Literals
C++14 introduced string literals with the syntax "foo"s unfortunately this is just a convenience. Because operator""s is not constexpr it cannot be evaluated at compile time and thus does not avoid the penalty that construction brings. So it can be used to make code nicer looking, but it doesn't provide any other benefit in this case.

Are there any C++ language obstacles that prevent adopting D ranges?

This is a C++ / D cross-over question. The D programming language has ranges that -in contrast to C++ libraries such as Boost.Range- are not based on iterator pairs. The official C++ Ranges Study Group seems to have been bogged down in nailing a technical specification.
Question: does the current C++11 or the upcoming C++14 Standard have any obstacles that prevent adopting D ranges -as well as a suitably rangefied version of <algorithm>- wholesale?
I don't know D or its ranges well enough, but they seem lazy and composable as well as capable of providing a superset of the STL's algorithms. Given their claim to success for D, it would seem very nice to have as a library for C++. I wonder how essential D's unique features (e.g. string mixins, uniform function call syntax) were for implementing its ranges, and whether C++ could mimic that without too much effort (e.g. C++14 constexpr seems quite similar to D compile-time function evaluation)
Note: I am seeking technical answers, not opinions whether D ranges are the right design to have as a C++ library.
I don't think there is any inherent technical limitation in C++ which would make it impossible to define a system of D-style ranges and corresponding algorithms in C++. The biggest language level problem would be that C++ range-based for-loops require that begin() and end() can be used on the ranges but assuming we would go to the length of defining a library using D-style ranges, extending range-based for-loops to deal with them seems a marginal change.
The main technical problem I have encountered when experimenting with algorithms on D-style ranges in C++ was that I couldn't make the algorithms as fast as my iterator (actually, cursor) based implementations. Of course, this could just be my algorithm implementations but I haven't seen anybody providing a reasonable set of D-style range based algorithms in C++ which I could profile against. Performance is important and the C++ standard library shall provide, at least, weakly efficient implementations of algorithms (a generic implementation of an algorithm is called weakly efficient if it is at least as fast when applied to a data structure as a custom implementation of the same algorithm using the same data structure using the same programming language). I wasn't able to create weakly efficient algorithms based on D-style ranges and my objective are actually strongly efficient algorithms (similar to weakly efficient but allowing any programming language and only assuming the same underlying hardware).
When experimenting with D-style range based algorithms I found the algorithms a lot harder to implement than iterator-based algorithms and found it necessary to deal with kludges to work around some of their limitations. Of course, not everything in the current way algorithms are specified in C++ is perfect either. A rough outline of how I want to change the algorithms and the abstractions they work with is on may STL 2.0 page. This page doesn't really deal much with ranges, however, as this is a related but somewhat different topic. I would rather envision iterator (well, really cursor) based ranges than D-style ranges but the question wasn't about that.
One technical problem all range abstractions in C++ do face is having to deal with temporary objects in a reasonable way. For example, consider this expression:
auto result = ranges::unique(ranges::sort(std::vector<int>{ read_integers() }));
In dependent of whether ranges::sort() or ranges::unique() are lazy or not, the representation of the temporary range needs to be dealt with. Merely providing a view of the source range isn't an option for either of these algorithms because the temporary object will go away at the end of the expression. One possibility could be to move the range if it comes in as r-value, requiring different result for both ranges::sort() and ranges::unique() to distinguish the cases of the actual argument being either a temporary object or an object kept alive independently. D doesn't have this particular problem because it is garbage collected and the source range would, thus, be kept alive in either case.
The above example also shows one of the problems with possibly lazy evaluated algorithm: since any type, including types which can't be spelled out otherwise, can be deduced by auto variables or templated functions, there is nothing forcing the lazy evaluation at the end of an expression. Thus, the results from the expression templates can be obtained and the algorithm isn't really executed. That is, if an l-value is passed to an algorithm, it needs to be made sure that the expression is actually evaluated to obtain the actual effect. For example, any sort() algorithm mutating the entire sequence clearly does the mutation in-place (if you want a version doesn't do it in-place just copy the container and apply the in-place version; if you only have a non-in-place version you can't avoid the extra sequence which may be an immediate problem, e.g., for gigantic sequences). Assuming it is lazy in some way the l-value access to the original sequence provides a peak into the current status which is almost certainly a bad thing. This may imply that lazy evaluation of mutating algorithms isn't such a great idea anyway.
In any case, there are some aspects of C++ which make it impossible to immediately adopt the D-sytle ranges although the same considerations also apply to other range abstractions. I'd think these considerations are, thus, somewhat out of scope for the question, too. Also, the obvious "solution" to the first of the problems (add garbage collection) is unlikely to happen. I don't know if there is a solution to the second problem in D. There may emerge a solution to the second problem (tentatively dubbed operator auto) but I'm not aware of a concrete proposal or how such a feature would actually look like.
BTW, the Ranges Study Group isn't really bogged down by any technical details. So far, we merely tried to find out what problems we are actually trying to solve and to scope out, to some extend, the solution space. Also, groups generally don't get any work done, at all! The actual work is always done by individuals, often by very few individuals. Since a major part of the work is actually designing a set of abstractions I would expect that the foundations of any results of the Ranges Study Group is done by 1 to 3 individuals who have some vision of what is needed and how it should look like.
My C++11 knowledge is much more limited than I'd like it to be, so there may be newer features which improve things that I'm not aware of yet, but there are three areas that I can think of at the moment which are at least problematic: template constraints, static if, and type introspection.
In D, a range-based function will usually have a template constraint on it indicating which type of ranges it accepts (e.g. forward range vs random-access range). For instance, here's a simplified signature for std.algorithm.sort:
auto sort(alias less = "a < b", Range)(Range r)
if(isRandomAccessRange!Range &&
hasSlicing!Range &&
hasLength!Range)
{...}
It checks that the type being passed in is a random-access range, that it can be sliced, and that it has a length property. Any type which does not satisfy those requirements will not compile with sort, and when the template constraint fails, it makes it clear to the programmer why their type won't work with sort (rather than just giving a nasty compiler error from in the middle of the templated function when it fails to compile with the given type).
Now, while that may just seem like a usability improvement over just giving a compilation error when sort fails to compile because the type doesn't have the right operations, it actually has a large impact on function overloading as well as type introspection. For instance, here are two of std.algorithm.find's overloads:
R find(alias pred = "a == b", R, E)(R haystack, E needle)
if(isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{...}
R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
if(isForwardRange!R1 && isForwardRange!R2 &&
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
!isRandomAccessRange!R1)
{...}
The first one accepts a needle which is only a single element, whereas the second accepts a needle which is a forward range. The two are able to have different parameter types based purely on the template constraints and can have drastically different code internally. Without something like template constraints, you can't have templated functions which are overloaded on attributes of their arguments (as opposed to being overloaded on the specific types themselves), which makes it much harder (if not impossible) to have different implementations based on the genre of range being used (e.g. input range vs forward range) or other attributes of the types being used. Some work has been being done in this area in C++ with concepts and similar ideas, but AFAIK, C++ is still seriously lacking in the features necessary to overload templates (be they templated functions or templated types) based on the attributes of their argument types rather than specializing on specific argument types (as occurs with template specialization).
A related feature would be static if. It's the same as if, except that its condition is evaluated at compile time, and whether it's true or false will actually determine which branch is compiled in as opposed to which branch is run. It allows you to branch code based on conditions known at compile time. e.g.
static if(isDynamicArray!T)
{}
else
{}
or
static if(isRandomAccessRange!Range)
{}
else static if(isBidirectionalRange!Range)
{}
else static if(isForwardRange!Range)
{}
else static if(isInputRange!Range)
{}
else
static assert(0, Range.stringof ~ " is not a valid range!");
static if can to some extent obviate the need for template constraints, as you can essentially put the overloads for a templated function within a single function. e.g.
R find(alias pred = "a == b", R, E)(R haystack, E needle)
{
static if(isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{...}
else static if(isForwardRange!R1 && isForwardRange!R2 &&
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
!isRandomAccessRange!R1)
{...}
}
but that still results in nastier errors when compilation fails and actually makes it so that you can't overload the template (at least with D's implementation), because overloading is determined before the template is instantiated. So, you can use static if to specialize pieces of a template implementation, but it doesn't quite get you enough of what template constraints get you to not need template constraints (or something similar).
Rather, static if is excellent for doing stuff like specializing only a piece of your function's implementation or for making it so that a range type can properly inherit the attributes of the range type that it's wrapping. For instance, if you call std.algorithm.map on an array of integers, the resultant range can have slicing (because the source range does), whereas if you called map on a range which didn't have slicing (e.g. the ranges returned by std.algorithm.filter can't have slicing), then the resultant ranges won't have slicing. In order to do that, map uses static if to compile in opSlice only when the source range supports it. Currently, map 's code that does this looks like
static if (hasSlicing!R)
{
static if (is(typeof(_input[ulong.max .. ulong.max])))
private alias opSlice_t = ulong;
else
private alias opSlice_t = uint;
static if (hasLength!R)
{
auto opSlice(opSlice_t low, opSlice_t high)
{
return typeof(this)(_input[low .. high]);
}
}
else static if (is(typeof(_input[opSlice_t.max .. $])))
{
struct DollarToken{}
enum opDollar = DollarToken.init;
auto opSlice(opSlice_t low, DollarToken)
{
return typeof(this)(_input[low .. $]);
}
auto opSlice(opSlice_t low, opSlice_t high)
{
return this[low .. $].take(high - low);
}
}
}
This is code in the type definition of map's return type, and whether that code is compiled in or not depends entirely on the results of the static ifs, none of which could be replaced with template specializations based on specific types without having to write a new specialized template for map for every new type that you use with it (which obviously isn't tenable). In order to compile in code based on attributes of types rather than with specific types, you really need something like static if (which C++ does not currently have).
The third major item which C++ is lacking (and which I've more or less touched on throughout) is type introspection. The fact that you can do something like is(typeof(binaryFun!pred(haystack.front, needle)) : bool) or isForwardRange!Range is crucial. Without the ability to check whether a particular type has a particular set of attributes or that a particular piece of code compiles, you can't even write the conditions which template constraints and static if use. For instance, std.range.isInputRange looks something like this
template isInputRange(R)
{
enum bool isInputRange = is(typeof(
{
R r = void; // can define a range object
if (r.empty) {} // can test for empty
r.popFront(); // can invoke popFront()
auto h = r.front; // can get the front of the range
}));
}
It checks that a particular piece of code compiles for the given type. If it does, then that type can be used as an input range. If it doesn't, then it can't. AFAIK, it's impossible to do anything even vaguely like this in C++. But to sanely implement ranges, you really need to be able to do stuff like have isInputRange or test whether a particular type compiles with sort - is(typeof(sort(myRange))). Without that, you can't specialize implementations based on what types of operations a particular range supports, you can't properly forward the attributes of a range when wrapping it (and range functions wrap their arguments in new ranges all the time), and you can't even properly protect your function against being compiled with types which won't work with it. And, of course, the results of static if and template constraints also affect the type introspection (as they affect what will and won't compile), so the three features are very much interconnected.
Really, the main reasons that ranges don't work very well in C++ are the some reasons that metaprogramming in C++ is primitive in comparison to metaprogramming in D. AFAIK, there's no reason that these features (or similar ones) couldn't be added to C++ and fix the problem, but until C++ has metaprogramming capabilities similar to those of D, ranges in C++ are going to be seriously impaired.
Other features such as mixins and Uniform Function Call Syntax would also help, but they're nowhere near as fundamental. Mixins would help primarily with reducing code duplication, and UFCS helps primarily with making it so that generic code can just call all functions as if they were member functions so that if a type happens to define a particular function (e.g. find) then that would be used instead of the more general, free function version (and the code still works if no such member function is declared, because then the free function is used). UFCS is not fundamentally required, and you could even go the opposite direction and favor free functions for everything (like C++11 did with begin and end), though to do that well, it essentially requires that the free functions be able to test for the existence of the member function and then call the member function internally rather than using their own implementations. So, again you need type introspection along with static if and/or template constraints.
As much as I love ranges, at this point, I've pretty much given up on attempting to do anything with them in C++, because the features to make them sane just aren't there. But if other folks can figure out how to do it, all the more power to them. Regardless of ranges though, I'd love to see C++ gain features such as template constraints, static if, and type introspection, because without them, metaprogramming is way less pleasant, to the point that while I do it all the time in D, I almost never do it in C++.

Why isn't std::initializer_list a language built-in?

Why isn't std::initializer_list a core-language built-in?
It seems to me that it's quite an important feature of C++11 and yet it doesn't have its own reserved keyword (or something alike).
Instead, initializer_list it's just a template class from the standard library that has a special, implicit mapping from the new braced-init-list {...} syntax that's handled by the compiler.
At first thought, this solution is quite hacky.
Is this the way new additions to the C++ language will be now implemented: by implicit roles of some template classes and not by the core language?
Please consider these examples:
widget<int> w = {1,2,3}; //this is how we want to use a class
why was a new class chosen:
widget( std::initializer_list<T> init )
instead of using something similar to any of these ideas:
widget( T[] init, int length ) // (1)
widget( T... init ) // (2)
widget( std::vector<T> init ) // (3)
a classic array, you could probably add const here and there
three dots already exist in the language (var-args, now variadic templates), why not re-use the syntax (and make it feel built-in)
just an existing container, could add const and &
All of them are already a part of the language. I only wrote my 3 first ideas, I am sure that there are many other approaches.
There were already examples of "core" language features that returned types defined in the std namespace. typeid returns std::type_info and (stretching a point perhaps) sizeof returns std::size_t.
In the former case, you already need to include a standard header in order to use this so-called "core language" feature.
Now, for initializer lists it happens that no keyword is needed to generate the object, the syntax is context-sensitive curly braces. Aside from that it's the same as type_info. Personally I don't think the absence of a keyword makes it "more hacky". Slightly more surprising, perhaps, but remember that the objective was to allow the same braced-initializer syntax that was already allowed for aggregates.
So yes, you can probably expect more of this design principle in future:
if more occasions arise where it is possible to introduce new features without new keywords then the committee will take them.
if new features require complex types, then those types will be placed in std rather than as builtins.
Hence:
if a new feature requires a complex type and can be introduced without new keywords then you'll get what you have here, which is "core language" syntax with no new keywords and that uses library types from std.
What it comes down to, I think, is that there is no absolute division in C++ between the "core language" and the standard libraries. They're different chapters in the standard but each references the other, and it has always been so.
There is another approach in C++11, which is that lambdas introduce objects that have anonymous types generated by the compiler. Because they have no names they aren't in a namespace at all, certainly not in std. That's not a suitable approach for initializer lists, though, because you use the type name when you write the constructor that accepts one.
The C++ Standard Committee seems to prefer not to add new keywords, probably because that increases the risk of breaking existing code (legacy code could use that keyword as the name of a variable, a class, or whatever else).
Moreover, it seems to me that defining std::initializer_list as a templated container is quite an elegant choice: if it was a keyword, how would you access its underlying type? How would you iterate through it? You would need a bunch of new operators as well, and that would just force you to remember more names and more keywords to do the same things you can do with standard containers.
Treating an std::initializer_list as any other container gives you the opportunity of writing generic code that works with any of those things.
UPDATE:
Then why introduce a new type, instead of using some combination of existing? (from the comments)
To begin with, all others containers have methods for adding, removing, and emplacing elements, which are not desirable for a compiler-generated collection. The only exception is std::array<>, which wraps a fixed-size C-style array and would therefore remain the only reasonable candidate.
However, as Nicol Bolas correctly points out in the comments, another, fundamental difference between std::initializer_list and all other standard containers (including std::array<>) is that the latter ones have value semantics, while std::initializer_list has reference semantics. Copying an std::initializer_list, for instance, won't cause a copy of the elements it contains.
Moreover (once again, courtesy of Nicol Bolas), having a special container for brace-initialization lists allows overloading on the way the user is performing initialization.
This is nothing new. For example, for (i : some_container) relies on existence of specific methods or standalone functions in some_container class. C# even relies even more on its .NET libraries. Actually, I think, that this is quite an elegant solution, because you can make your classes compatible with some language structures without complicating language specification.
This is indeed nothing new and how many have pointed out, this practice was there in C++ and is there, say, in C#.
Andrei Alexandrescu has mentioned a good point about this though: You may think of it as a part of imaginary "core" namespace, then it'll make more sense.
So, it's actually something like: core::initializer_list, core::size_t, core::begin(), core::end() and so on. This is just an unfortunate coincidence that std namespace has some core language constructs inside it.
Not only can it work completely in the standard library. Inclusion into the standard library does not mean that the compiler can not play clever tricks.
While it may not be able to in all cases, it may very well say: this type is well known, or a simple type, lets ignore the initializer_list and just have a memory image of what the initialized value should be.
In other words int i {5}; can be equivalent to int i(5); or int i=5; or even intwrapper iw {5}; Where intwrapper is a simple wrapper class over an int with a trivial constructor taking an initializer_list
It's not part of the core language because it can be implemented entirely in the library, just line operator new and operator delete. What advantage would there be in making compilers more complicated to build it in?

C++0x, Compiler hooks and hard coded languages features

I'm a little curious about some of the new features of C++0x. In particular range-based for loops and initializer lists. Both features require a user-defined class in order to function correctly.
I came accross this post, and while the top-answer was helpful. I don't know if it's entirely correct (I'm probably just completely misunderstanding, see 3rd comment on first answer). According to the current specifications for initializer lists, the header defines one type:
template<class E> class initializer_list {
public:
initializer_list();
size_t size() const; // number of elements
const E* begin() const; // first element
const E* end() const; // one past the last element
};
You can see this in the specifications, just Ctrl + F 'class initializer_list'.
In order for = {1,2,3} to be implicitly casted into the initializer_list class, the compiler HAS to have some knowledge of the relationship between {} and initializer_list. There is no constructor that receives anything, so the initializer_list as far as I can tell is a wrapper that gets bound to whatever the compiler is actually generating.
It's the same with the for( : ) loop, which also requires a user-defined type to work (though according to the specs, updated to not require any code for arrays and initializer lists. But initializer lists require <initializer_list>, so it's a user-defined code requirement by proxy).
Am I misunderstanding completely how this works here? I'm not wrong in thinking that these new features do infact rely extremely heavily on user code. It feels as if the features are half-baked, and instead of building the entire feature into the compiler, it's being half-done by the compiler and half done in includes. What's the reason for this?
Edit: I typed 'rely heavily on compiler code', and not 'rely heavily on user code'. Which I think completely threw off my question. My confusion isn't about new features being built into the compiler, it's things that are built into the compiler that rely on user code.
I'm not wrong in thinking that these new features do infact rely extremely heavily on compiler code
They do rely extremely on the compiler. Whether you need to include a header or not, the fact is that in both cases, the syntax would be a parsing error with today compilers. The for (:) does not quite fit into todays standard, where the only allowed construct is for(;;)
It feels as if the features are half-baked, and instead of building the entire feature into the compiler, it's being half-done by the compiler and half done in includes. What's the reason for this?
The support must be implemented in the compiler, but you are required to include a system's header for it to work. This can serve a couple of purposes, in the case of initialization lists, it brings the type (interface to the compiler support) into scope for the user so that you can have a way of using it (think how va_args are in C). In the case of the range-based for (which is just syntactic sugar) you need to bring Range into scope so that the compiler can perform it's magic. Note that the standard defines for ( for-range-declaration : expression ) statement as equivalent to ([6.5.4]/1 in the draft):
{
auto && __range = ( expression );
for ( auto __begin = std::Range<_RangeT>::begin(__range),
__end = std::Range<_RangeT>::end(__range);
__begin != __end;
++__begin ) {
for-range-declaration = *__begin;
statement
}
}
If you want to use it only on arrays and STL containers that could be implemented without the Range concept (not in the C++0x sense), but if you want to extend the syntax into user defined classes (your own containers) the compiler can easily depend upon the existing Range template (with your own possible specialization). The mechanism of depending upon a template being defined is equivalent to requiring a static interface on the container.
Most other languages have gone in the direction of requiring a regular interface (say Container,...) and using runtime polymorphism on that. If that was to be done in C++, the whole STL would have to go through a major refactoring as STL containers do not share a common base or interface, and they are not prepared to be used polimorphically.
If any, the current standard will not be underbaked by the time it goes out.
It's just syntax sugar. The compiler will expand the given syntactic constructs into equivalent C++ expressions that reference the standard types / symbol names directly.
This isn't the only strong coupling that modern C++ compilers have between their language and the "outside world". For example, extern "C" is a bit of a language hack to accommodate C's linking model. Language-oriented ways of declaring thread-local storage implicitly depend on lots of RTL hackery to work.
Or look at C. How do you access arguments passed via ...? You need to rely on the standard library; but that uses magic that has a very hard dependency on how exactly the C compiler lays out stack frames.
UPDATE:
If anything, the approach C++ has taken here is more in the spirit of C++ than the alternative - which would be to add an intrinsic collection or range type, baked in to the language. Instead, it's being done via a vendor-defined range type. I really don't see it as much different to variadic arguments, which are similarly useless without the vendor-defined accessor macros.