Is a const string Still Preferable?

Is a const string Still Preferable? - c++

I asked a question on Iterators here: Prefer Iterators Over Pointers? I've come to understand some of the protection and debug capabilities they offer as a result.
However, I believe that begin and end now offer similar possibilities on C-style array.
If I want to create a const string that will only be iterated over in STL algorithms, is there still an advantage to using a const string, or should I prefer const char[] with begin and end?

So the answer depends on what version of c++ you're using
C++98
Because C++98 doesn't have std::begin or std::end the best move is just to accept you're going to have to pay the costs of construction and use std::string. If you have boost available you should still consider boost::string_ref for two reasons. First its construction will always avoid allocation, and is overall much simpler than std::string.
boost::string_ref works because it only stores a pointer to the string and the length. Thus the overhead is minimal in all cases.
c++11
Very similar to C++98 except the recommendation to use boost::string_ref becomes MUCH stronger because c++11 has constexpr which allows the compiler to bypass construction completely by constructing the object at compile time.
c++1z
Allegedly (it's not final) the Library Fundamentals TS will be bringing us std::string_view. boost::string_ref was a prototype for an earlier proposal of std::string_view and is designed to bring the functionality to all versions of C++ in some form.
On C++14 String Literals
C++14 introduced string literals with the syntax "foo"s unfortunately this is just a convenience. Because operator""s is not constexpr it cannot be evaluated at compile time and thus does not avoid the penalty that construction brings. So it can be used to make code nicer looking, but it doesn't provide any other benefit in this case.

Related

Is boost::typeindex::ctti_type_index a standard compliant way for compile-time type ids for some cases?

I'm currently evaluating possibilities in changing several classes/structs of a project in order to have them usable within a constexpr-context at compile time. A current game stopper are the cases where typeid() and std::type_index (both seem to be purely rtti-based?) are used that cannot be used within a constexpr-context.
So I came across with boost's boost::typeindex::ctti_type_index
They say:
boost::typeindex::ctti_type_index class can be used as a drop-in
replacement for std::type_index.
So far so good. The only exceptional case I was able to find so far, that one should be aware of when using it, is
With RTTI off different classes with same names in anonymous namespace
may collapse. See 'RTTI emulation limitations'.
which is currently relevant at least for gcc, clang and Intel compilers and not really surprising. I could live with that restriction so far. So my first question here is: Besides the issue with anonymous namespaces, does boost fully refer to standard compliant mechanisms in achieving that constexpr typeid generation? It's quite hard to analyze that from scratch due to too many compiler dependent switches. Did anybody gain experience with that already for several scenarios and might mention some further drawbacks I do not see here a priori?
And my second question, quite directly related with the first one, is about the details: How does that implementation work at "core level", especially for the comparison context?
For the comparison, they use
BOOST_CXX14_CONSTEXPR inline bool ctti_type_index::equal(const ctti_type_index& rhs) const BOOST_NOEXCEPT {
const char* const left = raw_name();
const char* const right = rhs.raw_name();
return /*left == right ||*/ !boost::typeindex::detail::constexpr_strcmp(left, right);
}
Why did they out comment the raw "string" inner comparison? The raw name member (inline referred from raw_name()) itself is simply defined as
const char* data_;
So my guess is, that at least within a fully constexpr context if initialized with a constexpr char*, the simple comparison should be standard compliant (ensured unique pointer adresses for inline-objects, i.e. for constexpr respectively?)? Is that already fully guaranteed by the standard (here I focus on C++17, relevant changes for C++20?) and not used here yet due to common compiler limitations only? (BTW: I'm generally struggling with non-trivial non self-explanatory out commented sections in code...) With their constexpr_strcmp, they apply a trivial but expensive character-wise comparison what would have been my custom way too. Trivial to see here, that the simple pointer comparison would be the preferred one further on.
Update due to rereading my own question: So at least for the comparison case, I currently understand the mechanisms for the enabled code but are interested in the out-commented approach.

Using std::string's std::hash specialization without constructing a string object

I have a codebase which makes extensive use of CUDA, which unfortunately only supports C++14 to date. However, I still want to use string_view, which is a C++17 feature. The implementation is relatively straight-forward, especially since I do not require the 'find' functionalities.
However, I do need hashing to work. The standard mandates that std::hash of a string_view must be the equal to the hash of a string constructed from the string_view (and I intend to rely on this guarantee). Is there a standard-conform way of getting the output from std::hash without having to temporarily construct a string object, which may come with an unoptimizable heap allocation (which is the route that string-view-lite went)? I would rather not rely on copying over the algorithm from the concrete stdlib implementation, since that might break in the future or already breaks compilation with older versions.
Alternatively, is there a way to let MSVC (EDIT: v14.16) use std::string_view in C++14-mode, which NVCC also recognizes? It would be great if Clang and GCC had a similar option as well, since the codebase might migrate away from MSVC one day.

You are out of luck in my opinion since, also by assuming that you could mimic the internal structure and pass a tailored object to fulfill and look like a std::string, that would surely be more fragile than just copying the implementation.
I see two choices, either:
you copy the std::hash<std::string> specialization implementation and put some asserts with hardcoded cases which may warn you if anything changes or is different (rather a clumsy solution).
you provide your own hash function which overrides std::string one and pass it as template argument so that you can enforce the constraint of them being equal when working with STL collections.

What are the new features in C++17?

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
C++17 is now feature complete, so unlikely to experience large changes. Hundreds of proposals were put forward for C++17.
Which of those features were added to C++ in C++17?
When using a C++ compiler that supports "C++1z", which of those features are going to be available when the compiler updates to C++17?

Language features:
Templates and Generic Code
Template argument deduction for class templates
Like how functions deduce template arguments, now constructors can deduce the template arguments of the class
http://wg21.link/p0433r2 http://wg21.link/p0620r0 http://wg21.link/p0512r0
template <auto>
Represents a value of any (non-type template argument) type.
Non-type template arguments fixes
template<template<class...>typename bob> struct foo {}
( Folding + ... + expressions ) and Revisions
auto x{8}; is an int
modernizing using with ... and lists
Lambda
constexpr lambdas
Lambdas are implicitly constexpr if they qualify
Capturing *this in lambdas
[*this]{ std::cout << could << " be " << useful << '\n'; }
Attributes
[[fallthrough]], [[nodiscard]], [[maybe_unused]] attributes
[[attributes]] on namespaces and enum { erator[[s]] }
using in attributes to avoid having to repeat an attribute namespace.
Compilers are now required to ignore non-standard attributes they don't recognize.
The C++14 wording allowed compilers to reject unknown scoped attributes.
Syntax cleanup
Inline variables
Like inline functions
Compiler picks where the instance is instantiated
Deprecate static constexpr redeclaration, now implicitly inline.
namespace A::B
Simple static_assert(expression); with no string
no throw unless throw(), and throw() is noexcept(true).
Cleaner multi-return and flow control
Structured bindings
Basically, first-class std::tie with auto
Example:
const auto [it, inserted] = map.insert( {"foo", bar} );
Creates variables it and inserted with deduced type from the pair that map::insert returns.
Works with tuple/pair-likes & std::arrays and relatively flat structs
Actually named structured bindings in standard
if (init; condition) and switch (init; condition)
if (const auto [it, inserted] = map.insert( {"foo", bar} ); inserted)
Extends the if(decl) to cases where decl isn't convertible-to-bool sensibly.
Generalizing range-based for loops
Appears to be mostly support for sentinels, or end iterators that are not the same type as begin iterators, which helps with null-terminated loops and the like.
if constexpr
Much requested feature to simplify almost-generic code.
Misc
Hexadecimal float point literals
Dynamic memory allocation for over-aligned data
Guaranteed copy elision
Finally!
Not in all cases, but distinguishes syntax where you are "just creating something" that was called elision, from "genuine elision".
Fixed order-of-evaluation for (some) expressions with some modifications
Not including function arguments, but function argument evaluation interleaving now banned
Makes a bunch of broken code work mostly, and makes .then on future work.
Direct list-initialization of enums
Forward progress guarantees (FPG) (also, FPGs for parallel algorithms)
I think this is saying "the implementation may not stall threads forever"?
u8'U', u8'T', u8'F', u8'8' character literals (string already existed)
"noexcept" in the type system
__has_include
Test if a header file include would be an error
makes migrating from experimental to std almost seamless
Arrays of pointer conversion fixes
inherited constructors fixes to some corner cases (see P0136R0 for examples of behavior changes)
aggregate initialization with inheritance.
std::launder, type punning, etc
Library additions:
Data types
std::variant<Ts...>
Almost-always non-empty last I checked?
Tagged union type
{awesome|useful}
std::optional
Maybe holds one of something
Ridiculously useful
std::any
Holds one of anything (that is copyable)
std::string_view
std::string like reference-to-character-array or substring
Never take a string const& again. Also can make parsing a bajillion times faster.
"hello world"sv
constexpr char_traits
std::byte off more than they could chew.
Neither an integer nor a character, just data
Invoke stuff
std::invoke
Call any callable (function pointer, function, member pointer) with one syntax. From the standard INVOKE concept.
std::apply
Takes a function-like and a tuple, and unpacks the tuple into the call.
std::make_from_tuple, std::apply applied to object construction
is_invocable, is_invocable_r, invoke_result
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0077r2.html
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0604r0.html
Deprecates result_of
is_invocable<Foo(Args...), R> is "can you call Foo with Args... and get something compatible with R", where R=void is default.
invoke_result<Foo, Args...> is std::result_of_t<Foo(Args...)> but apparently less confusing?
File System TS v1
[class.path]
[class.filesystem.error]
[class.file_status]
[class.directory_entry]
[class.directory_iterator] and [class.recursive_directory_iterator]
[fs.ops.funcs]
fstreams can be opened with paths, as well as with const path::value_type* strings.
New algorithms
for_each_n
reduce
transform_reduce
exclusive_scan
inclusive_scan
transform_exclusive_scan
transform_inclusive_scan
Added for threading purposes, exposed even if you aren't using them threaded
Threading
std::shared_mutex
Untimed, which can be more efficient if you don't need it.
atomic<T>::is_always_lockfree
scoped_lock<Mutexes...>
Saves some std::lock pain when locking more than one mutex at a time.
Parallelism TS v1
The linked paper from 2014, may be out of date
Parallel versions of std algorithms, and related machinery
hardware_*_interference_size
(parts of) Library Fundamentals TS v1 not covered above or below
[func.searchers] and [alg.search]
A searching algorithm and techniques
[pmr]
Polymorphic allocator, like std::function for allocators
And some standard memory resources to go with it.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0358r1.html
std::sample, sampling from a range?
Container Improvements
try_emplace and insert_or_assign
gives better guarantees in some cases where spurious move/copy would be bad
Splicing for map<>, unordered_map<>, set<>, and unordered_set<>
Move nodes between containers cheaply.
Merge whole containers cheaply.
non-const .data() for string.
non-member std::size, std::empty, std::data
like std::begin/end
Minimal incomplete type support in containers
Contiguous iterator "concept"
constexpr iterators
The emplace family of functions now returns a reference to the created object.
Smart pointer changes
unique_ptr<T[]> fixes and other unique_ptr tweaks.
weak_from_this and some fixed to shared from this
Other std datatype improvements:
{} construction of std::tuple and other improvements
TriviallyCopyable reference_wrapper, can be performance boost
Misc
C++17 library is based on C11 instead of C99
Reserved std[0-9]+ for future standard libraries
destroy(_at|_n), uninitialized_move(_n), uninitialized_value_construct(_n), uninitialized_default_construct(_n)
utility code already in most std implementations exposed
Special math functions
scientists may like them
std::clamp()
std::clamp( a, b, c ) == std::max( b, std::min( a, c ) ) roughly
gcd and lcm
std::uncaught_exceptions
Required if you want to only throw if safe from destructors
std::as_const
std::bool_constant
A whole bunch of _v template variables
std::void_t<T>
Surprisingly useful when writing templates
std::owner_less<void>
like std::less<void>, but for smart pointers to sort based on contents
std::chrono polish
std::conjunction, std::disjunction, std::negation exposed
std::not_fn
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0358r1.html
Rules for noexcept within std
std::is_contiguous_layout, useful for efficient hashing
std::to_chars/std::from_chars, high performance, locale agnostic number conversion; finally a way to serialize/deserialize to human readable formats (JSON & co)
std::default_order, indirection over std::less. (breaks ABI of some compilers due to name mangling, removed.)
memory_order_consume, added language to prefer use of memory_order_acquire
Traits
swap
is_aggregate
has_unique_object_representations
Deprecated
Some C libraries,
<codecvt>
result_of, replaced with invoke_result
shared_ptr::unique, it isn't very threadsafe
Isocpp.org has has an independent list of changes since C++14; it has been partly pillaged.
Naturally TS work continues in parallel, so there are some TS that are not-quite-ripe that will have to wait for the next iteration. The target for the next iteration is C++20 as previously planned, not C++19 as some rumors implied. C++1O has been avoided.
Initial list taken from this reddit post and this reddit post, with links added via googling or from the above isocpp.org page.
Additional entries pillaged from SD-6 feature-test list.
clang's feature list and library feature list are next to be pillaged. This doesn't seem to be reliable, as it is C++1z, not C++17.
these slides had some features missing elsewhere.
While "what was removed" was not asked, here is a short list of a few things ((mostly?) previous deprecated) that are removed in C++17 from C++:
Removed:
register, keyword reserved for future use
bool b; ++b;
trigraphs
if you still need them, they are now part of your source file encoding, not part of language
ios aliases
auto_ptr, old <functional> stuff, random_shuffle
allocators in std::function
There were rewordings. I am unsure if these have any impact on code, or if they are just cleanups in the standard:
Papers not yet integrated into above:
P0505R0 (constexpr chrono)
P0418R2 (atomic tweaks)
P0512R0 (template argument deduction tweaks)
P0490R0 (structured binding tweaks)
P0513R0 (changes to std::hash)
P0502R0 (parallel exceptions)
P0509R1 (updating restrictions on exception handling)
P0012R1 (make exception specifications be part of the type system)
P0510R0 (restrictions on variants)
P0504R0 (tags for optional/variant/any)
P0497R0 (shared ptr tweaks)
P0508R0 (structured bindings node handles)
P0521R0 (shared pointer use count and unique changes?)
Spec changes:
exception specs and throw expressions
Further reference:
papers grouped by year; not all accepted
https://isocpp.org/files/papers/p0636r0.html
Should be updated to "Modifications to existing features" here.

Why make a specialized version of unintialized_copy( ) for char* and w_char* but not other primitive types?

Here is the specialization version for char* :
inline char* uninitialized_copy(const char* first, const char* last, char* result)
{
memmove(result, first, last-first);
return result + (last - first);
}
It is said that memmove is the most efficient way for char* and w_char* to implement this method. But Why can't int* and other basic type be implemented in this way?

When the standard was designed, it was believed that some specializations like this could be used to improve performance. As it turned out, the compilers got better and generate the same code anyway from the base template.
Many current compilers even have the memmove function built in and generate improved inline versions when they can take advantage of known alignment or when the size of the objects happen to be an even multiple of the register size.
Here is my favorite exemple of when the compiler realizes that an 8 byte string can be copied using a single register move.

The C++ standard actually does not say that there should be specializations of uninitialized_copy for these types; it just mandates that there should be a template function called uninitialized_copy that should work on iterator ranges. However, the C++ standard does permit compiler and library authors to implement their own specializations of this function if they choose to do so.
There are many good reasons to specialize uninitialized_copy to work on individual characters. In some cases, compilers provide intrinsics for memmove and memcpy that are much faster than the code they would normally output due to normal optimizations. As a result, specializing this code for the char and wchar_t types would make sense, as these intrinsics would outperform a standard loop.
It is hard for me to say exactly why the library authors didn't specialize for other types. My guess is that they tested this out and didn't find much of a performance difference. Since anything the library authors do above providing the standard template version of uninitialized_copy is just going above and beyond what the spec requires, it could simply be because they were busy and needed to work on something else. You'd have to contact them directly to get a more definitive answer.
In short - the library authors aren't required to provide any specializations at all, and it's nice that they chose to put in this specialization. Without contacting them, it's hard to say definitively why they didn't choose to do this for other types.
Hope this helps!

Why isn't std::initializer_list a language built-in?

Why isn't std::initializer_list a core-language built-in?
It seems to me that it's quite an important feature of C++11 and yet it doesn't have its own reserved keyword (or something alike).
Instead, initializer_list it's just a template class from the standard library that has a special, implicit mapping from the new braced-init-list {...} syntax that's handled by the compiler.
At first thought, this solution is quite hacky.
Is this the way new additions to the C++ language will be now implemented: by implicit roles of some template classes and not by the core language?
Please consider these examples:
widget<int> w = {1,2,3}; //this is how we want to use a class
why was a new class chosen:
widget( std::initializer_list<T> init )
instead of using something similar to any of these ideas:
widget( T[] init, int length ) // (1)
widget( T... init ) // (2)
widget( std::vector<T> init ) // (3)
a classic array, you could probably add const here and there
three dots already exist in the language (var-args, now variadic templates), why not re-use the syntax (and make it feel built-in)
just an existing container, could add const and &
All of them are already a part of the language. I only wrote my 3 first ideas, I am sure that there are many other approaches.

There were already examples of "core" language features that returned types defined in the std namespace. typeid returns std::type_info and (stretching a point perhaps) sizeof returns std::size_t.
In the former case, you already need to include a standard header in order to use this so-called "core language" feature.
Now, for initializer lists it happens that no keyword is needed to generate the object, the syntax is context-sensitive curly braces. Aside from that it's the same as type_info. Personally I don't think the absence of a keyword makes it "more hacky". Slightly more surprising, perhaps, but remember that the objective was to allow the same braced-initializer syntax that was already allowed for aggregates.
So yes, you can probably expect more of this design principle in future:
if more occasions arise where it is possible to introduce new features without new keywords then the committee will take them.
if new features require complex types, then those types will be placed in std rather than as builtins.
Hence:
if a new feature requires a complex type and can be introduced without new keywords then you'll get what you have here, which is "core language" syntax with no new keywords and that uses library types from std.
What it comes down to, I think, is that there is no absolute division in C++ between the "core language" and the standard libraries. They're different chapters in the standard but each references the other, and it has always been so.
There is another approach in C++11, which is that lambdas introduce objects that have anonymous types generated by the compiler. Because they have no names they aren't in a namespace at all, certainly not in std. That's not a suitable approach for initializer lists, though, because you use the type name when you write the constructor that accepts one.

The C++ Standard Committee seems to prefer not to add new keywords, probably because that increases the risk of breaking existing code (legacy code could use that keyword as the name of a variable, a class, or whatever else).
Moreover, it seems to me that defining std::initializer_list as a templated container is quite an elegant choice: if it was a keyword, how would you access its underlying type? How would you iterate through it? You would need a bunch of new operators as well, and that would just force you to remember more names and more keywords to do the same things you can do with standard containers.
Treating an std::initializer_list as any other container gives you the opportunity of writing generic code that works with any of those things.
UPDATE:
Then why introduce a new type, instead of using some combination of existing? (from the comments)
To begin with, all others containers have methods for adding, removing, and emplacing elements, which are not desirable for a compiler-generated collection. The only exception is std::array<>, which wraps a fixed-size C-style array and would therefore remain the only reasonable candidate.
However, as Nicol Bolas correctly points out in the comments, another, fundamental difference between std::initializer_list and all other standard containers (including std::array<>) is that the latter ones have value semantics, while std::initializer_list has reference semantics. Copying an std::initializer_list, for instance, won't cause a copy of the elements it contains.
Moreover (once again, courtesy of Nicol Bolas), having a special container for brace-initialization lists allows overloading on the way the user is performing initialization.

This is nothing new. For example, for (i : some_container) relies on existence of specific methods or standalone functions in some_container class. C# even relies even more on its .NET libraries. Actually, I think, that this is quite an elegant solution, because you can make your classes compatible with some language structures without complicating language specification.

This is indeed nothing new and how many have pointed out, this practice was there in C++ and is there, say, in C#.
Andrei Alexandrescu has mentioned a good point about this though: You may think of it as a part of imaginary "core" namespace, then it'll make more sense.
So, it's actually something like: core::initializer_list, core::size_t, core::begin(), core::end() and so on. This is just an unfortunate coincidence that std namespace has some core language constructs inside it.

Not only can it work completely in the standard library. Inclusion into the standard library does not mean that the compiler can not play clever tricks.
While it may not be able to in all cases, it may very well say: this type is well known, or a simple type, lets ignore the initializer_list and just have a memory image of what the initialized value should be.
In other words int i {5}; can be equivalent to int i(5); or int i=5; or even intwrapper iw {5}; Where intwrapper is a simple wrapper class over an int with a trivial constructor taking an initializer_list

It's not part of the core language because it can be implemented entirely in the library, just line operator new and operator delete. What advantage would there be in making compilers more complicated to build it in?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js