hash_value function in C++11 - c++

The Boost library provides a convenience function hash_value which basically just called:
return hash<T>()(key);
As far as I can see, C++11 included std::hash which is pretty similar to boost::hash, but did not include std::hash_value. This requires application code to create a hash object and call it's operator() instead of just calling a convenient function. Is there some reason that std::hash_value was not standardized?

The primary use of the std::hash<T> function is the object used to obtain a hash value from a key in the std::unordered_* group of containers. These will always contain and use a corresponding object, possibly, using the empty base optimization to avoid it taking any memory. In any case, whenever the std::hash<T> type is used, an object is actually around.
Although the function object can be used stand-alone, it is probably rare. Also, for other, similar existing function objects there are no corresponding convenience calling functions: although most of them are wrappers for operators, especially std::less<void*> could be interesting to call stand-alone as you can't use ptr1 < ptr2 (at least, it couldn't be used in C++03 if ptr1 and ptr2 were not part of the same array object). That is, there was no suitable precedence.
Finally, I would guess that the convenience function was simply not part of the proposal: if it isn't proposed and there isn't a really good case to have, nothing will be included into the C++ standard. From the looks of it n1456 seems to be, at least, one revision of the "hash table" proposal and it doesn't include a trace of std::hash_value<T>().

Related

Can Niebloids be passed where Callables is required?

Generally speaking, unless explicitly allowed, the behavior of a C++ program that tries to take the pointer of a standard library function is unspecified. Which means extra caution should be taken before passing them as Callable. Instead it is typically better to wrap them in a lambda.
More on the topic: Can I take the address of a function defined in standard library?
However, C++20 introduced Constrained algorithms, or ranged algorithms, based on the Range-v3 library; where function-like entities, such as std::ranges::sort and std::ranges::transform, are introduced as Niebloids.
While the original library has created a functor class for each functions in the algorithm library, and each niebloids, such as ranges::sort, is simply a named object of the corresponding functor class; the standard does not specify how they should be implemented.
So the question is if the behavior of passing a Niebloid as a Callable, such as std::invoke(std::ranges::sort, my_vec), specified/explicitly allowed?
All the spec says, in [algorithms.requirements] is:
The entities defined in the std​::​ranges namespace in this Clause are not found by argument-dependent name lookup ([basic.lookup.argdep]). When found by unqualified ([basic.lookup.unqual]) name lookup for the postfix-expression in a function call ([expr.call]), they inhibit argument-dependent name lookup.
The only way to implement that, today, is by making them objects. However, we don't specify any further behavior of those objects.
So this:
std::invoke(std::ranges::sort, my_vec)
will work, simply because that will simply evaluate as std::ranges::sort(my_vec) after taking a reference to it, and there's no way to really prevent that from working.
But other uses might not. For instance, std::views::transform(r, std::ranges::distance) is not specified to work, because we don't say whether std::ranges::distance is copyable or not - std::ranges::size is a customization point object, and thus copyable, but std::ranges::distance is just an algorithm.
The MSVC implementation tries to adhere aggressively to the limited specification, and its implementation of std::ranges::distance is not copyable. libstdc++, on the other hand, just makes them empty objects, so views::transform(ranges::distance) just works by way of being not actively rejected.
All of which to say is: once you get away from directly writing std::ranges::meow(r) (or otherwise writing meow(r) after a using or using namespace), you're kind of on your own.

Using std::string's std::hash specialization without constructing a string object

I have a codebase which makes extensive use of CUDA, which unfortunately only supports C++14 to date. However, I still want to use string_view, which is a C++17 feature. The implementation is relatively straight-forward, especially since I do not require the 'find' functionalities.
However, I do need hashing to work. The standard mandates that std::hash of a string_view must be the equal to the hash of a string constructed from the string_view (and I intend to rely on this guarantee). Is there a standard-conform way of getting the output from std::hash without having to temporarily construct a string object, which may come with an unoptimizable heap allocation (which is the route that string-view-lite went)? I would rather not rely on copying over the algorithm from the concrete stdlib implementation, since that might break in the future or already breaks compilation with older versions.
Alternatively, is there a way to let MSVC (EDIT: v14.16) use std::string_view in C++14-mode, which NVCC also recognizes? It would be great if Clang and GCC had a similar option as well, since the codebase might migrate away from MSVC one day.
You are out of luck in my opinion since, also by assuming that you could mimic the internal structure and pass a tailored object to fulfill and look like a std::string, that would surely be more fragile than just copying the implementation.
I see two choices, either:
you copy the std::hash<std::string> specialization implementation and put some asserts with hardcoded cases which may warn you if anything changes or is different (rather a clumsy solution).
you provide your own hash function which overrides std::string one and pass it as template argument so that you can enforce the constraint of them being equal when working with STL collections.

Why are C++11 string new functions (stod, stof) not member functions of the string class?

Why are those C++11 new functions of header <string> (stod, stof, stoull) not member functions of the string class ?
Isn't more C++ compliant to write mystring.stod(...) rather than stod(mystring,...)?
It is a surprise to many, but C++ is not an Object-Oriented language (unlike Java or C#).
C++ is a multi-paradigm language, and therefore tries to use the best tool for the job whenever possible. In this instance, a free-function is the right tool.
Guideline: Prefer non-member non-friend functions to member functions (from Efficient C++, Item 23)
Reason: a member function or friend function has access to the class internals whereas a non-member non-friend function does not; therefore using a non-member non-friend function increases encapsulation.
Exception: when a member function or friend function provides a significant advantage (such as performance), then it is worth considering despite the extra coupling. For example even though std::find works really well, associative containers such as std::set provide a member-function std::set::find which works in O(log N) instead of O(N).
The fundamental reason is that they don't belong there. They
don't really have anything to do with strings. Stop and think
about it. User defined types should follow the same rules as
built-in types, so every time you defined a new user type,
you'd have to add a function to std::string. This would
actually be possible in C++: if std::string had a member
function template to, without a generic implementation, you
could add a specialization for each type, and call
str.to<double>() or str.to<MyType>(). But is this really
what you want. It doesn't seem like a clean solution to me,
having everyone writing a new class having to add
a specialization to std::string. Putting these sort of things
in the string class bastardizes it, and is really the opposite
of what OO tries to achieve.
If you were to insist on pure OO, they would have to be
members of double, int, etc. (A constructor, really. This
is what Python does, for example.) C++ doesn't insist on pure
OO, and doesn't allow basic types like double and int to
have members or special constructors. So free functions are
both an acceptable solution, and the only clean solution
possible in the context of the language.
FWIW: conversions to/from textual representation is always
a delicate problem: if I do it in the target type, then I've
introduced a dependency on the various sources and sinks of text
in the target type---and these can vary in time. If I do it in
the source or sink type, I make them dependent on the the type
being converted, which is even worse. The C++ solution is to
define a protocol (in std::streambuf), where the user writes
a new free function (operator<< and operator>>) to handle
the conversions, and counts on operator overload resolution to
find the correct function. The advantage of the free function
solution is that the conversions are part of neither the data
type (which thus doesn't have to know of sources and sinks) nor
the source or sink type (which thus doesn't have to know about
user defined data types). It seems like the best solution to
me. And functions like stod are just convenience functions,
which make one particularly frequent use easier to write.
Actually they are some utility functions and they don't need to be inside the main class. Similar utility functions such as atoi, atof are defined (but for char*) inside stdlib.h and they too are standalone functions.

Specializing std::optional

Will it be possible to specialize std::optional for user-defined types? If not, is it too late to propose this to the standard?
My use case for this is an integer-like class that represents a value within a range. For instance, you could have an integer that lies somewhere in the range [0, 10]. Many of my applications are sensitive to even a single byte of overhead, so I would be unable to use a non-specialized std::optional due to the extra bool. However, a specialization for std::optional would be trivial for an integer that has a range smaller than its underlying type. We could simply store the value 11 in my example. This should provide no space or time overhead over a non-optional value.
Am I allowed to create this specialization in namespace std?
The general rule in 17.6.4.2.1 [namespace.std]/1 applies:
A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly
prohibited.
So I would say it's allowed.
N.B. optional will not be part of the C++14 standard, it will be included in a separate Technical Specification on library fundamentals, so there is time to change the rule if my interpretation is wrong.
If you are after a library that efficiently packs the value and the "no-value" flag into one memory location, I recommend looking at compact_optional. It does exactly this.
It does not specialize boost::optional or std::experimental::optional but it can wrap them inside, giving you a uniform interface, with optimizations where possible and a fallback to 'classical' optional where needed.
I've asked about the same thing, regarding specializing optional<bool> and optional<tribool> among other examples, to only use one byte. While the "legality" of doing such things was not under discussion, I do think that one should not, in theory, be allowed to specialize optional<T> in contrast to eg.: hash (which is explicitly allowed).
I don't have the logs with me but part of the rationale is that the interface treats access to the data as access to a pointer or reference, meaning that if you use a different data structure in the internals, some of the invariants of access might change; not to mention providing the interface with access to the data might require something like reinterpret_cast<(some_reference_type)>. Using a uint8_t to store a optional-bool, for example, would impose several extra requirements on the interface of optional<bool> that are different to the ones of optional<T>. What should the return type of operator* be, for example?
Basically, I'm guessing the idea is to avoid the whole vector<bool> fiasco again.
In your example, it might not be too bad, as the access type is still your_integer_type& (or pointer). But in that case, simply designing your integer type to allow for a "zombie" or "undetermined" value instead of relying on optional<> to do the job for you, with its extra overhead and requirements, might be the safest choice.
Make it easy to opt-in to space savings
I have decided that this is a useful thing to do, but a full specialization is a little more work than necessary (for instance, getting operator= correct).
I have posted on the Boost mailing list a way to simplify the task of specializing, especially when you only want to specialize some instantiations of a class template.
http://boost.2283326.n4.nabble.com/optional-Specializing-optional-to-save-space-td4680362.html
My current interface involves a special tag type used to 'unlock' access to particular functions. I have creatively named this type optional_tag. Only optional can construct an optional_tag. For a type to opt-in to a space-efficient representation, it needs the following member functions:
T(optional_tag) constructs an uninitialized value
initialize(optional_tag, Args && ...) constructs an object when there may be one in existence already
uninitialize(optional_tag) destroys the contained object
is_initialized(optional_tag) checks whether the object is currently in an initialized state
By always requiring the optional_tag parameter, we do not limit any function signatures. This is why, for instance, we cannot use operator bool() as the test, because the type may want that operator for other reasons.
An advantage of this over some other possible methods of implementing it is that you can make it work with any type that can naturally support such a state. It does not add any requirements such as having a move constructor.
You can see a full code implementation of the idea at
https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/optional/optional.hpp?at=default&fileviewer=file-view-default
and for a class using the specialization:
https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/class.hpp?at=default&fileviewer=file-view-default
(lines 220 through 242)
An alternative approach
This is in contrast to my previous implementation, which required users to specialize a class template. You can see the old version here:
https://bitbucket.org/davidstone/bounded_integer/src/2defec41add2079ba023c2c6d118ed8a274423c8/bounded_integer/detail/optional/optional.hpp
and
https://bitbucket.org/davidstone/bounded_integer/src/2defec41add2079ba023c2c6d118ed8a274423c8/bounded_integer/detail/optional/specialization.hpp
The problem with this approach is that it is simply more work for the user. Rather than adding four member functions, the user must go into a new namespace and specialize a template.
In practice, all specializations would have an in_place_t constructor that forwards all arguments to the underlying type. The optional_tag approach, on the other hand, can just use the underlying type's constructors directly.
In the specialize optional_storage approach, the user also has the responsibility of adding proper reference-qualified overloads of a value function. In the optional_tag approach, we already have the value so we do not have to pull it out.
optional_storage also required standardizing as part of the interface of optional two helper classes, only one of which the user is supposed to specialize (and sometimes delegate their specialization to the other).
The difference between this and compact_optional
compact_optional is a way of saying "Treat this special sentinel value as the type being not present, almost like a NaN". It requires the user to know that the type they are working with has some special sentinel. An easily specializable optional is a way of saying "My type does not need extra space to store the not present state, but that state is not a normal value." It does not require anyone to know about the optimization to take advantage of it; everyone who uses the type gets it for free.
The future
My goal is to get this first into boost::optional, and then part of the std::optional proposal. Until then, you can always use bounded::optional, although it has a few other (intentional) interface differences.
I don't see how allowing or not allowing some particular bit pattern to represent the unengaged state falls under anything the standard covers.
If you were trying to convince a library vendor to do this, it would require an implementation, exhaustive tests to show you haven't inadvertently blown any of the requirements of optional (or accidentally invoked undefined behavior) and extensive benchmarking to show this makes a notable difference in real world (and not just contrived) situations.
Of course, you can do whatever you want to your own code.

Why isn't std::initializer_list a language built-in?

Why isn't std::initializer_list a core-language built-in?
It seems to me that it's quite an important feature of C++11 and yet it doesn't have its own reserved keyword (or something alike).
Instead, initializer_list it's just a template class from the standard library that has a special, implicit mapping from the new braced-init-list {...} syntax that's handled by the compiler.
At first thought, this solution is quite hacky.
Is this the way new additions to the C++ language will be now implemented: by implicit roles of some template classes and not by the core language?
Please consider these examples:
widget<int> w = {1,2,3}; //this is how we want to use a class
why was a new class chosen:
widget( std::initializer_list<T> init )
instead of using something similar to any of these ideas:
widget( T[] init, int length ) // (1)
widget( T... init ) // (2)
widget( std::vector<T> init ) // (3)
a classic array, you could probably add const here and there
three dots already exist in the language (var-args, now variadic templates), why not re-use the syntax (and make it feel built-in)
just an existing container, could add const and &
All of them are already a part of the language. I only wrote my 3 first ideas, I am sure that there are many other approaches.
There were already examples of "core" language features that returned types defined in the std namespace. typeid returns std::type_info and (stretching a point perhaps) sizeof returns std::size_t.
In the former case, you already need to include a standard header in order to use this so-called "core language" feature.
Now, for initializer lists it happens that no keyword is needed to generate the object, the syntax is context-sensitive curly braces. Aside from that it's the same as type_info. Personally I don't think the absence of a keyword makes it "more hacky". Slightly more surprising, perhaps, but remember that the objective was to allow the same braced-initializer syntax that was already allowed for aggregates.
So yes, you can probably expect more of this design principle in future:
if more occasions arise where it is possible to introduce new features without new keywords then the committee will take them.
if new features require complex types, then those types will be placed in std rather than as builtins.
Hence:
if a new feature requires a complex type and can be introduced without new keywords then you'll get what you have here, which is "core language" syntax with no new keywords and that uses library types from std.
What it comes down to, I think, is that there is no absolute division in C++ between the "core language" and the standard libraries. They're different chapters in the standard but each references the other, and it has always been so.
There is another approach in C++11, which is that lambdas introduce objects that have anonymous types generated by the compiler. Because they have no names they aren't in a namespace at all, certainly not in std. That's not a suitable approach for initializer lists, though, because you use the type name when you write the constructor that accepts one.
The C++ Standard Committee seems to prefer not to add new keywords, probably because that increases the risk of breaking existing code (legacy code could use that keyword as the name of a variable, a class, or whatever else).
Moreover, it seems to me that defining std::initializer_list as a templated container is quite an elegant choice: if it was a keyword, how would you access its underlying type? How would you iterate through it? You would need a bunch of new operators as well, and that would just force you to remember more names and more keywords to do the same things you can do with standard containers.
Treating an std::initializer_list as any other container gives you the opportunity of writing generic code that works with any of those things.
UPDATE:
Then why introduce a new type, instead of using some combination of existing? (from the comments)
To begin with, all others containers have methods for adding, removing, and emplacing elements, which are not desirable for a compiler-generated collection. The only exception is std::array<>, which wraps a fixed-size C-style array and would therefore remain the only reasonable candidate.
However, as Nicol Bolas correctly points out in the comments, another, fundamental difference between std::initializer_list and all other standard containers (including std::array<>) is that the latter ones have value semantics, while std::initializer_list has reference semantics. Copying an std::initializer_list, for instance, won't cause a copy of the elements it contains.
Moreover (once again, courtesy of Nicol Bolas), having a special container for brace-initialization lists allows overloading on the way the user is performing initialization.
This is nothing new. For example, for (i : some_container) relies on existence of specific methods or standalone functions in some_container class. C# even relies even more on its .NET libraries. Actually, I think, that this is quite an elegant solution, because you can make your classes compatible with some language structures without complicating language specification.
This is indeed nothing new and how many have pointed out, this practice was there in C++ and is there, say, in C#.
Andrei Alexandrescu has mentioned a good point about this though: You may think of it as a part of imaginary "core" namespace, then it'll make more sense.
So, it's actually something like: core::initializer_list, core::size_t, core::begin(), core::end() and so on. This is just an unfortunate coincidence that std namespace has some core language constructs inside it.
Not only can it work completely in the standard library. Inclusion into the standard library does not mean that the compiler can not play clever tricks.
While it may not be able to in all cases, it may very well say: this type is well known, or a simple type, lets ignore the initializer_list and just have a memory image of what the initialized value should be.
In other words int i {5}; can be equivalent to int i(5); or int i=5; or even intwrapper iw {5}; Where intwrapper is a simple wrapper class over an int with a trivial constructor taking an initializer_list
It's not part of the core language because it can be implemented entirely in the library, just line operator new and operator delete. What advantage would there be in making compilers more complicated to build it in?