How does std::map::extract() allow changing the key? - c++

I am writing a template class similar to std::map. Currently I'm working on implementing a function equivalent to std::map::extract(). This should return a node handle with its own function node_type::key(), that returns a non-const reference to the key. This therefore allows changing the key associated with a mapped object and thus avoids moving the mapped object.
std::map exposes its value as std::pair<const Key,T>, but somehow allows changing the Key object through a node_type object. I don't understand how STL implementations deal with this? I am lead to believe that they include const_cast conversions, but all resources I read heavily discourage the use of const_casts for fear of undefined behaviour, especially when changing the value afterwards.
This resource invokes "implementation magic".
How can I implement std::map::extract() without causing undefined behaviour?
Related Questions
Using std::map::extract to modify key
Rationale of restrictive rules for extract and re-insert with map
Is it possible to cast a pair<Key, Value> to a pair<const Key, Value>?
Type punning between `pair<Key, Value>` and `pair<const Key, Value>`

How can I implement std::map::extract() without causing undefined behaviour?
You can't. The standard specifies behaviour that is not implementable in portable C++.
An implementer (of C++) will be required to have implementation-defined additional guarantees, which need not be exposed to user code, to handle this requirement. This is colloquially known as "implementation magic".
What you could do is write a proposal, submit it to the committee, and get it voted into a future standard; such that there is a mechanism by which this can be accomplished. implicit-lifetime-types that arrived in C++20 did a similar thing to the requirements on std::vector::data.

Related

Const-ness affecting concurrent algorithms?

Are there any examples of the absence or presence of const affecting the concurrency of any C++ Standard Library algorithms or containers? If there aren't, is there any reason using the const-ness in this way is not permitted?
To clarify, I'm assuming concurrent const access to objects is data-race free, as advocated in several places, including GotW 6.
By analogy, the noexcept-ness of move operations can affect the performance of std::vectors methods such as resize.
I skimmed through several of the C++17 concurrent Algorithms, hoping to find an example, but I didn't find anything. Algorithms like transform don't require the unary_op or binary_op function objects to be const. I did find for_each takes a MoveConstructible function object for the original, non execution policy version, and take a CopyConstructible function object for the C++17 execution policy version; I thought this might be an example where the programmer was manually forced to select one or the other based on if the function object could be safely copied (a const operation)... but at the moment I suspect this requirement is just there to support the return type (UnaryFunction vs. void).

How to check if variable is still valid or std::move was used on it?

I am aware that after using std::move the variable is still valid, but in an unspecified state.
Unfortunately, recently I have come across several bugs in our code base where a function was accessing the moved variable, and weird things were happening. These issues were extremely hard to track down.
Is there any compiler option (in clang) or any way to throw an error either during runtime or compilation?
Some things that may help :
Use a static analyzer. Xcode has it built-in.
https://clang-analyzer.llvm.org/
Use Address Sanitizer and Undefined Behaviour sanitizer
http://clang.llvm.org/docs/AddressSanitizer.html
https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
Code changes that can make such bugs easy to track down:
I'm assuming that if you're using std::move on something, it is (not always) a heavy container.
If so, try to use std::unique_ptr<T> to create it. Calls to movers must explicitly use std::move, which is easy to spot. And other non-owning access functions can just work with .get(). You can also check for nullability and throw if it's nullptr at any point where you need to access it.
I am aware that after using std::move the variable is still valid, but in an unspecified state.
This is not a universal truth. More generally, the object that was moved from is in whatever state in which the move constructor / assignment operator left it.
The standard library does have the guarantee that you describe at minimum. But it is also possible to implement a member function for your class which doesn't abide by it, and leaves the moved from object in an invalid state. It is however a good design choice to implement move operations in the way you describe.
How to check if variable is still valid or std::move was used on it?
There is no way to do such check in general within the language.
Is there any compiler option (in clang) or any way to throw an error either during runtime or compilation?
Note that using a variable after a move is not necessarily a bug at all, but can instead be an entirely correct thing to do. Some types specify exactly the state of the moved from object (std::unique_ptr for example) and others which have the validity guarantee can be used in ways that have no pre-conditions (such as calling container.size()).
As such, using a moved from object is only a problem if that violates a pre-condition, which would result in undefined behaviour. Clang and other compilers have runtime sanitisers that may be able to catch some undefined behaviour. There are also many warning options and static analysers that diagnose cases where bugs are likely.
Using them is a very good idea, but you should not rely solely on them because they won't be able to find all bugs. The programmer still needs to be careful when writing the program, and needs to compare it with the rules of the language. Following common idioms such as RAII, avoiding bare owning pointers (and other resource handles) goes a long way in avoiding typical bugs.

Where can I find what std::launder really does? [duplicate]

This question already has answers here:
What is the purpose of std::launder?
(3 answers)
Closed 4 years ago.
I am trying to understand what std::launder does, and I hoped that by looking up an example implementation it would be clear.
Where can I find an example implementation of std::launder?
When I looked in lbic++ I see a code like
template<typename _Tp>
[[nodiscard]] constexpr _Tp*
launder(_Tp* __p) noexcept
{ return __builtin_launder(__p); }
Which makes me think that this another of those compiler-magic functions.
What is that this function __builtin_launder can potentially do, does it simply add a tag to suppress compiler warnings about aliasing?
Is it possible to understand std::launder in terms of __builtin_launder or it just more compiler-magic (hooks)?
The purpose of std::launder is not to "suppress warnings" but to remove assumptions that the C++ compiler may have.
Aliasing warnings are trying to inform you that you are possibly doing things whose behaviour is not defined by the C++ standard.
The compiler can and does make assumptions that your code is only doing things defined by the standard. For example, it can assume that a pointer to a const value once constructed will not be changed.
Compilers may use that assumption to skip refetching the value from memory (and store it in a register), or even calculate its value at compile time and do dead-code elimination based on it. It can assume this, because any program where it is false is doing undefined behaviour, so any program behaviour is accepted under the C++ standard.
std::launder was crafted to permit you do things like take a pointer to a truly const value that was legally modified (by creating a new object in its storage, say) and use that pointer after the modification in a defined way (so it refers to the new object) and other specific and similar situations (don't assume it just "removes aliasing problems"). __builtin_launder is going to be a "noop" function in one sense, but in another sense it is going to change what kind of assembly code can be generated around it. With it, certain assumptions about what value can be reached from its input cannot be made about its output. And some code that would be UB on the input pointer is not UB on the output pointer.
It is an expert tool. I, personally, wouldn't use it without doing a lot of standard delving and double checking that I wasn't using it wrong. It was added because there were certain operations someone proved there was no way to reasonably do in a standard compliant way, and it permits a library writer to do it efficiently now.

Why doesn't std::priority_queue have a clear() member function

I was doing some hacking today and found out that std::priority_queue does not have a clear() member function. Are there any technical reasons as to why the standards committee may have left this out?
To be clear, I am aware that it is easy to work around this via assignment:
oldPQ = std::priority_queue<int>{};
This solution is less desirable because:
It requires you to repeat the type - this does not continue to work under maintenance. As #chris pointed out below, you can simplify this if you're using the default constructor, but if you have a custom comparator, this may not be possible.
std::priority_queue cannot be used in a templated function that expects a clear() member function.
It is generally undesirable that it does not meet the common interface provided by the other containers. In particular, everything from std::forward_list to std::unordered_map to std::string has clear(). The only other exceptions I note are std::array, for which the semantics would not make sense, and std::stack and std::queue, for which the semantics are more questionable when std::deque works without any extra effort.
One item that looks like an issue, but in practice needn't be:
Because the internal container used for std::priority_queue is templated and may not have a clear() member function of its own, this creates an interesting problem, in particular it raises the question of backward compatibility. This is a non-issue because:
For internal containers that do not provide clear(), as long as nobody attempts to invoke std::priority_queue::clear(), the code will continue to compile.
It may still be possible with SFINAE to provide the new interface (the clear member) by calling clear() on the internal container when it's available and by repeatedly popping if it is not.
It is my opinion that this is a defect in the C++ standard. Assuming a technical discussion does not provide a strong case for why this method is omitted, I intend to pursue the creation of a standards proposal.
Edit:
Seems this is being handled in-committee (note the last post): https://groups.google.com/a/isocpp.org/forum/?fromgroups#!searchin/std-discussion/clear/std-discussion/_mYobAFBOrM/ty-2347w1T4J
http://wg21.cmeerw.net/lwg/issue2194
The specification of container adaptors is known to be overly pedantic: since the "abstract" spec of the corresponding data structure (from some book on abstract algorithms and data structures) does not include operation clear for canonical priority queues or stacks, it is not provided in the adaptor. This indeed often makes it quite inconvenient to use these adaptors in practice.
The good news though is that the inner container member is declared inside the adaptor as a protected member of the adapter, named c. This is probably done specifically for you to be able to easily implement your own version of the adaptor: by inheriting from the standard adaptor and adding whatever member functions you want to add, including clear.
As for comparing these adaptors' interfaces with standard container interfaces... I don't think it is a valid comparison. These adaptors have never been intended to be compatible with containers in terms of interface. Quite the opposite, the purpose of these adaptors was largely to restrict the public interface of the data structure and force it into the narrow bounds of what is allowed by its canonical abstract definition.
For example, you are not allowed to iterate over the canonical stack. Stack, by definition, is not "iterable". The fact that stack adaptor disables iteration interface is a good thing. But the absence of clear certainly feels too pedantic, since it has a great practical value without looking like a big violation of the canonical interface.

Well definedness of C++ programs hiding pointers

According to Wikipedia:
C++11 defines conditions under which pointer values are "safely
derived" from other values. An implementation may specify that it
operates under "strict pointer safety," in which case pointers that
are not derived according to these rules can become invalid.
As I read it you can get the safety model used by an implementation, however that's fixed for the compiler (possibly variable with a command line switch).
Suppose I have code that hides pointers, such code definitely would not run with a naive bolt on garbage collector. However collectors (like my own) and Boehm provide hooks for finding pointers in certain objects.
I am in particular thinking about JudyArrays. These are digital tries which necessarily hide the keys. My question is basically whether using such data structures would render the behaviour of a program undefined in C++11.
I hope not (since Judy Arrays outperform everything else). Also as it happens .. I'm using them to implement a garbage collector. I am concerned however because "minimal requirements" don't general work at all and were strongly opposed in the original debate on the C++ conformance model (by the UK and Australia). Parametric requirements are better. But the C++11 GC related text seems to be a bit of both so I'm confused!
It's implementation defined whether an implementation provides relaxed pointer safety (what you seem to want) or strict pointer safety (pointers remain valid only when safely derived). As you've implied, you can call get_pointer_safety to find out what the policy is, but the standard provides no way to specify/change the policy.
You may, however, be able to side-step this question. If you can make a call to declare_reachable (passing that pointer value) before you hide the pointer, it remains valid until a matching call to undeclare_reachable (and here "matching" means calls nest).