io_context concurrency hint (BOOST_ASIO_CONCURRENCY_HINT_UNSAFE_IO) - c++

While checkig the documentation of boost::asio in verion 1.66.0, I noticed that the io_context constructor provides a concurrency_hint parameter. After reading the documentation, I'm unsure if I can use BOOST_ASIO_CONCURRENCY_HINT_UNSAFE_IO.
I have following situation:
I have a single io_context to do the IO. ioc.run() is executed from a single thread.
In this thread, some network IO using async calls are executed.
Other threads call boost::asio::dispatch(ioc, ...) to execute code on the IO thread.
I'm trying to figure out what concurrency hint value is safe to use in the situation as described above:
Using no concurrency hint is ok (eg. BOOST_ASIO_CONCURRENCY_HINT_SAFE), but slower than with hints.
Using 1 is ok.
Using BOOST_ASIO_CONCURRENCY_HINT_UNSAFE is not ok because it doesn't allow async calls.
What is unclear to me is BOOST_ASIO_CONCURRENCY_HINT_UNSAFE_IO. Documentation says:
This special concurrency hint disables locking in the reactor I/O. This hint has the following restrictions:
— Care must be taken to ensure that run functions on the io_context, and all operations on the context's associated I/O objects (such as sockets and timers), occur in only one thread at a time.
I wonder if it's safe do a boost::asio::dispatch from another thread when using this concurrency hint.

Since boost::asio::dispatch¹ ends up calling io_context::dispatch I would conclude that it's not ok to use BOOST_ASIO_CONCURRENCY_HINT_UNSAFE_IO if you call it from another thread:
— Care must be taken to ensure that run functions on the io_context, and all operations on the context's associated I/O objects (such as sockets and timers), occur in only one thread at a time.
¹ same for post/defer

Related

Does boost asio strand run all handlers on the same thread?

The boost asio documentation talks about executors but I can't see if that implies the same thread.
The reason I'm curious about this is that the purpose of a strand seems to be to allow the developer not to have to worry about multithreading issues. If that is the case then I see two options for a strand, assuming more than one thread in the io_service/context:
Run all handlers on the same thread for the lifetime of the strand
Use different threads but
Use some mechanism to make sure one handler runs after the other
If run on a different thread make sure you use a memory fence so the second handler sees all updates from the previous handler
(I strongly suspect it is impossible to do 2.1 without doing 2.2)
The problem with 2. is that it hits performance because it requires fences, but I don't see anywhere that explicitly says a strand always uses the same thread.
No.
All that is guaranteed is that
all handlers are invoked on a thread that is invoking run[_one] or poll_[one] on the execution context. This is is true for handlers on any executor
strand executors add the guarantee that handlers are only invoked sequentially (non-concurrently) and in the order they were posted (see e.g. https://www.boost.org/doc/libs/1_80_0/doc/html/boost_asio/reference/io_context__strand.html#boost_asio.reference.io_context__strand.order_of_handler_invocation)
So yes, depending on situation there may be overhead. However, in some situations there may be optimizations (e.g. dispatch from the local strand, or running on a context with a concurrency hint).

Is it safe to disable threads on boost::asio in a multi-threaded program?

I read in this SO answer that there are locks around several parts of asio's internals.
In addition I'm aware that asio is designed to allow multiple threads to service a single io_context.
However, if I only have a single thread servicing a single io_context, but I want to have more than 1 io_context in my application, is it safe to disable threads (per BOOST_ASIO_DISABLE_THREADS)
That is: I have one io_context and one thread which has entered its io_context::run() loop, and it is servicing a number of sockets etc. All interaction with these sockets are done within the context of that thread.
I then also have another thread, and another io_context, and that thread services that io_context and its sockets etc.
Inter-thread communication is achieved using a custom thread-safe queue and an eventfd wrapped with an asio::posix::stream_descriptor which is written to by the initiating thread, and read from the receiving thread which then pops items off the thread-safe queue.
So at no point will there be user code which attempts to call asio functions from a thread which isn't associated with the io_context servicing its asio objects.
With the above use-case in mind, is it safe to disable threads in asio?
It'll depend. As far as I know it ought to be fine. See below for caveats/areas of attention.
Also, you might want to take a step back and think about the objectives. If you're trying to optimize areas containing async IO, there may be quick wins that don't require such drastic measures. That is not to say that there are certainly situations where I imagine BOOST_ASIO_DISABLE_THREADS will help squeeze just that little extra bit of performance out.
Impact
What BOOST_ASIO_DISABLE_THREADS does is
replace selected mutexes/events with null implementations
disable some internal thread support (boost::asio::detail::thread throws on construction)
removes atomics (atomic_count becomes non-atomic)
make globals behave as simple statics (applies to system_context/system_executor)
disables TLS support
System executor
It's worth noting that system_executor is the default fallback when querying for associated handler executors. The library implementation specifies that async initiations will override that default with the executor of any IO object involved (e.g. the one bound to your socket or timer).
However, you have to scrutinize your own use and that of third-party code to make sure you don't accidentally rely on fallback.
Update: turns out system_executor internally spawns a thread_group which uses detail::thread - correctly erroring out when used
IO Services
Asio is extensible. Some services may elect to run internal threads as an implementation detail.
docs:
The implementation of this library for a particular platform may make use of one or more internal threads to emulate asynchronicity. As far as possible, these threads must be invisible to the library user. [...]
I'd trust the library implementation to use detail::thread - causing a runtime error if that were to be the case.
However, again, when using third-party code/user services you'll have to make sure that they don't break your assumptions.
Also, specific operations will not work without the thread support, like:
Live On Coliru
#define BOOST_ASIO_DISABLE_THREADS
#include <boost/asio.hpp>
#include <iostream>
int main() {
boost::asio::io_context ioc;
boost::asio::ip::tcp::resolver r{ioc};
std::cout << r.resolve("127.0.0.1", "80")->endpoint() << std::endl; // fine
// throws "thread: not supported":
r.async_resolve("127.0.0.1", "80", [](auto...) {});
}
Prints
127.0.0.1:80
terminate called after throwing an instance of 'boost::wrapexcept<boost::system::system_error>'
what(): thread: Operation not supported [system:95]
bash: line 7: 25771 Aborted (core dumped) ./a.out

Boost ASIO, SSL: How do strands help the implementation?

TLDR: Strands serialise resources shared across completion handlers: how does that prevent the ssl::stream implementation from concurrent access of the SSL context (used internally) for concurrent read/write requests (stream::ssl is not full duplex)? Remember, strands only serialise the completion handler invocation or the original queueing of the read/write requests. [Thanks to sehe for helping me express this better]
I've spent most of a day reading about ASIO, SSL and strands; mostly on stackoverflow (which has some VERY detailed and well expressed explanations, e.g. Why do I need strand per connection when using boost::asio?), and the Boost documentation; but one point remains unclear.
Obviously strands can serialise invocation of callbacks within the same strand, and so also serialise access to resources shared by those strands.
But it seems to me that the problem with boost::asio::ssl::stream isn't in the completion handler callbacks because it's not the callbacks that are operating concurrently on the SSL context, but the ssl::stream implementation that is.
I can't be confident that use of strands in calling async_read_some and async_write_some, or that use of strands for the completion handler, will prevent the io engine from operating on the SSL context at the same time in different threads.
Clearly strand use while calling async_read_some or async_write_some will mean that the read and write can't be queued at the same instant, but I don't see how that prevents the internal implementation from performing the read and write operations at the same time on different threads if the encapsulated tcp::socket becomes ready for read and write at the same time.
Comments at the end of the last answer to this question boost asio - SSL async_read and async_write from one thread claim that concurrent writes to ssl::stream could segfault rather than merely interleave, suggesting that the implementation is not taking the necessary locks to guard against concurrent access.
Unless the actual delayed socket write is bound to the thread/strand that queued it (which I can't see being true, or it would undermine the usefulness of worker threads), how can I be confident that it is possible to queue a read and a write on the same ssl::stream, or what that way could be?
Perhaps the async_write_some processes all of the data with the SSL context immediately, to produce encrypted data, and then becomes a plain socket write, and so then can't conflict with a read completion handler on the same strand, but it doesn't mean that it can't conflict with the internal implementations socket-read-and-decrypt before the completion handler gets queued on the strand. Never mind transparent SSL session re-negotiation that might happen...
I note from: Why do I need strand per connection when using boost::asio? "Composed operations are unique in that intermediate calls to the stream are invoked within the handler's strand, if one is present, instead of the strand in which the composed operation is initiated." but I'm not sure if what I am refering to are "intermediate calls to the stream". Does it mean: "any subsequent processing within that stream implementation"? I suspect not
And finally, for why-oh-why, why doesn't the ssl::stream implementation use a futex or other lock that is cheap when there is no conflict? If the strand rules (implicit or explicit) were followed, then the cost would be almost non-existent, but it would provide safety otherwise. I ask because I've just transitioned the propaganda of Sutter, Stroustrup and the rest, that C++ makes everything better and safer, to ssl::stream where it seems easy to follow certain spells but almost impossible to know if your code is actually safe.
The answer is that the boost ssl::stream implementation uses strands internally for SSL operations.
For example, the async_read_some() function creates an instance of openssl_operation and then calls strand_.post(boost::bind(&openssl_operation::start, op)).
[http://www.boost.org/doc/libs/1_57_0/boost/asio/ssl/old/detail/openssl_stream_service.hpp]
It seems reasonable to assume that all necessary internal ssl operations are performed on this internal strand, thus serialising access to the SSL context.
Q. but I'm not sure if what I am refering to are "intermediate calls to the stream". Does it mean: "any subsequent processing within that stream implementation"? I suspect not
The docs spell it out:
This operation is implemented in terms of zero or more calls to the stream's async_read_some function, and is known as a composed operation. The program must ensure that the stream performs no other read operations (such as async_read, the stream's async_read_some function, or any other composed operations that perform reads) until this operation completes. doc
And finally, for why-oh-why, why doesn't the ssl::stream implementation use a futex or other lock that is cheap when there is no conflict?
You can't hold a futex across async operations because any thread may execute completion handlers. So, you'd still need the strand here, making the futex redundant.
Comments at the end of the last answer to this question boost asio - SSL async_read and async_write from one thread claim that concurrent writes to ssl::stream could segfault rather than merely interleave, suggesting that the implementation is not taking the necessary locks to guard against concurrent access.
See previous entry. Don't forget about multiple service threads. Data races are Undefined Behaviour
TL;DR
Long story short: async programming is different. It is different for good reasons. You will have to adapt your thinking to it though.
Strands help the implementation by abstracting sequential execution over the async scheduler.
This makes it so that you don't have to know what the scheduling is, how many service threads are running etc.

Boost ASIO - What is async

I've been doing a lot of reading, but I just cannot wrap my head around the difference between synchronous and asynchronous calls in Boost ASIO: what they are, how they work, and why to pick one over the other.
My model is a server which accepts connections and appends the new connection to a list. A different thread loops over the list and sends each registered connection data as it becomes available. Each write operation should be safe. It should have a timeout so that it cannot hang, it should not allocate arbitrarily large amounts of memory, or in general cause the main application to crash.
Confusion:
How does accept_async differ from regular accept? Is a new thread allocated for each connection accepted? From examples I've seen it looks like after a connection is accepted, a request handler is called. This request handler must tell the acceptor to prepare to accept again. Nothing about this seems asynchronous. If the requset handler hangs then the acceptor blocks.
In the boost mailing list the OP was told to use async_write with a timer instead of regular write. In this configureation I don't see any asynchronous behaviour or why they would be recommended. From the Boost docs async_write seems more dangerous than write because the user must not call async_write again before the first one completes.
Asynchronous calls return immediately.
That's the important bit.
Now how do you control "the next thing" that happens when the asynchronous operation has completed? You got it, you supply the completion handler.
The strength of asynchrony is so you can have an IO operation (or similar) run "in the background" without necessarily incurring any thread switch or synchronization overhead. This way you can handle many asynchronous control flows at the same time, on a single thread.
Indeed asynchronous operations can be more complicated and require more thought (e.g. about lifetime of references used in the completion handler). However, when you need it, you need it.
Boost.Asio basic overview from the official site explains it well:
http://www.boost.org/doc/libs/1_61_0/doc/html/boost_asio/overview/core/basics.html
The io_service object is what handles the multiple operations.
Calls to io_service.run() should be made carefully (that could explain the "dangerous async_write")

io_service, why and how is it used?

Trying to learn asio, and I'm following the examples from the website.
Why is io_service needed and what does it do exactly? Why do I need to send it to almost every other functions while performing asynchronous operations, why can't it "create" itself after the first "binding".
Asio's io_service is the facilitator for operating on asynchronous functions. Once an async operation is ready, it uses one of io_service's running threads to call you back. If no such thread exists it uses its own internal thread to call you.
Think of it as a queue containing operations. It guarantees you that those operations, when run, will only do so on the threads that called its run() or run_once() methods, or when dealing with sockets and async IO, its internal thread.
The reason you must pass it to everyone is basically that someone has to wait for async operations to be ready, and as stated in its own documentation io_service is ASIO's link to the Operating System's I/O service so it abstracts away the platform's own async notifiers, such as kqueue, /dev/pool/, epoll, and the methods to operate on those, such as select().
Primarily I end up using io_service to demultiplex callbacks from several parts of the system, and make sure they operate on the same thread, eliminating the need for explicit locking, since the operations are serialized. It is a very powerful idiom for asynchronous applications.
You can take a look at the core documentation to get a better feeling of why io_service is needed and what it does.