boost::asio wite() API stuck while writing the data

boost::asio wite() API stuck while writing the data - c++

While sending some data to client (multiple chunks of data); if the client stop reading the data after some packets, the server gets stuck on boost::asio::write() which results in unwanted behavior of the product.
We thought of shifting to async_write() and have a timer over it so that if such condition occurs, we could fallback to original good state, but due to design faults we could not use io_service (due to high concurrency) after async_write which resulted in not getting callbacks to stop the timer.
So, is there any way through which (without using io_serivce) we can unblock the write() API.
Somthing like we could execute write() API on a separate thread and terminate it through some timer. But here the question arises, is there any way through which we can clear out the boost buffers which already has some pending write data ?
Any help would be appreciated.
Thanks.

Eventually went with using boost::asio::async_write() but with io_service::poll() -> poll being non-blocking.
run() was not an option as the system is highly concurrent and read/write had to share the same io_service.
Pseudo code looks something like this:
data_to_write = size of data;
set current_bytes_transffered = 0
set timeout_occurred to false
/*
current_bytes_transffered -> obtained from async_write() callback
timeout_occurred -> obtained from a seperate timer
*/
while((data_to_write != current_bytes_transffered) || (!timeout_occurred))
{
// poll() is used instead of run() as the system
// has high concurrency and read and write operations
// shares same io_service
io_service.poll();
if(data_to_write == current_bytes_transffered)
{
// SUCCESS write logic
}
else if(timeout_occurred)
{
// timeout logic
}
}

Related

c++ Non-Boost ASIO read with timeout

I have a c++17 project using the non-boost version of ASIO because i need to connect, read and write to a TCP socket. The application has a read and write thread that run periodically and share a mutex therefore my reading thread has a time slot of 20 milliseconds in which it needs to read as much as it can and exit.
My problem is that i cant figure out how to get ASIO to read and then stop reading gracefully until another read is requested. There is no read with timeout functions and neither could i find any examples of such behaviour.
The closest thing ive found seems to kinda work but not exactly and i have no idea why. My current code is something like this:
ErrorCode Read(uint8_t* buf, unsigned int maxAmountOfBytesToRead, unsigned int& nRead)
{
std::lock_guard<std::mutex> tcpSocketLock(m_TCPSocketMutex);
asio::error_code asioError;
unsigned int amountOfBytesInBuffer = 0;
m_TCPConnectionSocket.async_read_some(asio::buffer(buf, maxAmountOfBytesToRead),
[&](const asio::error_code& errorCode, unsigned int result_n)
{
asioError = errorCode;
amountOfBytesInBuffer = result_n;
});
RunIOContextWithTimeOut(std::chrono::milliseconds(20));
nRead = amountOfBytesInBuffer;
// finish up and exit.
}
void RunIOContextWithTimeOut(std::chrono::steady_clock::duration timeout)
{
// Restart the io_context, as it may have been left in the "stopped" state
// by a previous operation.
m_TCPioContext.restart();
// Block until the asynchronous operation has completed, or timed out. If
// the pending asynchronous operation is a composed operation, the deadline
// applies to the entire operation, rather than individual operations on
// the socket.
m_TCPioContext.run_for(timeout);
// If the asynchronous operation completed successfully then the io_context
// would have been stopped due to running out of work. If it was not
// stopped, then the io_context::run_for call must have timed out.
if (!m_TCPioContext.stopped())
{
m_TCPioContext.stop();
// Run the io_context again until the operation completes.
m_TCPioContext.run();
}
}
But when running this code, i do notice that the data coming in is not exactly correct and that there are chunks of it missing. Adding logs and debugging i see that when the run_for pops out because of a time out, it never finishes the async read callback handler which makes me suspect that when the run_for doesnt finish on its own and is asked to stop, it abandons what ever data is has read and exits.
But i thought that was what the subsequent run() function was used for, to make the thread go back in and finish running the read before exiting. But apparently not? I dont understand how to make it just read and when its time to stop, copy over all that has read and stop gracefully. All other examples have you closing sockets and cancelling everything but i want to keep the socket open, the connection established, just to stop reading.
I cant let it read for as long as it wants because there is a write thread waiting for the read to finish so that it can be executed. I also would prefer not to make a solution that uses an additional thread of continues reading because this solution will be scaled up which will cause the usage of an additional 40 threads on a system with limited resources, we want to be as efficient as possible with our CPU resources.

How do I use select() and gRPC to create a server?

I need to use gRPC but in a single-threaded application (with additional socket channels). Naively, I'm thinking of using select() and depending on which file descriptor pops, calling gRPC to handle the message. My question is, can someone give me a rough (5-10 lines of code) outline skeleton on what I need to call after the select() pops?
Looking at Google's "hello world" example in the synchronous case implies a thread pool (which I can't use), and in the asynchronous case shows the main loop blocking -- which doesn't work for me because I need to handle other socket operations.

You can't do it, at this point (and probably ever).
One of the big weaknesses of event loops, including direct use of select()/poll() style APIs, is that they aren't composable in any natural way short of direct integration between the two.
We could theoretically add such functionality for Linux -- exporting an epoll_fd with a timerfd which becomes readable if it would be productive to call into a completion queue, but doing so would impose substantial constraints and architectural overhead on the rest of the stack just to support this usecase only on Linux. Everywhere else would require a background thread to manage that fd's readability.

This can be done using a gRPC async service along with grpc::Alarm to send any events that come from select or other polling APIs onto the gRPC completion queue. You can see an example using Epoll and gRPC together in this gist. The important functions are these two:
bool grpc_tick(grpc::ServerCompletionQueue& queue) {
void* tag = nullptr;
bool ok = false;
auto next_status = queue.AsyncNext(&tag, &ok, std::chrono::system_clock::now());
if (next_status == grpc::CompletionQueue::GOT_EVENT) {
if (ok && tag) {
static_cast<RequestProcessor*>(tag)->grpc_queue_tick();
} else {
std::cerr << "Not OK or bad tag: " << ok << "; " << tag << std::endl;
return false;
}
}
return next_status != grpc::CompletionQueue::SHUTDOWN;
}
bool tick_loops(int epoll, grpc::ServerCompletionQueue& queue) {
// Pump epoll events over to gRPC's completion queue.
epoll_event event{0};
while (epoll_wait(epoll, &event, /*maxevents=*/1, /*timeout=*/0)) {
grpc::Alarm alarm;
alarm.Set(&queue, std::chrono::system_clock::now(), event.data.ptr);
if (!grpc_tick(queue)) return false;
}
// Make sure gRPC gets at least 1 tick.
return grpc_tick(queue);
}
Here you can see the tick_loops function repeatedly calls epoll_wait until no more events are returned. For each epoll event, a grpc::Alarm is constructed with the deadline set to right now. After that, the gRPC event loop is immediately pumped with grpc_tick.
Note that the grpc::Alarm instance MUST outlive its time on the completion queue. In a real-world application, the alarm should be somehow attached to the tag (event.data.ptr in this example) so it can be cleaned up in the completion callback.
The gRPC event loop is then pumped again to ensure that any non-epoll events are also processed.
Completion queues are thread safe, so you could also put the epoll pump on one thread and the gRPC pump on another. With this setup you would not need to set the polling timeouts for each to 0 as they are in this example. This would reduce CPU usage by limiting dry cycles of the event loop pumps.

Is it expected for poll() to take 40ms to return even though data will be available sooner?

I created a proxy server to handle CQL orders from website clients. The proxy listens for incoming connections and each connection is given a thread. The thread loops as long as the socket exists and dies on HUP. You may also stop the proxy, which will stop the threads by sending an event (See eventfd()) to each thread.
By itself, this already allows me to save a good 100ms because the proxy is local and connecting to a local service is much faster than a service on a remote computer... (even if the computer is local.)
However, I send orders and once in a while the proxy sees no incoming data (i.e. it calls read() on the socket which is setup as NONBLOCK and gets -1 in return and errno == EAGAIN.) When that happens, I call poll() to wait for additional data, the HUP, or a hit on the eventfd meaning I have to quit (i.e. 2 fds, the socket and the eventfd).
Somehow, more often than not, when I hit the poll() function call, it adds an extra 40ms to the time it takes for a message to go round trip. Although one would think this only happens on larger messages, it happens when I receive an order, which is less than 100 bytes! So the size should not be the culprit. I also changed the code to make sure I send the entire order from the client to the proxy in one write() and to avoid the poll() if at all possible (i.e. I call read() first, and poll() only if nothing is available.)
Note that I have no timeout in this case because there is nothing to check other than the incoming orders and the eventfd. So I would imagine that the timeout won't be a problem.
The code base is really big. But the client/server comes down to something like this (the sizes in original are fully dynamic):
// Client
...
connect(socket);
...
write(socket, order, sizeof(order));
read(socket, result, sizeof(result));
// repeat for other orders, as required by client...
// server
...
socket = accept(); // happens for each client
...
pthread_create(runner);
...
// server thread (runner)
...
for(;;)
{
int r(0);
for(;;)
{
r += read(socket, order, sizeof(order));
if(r >= sizeof(order))
{
break;
}
// wait for more data is not enough received yet
poll(..."socket" + "eventfd"...); // <-- this will often take 40ms
if(eventfd_happened)
{
// quit thread
return;
}
}
...
[work on order]
...
write(socket, result, sizeof(result));
}
Note 1: I see the problem when I have a single client. So having multiple clients does not in itself cause the problem either.
Note 2: The client really uses BIO_connect(), BIO_read() and BIO_write() [from OpenSSL], but I doubt that would be a problem. I do not use any kind of encryption.

I don't see why you're using non-blocking I/O given you have a dedicated thread per socket. Just block in read(). Use SO_RCVTIMEO if you need an overall read timeout.

how to simulate time delay in network

Let's say that we need to send this message Hellow World using UDP protocol between two PCs A and B . Computer A will send the message to B with some time delay (i.e. constant or time-varying). Now to simulate this scenario, my first attempt is to use sleep function but this solution will freezes the entire application. Another solution is to implement mutlithreads and use sleep() with the thread that is responsible for getting the data and store this in a global variable and access this variable from another thread. In this solution, there might be difficulties in the synchronization between the threads. To overcome this problem, I will write the received data in txt file and read it from another thread. My question is what is the proper way to carry out this trivial experiment? I will appreciate if the answer has some C++ pseudo.
Edit:
My attempt to solve it is as follows, for the Master side (client),
Master masterObj
int main()
{
masterObj.initialize();
masterObj.connect();
while( masterObj.isConnected() == true ){
get currentTime and data; // currentTime here is sendTime
datagram = currentTime + data;
masterObj.send( datagram );
}
}
For the Slave side (server), the pseudo code is
Slave slaveObj
int main()
{
slaveObj.initialize();
slaveObj.connect();
slaveObj.slaveThreadInit();
while( slaveObj.isConnected() == true ){
slaveObj.getData();
}
}
Slave::recieve()
{
get currentTime and call it recievedTime
get datagram from Master;
this->slaveThread( recievedTime + datagram );
}
Slave::slaveThread( info )
{
sleep( 1 msec );
info = recievedTime + datagram ;
get time delay;
time delay = sendTime - recievedTime;
extract data from datagram;
insert data and time delay in txt file ( call it txtSlaveData);
}
Slave::getData()
{
read from txtSlaveData;
}
As you can see, I'm using an independent thread which inside it, I'm using sleep(). I'm not sure if this approach is applicable.

A simple way to simulate sending UDP datagrams from one computer to another is to send the datagrams through the loopback interface to another - or the same - process on the same computer. That will function exactly like the real thing except for the delay.
You can simulate the delay either when sending or receiving. Once you've implemented it one way, the other should be trivial. I think delaying the sending side is more natural option. Here is an approach for the more general problem of simulating network delay. See the last paragraph for a trivial experiment of sending only one datagram.
In case you choose delaying on send, what you could do is, instead of sending, store the datagram in a queue, along with the time it should be sent (target = now + delay).
Then, in another thread, wait for a datagram to become available, then sleep for max(target - now, 0). After sleeping, send the datagram and move on to the next one. Wait if queue is empty.
To simulate jitter, randomize the delay. To let jitter simulation send the datagrams in non-sequential order, use a priority queue, sorted by the target send-time.
Remember to synchronize the access to the queue.
For a single datagram, you can do much simpler. Simply start a new thread, sleep for the delay, send and end thread. No need for synchronization. Here's c++ code for that:
std::thread([]{
std::this_thread::sleep_for(delay);
send("foo");
}).detach();

Multi-threaded Server handling multiple clients in one thread

I wanted to create a multi-threaded socket server using C++11 and standard linux C-Librarys.
The easiest way doing this would be opening a new thread for each incoming connection, but there must be an other way, because Apache isn't doing this. As far as I know Apache handles more than one connection in a Thread. How to realise such a system?
I thought of creating one thread always listening for new clients and assigning this new client to a thread. But if all threads are excecuting an "select()" currently, having an infinite timeout and none of the already assigned client is doing anything, this could take a while for the client to be useable.
So the "select()" needs a timeout. Setting the timeout to 0.5ms would be nice, but I guess the workload could rise too much, couldn't it?
Can someone of you tell me how you would realise such a system, handling more than one client for each thread?
PS: Hope my English is well enough for you to understand what I mean ;)

The standard method to multiplex multiple requests onto a single thread is to use the Reactor pattern. A central object (typically called a SelectServer, SocketServer, or IOService), monitors all the sockets from running requests and issues callbacks when the sockets are ready to continue reading or writing.
As others have stated, rolling your own is probably a bad idea. Handling timeouts, errors, and cross platform compatibility (e.g. epoll for linux, kqueue for bsd, iocp for windows) is tricky. Use boost::asio or libevent for production systems.
Here is a skeleton SelectServer (compiles but not tested) to give you an idea:
#include <sys/select.h>
#include <functional>
#include <map>
class SelectServer {
public:
enum ReadyType {
READABLE = 0,
WRITABLE = 1
};
void CallWhenReady(ReadyType type, int fd, std::function<void()> closure) {
SocketHolder holder;
holder.fd = fd;
holder.type = type;
holder.closure = closure;
socket_map_[fd] = holder;
}
void Run() {
fd_set read_fds;
fd_set write_fds;
while (1) {
if (socket_map_.empty()) break;
int max_fd = -1;
FD_ZERO(&read_fds);
FD_ZERO(&write_fds);
for (const auto& pr : socket_map_) {
if (pr.second.type == READABLE) {
FD_SET(pr.second.fd, &read_fds);
} else {
FD_SET(pr.second.fd, &write_fds);
}
if (pr.second.fd > max_fd) max_fd = pr.second.fd;
}
int ret_val = select(max_fd + 1, &read_fds, &write_fds, 0, 0);
if (ret_val <= 0) {
// TODO: Handle error.
break;
} else {
for (auto it = socket_map_.begin(); it != socket_map_.end(); ) {
if (FD_ISSET(it->first, &read_fds) ||
FD_ISSET(it->first, &write_fds)) {
it->second.closure();
socket_map_.erase(it++);
} else {
++it;
}
}
}
}
}
private:
struct SocketHolder {
int fd;
ReadyType type;
std::function<void()> closure;
};
std::map<int, SocketHolder> socket_map_;
};

First off, have a look at using poll() instead of select(): it works better when you have large number of file descriptors used from different threads.
To get threads currently waiting in I/O out of waiting I'm aware of two methods:
You can send a suitable signal to the thread using pthread_kill(). The call to poll() fails and errno is set to EINTR.
Some systems allow a file descriptor to be obtained from a thread control device. poll()ing the corresponding file descriptor for input succeeds when the thread control device is signalled. See, e.g., Can we obtain a file descriptor for a semaphore or condition variable?.

This is not a trivial task.
In order to achieve that, you need to maintain a list of all opened sockets (the server socket and the sockets to current clients). You then use the select() function to which you can give a list of sockets (file descriptors). With correct parameters, select() will wait until any event happen on one of the sockets.
You then must find the socket(s) which caused select() to exit and process the event(s). For the server socket, it can be a new client. For client sockets, it can be requests, termination notification, etc.
Regarding what you say in your question, I think you are not understanding the select() API very well. It is OK to have concurrent select() calls in different threads, as long as they are not waiting on the same sockets. Then if the clients are not doing anything, it doesn't prevent the server select() from working and accepting new clients.
You only need to give select() a timeout if you want to be able to do things even if clients are not doing anything. For example, you may have a timer to send periodic infos to the clients. You then give select a timeout corresponding to you first timer to expire, and process the expired timer when select() returns (along with any other concurrent events).
I suggest you have a long read of the select manpage.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

boost::asio wite() API stuck while writing the data - c++

Related

c++ Non-Boost ASIO read with timeout

How do I use select() and gRPC to create a server?

Is it expected for poll() to take 40ms to return even though data will be available sooner?

how to simulate time delay in network

Multi-threaded Server handling multiple clients in one thread

Categories

Resources