Detect closed TCP connection during write with boost::asio immediately - c++

I have a TCP-server with multiple clients/sessions. Each session has its own thread for receiving data from the client, but there is only one thread ("writeThread") to respond to all clients.
Now there is the problem, if a client closes the connection during the "writeThread" is writing to this socket it takes multiple seconds until the write operation notices that the connection is closed remotely. Sometimes its not notices at all, just when I send a signal for an installed sighandler manually, the application will detect it and break the write operation.
The time is measuared between Logger::trace("start write"); and Logger::trace("remote term, closed socket ");
Despite the fact that this may not the best design, is there a possibility to detect the closed connection immediately, or do I really have to redesign?
bool myWrite(UINT8 *pu8_buffer, UINT32 u32_size)
{
bool b_success = false;
try
{
Logger::trace("start write");
b_success = (u32_size == boost::asio::write(_x_socket, boost::asio::buffer(pu8_buffer, u32_size)))
}
catch (boost::system::system_error &er)
{
if (er.code() == boost::asio::error::eof ||
er.code() == boost::asio::error::connection_reset)
{
boost::system::error_code x_er;
_x_socket.close(x_er);
if (!x_er)
{
Logger::trace("remote term, closed socket ");
}
else
{
Logger::err("remote term, closed socket failed");
}
}
}
catch(std::exception &ex)
{
Logger::err("write exception\n\t",ex.what());
}
catch(...)
{
Logger::err("write unknown exception",(uint32_t)this);
}
return b_success;
}

If you use asychronous write operations, it is possible to multiplex writes on the same thread without one blocking the other. You can even do the same for reads.

Just to close this question I will summarize "Richard Critten"s comments.
The issue was that there was no graceful disconnect from the client connected to my server. If there is a proper disconnect the write function will break immediately. To avoid long term or infinite blocking of the write operation its possible to configure a timeout for how long a write operation can take before reporting an error. This timeout can be set with the SO_SNDTIMEO socket option. http://man7.org/linux/man-pages/man7/socket.7.html

Related

Handling "reset by peer" scenario with boost::asio

I have a server method that waits for new incoming TCP connections, for each connection I'm creating two threads (detached) for handling various tasks.
void MyClass::startServer(boost::asio::io_service& io_service, unsigned short port) {
tcp::acceptor TCPAcceptor(io_service, tcp::endpoint(tcp::v4(), port));
bool UARTToWiFiGatewayStarted = false;
for (;;) {
auto socket(std::shared_ptr<tcp::socket>(new tcp::socket(io_service)));
/*!
* Accept a new connected WiFi client.
*/
TCPAcceptor.accept(*socket);
socket->set_option( tcp::no_delay( true ) );
MyClass::enableCommunicationSession();
// start one worker thread.
std::thread(WiFiToUARTWorkerSession, socket, this->LINport, this->LINbaud).detach();
// only if this is the first connected client:
if(false == UARTToWiFiGatewayStarted) {
std::thread(UARTToWifiWorkerSession, socket, this->UARTport, this->UARTbaud).detach();
UARTToWiFiGatewayStarted = true;
}
}
}
This works fine for starting the communication, but the problem appears when the client disconnects and connects again (or at least tries to connect again).
When the current client disconnects, I stop the communication (by stopping the internal infinite loops from both functions, then they'll return).
void Gateway::WiFiToUARTWorkerSession(std::shared_ptr<tcp::socket> socket, ...) {
/*!
* various code here...
*/
try {
while(true == MyClass::communicationSessionStatus) {
/*!
* Buffer used for storing the UART-incoming data.
*/
unsigned char WiFiDataBuffer[max_incoming_wifi_data_length];
boost::system::error_code error;
/*!
* Read the WiFi-available data.
*/
size_t length = socket->read_some(boost::asio::buffer(WiFiDataBuffer), error);
/*!
* Handle possible read errors.
*/
if (error == boost::asio::error::eof) {
break; // Connection closed cleanly by peer.
}
else if (error) {
// this will cause the infinite loops from the both worker functions to stop, and when they stop the functions will return.
MyClass::disableCommunicationSession();
sleep(1);
throw boost::system::system_error(error); // Some other error.
}
uart->write(WiFiDataBuffer, length);
}
}
catch (std::exception &exception) {
std::cerr << "[APP::exception] Exception in thread: " << exception.what() << std::endl;
}
}
I expect that when I reconnect the communication should work again (the MyClass::startServer(...) will create and detach again two worker threads that will do the same things.
The problem is that when I connect the second time I get:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
what(): write: Broken pipe
From what I found about this error it seems that the server (this application) sends something via TCP to a client that was disconnected.
What I'm doing wrong?
How can I solve this problem?
A read of length 0 with no error is also an indication of eof. The boost::asio::error::eof error code is normally more useful when you're checking the result of a composed operation.
When this error condition is missed, the code as presented will call write on a socket which has now been shutdown. You have used the form of write which does not take a reference to an error_code. This form will throw if there is an error. There will be an error. The read has failed.

C++ - Sockets and multithreading

Socket A(local_address);
void enviar(sockaddr_in remote_address, std::atomic<bool>& quit){
std::string message_text;
Message message;
while(!quit){
std::getline(std::cin, message_text);
if (message_text != "/quit"){
memset(message.text, 0, 1024);
message_text.copy(message.text, sizeof(message.text) - 1, 0);
A.send_to(message, remote_address);
}
else {
quit = true;
}
}
}
void recibir(sockaddr_in local_address, std::atomic<bool>& quit){
Message messager;
while(!quit){
A.receive_from(messager, local_address);
}
}
int main(void){
std::atomic<bool> quit(false);
sockaddr_in remote_address = make_ip_address("127.0.0.1",6000);
std::thread hilorec(&recibir,local_address, std::ref(quit));
std::thread hiloenv(&enviar,remote_address, std::ref(quit));
hiloenv.join();
hilorec.join();
}
Hi! I'm trying to make a simple chat with sockets. I want the program to finish when I write "/quit". I'm trying this with an atomic bool variable called quit. The problem is when I write "/quit" quit will be 'true' and the hiloenv thread will be finish, but hilorec, which is to receive the messages, will be blocked until i receive a message because of the recvfrom() function. How i can solve this?
Sorry for my english and thanks!
Shutdown the socket for input. That will cause recvfrom() to return zero as though the peer had closed the connection, which will cause that thread to exit.
I would send some special (e.g. empty) message to A socket from main thread when quit is detected. In this case your while(!quit) ... loop will finish and so the thread.
If you want to create a single thread app, then use epoll or select apis. If you want to stick to your current design, then you can create your socket having timeout set. Please look for How to set socket timeout in C when making multiple connections? for details. SO when you do quit, the waiting thread will come out of recv or send after the timout and then thread will join and your application can quit gracefully.
Thanks for the answers. I managed to fix it, If anyone is interested how:
std::thread hilorec(&recibir,local_address);
std::thread hiloenv(&enviar,remote_address);
while(!quit){}
pthread_cancel(hilorec.native_handle());
pthread_cancel(hiloenv.native_handle());
hilorec.join();
hiloenv.join();

valgrind/helgrind gets killed on stress test

I'm making a web server on linux in C++ with pthreads. I tested it with valgrind for leaks and memory problems - all fixed. I tested it with helgrind for thread problems - all fixed. I'm trying a stress test. I'm getting problem when the probram is run with helgrind
valgrind --tool=helgrind ./chats
It just dies on random places with the text "Killed" as it would do when I kill it with kill -9. The only report I get sometimes from helgrind is that the program exists while still holding some locks, which is normal when gets killed.
When checking for leaks:
valgrind --leak-check=full ./chats
it's more stable, but I managed to make it die once with few hundreds of concurrent connections.
I tried running program alone and couldn't make it crash at all. I tried up to 250 concurrent connections. Each thread delays with 100ms to make it easier to have multiple connections at the same time. No crash.
In all cases threads as well as connections do not get above 10 and I see it crash even with 2 connections, but never with only one connection at the same time (with including main thread and one helper thread is total of 3).
Is it possible that the problem will only happen when run with
helgrind or just helgrind makes it more likely to show?
What be the reason that a program gets killed (by kernel?) Allocating too much memory, too many file descriptors?
I tested a bit more and I found out that it only dies when the client times out and closes the connection. So here is the code which detects that the client closed the socket:
void *TcpClient::run(){
int ret;
struct timeval tv;
char * buff = (char *)malloc(10001);
int br;
colorPrintf(TC_GREEN, "new client starting: %d\n", sockFd);
while(isRunning()){
tv.tv_sec = 0;
tv.tv_usec = 500*1000;
FD_SET(sockFd, &readFds);
ret = select(sockFd+1, &readFds, NULL, NULL, &tv);
if(ret < 0){
//select error
continue;
}else if(ret == 0){
// no data to read
continue;
}
br = read(sockFd, buff, 10000);
buff[br] = 0;
if (br == 0){
// client disconnected;
setRunning(false);
break;
}
if (reader != NULL){
reader->tcpRead(this, std::string(buff, br));
}else{
readBuffer.append(buff, br);
}
//printf("received: %s\n", buff);
}
free(buff);
sendFeedback((void *)1);
colorPrintf(TC_RED, "closing client socket: %d\n", sockFd);
::close(sockFd);
sockFd = -1;
return NULL;
}
// this method writes to socket
bool TcpClient::write(std::string data){
int bw;
int dataLen = data.length();
bw = ::write(sockFd, data.data(), dataLen);
if (bw != dataLen){
return false; // I don't close the socket in this case, maybe I should
}
return true;
}
P.S. Threads are:
main thread. connections are accepted here.
one helper thread which listen for signals and sends signals. It stops signal reception for the app and manually polls the signal queue. The reason is because it's hard to handle signals when using threads. I found this technique here in stackoverflow and it seams to work pretty fine in other projects.
client connection threads
The full code is pretty big, but I can post chunks if someone is interested.
Update:
I managed to trigger the problem with only one connection. It's all happening in client thread. This is what I do:
I read/parse headers. I put delay before writing so the client can timeout (which causes the problem).
Here the client timeouts and leaves (probably closes socket)
I write back headers
I write back the html code.
Here is how I write back
bw = ::write(sockFd, data.data(), dataLen);
// bw is = dataLen = 108 when writing the headers
//then secondary write for HTML kills the program. there is a message before and after write()
bw = ::write(sockFd, data.data(), dataLen); // doesn't go past this point second time
Update 2: Got it :)
gdb sais:
Program received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x41401940 (LWP 10554)]
0x0000003ac2e0d89b in write () from /lib64/libpthread.so.0
Question 1: What should I do to void receiving this signal.
Question 2: How to know that remote side disconnected while writing. On read select returns that there is data but data read is 0. How about write?
Well I just had to handle the SIGPIPE singal and write returned -1 -> I close socket and quit thread gracefully. Works like a charm.
I guess the easiest way is to set signal handler of SIGPIPE to SIG_IGN:
signal(SIGPIPE, SIG_IGN);
Note that first write was successful and didn't kill the program. If you have similar problem check if you are writing once or multiple times. If you are not familiar with gdb this is how to do it:
gdb ./your-program
> run
and gdb will tell you all about signals and sigfaults.

TCP client in Boost asio

Im building a TCP client using Boost::asio Libs. My program has a write() thread that sends a command to the server
write(*_socket,boost::asio::buffer("sspi l1\n\n",sizeof("sspi l1\n\n")));
Then a read thread is started that reads from the buffer all the time, as there can be messages broadcasted from the server due to any other client
void TCP_IP_Connection::readTCP()
{
size_t l=0;
this->len=0;
boost::system::error_code error;
try
{//loop reading all values from router
while(1)
{
//wait for reply??
l=_socket->read_some(boost::asio::buffer(this->reply,sizeof(this->reply)),error);
if(error)
throw boost::system::system_error(error);
if(l>0)
{
this->dataProcess(l);
}
else
boost::this_thread::sleep(boost::posix_time::milliseconds(5000));
_io.run();
if(error==boost::asio::error::eof) //connection closed by router
std::cout<<"connection closed by router";
_io.reset();
}
}
catch (std::exception& e)
{
std::cerr << e.what() << std::endl;
}
}
This thread runs al time in a while(1) loop and is supposed to sleep when the received data length is less than zero. It reads all the data and calls the data parser function. After that the write thread is used to send another command, with read thread running. But instead of the required response the server sends back
? ""
ERROR: Unknown command
I tried using the wireshark. I can see the command being send properly. What can be mistake I'm doing here?
sizeof("sspi l1\n\n") returns 10, but I can only count 9 characters in that string.
Try this instead:
const std::string cmd("sspi l1\n\n");
write(*_socket,boost::asio::buffer(cmd, cmd.length()));
Or when you have it as a string it is enough to do
const std::string cmd("sspi l1\n\n");
write(*_socket,boost::asio::buffer(cmd));
The second argument specifies a maximum length of the string to use. But since it is a constant string, the second argument is not strictly necessary.

boost asio - change of deficient code

I have this piece of code as part of a socks5 proxy server implementation. This is the part from where once the server established communication sockets with proxy client (in code - socket_) and destination server (in code clientSock_) it takes data send on a socket and exchanges it with data sent on the other socket.
I specify that this exchange happens already in a thread spawn by the server for a proxy client.
std::size_t readable = 0;
boost::asio::socket_base::bytes_readable command1(true);
boost::asio::socket_base::bytes_readable command2(true);
try
{
while (1)
{
socket_->io_control(command1);
clientSock_->io_control(command2);
if ((readable = command1.get()) > 0)
{
transf = ba::read(*socket_, ba::buffer(data_,readable));
ba::write(*clientSock_, ba::buffer(data_,transf));
boost::this_thread::sleep(boost::posix_time::milliseconds(500));
}
if ((readable = command2.get()) > 0)
{
transf = ba::read(*clientSock_, ba::buffer(data_,readable));
ba::write(*socket_, ba::buffer(data_,transf));
boost::this_thread::sleep(boost::posix_time::milliseconds(500));
}
}
}
catch (std::exception& ex)
{
std::cerr << "Exception in thread while exchanging: " << ex.what() << "\n";
return;
}
The problem here is that I have very high CPU in the loop. Also I am not sure if here the way to know if one of the parts has closed the socket is to catch boost socket exception -> and end the data exchange.
The problem ca be solved by using asynchronous write/read functions. Basically use async_read_some() or async_write() - or other async functions in these categories. Also in order for async processing to work one must call io_service.run() after at least one async function was called - that will dispatch completion handler for async processing.