I am trying to send a large amount of data around 50KByte or above over a TCP socket using the following command in C++:
boost::asio::async_write(sock, boost::asio::buffer(sbuff, slen),
boost::bind((&send_handler), placeholders::error));
Where sbuff is a pointer to the data to be transmitted, and slen is the length of the data.
Sometimes the operation successes and sometimes I get an error with Operation cancelled
Here is the code part for the receiver, waiting for a specific amount of the data to be received.
boost::asio::async_read(_sock,
boost::asio::buffer(rbuf, rlen),
boost::bind(&session::handle_read_payload,
this,
placeholders::bytes_transferred,
placeholders::error));
void session::handle_read_payload(buffer<uint8> &buff, size_t rbytes, const boost::system::error_code &e)
Where rlen is the number of the bytes to wait to receive. And rbuf is a pointer to where I store the received bytes.
I checked the flow of the TCP packets between the two machines using Wireshark and I found that suddenly the receiver sends back a packet with FIN flag set to the sender, which terminates the connection.
So can anyone tell me what might be the root of the problem? IS there any problem with my code?
Does it matter if I call _acceptor.listen(); before async_accept. Because when I tested without _acceptor.listen(); , it works perfectly. So what would be the difference?
From the discussion in the comments to the question, it sounds very much like there is a disagreement between the sender and the receiver about the size of the message being sent.
The receiver receives what it thinks is a complete message then closes the socket while the sender still thinks there is more data that the receiver has not accepted.
To diagnose the problem, I suggest that you display slen on the sender side, and display rlen on the receiver side before issuing the respective read/write requests (by display I mean write to a log or to std::cerr or whatever other approach works for your application.) If the two numbers are not equal you know where to look for the underlying cause of the problem. If they are equal -- then more investigation will be needed.
Related
/*Basic Things has been done.*/
/*Like setting connection and receiving */
namespace bar = boost::asio::error;
void doWrite(char* buffer, size_t size_) {
boost::asio::async_write_some(socket, boost::asio::buffer(buffer ,size), boost::bind(&Handler, this, bar::error, bar::bytes_transferred));
}
/*handler*/
void handler(/*parameters*/)
{
}
while my server is continuously transferring the data. sometimes client gets crash /*purposely */.
errorCode.message() gives error of boost::asio::error::bad_descriptor and whole program crashes.
i have copied the program from boost chat server example.
if server is transmitting let say 1024 bytes and while writing client close in middle of writing of 1024 bytes. whole program crashes.
More Technical Wording:
how to handle half open socket in middle of transfer?
Be sure to do all operations either catching system_error or receiving (and checking) the error_code.
As you say, you already receive the boost::asio::error::bad_descriptor code, so I expect the program termination results because of a subsequent action on the same socket that throws.
Look for the overloads that take a boost::system::error_code& parameter (like e.g. http://www.boost.org/doc/libs/1_66_0/doc/html/boost_asio/reference/basic_stream_socket/close.html).
I have a UDP client that is sending messages to a server, at a specified rate. The rate needs to be constant, so I decided to try to do my receiving of replies in a separate thread to avoid blocking or delaying on recvfrom(). Is it at all possible to 'wait' for a full message before receiving? What would be the best strategy to go about doing this?
while (true)
{
//std::this_thread::sleep_for(std::chrono::milliseconds(5000));
if (recvfrom(threadSock, ReceiveBuf, BufLength, 0, 0, 0) == SOCKET_ERROR)
{
printf("Thread Receive failed with error %ld\n", GetLastError());
break;
}
else
{
printf("Reply received: %s\n\n", ReceiveBuf);
}
memset(ReceiveBuf, '\0', BufLength);
}
Above is my receiving code. Currently, only the first 8 characters of a reply are being read into the buffer (the buffer is 512 bytes).
How can I wait for a full message (bearing in mind the message lengths are variable).
Is this even possible? Perhaps there is a better approach.
Thanks in advance.
EDIT: I should clarify the prints are for testing only. They won't be in the final result, as printing from a thread gives weird inline prints.
According to MSDN:
The recvfrom function receives a datagram and stores the source address.
For message-oriented sockets, data is extracted from the first enqueued message, up to the size of the buffer specified. If the datagram or message is larger than the buffer specified, the buffer is filled with the first part of the datagram, and recvfrom generates the error WSAEMSGSIZE. For unreliable protocols (for example, UDP) the excess data is lost. For UDP if the packet received contains no data (empty), the return value from the recvfrom function function is zero.
Thus, you can't receive a part of the incoming message, the receive returns only when the OS can process and return an enqueued datagram.
In the interest of completeness, and the small chance anyone suffering from similar confusion finds this, solution follows:
Yes, it was a silly question, I should've realised recvfrom waits for a full datagram. The problem was with my server.
It was an issue of the server not sending the full data. I'm not sure as to the exact cause, but to fix it I converted the char* my reply was being stored to (and printing correctly) to a char[], which, when sent, worked fine.
I have been reading some socket guides such as Beej's guide to network programming. It is quite clear now that there is no guarantee on how many bytes are received in a single recv() call. Therefore a mechanism of e.g. first two bytes stating the message length should be sent and then the message. So the receiver receives the first two bytes and then receives in a loop until the whole message has been received. All good and dandy!?
I was asked by a colleague about messages going out of sync. E.g. what if, somehow, I receive two bytes in once recv() call that are actually in the middle of the message itself and it would appear as a integer of some value? Does that mean that the rest of the data sent will be out of sync? And what about receiving the header partially, i.e. one byte at a time?
Maybe this is overthinking, but I can't find this mentioned anywhere and I just want to be sure that I would handle this if it could be a possible threat to the integrity of the communication.
Thanks.
It is not overthinking. TCP presents a stream so you should treat it this way. A lot of problems concerning TCP are due to network issues and will probably not happen during development.
Start a message with a (4 byte) magic that you can look for followed by a (4 byte) length in an expected order (normally big endian). When receiving, read each byte of the header at the time, so you can handle it anyway the bytes were received. Based on that you can accept messages in a lasting TCP connection.
Mind you that when starting a new connection per message, you know the starting point. However, it doesn't hurt sending a magic either, if only to filter out some invalid messages.
A checksum is not necessary because TCP shows a reliable stream of bytes which was already checked by the receiving part of TCP, and syncing will only be needed if there was a coding issue with sending/receiving.
On the other hand, UDP sends packets, so you know what to expect, but then the delivery and order is not guaranteed.
Your colleague is mistaken. TCP data cannot arrive out of order. However you should investigate the MSG_WAITALL flag to recv() to overcome the possibility of the two length bytes arriving separately, and to eliminate the need for a loop when receiving the message body.
Its your responsibility to make you client and server syncing together, how ever in TCP there is no out of order delivery, if you got something by calling recv() you can think there isn't anything behind that that you doesn't received.
So the question is how to synchronize sender and receiver ? its easy, as stefaanv said, sender and receiver are knowing their starting point. so you can define a protocol for your network communication. for example a protocol could be defined this way :
4 bytes of header including message type and payload length
Rest of message is payload length
By this, you have to send 4 byte header before sending actual payload, then sending actual payload followed.
Because TCP has garauntied Inorder reliable delivery, you can make two recv() call for each pack. one recv() call with length of 4 bytes for getting next payload size, and another call to recv() with size specified in header. Its necessary to make both recv() blocking to getting synchronized all the time.
An example would be like this:
#define MAX_BUF_SIZE 1024 // something you know
char buf[MAX_BUF_SIZE];
int recvLen = recv(fd, buff, 4, MSG_PEEK);
if(recvLen==4){
recvLen = recv(fd, buff, 4);
if(recvLen != 4){
// fatal error
}
int payloadLen = extractPayloadLenFromHeader(buf);
recvLen = recv(fd, buff, payloadLen, MSG_PEEK);
if(recvLen == payloadLen){
recvLen = recv(fd, buff, payloadLen); // actual recv
if(recvLen != payloadLen){
// fatal error
}
// do something with received payload
}
}
As you can see, i have first called recv with MSG_PEEK flag to ensure is there really 4 bytes available or not, then received actual header. same for payload
When you use the simple ZeroMQ REQ/REP pattern you depend on a fixed send()->recv() / recv()->send() sequence.
As this article describes you get into trouble when a participant disconnects in the middle of a request because then you can't just start over with receiving the next request from another connection but the state machine would force you to send a request to the disconnected one.
Has there emerged a more elegant way to solve this since the mentioned article has been written?
Is reconnecting the only way to solve this (apart from not using REQ/REP but use another pattern)
As the accepted answer seem so terribly sad to me, I did some research and have found that everything we need was actually in the documentation.
The .setsockopt() with the correct parameter can help you resetting your socket state-machine without brutally destroy it and rebuild another on top of the previous one dead body.
(yeah I like the image).
ZMQ_REQ_CORRELATE: match replies with requests
The default behaviour of REQ sockets is to rely on the ordering of messages to match requests and responses and that is usually sufficient. When this option is set to 1, the REQ socket will prefix outgoing messages with an extra frame containing a request id. That means the full message is (request id, 0, user frames…). The REQ socket will discard all incoming messages that don't begin with these two frames.
Option value type int
Option value unit 0, 1
Default value 0
Applicable socket types ZMQ_REQ
ZMQ_REQ_RELAXED: relax strict alternation between request and reply
By default, a REQ socket does not allow initiating a new request with zmq_send(3) until the reply to the previous one has been received. When set to 1, sending another message is allowed and has the effect of disconnecting the underlying connection to the peer from which the reply was expected, triggering a reconnection attempt on transports that support it. The request-reply state machine is reset and a new request is sent to the next available peer.
If set to 1, also enable ZMQ_REQ_CORRELATE to ensure correct matching of requests and replies. Otherwise a late reply to an aborted request can be reported as the reply to the superseding request.
Option value type int
Option value unit 0, 1
Default value 0
Applicable socket types ZMQ_REQ
A complete documentation is here
The good news is that, as of ZMQ 3.0 and later (the modern era), you can set a timeout on a socket. As others have noted elsewhere, you must do this after you have created the socket, but before you connect it:
zmq_req_socket.setsockopt( zmq.RCVTIMEO, 500 ) # milliseconds
Then, when you actually try to receive the reply (after you have sent a message to the REP socket), you can catch the error that will be asserted if the timeout is exceeded:
try:
send( message, 0 )
send_failed = False
except zmq.Again:
logging.warning( "Image send failed." )
send_failed = True
However! When this happens, as observed elsewhere, your socket will be in a funny state, because it will still be expecting the response. At this point, I cannot find anything that works reliably other than just restarting the socket. Note that if you disconnect() the socket and then re connect() it, it will still be in this bad state. Thus you need to
def reset_my_socket:
zmq_req_socket.close()
zmq_req_socket = zmq_context.socket( zmq.REQ )
zmq_req_socket.setsockopt( zmq.RCVTIMEO, 500 ) # milliseconds
zmq_req_socket.connect( zmq_endpoint )
You will also notice that because I close()d the socket, the receive timeout option was "lost", so it is important set that on the new socket.
I hope this helps. And I hope that this does not turn out to be the best answer to this question. :)
There is one solution to this and that is adding timeouts to all calls. Since ZeroMQ by itself does not really provide simple timeout functionality I recommend using a subclass of the ZeroMQ socket that adds a timeout parameter to all important calls.
So, instead of calling s.recv() you would call s.recv(timeout=5.0) and if a response does not come back within that 5 second window it will return None and stop blocking. I had made a futile attempt at this when I run into this problem.
I'm actually looking into this at the moment, because I am retro fitting a legacy system.
I am coming across code constantly that "needs" to know about the state of the connection. However the thing is I want to move to the message passing paradigm that the library promotes.
I found the following function : zmq_socket_monitor
What it does is monitor the socket passed to it and generate events that are then passed to an "inproc" endpoint - at that point you can add handling code to actually do something.
There is also an example (actually test code) here : github
I have not got any specific code to give at the moment (maybe at the end of the week) but my intention is to respond to the connect and disconnects such that I can actually perform any resetting of logic required.
Hope this helps, and despite quoting 4.2 docs, I am using 4.0.4 which seems to have the functionality
as well.
Note I notice you talk about python above, but the question is tagged C++ so that's where my answer is coming from...
Update: I'm updating this answer with this excellent resource here: https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/ Socket programming is complicated so do checkout the references in this post.
None of the answers here seem accurate or useful. The OP is not looking for information on BSD socket programming. He is trying to figure out how to robustly handle accept()ed client-socket failures in ZMQ on the REP socket to prevent the server from hanging or crashing.
As already noted -- this problem is complicated by the fact that ZMQ tries to pretend that the servers listen()ing socket is the same as an accept()ed socket (and there is no where in the documentation that describes how to set basic timeouts on such sockets.)
My answer:
After doing a lot of digging through the code, the only relevant socket options passed along to accept()ed socks seem to be keep alive options from the parent listen()er. So the solution is to set the following options on the listen socket before calling send or recv:
void zmq_setup(zmq::context_t** context, zmq::socket_t** socket, const char* endpoint)
{
// Free old references.
if(*socket != NULL)
{
(**socket).close();
(**socket).~socket_t();
}
if(*context != NULL)
{
// Shutdown all previous server client-sockets.
zmq_ctx_destroy((*context));
(**context).~context_t();
}
*context = new zmq::context_t(1);
*socket = new zmq::socket_t(**context, ZMQ_REP);
// Enable TCP keep alive.
int is_tcp_keep_alive = 1;
(**socket).setsockopt(ZMQ_TCP_KEEPALIVE, &is_tcp_keep_alive, sizeof(is_tcp_keep_alive));
// Only send 2 probes to check if client is still alive.
int tcp_probe_no = 2;
(**socket).setsockopt(ZMQ_TCP_KEEPALIVE_CNT, &tcp_probe_no, sizeof(tcp_probe_no));
// How long does a con need to be "idle" for in seconds.
int tcp_idle_timeout = 1;
(**socket).setsockopt(ZMQ_TCP_KEEPALIVE_IDLE, &tcp_idle_timeout, sizeof(tcp_idle_timeout));
// Time in seconds between individual keep alive probes.
int tcp_probe_interval = 1;
(**socket).setsockopt(ZMQ_TCP_KEEPALIVE_INTVL, &tcp_probe_interval, sizeof(tcp_probe_interval));
// Discard pending messages in buf on close.
int is_linger = 0;
(**socket).setsockopt(ZMQ_LINGER, &is_linger, sizeof(is_linger));
// TCP user timeout on unacknowledged send buffer
int is_user_timeout = 2;
(**socket).setsockopt(ZMQ_TCP_MAXRT, &is_user_timeout, sizeof(is_user_timeout));
// Start internal enclave event server.
printf("Host: Starting enclave event server\n");
(**socket).bind(endpoint);
}
What this does is tell the operating system to aggressively check the client socket for timeouts and reap them for cleanup when a client doesn't return a heart beat in time. The result is that the OS will send a SIGPIPE back to your program and socket errors will bubble up to send / recv - fixing a hung server. You then need to do two more things:
1. Handle SIGPIPE errors so the program doesn't crash
#include <signal.h>
#include <zmq.hpp>
// zmq_setup def here [...]
int main(int argc, char** argv)
{
// Ignore SIGPIPE signals.
signal(SIGPIPE, SIG_IGN);
// ... rest of your code after
// (Could potentially also restart the server
// sock on N SIGPIPEs if you're paranoid.)
// Start server socket.
const char* endpoint = "tcp://127.0.0.1:47357";
zmq::context_t* context;
zmq::socket_t* socket;
zmq_setup(&context, &socket, endpoint);
// Message buffers.
zmq::message_t request;
zmq::message_t reply;
// ... rest of your socket code here
}
2. Check for -1 returned by send or recv and catch ZMQ errors.
// E.g. skip broken accepted sockets (pseudo-code.)
while (1):
{
try
{
if ((*socket).recv(&request)) == -1)
throw -1;
}
catch (...)
{
// Prevent any endless error loops killing CPU.
sleep(1)
// Reset ZMQ state machine.
try
{
zmq::message_t blank_reply = zmq::message_t();
(*socket).send (blank_reply);
}
catch (...)
{
1;
}
continue;
}
Notice the weird code that tries to send a reply on a socket failure? In ZMQ, a REP server "socket" is an endpoint to another program making a REQ socket to that server. The result is if you go do a recv on a REP socket with a hung client, the server sock becomes stuck in a broken receive loop where it will wait forever to receive a valid reply.
To force an update on the state machine, you try send a reply. ZMQ detects that the socket is broken, and removes it from its queue. The server socket becomes "unstuck", and the next recv call returns a new client from the queue.
To enable timeouts on an async client (in Python 3), the code would look something like this:
import asyncio
import zmq
import zmq.asyncio
#asyncio.coroutine
def req(endpoint):
ms = 2000 # In milliseconds.
sock = ctx.socket(zmq.REQ)
sock.setsockopt(zmq.SNDTIMEO, ms)
sock.setsockopt(zmq.RCVTIMEO, ms)
sock.setsockopt(zmq.LINGER, ms) # Discard pending buffered socket messages on close().
sock.setsockopt(zmq.CONNECT_TIMEOUT, ms)
# Connect the socket.
# Connections don't strictly happen here.
# ZMQ waits until the socket is used (which is confusing, I know.)
sock.connect(endpoint)
# Send some bytes.
yield from sock.send(b"some bytes")
# Recv bytes and convert to unicode.
msg = yield from sock.recv()
msg = msg.decode(u"utf-8")
Now you have some failure scenarios when something goes wrong.
By the way -- if anyone's curious -- the default value for TCP idle timeout in Linux seems to be 7200 seconds or 2 hours. So you would be waiting a long time for a hung server to do anything!
Sources:
https://github.com/zeromq/libzmq/blob/84dc40dd90fdc59b91cb011a14c1abb79b01b726/src/tcp_listener.cpp#L82 TCP keep alive options preserved for client sock
http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/ How does keep alive work
https://github.com/zeromq/libzmq/blob/master/builds/zos/README.md Handling sig pipe errors
https://github.com/zeromq/libzmq/issues/2586 for information on closing sockets
https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
https://github.com/zeromq/libzmq/issues/976
Disclaimer:
I've tested this code and it seems to be working, but ZMQ does complicate testing this a fair bit because the client re-connects on failure? If anyone wants to use this solution in production, I recommend writing some basic unit tests, first.
The server code could also be improved a lot with threading or polling to be able to handle multiple clients at once. As it stands, a malicious client can temporarily take up resources from the server (3 second timeout) which isn't ideal.
I'd like to make a chatting program using win socket in c/c++. (I am totally newbie.)
The first question is about how to check if the client receives packets from server.
For instance, a server sends "aaaa" to a client.
And if the client doesn't receive packet "aaaa", the server should re-send the packet again.(I think). However, I don't know how to check it out.
Here is my thought blow.
First case.
Server --- "aaaa" ---> Client.
Server will be checking a sort of time waiting confirm msg from the client.
Client --- "I received it" ---> Server.
Server won't re-send the packet.
The other case.
Server --- "aaaa" ---> Client.
Server is waiting for client msg until time out
Server --- "aaaa" ---> Client again.
But these are probably inappropriate.
Look at second case. Server is waiting a msg from client for a while.
And if time's out, server will re-send a packet again.
In this case, client might receive the packet twice.
Second question is how to send unlimited size packet.
A book says packet should have a type, size, and msg.
Following it, I can only send msg with the certain size.
But i want to send msg like 1Mbytes or more.(unlimited)
How to do that?
Anyone have any good link or explain correct logic to me as easy as possible.
Thanks.
Use TCP. Think "messages" at the application level, not packets.
TCP already handles network-level packet data, error checking & resending lost packets. It presents this to the application as a "stream" of bytes, but without necessarily guaranteed delivery (since either end can be forcibly disconnected).
So at the application level, you need to handle Message Receipts & buffering -- with a re-connecting client able to request previous messages, which they hadn't (yet) correctly received.
Here are some data structures:
class or struct Message {
int type; // const MESSAGE.
int messageNumber; // sequentially incrementing.
int size; // 4 bytes, probably signed; allows up to 2GB data.
byte[] data;
}
class or struct Receipt {
int type; // const RECEIPT.
int messageNumber; // last #, successfully received.
}
You may also want a Connect/ Hello and perhaps a Disconnect/ Goodbye handshake.
class Connect {
int type; // const CONNECT.
int lastReceivedMsgNo; // last #, successfully received.
// plus, who they are?
short nameLen;
char[] name;
}
etc.
If you can be really simple & don't need to buffer/ re-send messages to re-connecting clients, it's even simpler.
You could also adopt a "uniform message structure" which had TYPE and SIZE (4-byte int) as the first two fields of every message or handshake. This might help standardize your routines for handling these, at the expense of some redundancy (eg in 'name' field-sizes).
For first part, have a look over TCP.
It provides a ordered and reliable packet transfer. Plus you can have lot of customizations in it by implementing it yourself using UDP.
Broadly, what it does is,
Server:
1. Numbers each packet and sends it
2. Waits for acknowledge of a specific packet number. And then re-transmits the lost packets.
Client:
1. Receives a packet and maintains a buffer (sliding window)
2. It keeps on collecting packets in buffer until the buffer overflows or a wrong sequenced packet arrives. As soon as it happens, the packets with right sequence are 'delivered', and the sequence number of last correct packet is send with acknowledgement.
For second part:
I would use HTTP for it.
With some modifications. Like you should have some very unique indicator to tell client that transmission is complete now, etc