C/C++: Write and Read Sockets - c++

I'm sending and receiving info with a unix socket, but I do not completely understand how it works. Basically, I send a message like this:
int wr_bytes = write(sock, msg.c_str(), msg.length());
And receive message like this:
int rd_bytes = read(msgsock, buf, SOCKET_BUFFER_SIZE);
This code works perfectly with thousands of bytes, what I don't understand is, how does the read function knows when the other part is done sending the message? I tried to read the read documentation and, on my understanding read will return once it reaches EOF or the SOCKET_BUFFER_SIZE, is that correct?
So I'm guessing that when I give my string to the write function, it adds an EOF at the end of my content so the read function knows when to stop.
I'm asking this question because, I did not add any code that checks whether the other part finished sending the message, however, I'm receiving big messages (thousands of bytes) without any problem, why is that happening, why am I not getting only parts of the message?
Here is the full function I'm using to send a message to a unix socket server:
string sendSocketMessage(string msg) {
int sock;
struct sockaddr_un server;
char buf[1024];
sock = socket(AF_UNIX, SOCK_STREAM, 0);
if (sock < 0) {
throw runtime_error("opening stream socket");
}
server.sun_family = AF_UNIX;
strcpy(server.sun_path, "socket");
if (connect(sock, (struct sockaddr *) &server, sizeof(struct sockaddr_un)) < 0) {
close(sock);
throw runtime_error("connecting stream socket");
}
if (write(sock, msg.c_str(), msg.length()) < 0){
throw runtime_error("writing on stream socket");
close(sock);
}
bzero(buf, sizeof(buf));
int rval = read(sock, buf, 1024);
return string( reinterpret_cast< char const* >(buf), rval );
}
And here is my server function (a little bit more complicated, the type vSocketHandler represents a function that I call to handle requests):
void UnixSocketServer::listenRequests(vSocketHandler requestHandler){
int sock, msgsock, rval;
struct sockaddr_un server;
char buf[SOCKET_BUFFER_SIZE];
sock = socket(AF_UNIX, SOCK_STREAM, 0);
if (sock < 0) {
throw runtime_error("opening stream socket");
}
server.sun_family = AF_UNIX;
strcpy(server.sun_path, SOCKET_FILE_PATH);
if (bind(sock, (struct sockaddr *) &server, sizeof(struct sockaddr_un))) {
throw runtime_error("binding stream socket");
}
listen(sock, SOCKET_MAX_CONNECTIONS);
while(true) {
msgsock = accept(sock, 0, 0);
if (msgsock == -1){
throw runtime_error("accept socket");
} else {
bzero(buf, sizeof(buf));
if((rval = read(msgsock, buf, SOCKET_BUFFER_SIZE)) < 0)
throw runtime_error("reading stream message");
else if (rval == 0){
//do nothing, client closed socket
break;
} else {
string msg = requestHandler(string( reinterpret_cast< char const* >(buf), rval ));
if(write(msgsock, msg.c_str(), msg.length()) < 0)
throw runtime_error("sending stream message");
}
close(msgsock);
}
}
close(sock);
unlink(SOCKET_FILE_PATH);
}

what I don't understand is, how does the read function knows when the other part is done sending the message?
For a stream-type socket, such as you're using, it doesn't. For a datagram-type socket, communication is broken into distinct chunks, but if a message spans multiple datagrams then the answer is again "it doesn't". This is indeed one of the key things to understand about the read() and write() (and send() and recv()) functions in general, and about sockets more specifically.
For the rest of this answer I'll focus on stream oriented sockets, since that's what you're using. I'll also suppose that the socket is not in non-blocking mode. If you intend for your data transmitted over such a socket to be broken into distinct messages, then it is up to you to implement an application-level protocol by which the other end can recognize message boundaries.
I tried to read the read documentation and, on my understanding read will return once it reaches EOF or the SOCKET_BUFFER_SIZE, is that correct?
Not exactly. read() will return if it reaches the end of the file, which happens when the peer closes its socket (or at least shuts down the write side of it) so that it is certain that no more data will be sent. read() will also return in the event of any of a variety of error conditions. And read() may return under other unspecified circumstances, provided that it has transferred at least one byte. In practice, this last case is generally invoked if the socket buffer fills, but it may also be invoked under other circumstances, such as when the buffer empties.
So I'm guessing that when I give my string to the write function, it adds an EOF at the end of my content so the read function knows when to stop.
No, it does no such thing. On success, the write() function sends some or all of the bytes you asked it to send, and nothing else. Note that it is not guaranteed even to send all the requested bytes; its return value tells you how many of them it actually did send. If that's fewer than "all", then ordinarily you should simply perform another write() to transfer the rest. You may need to do this multiple times to send the whole message. In any event, only the bytes you specify are sent.
I'm asking this question because, I did not add any code that checks whether the other part finished sending the message, however, I'm receiving big messages (thousands of bytes) without any problem, why is that happening, why am I not getting only parts of the message?
More or less because you're getting lucky, but the fact that you're using UNIX-domain sockets (as opposed to network sockets) helps. Your data are transferred very efficiently from sending process to receiving process through the kernel, and it is not particularly surprising that large writes() are received by single read()s. You cannot safely rely on that always to happen, however.

Related

TCP C send data when not receiving data

I'm trying to send data to the connected client, even when the client did not send me a message first.
This is my current code:
while (true) {
// open a new socket to transmit data per connection
int sock;
if ((sock = accept(listen_sock, (sockaddr *) &client_address, &client_address_len)) < 0) {
logger.log(TYPE::ERROR, "server::could not open a socket to accept data");
exit(0);
}
int n = 0, total_received_bytes = 0, max_len = 4096;
std::vector<char> buffer(max_len);
logger.log(TYPE::SUCCESS,
"server::client connected with ip address: " + std::string(inet_ntoa(client_address.sin_addr)));
// keep running as long as the client keeps the connection open
while (true) {
n = recv(sock, &buffer[0], buffer.size(), 0);
if (n > 0) {
total_received_bytes += n;
std::string str(buffer.begin(), buffer.end());
KV key_value = kv_from(vector_from(str));
messaging.set_command(key_value);
}
std::string message = "hmc::" + messaging.get_value("hmc") + "---" + "sonar::" + messaging.get_value("sonar") + "\n";
send(sock, message.c_str(), message.length(), 0);
}
logger.log(TYPE::INFO, "server::connection closed");
close(sock);
}
I thought by moving the n = recv(sock, &buffer[0], buffer.size(), 0); outside the while condition that it would send the data indefinitely, but that is not what happened.
Thanks in advance.
Solution
Adding MSG_DONTWAIT to the recv function enabled non-blocking operations which I was looking for.
First I will explain, why it does not work, then I will make a proposal for solutions. Basically you will find the answer in the man7.org > Linux > man-pages and for recv specifially here.
When the function "recv" is called, then it will not return, until data is available and can be read. This behavior of functions is called "blocking". Means, the current execution thread is blocked until data has been read.
So, calling the function
n = recv(sock, &buffer[0], buffer.size(), 0);
as you did, causes the trouble. You need also to check the return code. 0 means, connection closed, -1 means error and you must check errno for further information.
You can modify the socket to work in non-blocking mode with the function fnctl and the O_NONBLOCK flag, for the lifetime of the socket. You can also use the the flag MSG_DONTWAIT as 4th parameter (flags), to unblock the function on a per-function-call base.
In both cases, if no data is available, the functions returns a -1 and you need to check errno for EAGAIN or EWOULDBLOCK.
return value 0 indicates that the connection has been closed.
But from the architecture point of view, I would not recommend to use this approach. You could use multiple threads for receiving and sending data, or, using Linux, one of select, poll or similar functions. There is even a common design pattern for this. It is called "reactor", There are also related patterns like "Acceptor/Connector" and "Proactor"/"ACT" available. If you plan to write a more robust application, then you may consider those.
You will find an implementation of Acceptor, Connector, Reactor, Proactor, ACT here
Hope this helps

Is this function doing something wrong with the sockets?

I am using the following function to receive XML files for a while, but it has been going wrong for some time now and I think the problem is on the customer's network. I'm not sure, it's just a guess.
It happens some times when they try to send me XMLs files bigger than 13KB - the received buffer contains trash like this:
...
<Identifiers>
<Identifier>
<PID>E3744</PID>
</Identifier>
<Identifier IDType="SHC">
<PID>10021020</PID>
</Identifier>
<Identifier><*X| Å Å Ÿòc PV“R¢ E ·Â÷# #€ˆ
þõ
øæ=Ì×KåÅôdËÞ¦P s÷j
<PID>1002102-0</PID>
</Identifier>
<Identifier>
<PID>1002102</PID>
</Identifier>
</Identifiers>
...
Here is the fuction:
bool ReceiveBuffer(HWND hDlg, const SOCKET& socket, string& sBuffer)
{
WSAAsyncSelect(socket, hDlg, WM_WINSOCK, FD_CLOSE);
int iBufSize = 10000000; //10MB
int iBufVarSize = sizeof(iBufSize);
if (setsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char*)&iBufSize, iBufVarSize) == SOCKET_ERROR)
if (getsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char*)&iBufSize, &iBufVarSize) == SOCKET_ERROR)
WriteLog("Unable to GET buffer receiving size");
char* buf = (char*)MALLOCZ(iBufSize);
if (!buf)
{
WriteLog("Unable to allocate memory");
return false;
}
int iCharsRead = 0;
do
{
memset(buf, 0, iBufSize);
iCharsRead = recv(socket, buf, iBufSize, 0);
if (iCharsRead > 0)
sBuffer.append(buf, iCharsRead);
}
while (iCharsRead > 0);
FREE(buf);
buf = NULL;
return true;
}
ReceiveBuffer() should not be calling WSAAsyncSelect() or setting SO_RCVBUF. That is the responsibility of whatever code initially creates the SOCKET.
But more importantly, WSAAsyncSelect() puts the socket into non-blocking mode, per the documentation:
The WSAAsyncSelect function automatically sets socket s to nonblocking mode, regardless of the value of lEvent.
However, your reading loop is not accounting for possible WSAEWOULDBLOCK errors from recv() so it can call recv() again to keep reading.
ReceiveBuffer() is also assuming that if setsockopt() succeeds then the actual buffer size is really the requested size, which is not guaranteed. So you need to call getsockopt() regardless of whether setsockopt() succeeds or fails, per the documentation:
SO_RCVBUF and SO_SNDBUF
When a Windows Sockets implementation supports the SO_RCVBUF and SO_SNDBUF options, an application can request different buffer sizes (larger or smaller). The call to setsockopt can succeed even when the implementation did not provide the whole amount requested. An application must call getsockopt with the same option to check the buffer size actually provided.
But really, setting SO_RCVBUF on every call to ReceiveBuffer() is not necessary in the first place. recv() returns whatever data is currently available at that moment, up to the requested buffer size. It is very unlikely that it will return anywhere close to 10MB of data on any given read. So you are just wasting a lot of memory for no real benefit. It is one thing to set the socket's internal buffer to 10MB if you are on a fast network. It is another thing to allocate a memory buffer of 10MB to receive data from each recv() call. You should use a much smaller memory buffer. 1K is a common size to use.
But beyond that, regardless of the buffer size you use, ReceiveBuffer() is reading arbitrary bytes in an endless loop until the socket is disconnected or errors (and not accounting for non-blocking errors). When the socket does eventually disconnect/error, ReceiveBuffer() is returning true instead of false, so the caller has no idea that something went wrong, or that sBuffer may be incomplete.
Also, in case the caller calls ReceiveBuffer() multiple times with the same variable for the sBuffer parameter, you should call sBuffer.clear() before starting the reading loop to make sure you are not appending new data to the end of stale data.
Now, all of the above is just technical issues with your code logic. But there is also a semantic element as well. XML has a finite length to it, but your current code has no way of knowing what that length actually is. It is the sender's responsibility to tell the receiver when the XML has stopped being sent. That could be by sending the XML's length before sending the XML itself, so the receiver knows how many bytes to expect. Or that could be by sending a unique delimiter, like a null terminator, at the end of the XML, so the receiver can stop reading when it sees the delimiter. Or that could be by gracefully closing the connection at the end of the XML (which is a bad idea, because then the receiver can't differentiate between end-of-data and data loss). But it has to do something.
Now, with all of that said, try something more like this instead (I'm assuming a graceful disconnect is the end-of-data indicator, since that is what your original code is doing - you need to seriously consider a different protocol design!):
bool ReceiveBuffer(SOCKET socket, string& sBuffer)
{
sBuffer.clear();
/*
int iBufSize = 1024 * 1024 * 10; //10MB
setsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char*)&iBufSize, sizeof(iBufSize));
if (getsockopt(socket, SOL_SOCKET, SO_RCVBUF, (char*)&iBufSize, sizeof(iBufSize)) == SOCKET_ERROR)
WriteLog("Unable to GET buffer receiving size");
*/
char* buf = (char*) malloc(1024);
if (!buf)
{
WriteLog("Unable to allocate memory");
return false;
}
int iCharsRead;
bool bRet = true;
do
{
iCharsRead = recv(socket, buf, 1024, 0);
if (iCharsRead > 0)
{
sBuffer.append(buf, iCharsRead);
}
else if (iCharsRead == 0)
{
// socket disconnected gracefully
break;
}
else
{
if (WSAGetLastError() != WSAEWOULDBLOCK)
{
// socket error!
WriteLog("Unable to read from socket");
bRet = false;
break;
}
// socket is non-blocking and there is no data available
// at this moment. Call recv() again...
// optional: call select() to wait for new data to arrive
// before calling recv() again. For instance, this will
// allow you to fail the function if no new data arrived
// within a timeout period...
//
/*
fd_set fd;
FD_ZERO(&fd);
FD_SET(socket, &fd);
timeval tv;
tv.tv_sec = 30;
tv.tv_usec = 0;
int ret = select(0, &fd, NULL, NULL, &tv);
if (ret <= 0)
{
if (ret == 0)
{
// timeout!
WriteLog("Timeout waiting for data from socket");
}
else
{
// socket error!
WriteLog("Unable to wait for data from socket");
}
bRet = false;
break;
}
*/
}
}
while (true);
free(buf);
return bRet;
}

c++ Socket receive takes a long time

I am writing the client side of the Socket. When there is something to read my code works fine but when there is nothing to read, the recv never returns. Help please.
Code:
m_socket = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in dest;
if ( m_socket )
{
memset(&dest, 0, sizeof(dest)); /* zero the struct */
dest.sin_family = AF_INET;
dest.sin_addr.s_addr = inet_addr(address); /* set destination IP number */
dest.sin_port = htons(port);
if (connect(m_socket, (struct sockaddr *)&dest, sizeof(struct sockaddr)) == SOCKET_ERROR)
{
return false;
}
else
{
std::vector<char> inStartup1(2);
int recvReturn = recv(Socket, &inStartup1.at(0), inStartup1.size(), 0);
}
recv is a blocking call. This would help you:-
The recv() call is normally used only on a connected socket.It returns the length of the message on successful completion. If a message is too long to fit in the supplied buffer, excess bytes may be discarded DEPENDING on the type of socket the message is received from.
If no messages are available at the socket, the receive calls wait for a message to arrive, unless the socket is nonblocking, in which case the value -1 is returned and the external variable errno is set to EAGAIN or EWOULDBLOCK. The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested.
Taking this one step further, on a server this is how you would correctly handle a connection (socket or serial port does not matter):
make the socket/port non-blocking: this is the first important step; it means that recv() will read what is available (if anything) and return the number of read bytes or -1 in case of an error.
use select(), with a timeout, to find out when data becomes available. So now you wait for a certain amount of time for data to become available and than read it.
The next problem to handle is making sure you read the full message. Since there is no guarantee that the whole message will be available when you call recv(), you need to save whatever is available and go back to select() and wait for the next data to become available.
Put everything in a while(cond) construct to make sure you read all the data.
The condition in the while is the only thing left to figure out - you either know the length of the expected message or you use some delimiters to mark the end of the message.
Hope this helps!

Unix socket programming in C++, recv returning 0, but still receiving data, but sometimes receives more than what is sent

I am new to C++ and socket programming. I studied with Beej's guide so my codes are almost same as the guide, but I am struggling really strange bugs.
First, my server's recv() returns 0. According to document, the client should gracefully close the connection for recv() to return 0. Not really in my case. It returns 0, at the same time, I still receive the data from the client. So, the way Beej's do to receive, does not work for me. Can someone explain how this can be possible?
char buf[MAXDATASIZE];
numbytes = recv(new_fd, buf, MAXDATASIZE-1, 0);
buf[numbytes] = '\0';
the last line here, because numbytes is 0, it sweeps out all message I received. So I had to comment that out. Now, my code looks like this
char buf[MAXDATASIZE];
numbytes = recv(new_fd, buf, MAXDATASIZE-1, 0);
//buf[numbytes] = '\0';
printf("received: %s\n", buf);
It now works with receiving some messages sent by client. However, I did some string manipulation (appending) in the client side, and then sent the message. Now, I send string length of 29 in the client side, but the server receives 41 bytes with strange characters.
What I sent: received: Login#1 Mary 123456 451912345
received: Login#1 Mary 123456 451912345ÿ>É„ÿy#ÿ>Ád
Here is how I receive in the server:
while(1) { // main accept() loop
new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size);
if (new_fd == -1) {
perror("accept");
continue;
}
char buf[MAXDATASIZE];
int numbytes;
if (numbytes = recv(new_fd, buf, MAXDATASIZE-1, 0) == -1)
perror("recv");
//buf[numbytes] = '\0'; // this had to be commented out
printf("received: %s\n", buf); // prints out with weird characters
string msgRcved = buf;
close(new_fd);
}
This is how I send from client:
// string loginCredential is loaded with "1 Mary 123456 451912345" at this point
loginCredentials.insert(0, "Login#");
const char* msgToSend = loginCredentials.c_str();
int numbytesSent;
if (numbytesSent = send(sockfd, msgToSend, strlen(msgToSend), 0) == -1)
perror("send");
I'd like to know how my recv receives data while it returns 0 at the first place. And, I'd like to know what I am doing wrong to recv data from client/send data to server.
You have a precedence problem.
This:
if (numbytes = recv(new_fd, buf, MAXDATASIZE-1, 0) == -1)
is equivalent to
if (numbytes = (recv(new_fd, buf, MAXDATASIZE-1, 0) == -1))
and
recv(new_fd, buf, MAXDATASIZE-1, 0) == -1
is 0 whenever recv succeeds.
The same problem is present on the sending end.
There's no reason to write such awkward and error-prone condition.
This is safer:
int numbytes = recv(new_fd, buf, MAXDATASIZE-1, 0);
if (numbytes == -1)
perror("recv");
You have to test 'numbytes' for zero, separately, and if you get it close the socket and exit the read loop, because the peer has closed the connection. Otherwise, and assuming you have also tested for -1, you have to only process 'numbytes' bytes of the buffer. Not all of them. Otherwise you're liable to reprocess bytes you already processed. In this case that might mean restoring the line that null-terminated the buffer, or it might mean this:
printf("%.*s", numbytes, buf);
You are printing whatever garbage was in that stack-allocated buffer, not what the client sent. When recv(2) returns zero, nothing has been placed into the supplied buffer, so this is probably from some previous iteration of the loop.
Notes:
Connected TCP socket is is a bi-directional stream of bytes. This means you might send several of your "messages" and receive them in one chunk on the other side, or the other way around. Read from the socket in a loop until you have enough data to process, i.e. use explicit message separators, or pre-pend a length of your message that follows. This is your application-level protocol.
Don't mix C and C++ string handing like this. std::string has a size() method, use it instead of doing strlen( msgToSend.c_str() ).
Allocating any sizable buffers on the stack, especially ones receiving input from the network is a bad idea.
Printing, or otherwise passing further, unverified network input is a gross security violation leading to all sorts of problems.
Edit 0:
#molbdnilo's answer is the right one. I did not spot the precedence problem in the conditionals. My notes still apply though.

Checking for errors before recv() called

In got the following problem:
I made a server which is able to handle multiple connection by using select(). But select returns a client(index of FD_SET) also if the socket just got an error like "client disconnect" or whatever.
Is it possible to check a socket without calling recv(). Because to receive I need to get a buffer out of my "BufferPool"
Sample code:
int ret = recv(client, buffer_pool->get(), BUFFER_SIZE, 0);
if(ret == -1) ... // something went wrong
Well then I have to release the buffer again, and it was pretty much a waste of one buffer in my pool. (for a short time)
So isn't it possible to check the socket without calling recv()
I am not sure about the Windows, but using getsockopt() works like a charm on POSIX-compliant systems. Though before you use it - make sure that getting your buffer from the pool is more expensive than making an extra system call. Here is a code snippet:
int my_get_socket_error(int fd)
{
int err_code;
socklen_t len = sizeof(err_code);
if (getsockopt(fd, SOL_SOCKET, SO_ERROR, &err_code, &len) != 0)
err_code = errno;
else
errno = err_code;
return err_code;
}
UPDATE:
According to this document, it seems like Windows supports it too.
No, there is no way to avoid the recv() call. If select() reports that the socket is readable, then you have to read from the socket to determine its new state. If the client disconnected gracefully, recv() will return 0, not -1. If you do not want to waste a pooled buffer, then you will have to read into a temporary local buffer first, and then if recv() returns any data, you can retrieve a pooled buffer and copy the read data into it.
Calling recv and similar function does not work directly with networking devices or something similar.
When you send or receive data, all you do is questioning OS for available data, or to put data in queue for sending. Then OS will do the other job when your code is already went further.
That is why you receive errors after next call of socket function that will "contact" OS networking layers.
It is normal to get errors on that point, and you have to deal with them.
But to prevent blocking sockets and wasting buffers, check out online techniques of implementing or ready libraries that gives you asynchronous way of working with sockets, that way you don't need to define anything before socket will trigger receive callback function where you have to do actual receiving.
As well, it is not good technique to receive big amount of data in one go, because you will face problems with merged or broken apart data through TCP layer, because it is stream based layer. It is recommended to have header in you packets (few bytes) and receive them, that way you don't need pull for header, but only after header you want to read rest of message based on length provided in header. This is just possible example.
After some minutes of work and your help I just receive 1byte before receiving the full amount:
SOCKET client = ...;
char temp = 0x00;
int len = recv(client, &temp, 1, 0);
if(len == 0)
{
// .. client error handling
return;
}
char* buffer = m_memory_pool->Get();
len = recv(client, buffer + 1, m_memory_pool->buffer_size() - 1, 0);
buffer[0] = temp;
// data handling
I tried also to set a timeout for recv() but seems that under Windows it does not work, this is my code:
...
long timeout_ms = 10;
struct timeval interval = {timeout_ms / 1000, (timeout_ms % 1000) * 1000};
if (interval.tv_sec < 0 || (interval.tv_sec == 0 && interval.tv_usec <= 0))
{
interval.tv_sec = 0;
interval.tv_usec = 10000;
}
setsockopt(s_sktIx, SOL_SOCKET, SO_RCVTIMEO, (char *)&interval, sizeof(struct timeval));
...