Send buffer empty of Socket in Linux?

Send buffer empty of Socket in Linux? - c++

Is there a way to check if the send buffer of an TCP Connection is completely empty?
I haven't found anything until now and just want to make sure a connection is not closed by my server while there are still data being transmitted to a certain client.
I'm using poll to check if I'm able to send data on a non-blocking socket. But by that I'm not able to find out if EVERYTHING has been sent in buffer, am I?

In Linux, you can query a socket's send queue with ioctl(sd, SIOCOUTQ, &bytes). See man ioctl for details.
The information is not completely reliable in the sense that it is possible that the data has been received by the remote host, since the buffer cannot be emptied until an ACK is received. You probably should not use it to add another level of flow-control on top of TCP.
If the remote host actually closes the connection (or half-closes it), then the socket become unwriteable, regardless of how much data might have been in the buffer. You can detect this condition by writing 0 bytes to the socket.
The more difficult (and often more likely) condition is the remote host becoming unreachable, because of network issues or because it crashes. In that case, data will pile up in the send buffer, but that can also happen because the remote host's receive buffer is full (perhaps because the process reading the buffer doesn't have enough resources to process its input). In the case of network routing issues, you might get a router notification (an ICMP error), which should make the socket unwritable; unfortunately, there are many network errors which just result in black holes.

Related

C++ UDP Receving

I have a problem I have been trying to iron out all day. The situation is as follows:
I have a server list - let's say 10 different servers.
I want to send a Proposal broadcast message using sendto command to all 10 servers.
I then want to listen and wait for the 10 servers to respond with an ACK + some message.
After some time, timeout using the data from the servers who had responded. (time will be variable based on the amount of requests)
I would like to make use of UDP so that it is connection independent, but also concerned that if I shoot out all messages at once , I might miss a message since I am not blocking on the revfrom line until all the messages are sent.
I could just wait after each send, but that seems inefficient from a broadcast perspective.
I could also setup a listen thread first, and then run the sendto's on a seperate thread, but then the listener (which is the whole program) is on another thread outside of main.
So my question is two fold: which of these approaches (if any) seem like the best fit given what I am trying to do? Secondly, is there any queue on the socket. Like Lets say its not 10, but 1000 servers - if a message comes in while it is not ready to receive, will this message be dropped?
I am open to suggestions on other ways to implement.
Thanks in advance!

Most personal computers these days are located behind firewalls that will block any incoming UDP packets --- indeed, most personal computers these days are also behind a NAT translation layer and don't even have their own Internet-routable IP address. I'd worry about that before I'd worry about missing the occasional incoming UDP message due to timing issues.
That said, in the case where your client is running on the open Internet (or is behind a firewall that is configured to allow UDP packets in), the timing issue isn't really a problem, because the networking stack allocates an incoming-data buffer for the every socket as part of the socket() call. Once you have successfully called bind() on the socket, any UDP packets arriving at that socket's port will be placed in to the socket's incoming-data buffer, ready to be handed over to your code the next time it calls recvfrom(). Importantly, this buffering will occur whether your thread is currently inside a recvfrom() call, or not.
It is possible for the incoming-data-buffer to fill up (it has a finite size, usually around 64KB); at which point any additional incoming UDP packets will be dropped. The usual way to avoid that is to make sure you call recvfrom() as soon as possible, or if that is not sufficient, you can use setsockopt() to tell the networking stack to make the socket's incoming data-buffer larger.
Meanwhile, your calls to sendto() will likely finish quickly, since sendto() returns as soon as the data in your array is copied into the socket's outgoing-data-buffer. In particular, sendto() does not wait for the bytes to go across the network, or (usually) even for the bytes to get to your network card. At worst, it might block until there is enough room in the outgoing-data-buffer to place the data there; and the outgoing-data-buffer is always draining at the line-speed of your network device.

How create raw socket in Linux without buffering receive packets? Is it possible?

I use Linux, and I have created an application that uses raw sockets. When I open it and recv(...), I get packets, which went earlier, and I guess were buffered in kernel, or network card driver. But I don't need them. I need only packets, which went after I opened the socket.
Of course, I can drop these packets, but I don't know how many packets I need drop, because each time quantity of packets is different.
How can I create this socket? Is it possible?

Depends on how you've negotiated the host/port to communicate on, and do you have control over whatever is sending these packets?
You could:
1) Immediately after opening the socket, do a recv() loop (with flags=MSG_DONTWAIT) and ignore every packet assuming that it was stale, ending the loop as soon as recv() returns <=0 bytes (it should set errno to EWOULDBLOCK to indicate that there was nothing left to read too, otherwise the cause could be another socket-related issue)
2) Negotiate a new port each time
3) Add a custom header to your packets (e.g. first N bits) to indicate e.g. sequence number, or a special "new connection" code, or a timestamp. This usage really depends on what you're doing on both ends of this raw socket.

C++ socket programming Max size of TCP/IP socket Buffer?

I am using C++ TCP/IP sockets. According to my requirements my client has to connect to a server and read the messages sent by it (that's something really new, isn't it) but... in my application I have to wait for some time (typically 1 - 2 hrs) before I actually start reading messages (through recv() or read()) and the server still keeps on sending messages.
I want to know whether there is a limit on the capacity of the buffer which keeps those messages in case they are not read and whose physical memory is used to buffer those messages? Sender's or receiver's?

TCP data is buffered at both sender and receiver. The size of the receiver's socket receive buffer determines how much data can be in flight without acknowledgement, and the size of the sender's send buffer determines how much data can be sent before the sender blocks or gets EAGAIN/EWOULDBLOCK, depending on blocking/non-blocking mode. You can set these socket buffers as large as you like up to 2^32-1 bytes, but if you set the client receive buffer higher than 2^16-1 you must do so before connecting the socket, so that TCP window scaling can be negotiated in the connect handshake, so that the upper 16 bits can come into play. [The server receive buffer isn't relevant here, but if you set it >= 64k you need to set it on the listening socket, from where it will be inherited by accepted sockets, again so the handshake can negotiate window scaling.]
However I agree entirely with Martin James that this is a silly requirement. It wastes a thread, a thread stack, a socket, a large socket send buffer, an FD, and all the other associated resources at the server for two hours, and possibly affects other threads and therefore other clients. It also falsely gives the server the impression that two hours' worth of data has been received, when it has really only been transmitted to the receive buffer, which may lead to unknown complications in recovery situations: for example, the server may be unable to reconstruct the data sent so far ahead. You would be better off not connecting until you are ready to start receiving the data, or else reading and spooling the data to yourself at the client for processing later.

Receiving data from already closed socket?

Suppose I have a server application - the connection is over TCP, using UNIX sockets.
The connection is asynchronous - in other words, clients' and servers' sockets are non-blocking.
Suppose the following situation: in some conditions, the server may decide to send some data to a connected client and immediately close the connection: using shutdown with SHUT_RDWR.
So, my question is - is it guaranteed, that when the client call recv, it will receive the (sent by the server) data?
Or, to receive the data, recv must be called before the server's shutdown? If so, what should I do (or, to be more precise, how should I do this), to make sure, that the data is received by the client?

You can control this behavior with "setsockopt(SO_LINGER)":
man setsockopt
SO_LINGER
Waits to complete the close function if data is present. When this option is enabled and there is unsent data present when the close
function is called, the calling application is blocked during the
close function until the data is transmitted or the connection has
timed out. The close function returns without blocking the caller.
This option has meaning only for stream sockets.
See also:
man read
Beej's Guide to Network Programming

There's no guarantee you will receive any data, let alone this data, but the data pending when the socket is closed is subject to the same guarantees as all the other data: if it arrives it will arrive in order and undamaged and subject to TCP's best efforts.
NB 'Asynchronous' and 'non-blocking' are two different things, not two terms for the same thing.

Once you have successfully written the data to the socket, it is in the kernel's buffer, where it will stay until it has been sent and acknowledged. Shutdown doesn't cause the buffered data to get lost. Closing the socket doesn't cause the buffered data to get lost. Not even the death of the sending process would cause the buffered data to get lost.
You can observe the size of the buffer with netstat. The SendQ column is how much data the kernel still wants to transmit.
After the client has acknowledged everything, the port disappears from the server. This may happen before the client has read the data, in which case it will be in RecvQ on the client. Basically you have nothing to worry about. After a successful write to a TCP socket, every component is trying as hard as it can to make sure that your data gets to the destination unharmed regardless of what happens to the sending socket and/or process.
Well, maybe one thing to worry about: If the client tries to send anything after the server has done its shutdown, it could get a SIGPIPE and die before it has read all the available data from the socket.

Is acknowledgment response necessary when using send()/recv() of Winsock?

Using Winsock, C++, I send and receive the data with send()/recv(), TCP connection. I want to be sure that the data has been delivered to the other party, and wonder if it is recommended to send back some acknowledgment message after (if) receiving data with recv.
Here are two possibilities, and please advice which way to go:
If send returns the size of passed buffer, assume that the data has been delivered at least to recv function on the other side of wire. When I say "at least", I mean even if the recv fails there (e.g. due to insufficient buffer, etc.), I don't care, I just want to be sure I've done my server part of work properly - I've sent the data completely (i.e. the data reached the other machine).
Use additional acknowledgment: after receiving the data with recv, send back some ID of received packet (part of header of each data sent) signaling the successful receive operation of that packet. If I don't receive such "acknowledgment message" after some interval, return failure code from the sender function.
The second answer looks more safe, but I don't want to complicate the transfer protocol if it is redundant. Also please note that I'm talking about the TCP connection (which is more safe by itself than UDP).
Is there any other mechanisms (maybe some other APIs? maybe WSARecv()/WSASend() work differently?) of ensuring that the data was delivered to the recv function on the other side?
If you recommend the second way, could you please give me some code snippet that allows me to use recv with timeout to receive the acknowledgment? recv is a blocking operation so it will hang forever if the previous send attempt failed (the other party was not notified). Is there any simple way of using recv with timeout (without creating separate thread every time which would probably be the overkill for each and every send operation).
Also the amount of data I pass to send function might be quite big (several megabytes), so how to choose the timeout for "acknowledgment message"? Maybe I should "split" large buffers and use several send calls? I think it will get quite complicated, please advice!
EDIT: OK, you people are suggesting that TCP/IP stack will handle it (i.e. no manual acknowledgment required), but this is what I found on MSDN page: "The successful completion of a send function does not indicate that the data was successfully delivered and received to the recipient. This function only indicates the data was successfully sent." So even if the TCP mechanism has the ability to ensure data delivery, I can't get that status (success or not) via send() function, or any other Winsock function I know. Do you know any way of getting the status from the TCP layer? Again - return value of send() function seems to be not enough!
========================================================
EDIT 2: OK, I think we agree that even though TCP protocol considers the error handling when something goes wrong, the send() function of Winsock is not capable of reporting the errors (simply because it returns before actual transmitting of data starts by the network driver). So here is a million dollar question: Does the send() function of Winsock at least ensure that no other packets will be delivered to the other party until the current packet will be? In other words, if the sending fails for some network failure (but not reported by send() call), and then the network failure will be fixed before next call of send() function with next chunk of data, will it be ensured that the previous packet (which failed but not reported by send()) will be delivered before the next packet? In other words, is there a chance that the one particular send() function will fail "silently", so that subsequent send() calls will succeed but the first packet will be lost? AGAIN - I'm not talking at the TCP level, I'm talking at the Winsock API level!

Why don't you trust your TCP/IP stack to guarantee delivery. After all, that is the whole point of using TCP instead of UDP.

The existing answers here are mostly correct: if you use TCP you really don't need to worry about reliable delivery of your packets to your peer.
But this is a dangerous view for some systems where data integrity must be taken to the next level: the common criteria auditing requirement FAU_STG.4.1 requires the ability to prevent auditable events if the audit log might suffer a loss of audit entries. (For example, the Linux auditd(8) audit logging daemon can be configured to place the computer in single-user-mode or halt the system completely when there is no more space left for audit logs.) Audit logs from remote systems should probably be maintained until it is known that they have been successfully written to centralized log servers.
Financial transactions would probably be best handled with a more reliable protocol than simple TCP as well -- crediting or debiting accounts would be best handled with a multi-staged protocol to ensure availability of funds, perform the transaction, then report the result of the transaction to the origination point.
TCP allows nearly a gigabyte of in-flight data between two peers (under extreme conditions); depending upon the requirements of your application, you might need to maintain that data at the sending side until you receive positive confirmation from your peer that the data has been properly handled.
Thankfully, most applications aren't this critical; losing a megabyte of data here or there down a socket that reports a closed connection at some point "in the future" really isn't horrible -- we just re-try our HTTP request, or re-attempt the SFTP connection.
Update
A socket will only accept enough data to fill its available window. The window size is negotiated between the two peers during the session handshake. So your calls to send() will begin blocking when the socket's window fills. (The OS might keep letting you add data to its internal buffers too, but at some point the writes will block.) If the peer breaks the connection with a RST or ICMP Unreachable message, a future call to send() will return an error value for Connection Reset or Broken Pipe.
Update 2
I'm not talking at the TCP level, I'm talking at the Winsock API level
This might be the source of confusion. send() has no choice but to adhere to the TCP behavior when used with TCP.
TCP guarantees in-order reliable delivery of a stream of bytes, to the extent that packets can be delivered. (See #Hans's comment about a pony and careless people kicking power cords.) The peer program will see bytes in the correct order they were sent. (Well, okay, TCP also has out-of-band urgent packet delivery, but I haven't actually seen any applications that use it. Using OOB packets, you can get some data out-of-line. Forget I mentioned it.)
If the remote program receives a byte sent on a TCP stream, it reliably received all preceding bytes as well. (Well, there are entire classes of replay attacks that splice together legitimate and fake packets for the remote peer, but those are increasingly difficult on systems with randomized initial sequence numbers. If this is within your threat model, you should be using TLS on top of TCP to provide cryptographically strong tamper evident information. But TLS can't provide better per-packet delivery notification.)

If you use UDP and you care about the data actually being received by the other side you NEED to use ACK, but if you don't need the speed of UDP you should use TCP, as it does the ACKing for you.

I think you are over complicating this, trust your TCP/IP software stack and the reliable delivery it offers. TCP sockets operate on streams of data, not packets. Also one call to send does not guarantee one call to recv.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js