Windows lowest level socket programming? - c++

I'm having a little bit of confusion regarding how low-level winsock is? I am wanting to write a VERY basic client-server program on windows. I don't really wish to use a bloated TCP or even UDP, just something extremely basic and low latency. Would winsock be ideal for this? Or is winsock the same as the windows network functions, just all packaged up (and possibly slower)? Would I be better just using PInvoke on the native windows networking functions?

Winsock, TCP, UDP, and any well received networking library built on top of these are all going to be comparable performance wise.
Use whichever one is easiest to get your work done.

First of all, it's possible to write completely new protocol w/o implementing network driver. For this you have raw sockets. On desktop Windows they are very limited (find "limitations").
It's possible, but not recommended. Don't reinvent the wheel and choose between UDP and TCP until you're completely sure you need something more sophisticated (but not simpler).
To send data over network (as opposite to direct cable link between two computers) you need IP protocol. To dispatch your data to right application you need transport protocol (UDP, TCP and others). UDP is almost the simplest possible one because that's was its main design goal. UDP provides additional addressing (port number in addition to IP address to deliver your data to right socket), packet boundaries ("length" field) and optional checksum. That's all and that's the minimum feature list. Take it and implement everything you need over UDP.
Next, if you need to be sure that your packets are delivered instead of silently dropped somewhere on the way (reliability), delivered in right order, if you'd like to know that somebody still listen to you on opposite end (connectivity) and other things implemented by best specialists, with hardware adapted to this implementation, tested by millions during decades, with lots of documentation, available on almost all possible platforms - use TCP.

Related

How can I recv TCP socket data in one package without dividing

Since I create a TCP socket,it is fine when sending small amount data.no fragment. all data came in one package. but when data becomes bigger and bigger. TCP package has been divided into pieces.. it`s really annoying. Is there any option to set on socket, and the socket will automatically put pieces into one package for me ?
It's a byte stream. All the bytes will arrive correctly and in the right order, but not necessarily when you want them. If you need to send anything more complex than one byte, you need another protocol on top of TCP. That's why there are all those other TCP/IP protocols like HTTP, SMTP etc.
No there is not. There are even situations where you might receive 1 byte.
Consider using higher level messaging libraries like ZMQ. It handles all the message packing and unpacking for you.
TCP provides you reliable bi-directional byte stream. It takes care of sequencing, transport-layer packetization, retransmission, and flow-control. Decades of research went into optimizing its performance. Pretty nifty. The small price you pay for all this convenience is that you have to write and read the stream in a loop, watching for a complete application protocol message you can process when receiving, and flushing yet unbuffered bytes when sending.
Welcome to socket programming!
I'll chime in here and say that there's pretty much nothing you can do to solve you issue without adding extra dependencies on libraries which handle application protocols for you. There are some lower level message packing libraries (google's protocol buffers, among others) which may help.
It's probably the most beneficial to get used to reading and writing TCP data in a loop. It's proven and very portable.. even if you pay a small price in actually writing the streaming codecs yourself.
Try it a few times. It's a useful experience which you can re-use, and it's really not as difficult and annoying once you get the hang of it (like anything else, really).
Furthermore, it's fairly easy to unit-test (rather than dealing with esoteric libraries and uncommon protocols with badly/sparsely documented options)..
You can optimize sockets reads to return larger chunks, on platforms that support it, by setting low watermark using setsockopt() and SO_RECVLOWAT. But you will still have to handle the possibility of getting bytes less than the watermark.
I think you want SOCK_SEQPACKET (or possibly SOCK_RDM). See socket(2).

Reducing network latency in communication of high volume intranet applications

We have a set of server applications which receive measurement data from equipment/tools. Message transfer time is currently our main bottleneck, so we are interested in reducing it to improve the process. The communication between the tools and server applications is via TCP/IP sockets made using C++ on Redhat Linux.
Is it possible to reduce the message transfer time using hardware, by changing the TCP/IP configuration settings or by tweaking tcp kernel functions? (we can sacrifice security for speed, since communication is on a secure intranet)
Depending on the workload, disabling Nagle's Algorithm on the socket connection can help a lot.
When working with high volumes of small messages, i found this made a huge difference.
From memory, I believe the socket option for C++ was called TCP_NODELAY
As #Jerry Coffin proposed, you can switch to UDP. UDP is unreliable protocol, this means you can lose your packets, or they can arrive in wrong order, or be duplicated. So you need to handle these cases on application level. Since you can lose some data (as you stated in your comment) no need for retransmission (the most complicated part of any reliable protocol). You just need to drop outdated packets. Use simple sequence numbering and you're done.
Yes, you can use RTP (it has sequence numbering) but you don't need it. RTP looks like an overkill for your simple case. It has many other features and is used mainly for multimedia streaming.
[EDIT] and similar question here
On the hardware side try Intel Server NICs and make sure the TCP offload Engine (ToE) is enabled.
There is also an important decision to make between latency and goodput, if you want better latency at expense of goodput consider reducing the interrupt coalescing period. Consult the Intel documentation for further details as they offer quite a number of configurable parameters.
If you can, the obvious step to reduce latency would be to switch from TCP to UDP.
Yes.
Google "TCP frame size" for details.

Efficient Packet types/transfer protocol

Using boost::asio in C++, I'm trying to determine the best way to encrypt packets in my program. I thought of defining packets all by myself by type number, each with different fixed packet sizes. The system reads the header (type, and quantity of entries for lists of data) and creates the appropriate structure to receive the data, then it reacts according to the data received.
However, when I look at this method, I wonder if there would be a simpler way to accomplish this without sacrificing efficiency.
These packets are to be sent between different applications trough TCP. Ideally, I'm aiming at both applications using as least bandwidth and CPU as possible while also being as simple to modify as possible. Any suggestions?
TCP uses streams of data, not packets. I highly suggest thinking of your data transmission as a stream of data instead of sequence of packets. This will make it easier to abstract into your code. Take a look at Boost.Serialization or Google Protocol Buffers.
Boost.Asio has SSL encryption capabilities, so it's trivial to encrypt the stream of data. It also has an example using serialization.
Have you considered google protobufs? While it doesn't actually do the encryption (you'll have to do that yourself), it does provide a way of encoding the structured data allowing you to send it over the wire efficiently. Additionally, there are many language bindings for it (C++, Java, and Python off the top of my head).

Sockets: large message performance

I have a doubt regarding the use of Berkeley Sockets under Ubuntu. In terms of performance and reliability which option is best? To send a high amount of messages but with short length or to send a low amount of messages but this ones with a large size? I do not know which is the main design rule I should follow here.
Thank you all!
In terms of reliability, unless you have very specific requirements it isn't worth worrying about much. If you are talking about TCP, it is going to do a better job than you managing things until you come across some edge case that really requires you to fiddle with some knobs, in which case a more specific question would be in order. In terms of packet size, with TCP unless you circumvent Nagel's algorithm, you don't really have the control you might think.
With UDP, arguably the best thing to do is use path MTU discovery, which TCP does for you automatically, but as a general rule you are fine just using something in 500 byte range. If you start to get too fancy you will find yourself reinventing parts of TCP.
With TCP, one option is to use the TCP_CORK socket option. See the getsockopt man page. Set TCP_CORK on the socket, write a batch of small messages, then remove the TCP_CORK option and they will be transmitted in a minimum number of network packets. This can increase throughput at the cost of increased latency.
Each network message has a 40-byte length header, but large messages are harder to route and easier to lose. If you are talking about UDP, so the best message size is Ethernet block, which is 1496 bytes long, if yiu are using TCP, leave it up to network layer to handle how much to send.
Performance you can find out yourself with iperf. Run a few experiments and you will see it yourself. As for reliability as far as I understand if you use TCP the TCP connection guarantee that data will be delivered of course if the connection is not broken.
"In terms of performance and reliability which option is best"
On a lossy layer, performance and reliability are almost a straight trade-off against each other, and greater experts than us have put years of work into finding sweet spots, and techniques which beat the straight trade-off and improve both at once.
You have two basic options:
1) Use stream sockets (TCP). Any "Messages" that your application knows about are defined at the application layer rather than at the sockets. For example, you might consider an HTTP request to be a message, and the response to be another in the opposite direction. You should see your job as being to keep the output buffer as full as possible, and the input buffer as empty as possible, at all times. Reliability has pretty much nothing to do with message length, and for a fixed size of data performance is mostly determined by the number of request-response round-trips performed rather than the number of individual writes on the socket. Obviously if you're sending one byte at a time with TCP_NODELAY then you'd lose performance, but that's pretty extreme.
2) Use datagrams (UDP). "Messages" are socket-layer entities. Performance is potentially better than TCP, but you have to invent your own system for reliability, and potentially this will hammer performance by requiring data to be re-sent. TCP has the same issue, but greater minds, etc. Datagram length can interact very awkwardly with both performance and reliability, hence the MTU discovery mentioned by Duck. If you send a large packet and it's fragmented, then if any fragment goes astray then your message won't arrive. There's a size N, where if you send N-size datagrams they won't fragment, but if you send N+1-size datagrams they will. Hence that +1 doubles the number of failed messages. You don't know N until you know the network route (and maybe not even then). So it's basically impossible to say at compile time what sizes will give good performance: even if you measure it, it'll be different for different users. If you want to optimise, there's no alternative to knowing your stuff.
UDP is also more work than TCP if you need reliability, which is built in to TCP. Potentially UDP has big payoffs, but should probably be thought of as the assembler programming of sockets.
There's also (3): use a protocol for improved UDP reliability, such as RUDP. This isn't part of Berkeley-style sockets APIs, so you'll need a library to help.

Communication between processes

I'm looking for some data to help me decide which would be the better/faster for communication between two independent processes on Linux:
TCP
Named Pipes
Which is worse: the system overhead for the pipes or the tcp stack overhead?
Updated exact requirements:
only local IPC needed
will mostly be a lot of short messages
no cross-platform needed, only Linux
In the past I've used local domain sockets for that sort of thing. My library determined whether the other process was local to the system or remote and used TCP/IP for remote communication and local domain sockets for local communication. The nice thing about this technique is that local/remote connections are transparent to the rest of the application.
Local domain sockets use the same mechanism as pipes for communication and don't have the TCP/IP stack overhead.
I don't really think you should worry about the overhead (which will be ridiculously low). Did you make sure using profiling tools that the bottleneck of your application is likely to be TCP overhead?
Anyways as Carl Smotricz said, I would go with sockets because it will be really trivial to separate the applications in the future.
I discussed this in an answer to a previous post. I had to compare socket, pipe, and shared memory communications. Pipes were definitely faster than sockets (maybe by a factor of 2 if I recall correctly ... I can check those numbers when I return to work). But those measurements were just for the pure communication. If the communication is a very small part of the overall work, then the difference will be negligible between the two types of communication.
Edit
Here are some numbers from the test I did a few years ago. Your mileage may vary (particularly if I made stupid programming errors). In this specific test, a "client" and "server" on the same machine echoed 100 bytes of data back and forth. It made 10,000 requests. In the document I wrote up, I did not indicate the specs of the machine, so it is only the relative speeds that may be of any value. But for the curious, the times given here are the average cost per request:
TCP/IP: .067 ms
Pipe with I/O Completion Ports: .042 ms
Pipe with Overlapped I/O: .033 ms
Shared Memory with Named Semaphore: .011 ms
There will be more overhead using TCP - that will involve breaking the data up into packets, calculating checksums and handling acknowledgement, none of which is necessary when communicating between two processes on the same machine. Using a pipe will just copy the data into and out of a buffer.
I don't know if this suites you, but a very common way of IPC (interprocess communication) under linux is by using the shared memory. It's actually ultra fast (I didn't profiled this, but this is just shared data on RAM with strong processing around it).
The main problem around this approuch is the semaphore, you must build a little system around it so you must make sure a process is not writing at the same time the other one is trying to read.
A very simple starter tutorial is at here
This is not as portable as using sockets, but the concept would be the same, so if you're migrating this to Windows, you will just have to change the shared memory create/attach layer.
Two things to consider:
Connection setup cost
Continuous Communication cost
On TCP:
(1) more costly - 3way handshake overhead required for (potentially) unreliable channel.
(2) more costly - IP level overhead (checksum etc.), TCP overhead (sequence number, acknowledgement, checksum etc.) pretty much all of which aren't necessary on the same machine because the channel is supposed to be reliable and not introduce network related impairments (e.g. packet reordering).
But I would still go with TCP provided it makes sense (i.e. depends on the situation) because of its ubiquity (read: easy cross-platform support) and the overhead shouldn't be a problem in most cases (read: profile, don't do premature optimization).
Updated: if cross-platform support isn't required and the accent is on performance, then go with named/domain pipes as I am pretty sure the platform developers will have optimize-out the unnecessary functionality deemed required for handling network level impairments.
unix domain socket is a very goog compromise. Not the overhead of tcp, but more evolutive than the pipe solution. A point you did not consider is that socket are bidirectionnal, while named pipes are unidirectionnal.
I think the pipes will be a little lighter, but I'm just guessing.
But since pipes are a local thing, there's probably a lot less complicated code involved.
Other people might tell you to try and measure both to find out. It's hard to go wrong with this answer, but you may not be willing to invest the time. That would leave you hoping my guess is correct ;)