Non blocking sockets with overlapped I/O - c++

I'm trying to work myself through the socket jungle and came across non blocking sockets with overlapped I/O. I have three books at home which only mention this concept but don't really explain it or give me any examples.
So what I'm looking for is an article where this get's explained with an example or just an example I can work myself through. It would be good if this would be for windows but I guess I should be able to transfer it from unix as well.
I don't mind a book as source but I would be glad to avoid another 50$. So far I only found the very basic concept and basic comparisons with other socket models. It's not that I don't understand their concept but I would like to see them in action and maybe get a good explanation of how they work. ( I don't mind long articles at all :) )

The phrase 'non-blocking sockets' usually refers to use of the FIONBIO socket option, which makes a call to read() return immediately even if there is no data ready to read. (It returns with an EWOULDBLOCK error.)
Overlapped I/O is something specific to the Windows API (and not available on UNIX for example). The FIONBIO socket option is not used, neither are the traditional Berkeley socket API function calls (read() etc).
(For a POSIX equivalent on Linux, refer to 'man aio' or type man aio into your favourite search engine.)
Now that you understand that 'non-blocking sockets' and 'overlapped I/O' are two different approaches (and not to be mixed), finding helpful articles on each should be much easier.
The MSDN documentation on overlapped I/O is actually very good. If you are doing Windows programming then you really need to be using overlapped I/O for anything where performance and scalability matters. Here's a good starting point:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365603%28v=vs.85%29.aspx
(This MSDN article is about pipes, but it is exactly the same using sockets.)
Also, be aware that WaitForMultipleObjects() doesn't scale as well as using I/O completion ports, but get comfortable with the former first.

Related

How to use non-blocking sockets with multiple threads?

I have read that working with more than 64 sockets in a thread is dangerous(?). But -at least for me- Non-blocking sockets are used for avoiding complicated thread things. Since there is only one listener socket, how am i supposed to split sockets into threads and use them with select() ? Should i create fd_sets for each thread or what ? And how am i supposed to assign a client to a thread, since I can only pass values in the beginning with CreateThread() ?
No no no, you got a few things wrong there.
First, the ideal way to handle many sockets is to have a thread pool which will do the work in front of the sockets (clients).
Another thread, or two (actually in the amount of CPUs as far as I know), do the connection accepting.
Now, when a an event occurs, such as a new connection, it is being dispatched to the thread pool to be processed.
Second, it depends on the actual implementation and environment.
For example, in Windows there's something called IOCP.
If you ask me - do not bother with the lower implementation but instead use a framework such as BOOST::ASIO or ACE.
I personally like ASIO. The best thing about those frameworks is that they are usually cross-platform (nix, Windows etc').
So, my answer is a bit broad but I think it's to the best that you take these facts into consideration before diving into code/manuals/implementation.
Good luck!
Well, what you have read is wrong. Many powerful single-threaded applications have been written with non-blocking sockets and high-performance I/O demultiplexers like epoll(4) and kqueue(2). Their advantage is that you setup your wait events upfront, so the kernel does not have to copy ton of file descriptors and [re-]setup lots of stuff on each poll.
Then there are advantages to threading if your primary goal is throughput, and not latency.
Check out this great overview of available techniques: The C10K problem.
The "ideal way to handle many sockets" is not always - as Poni seems to believe - to "have a thread pool."
What does "ideal" pertain to? Is it ease of programming? Best performance?
Since he recommends not bothering "with the lower implementation" and "use a framework such as BOOST::ASIO or ACE" I guess he means ease of programming.
Had he had a performance angle on Windows he would have recommended "something called IOCPs." IOCPs are "IO Control Ports" which will allow implementation of super-fast IO-applications using just a handful of threads (one per available core is recommended). IOCP applications run circles around any thread-pool equivalent which he would have known if he'd ever written code using them. IOCPs are not used alongside thread pools but instead of them.
There is no IOCP equivalent in Linux.
Using a framework on Windows may result in a faster "time to market" product but the performance will be far from what it might have been had a pure IOCP implementation been chosen.
The performance difference is such that OS-specific code implementations should be considered. If a generic solution is chosen anyway, at least performance would "not have been given away accidentally."

Getting to know the basics of Asynchronous programming on *nix

For some time now I have been googling a lot to get to know about the various ways to acheive asynchronous programming/behavior on nix machines and ( as known earlier to me ) got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
Below are the few alternatives present for linux:
select/poll/epoll :: Cannot be done using single thread as epoll is still blocking call. Also the monitored file descriptors must be opened in non-blocking mode.
libaio:: What I have come to know about is that its implementation sucks and its still notification based instead of being completion based as with windows I/O completion ports.
Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern
libevent :: Any reason to go for it if I prefer ASIO?
Now Here comes the questions :)
What would be the best design pattern for writing fast scalable network server using epoll (ofcourse, will have to use threads here :( )
I had read somewhere that "only sockets can be opened in non-blocking mode" hence epoll supports only sockets and hence cannot be used for disk I/O.
How true is the above statement and why async programming cannot be done on disk I/O using epoll ?
Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself. Similar question
How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern ?
Hope somebody will able to answer all the questions with nice explanations also. Any link to source where the implementation details of epoll and AIO design patterns are exaplained is also appreciated.
Boost ASIO :: It uses epoll under linux and thus not a true async
pattern as it spawns thread which are completely abstracted from user
code to acheive the proactor design pattern
This is not correct. The Asio library uses epoll() by default on most recent Linux kernel versions. however, threads invoking io_service::run() will invoke callback handlers as needed. There is only one place in the Asio library that a thread is used to emulate an asynchronous interface, it is well described in the documentation:
An additional thread per io_service is used to emulate asynchronous
host resolution. This thread is created on the first call to either
ip::tcp::resolver::async_resolve() or
ip::udp::resolver::async_resolve().
This does not make the library "not a true async pattern" as you claim, in fact its name would disagree with you by definition.
1) What would be the best design pattern for writing fast scalable network server using epoll (of course, will have to use threads here :(
)
I suggest using Boost Asio, it uses the proactor design pattern.
3) Boost ASIO uses one big lock around epoll call. I didnt actually
understand what can be its implications and how to overcome it using
asio itself
The epoll reactor uses a mutex to dispatch handlers, though in practice this is not a big concern for most applications. There are application specific ways to mitigate this behavior, such as an io_service per CPU to exploit data locality. See my answer to a similar question on this topic. It is also discussed on the Asio mailing list frequently.
4) How can I modify ASIO pattern to work with disk files? Is there any
recommended design pattern?
The Asio library does not natively support file I/O as you noted. There have been several attempts to add it to the library, I'd suggest discussing on the mailing list.
First of all:
got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
You probably has a small misconception, asynchronous can be build on top of "polling" api.
More then that "reactor" (epoll-like) API is more powerful then "proactor" API (IOCP) as
the second can be implemented in terms of the first one (but not the other way around).
Also some operations that are "truly" asynchronous for example like disk I/O, some some other tools can be with combination of signals and Linux specific signalfd can provide full coverage of some other cases.
Bottom line. epoll is truly asynchronous I/O

Most suitable asynchronous socket model for an instant messenger client?

I'm working on an instant messenger client in C++ (Win32) and I'm experimenting with different asynchronous socket models. So far I've been using WSAAsyncSelect for receiving notifications via my main window. However, I've been experiencing some unexpected results with Winsock spawning additionally 5-6 threads (in addition to the initial thread created when calling WSAAsyncSelect) for one single socket.
I have plans to revamp the client to support additional protocols via DLL:s, and I'm afraid that my current solution won't be suitable based on my experiences with WSAAsyncSelect in addition to me being negative towards mixing network with UI code (in the message loop).
I'm looking for advice on what a suitable asynchronous socket model could be for a multi-protocol IM client which needs to be able to handle roughly 10-20+ connections (depending on amount of protocols and protocol design etc.), while not using an excessive amount of threads -- I am very interested in performance and keeping the resource usage down.
I've been looking on IO Completion Ports, but from what I've gathered, it seems overkill. I'd very much appreciate some input on what a suitable socket solution could be!
Thanks in advance! :-)
There are four basic ways to handle multiple concurrent sockets.
Multiplexing, that is using select() to poll the sockets.
AsyncSelect which is basically what you're doing with WSAAsyncSelect.
Worker Threads, creating a single thread for each connection.
IO Completion Ports, or IOCP. dp mentions them above, but basically they are an OS specific way to handle asynchronous I/O, which has very good performance, but it is a little more confusing.
Which you choose often depends on where you plan to go. If you plan to port the application to other platforms, you may want to choose #1 or #3, since select is not terribly different from other models used on other OS's, and most other OS's also have the concept of threads (though they may operate differently). IOCP is typically windows specific (although Linux now has some async I/O functions as well).
If your app is Windows only, then you basically want to choose the best model for what you're doing. This would likely be either #3 or #4. #4 is the most efficient, as it calls back into your application (similar, but with better peformance and fewer issues to WSAsyncSelect).
The big thing you have to deal with when using threads (either IOCP or WorkerThreads) is marshaling the data back to a thread that can update the UI, since you can't call UI functions on worker threads. Ultimately, this will involve some messaging back and forth in most cases.
If you were developing this in Managed code, i'd tell you to look at Jeffrey Richter's AysncEnumerator, but you've chose C++ which has it's pros and cons. Lots of people have written various network libraries for C++, maybe you should spend some time researching some of them.
consider to use the ASIO library you can find in boost (www.boost.org).
Just use synchronous models. Modern operating systems handle multiple threads quite well. Async IO is really needed in rare situations, mostly on servers.
In some ways IO Completion Ports (IOCP) are overkill but to be honest I find the model for asynchronous sockets easier to use than the alternatives (select, non-blocking sockets, Overlapped IO, etc.).
The IOCP API could be clearer but once you get past it it's actually easier to use I think. Back when, the biggest obstacle was platform support (it needed an NT based OS -- i.e., Windows 9x did not support IOCP). With that restriction long gone, I'd consider it.
If you do decide to use IOCP (which, IMHO, is the best option if you're writing for Windows) then I've got some free code available which takes away a lot of the work that you need to do.
Latest version of the code and links to the original articles are available from here.
And my views on how my framework compares to Boost::ASIO can be found here: http://www.lenholgate.com/blog/2008/09/how-does-the-socket-server-framework-compare-to-boostasio.html.

alternatives to winsock2 with example server source in c++

i'm using this example implementation found at http://tangentsoft.net/wskfaq/examples/basics/select-server.html
This is doing most of what I need, handles connections without blocking and does all work in its thread (not creating a new thread for each connection as some examples do), but i'm worried since i've been told winsock will only support max 64 client connectios :S
Is this 64 connections true?
What other choices do I have? It would be cool to have a c++ example for a similar implementation.
Thanks
Alternative library:
You should consider using boost asio. It is a cross platform networking library which simplifies many of the tasks you may have to do.
You can find the example source code you seek here.
About the 64 limit:
There is no hard 64 connection limit that you will experience with a good design. Basically if you use some kind of threading model you will not experience this limitation.
Here's some information on the limit you heard about:
4.9 - What are the "64 sockets" limitations?
There are two 64-socket limitations:
The Win32 event mechanism (e.g.
WaitForMultipleObjects()) can only
wait on 64 event objects at a time.
Winsock 2 provides the
WSAEventSelect() function which lets
you use Win32's event mechanism to
wait for events on sockets. Because it
uses Win32's event mechanism, you can
only wait for events on 64 sockets at
a time. If you want to wait on more
than 64 Winsock event objects at a
time, you need to use multiple
threads, each waiting on no more than
64 of the sockets.
The select() function is also limited
in certain situations to waiting on 64
sockets at a time. The FD_SETSIZE
constant defined in winsock.h
determines the size of the fd_set
structures you pass to select(). It's
defined by default to 64. You can
define this constant to a higher value
before you #include winsock.h, and
this will override the default value.
Unfortunately, at least one
non-Microsoft Winsock stack and some
Layered Service Providers assume the
default of 64; they will ignore
sockets beyond the 64th in larger
fd_sets.
You can write a test program to try
this on the systems you plan on
supporting, to see if they are not
limited. If they are, you can get
around this with threads, just as you
would with event objects.
Source
#Brian:
if ((gConnections.size() + 1) > 64) {
// For the background on this check, see
// www.tangentsoft.net/wskfaq/advanced.html#64sockets
// The +1 is to account for the listener socket.
cout << "WARNING: More than 63 client "
"connections accepted. This will not "
"work reliably on some Winsock "
"stacks!" << endl;
}
To the OP:
Why would you not want to use winsock2?
You could try to look at building your own server using IOCP, although making this cross-platform is a little tricky. You could look at Boost::asio like Brian suggested.
Before you decide that you need 'alternatives to winsock2" please read this: Network Programming for Microsoft Windows.
In summary, you DON'T need an 'alternative to Winsock2' you need to understand how to use the programming models supplied to full effect on the platform that you're targeting. Then, if you really need cross platform sockets code that uses async I/O then look at ASIO, but, if you don't really need cross platform code then consider something that actually focuses on the problems that you might have on the platform that you do need to focus on - i.e. something windows specific. Go back to the book mentioned above and take a look at the various options you have.
The most performant and scalable option is to use IO Completion Ports. I have some free code available from here that makes it pretty easy to write a server that scales and performs well on a windows (NT) based platform; the linked page also links to some articles that I've written about this. A comparison of my framework to ASIO can be found here: http://www.lenholgate.com/blog/2008/09/how-does-the-socket-server-framework-compare-to-boostasio.html.

What's the deal with boost.asio and file i/o?

I've noticed that boost.asio has a lot of examples involving sockets, serial ports, and all sorts of non-file examples. Google hasn't really turned up a lot for me that mentions if asio is a good or valid approach for doing asynchronous file i/o.
I've got gobs of data i'd like to write to disk asynchronously. This can be done with native overlapped io in Windows (my platform), but I'd prefer to have a platform independent solution.
I'm curious if
boost.asio has any kind of file support
boost.asio file support is mature enough for everyday file i/o
Will file support ever be added? Whats the outlook for this?
Has boost.asio any kind of file support?
Starting with (I think) Boost 1.36 (which contains Asio 1.2.0) you can use [boost::asio::]windows::stream_handle or windows::random_access_handle to wrap a HANDLE and perform asynchronous read and write methods on it that use the OVERLAPPED structure internally.
User Lazin also mentions boost::asio::windows::random_access_handle that can be used for async operations (e.g. named pipes, but also files).
Is boost.asio file support mature enough for everyday file i/o?
As Boost.Asio in itself is widely used by now, and the implementation uses overlapped IO internally, I would say yes.
Will file support ever be added? Whats the outlook for this?
As there's no roadmap found on the Asio website, I would say that there will be no new additions to Boost.Asio for this feature. Although there's always the chance of contributors adding code and classes to Boost.Asio. Maybe you can even contribute the missing parts yourself! :-)
boost::asio file i/o on Linux
On Linux, asio uses the epoll mechanism to detect if a socket/file descriptor is ready for reading/writing. If you attempt to use vanilla asio on a regular file on Linux you'll get an "operation not permitted" exception because epoll does not support regular files on Linux.
The workaround is to configure asio to use the select mechanism on Linux. You can do this by defining BOOST_ASIO_DISABLE_EPOLL. The trade-off here being select tends to be slower than epoll if you're working with a large number of open sockets. Open a file regularly using open() and then pass the file descriptor to a boost::asio::posix::stream_descriptor.
boost::asio file i/o on Windows
On Windows you can use boost::asio::windows::object_handle to wrap a Handle that was created from a file operation. See example.
boost::asio::windows::random_access_handle is the easiest way to do this, if you need something advanced, for example asynchronous LockFileEx or something else, you might extend asio, add your own asynchronous events. example
io_uring has changed everything.
asio now support async file read/write.
See the releases notes:
asio 1.21.0 releases notes
ASIO supports overlapped I/O on Windows where support is good. On Unixes this idea has stagnated due to:
Files are often located on the same physical device, accessing them sequentially is preferable.
File requests often complete very rapidly because they are physically closeby.
Files are often critical to complete the basic operation of a program (e.g. reading in its configuration file must be done before initializing further)
The one common exception is serving files directly to sockets. This is such a common special-case that Linux has a kernel function that handles this for you. Again, negating the reason to use asynchronous file I/O.
In Short: ASIO appears to reflect the underlying OS design philosophy, overlapped I/O being ignored by most Unix developers, so it is not supported on that platform.
Asio 1.21 appears to have added built-in filesystem support.
For instance, asio::stream_file now exists with all the async methods you'd expect.
Linux has an asio Library that is no harder to use than Windows APIs for this job (I've used it). Both sets of operating systems implement the same conceptual architecture. They differ in details that are relevant to writing a good library, but not to the point that you cannot have a common interface for both OS platforms (I've used one).
Basically, all flavors of Async File I/O follow the "Fry Cook" architecture. Here's what I mean in the context of a Read op: I (processing thread) go up to a fast food counter (OS) and ask for a cheeseburger (some data). It gives me a copy of my order ticket (some data structure) and issues a ticket in the back to the cook (the Kernel & file system) to cook my burger. I then go sit down or read my phone (do other work). Later, somebody announces that my burger is ready (a signal to the processing thread) and I collect my food (the read buffer).