I'm fully aware of the major differences between poll() and select():
select() only supports a fixed amount of file descriptors
select() is supposedly supported on more systems
poll() allows slightly more fine-grained control of event types
poll() implementations may differ in certain details
However, they both accomplish the same task in roughly the same way. So:
Shall we use poll() or select()?
EDIT: I might add that I'm not interested in epoll() since portability is of concern to me. Furthermore, libev(ent) is not an option either, since I'm asking this question because I'm writing my own replacement library for libev(ent).
All remotely modern systems have poll, and it's a greatly superior interface to select/pselect in almost all ways:
poll allows more fine-grained detection of status than select.
poll does not have limits on the max file descriptor you can use (and more importantly, does not have critical vulnerabilities when you fail to check for file descriptors past the FD_SETSIZE limit).
The only disadvantages I can think of to using poll are that:
unlike pselect, poll cannot atomically unmask/mask signals, so you can't use it for waiting for a set of events that includes both file descriptor activity and signals unless you resort to the self-pipe trick.
poll only has millisecond resolution for the wait timeout, rather than microsecond (select) or nanosecond (pselect).
Certainly portability of poll is not a consideration anymore. Any system old enough to lack poll is full of so many vulnerabilities it should not be connected to a network.
In summary, unless you have very special needs (tiny timeout intervals, nasty signal interactions, scaling to millions of persistent connections, etc.) I would simply use poll and be done with it. As others have mentioned, libevent is also an option, but it's not clean/safe code (its use of select actually invokes dangerous UB trying to workaround the limitations of select!) and I find code that uses libevent is generally a lot more unnecessarily complicated than code that simply uses poll directly.
If you are writing for GNU/Linux, you should look at epoll(7).
But for most cross platform support, you could look into using libevent.
http://libevent.org/
Actually, it is hard to recommend a single poll/select implementation without knowing the specifics of what you are trying to do.
I would actually recommend boost::asio, then you can try both implementations and test to see what suits your setup best.
I would use libev or libevent. These libraries are cross-platform and abstract away the details of the underlying implementation (e.g. poll, select.)
Depending on your exact needs, I would recommend either poll or ::boost::asio. I find libevent to be kind of cumbersome and it has all kinds of stuff in it that's oriented towards C and/or towards higher-level protocol handling in it.
I would not recommend select. I have seen implementations of select invisibly fail in weird and bizarre ways because the descriptor limit was exceeded. And the best you can do is to make it fail in an obvious way. Maybe this is very unlikely with your application, but I wouldn't chance it.
And nowadays poll is available almost everywhere select is. About the only place it isn't is Windows. But, IMHO, if you want cross-platform portability to that platform you are better off using a nice wrapper like ::boost::asio that nicely wraps the most efficient OS technique.
Apple's poll() has trouble with TTYs, IME. Where portability is a concern select() might therefore be a better choice.
Related
I'm working on a MMO game server project and I have a problem. That's select() method's limit. I want to handle more than 1024 socket I/O with a single thread. I want to make this with single thread because I've tried to make a multi-thread handling system. That system creates 3 thread (for example in 4 cores processor; 1 is main, 3 is select() handlers) that handles select() method but there is an other problem again, now our limit is gone to 3072 (1024 * 3) and that isn't a solution! After that idea, I want to make a non-blocking socket system, with this system I've called 2 different select method in 1 single thread like this; "select() select()". They returns in order and I can handle them in order. But there is an other problem I think. If I want to implement a thread like "while(true){ select() select()}" and select() methods (non-blocking) retuns, I'll overload CPU like a empty "while(true)" block. If I want to make a select() timeout, I can't handle bottom select() in realtime. Now I can't make a algorithm for that. Can anybody help me about this?
NOTE: I don't want to use poll-epoll-wsapoll etc. (poll cannot handle microseconds, it isn't fast as select!) and libevent like 3rd party libraries (I want to make my own!)
FINALLY SOLUTION (I think): I don't need to handle nanoseconds for a I/O operation because there is no sense to handle it. Poll is a good way to handle more than 1024 socket I/O. I'll research something for understanding MMO systems. And the last one is I'll make some tests and I'll try somethings before I ask a question :) Thanks!
EDIT: I'm new in this Q&A platform. Can you tell me what's wrong with my question after giving a negative point? :)
Using select is fundamentally wrong with this many (thousands) of connections. While select is usually faster when you have only a very small number of sockets (maybe tens,) it scales horribly to several thousand and more. Everywhere that I know of, select slows down linearly with the number of connections (it's even worse than that, but I wouldn't go into the details.)
Even poll doesn't do much better than select at scaling to thousands of connections. It doesn't have select's (low) limit on the number of file descriptors you can poll, but it still scales linearly with the number of connections.
What you really should use are platform-specific facilities like epoll and kqueue. They scale extremely better (usually O(1),) but obviously they aren't portable.
I seriously suggest that you consider something like libev that is a portable, highly-tested and a thin wrapper around platform-specific facilities and services.
This is because platform-specific methods (e.g. select, poll, epoll, kqueue, I/O completion ports, event ports, etc.) are different form each other and none of them is available on more than one or two platforms, or their limits and the details of their behaviors differ slightly. These facilities might even change from one version of an OS to the next (e.g. epoll on Linux 2.6.9, IIRC.)
Even if you are not concerned with portability or future-proofing your code, such a library can provide you with more functionality and a nicer interface.
Two more libraries you can try are libevent (a little larger and slower, but more features) and libuv (if you need Windows portability.)
Given the requirements you have set, your problem has no solution.
The normal way to overcome select()'s limit of FD_SETSIZZE (1024) file descriptors is to use poll() (or even better alternatives epoll and kqueue) but you've rejected that option.
Otherwise, you could always overcome the problem by calling select() multiple times in parallel in different threads with different sets of file descriptors... but you've rejected that option too.
I don't believe there can really be any other solution!
Perhaps you should explain why both the poll() et al option and the thread option are not suitable. Your requirements seem like artificial limitations without justification.
I have written a chat server using C on Linux. I have tested the same and it works fine with respect to performance. The only thing which lags is that I am using select system call for handling of sockets descriptors. Since select has the limit of 1024 so at max my chat server can handle only 1024 users concurrently.
I know that the other option which I can use is poll, but not so sure about it and its performance as compared to select.
Please suggest me the most effective way by which I can resolve this situation.
poll() can be used as an almost drop-in replacement for select(), and will allow you to exceed 1024 file descriptors (you can make make the array passed to poll() as large as you want).
It will have similar performance characteristics to select(), since both require the kernel and userspace application to scan the entire array - but if select() is working OK for you, then poll() should too. (There is actually a slight performance improvement in poll() - the .events field, specifying the events you are interested in for each file descriptor, is not changed by poll(), so you don't have to rebuild the array before every call like you do with the file descriptor sets passed to select()).
If you later find yourself having performance problems caused by scanning the poll file descriptor array, you can consider switching to the epoll interface, which is more complicated but also scales better with very large numbers of file descriptors.
Your question is known as the C10K problem (how to deal with more than 10 thousands simultaneous connections). You'll find lot of resources on the web, e.g. this one.
And you should consider select as an obsolete system call. Even with only dozens of file descriptors, you should at least prefer poll
Notice that Qt and Gtk provide you with an event loop machinery, often using poll (and QtCore or Glib can be used outside of graphical interfaces). There is also libev and libevent. I suggest using one of them.
Linux has no 1024 limit on select(). But:
select() performance is very poor
FreeBSD does :)
Your can use poll(). But its performance suffers when number of active connections increases.
Using epoll() is preferable on Linux however I would suggest to use libevent
libevent is fast, clean and portable way to implement heavy loaded servers and for linux it has epoll under the hood.
I have read that working with more than 64 sockets in a thread is dangerous(?). But -at least for me- Non-blocking sockets are used for avoiding complicated thread things. Since there is only one listener socket, how am i supposed to split sockets into threads and use them with select() ? Should i create fd_sets for each thread or what ? And how am i supposed to assign a client to a thread, since I can only pass values in the beginning with CreateThread() ?
No no no, you got a few things wrong there.
First, the ideal way to handle many sockets is to have a thread pool which will do the work in front of the sockets (clients).
Another thread, or two (actually in the amount of CPUs as far as I know), do the connection accepting.
Now, when a an event occurs, such as a new connection, it is being dispatched to the thread pool to be processed.
Second, it depends on the actual implementation and environment.
For example, in Windows there's something called IOCP.
If you ask me - do not bother with the lower implementation but instead use a framework such as BOOST::ASIO or ACE.
I personally like ASIO. The best thing about those frameworks is that they are usually cross-platform (nix, Windows etc').
So, my answer is a bit broad but I think it's to the best that you take these facts into consideration before diving into code/manuals/implementation.
Good luck!
Well, what you have read is wrong. Many powerful single-threaded applications have been written with non-blocking sockets and high-performance I/O demultiplexers like epoll(4) and kqueue(2). Their advantage is that you setup your wait events upfront, so the kernel does not have to copy ton of file descriptors and [re-]setup lots of stuff on each poll.
Then there are advantages to threading if your primary goal is throughput, and not latency.
Check out this great overview of available techniques: The C10K problem.
The "ideal way to handle many sockets" is not always - as Poni seems to believe - to "have a thread pool."
What does "ideal" pertain to? Is it ease of programming? Best performance?
Since he recommends not bothering "with the lower implementation" and "use a framework such as BOOST::ASIO or ACE" I guess he means ease of programming.
Had he had a performance angle on Windows he would have recommended "something called IOCPs." IOCPs are "IO Control Ports" which will allow implementation of super-fast IO-applications using just a handful of threads (one per available core is recommended). IOCP applications run circles around any thread-pool equivalent which he would have known if he'd ever written code using them. IOCPs are not used alongside thread pools but instead of them.
There is no IOCP equivalent in Linux.
Using a framework on Windows may result in a faster "time to market" product but the performance will be far from what it might have been had a pure IOCP implementation been chosen.
The performance difference is such that OS-specific code implementations should be considered. If a generic solution is chosen anyway, at least performance would "not have been given away accidentally."
For some time now I have been googling a lot to get to know about the various ways to acheive asynchronous programming/behavior on nix machines and ( as known earlier to me ) got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
Below are the few alternatives present for linux:
select/poll/epoll :: Cannot be done using single thread as epoll is still blocking call. Also the monitored file descriptors must be opened in non-blocking mode.
libaio:: What I have come to know about is that its implementation sucks and its still notification based instead of being completion based as with windows I/O completion ports.
Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern
libevent :: Any reason to go for it if I prefer ASIO?
Now Here comes the questions :)
What would be the best design pattern for writing fast scalable network server using epoll (ofcourse, will have to use threads here :( )
I had read somewhere that "only sockets can be opened in non-blocking mode" hence epoll supports only sockets and hence cannot be used for disk I/O.
How true is the above statement and why async programming cannot be done on disk I/O using epoll ?
Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself. Similar question
How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern ?
Hope somebody will able to answer all the questions with nice explanations also. Any link to source where the implementation details of epoll and AIO design patterns are exaplained is also appreciated.
Boost ASIO :: It uses epoll under linux and thus not a true async
pattern as it spawns thread which are completely abstracted from user
code to acheive the proactor design pattern
This is not correct. The Asio library uses epoll() by default on most recent Linux kernel versions. however, threads invoking io_service::run() will invoke callback handlers as needed. There is only one place in the Asio library that a thread is used to emulate an asynchronous interface, it is well described in the documentation:
An additional thread per io_service is used to emulate asynchronous
host resolution. This thread is created on the first call to either
ip::tcp::resolver::async_resolve() or
ip::udp::resolver::async_resolve().
This does not make the library "not a true async pattern" as you claim, in fact its name would disagree with you by definition.
1) What would be the best design pattern for writing fast scalable network server using epoll (of course, will have to use threads here :(
)
I suggest using Boost Asio, it uses the proactor design pattern.
3) Boost ASIO uses one big lock around epoll call. I didnt actually
understand what can be its implications and how to overcome it using
asio itself
The epoll reactor uses a mutex to dispatch handlers, though in practice this is not a big concern for most applications. There are application specific ways to mitigate this behavior, such as an io_service per CPU to exploit data locality. See my answer to a similar question on this topic. It is also discussed on the Asio mailing list frequently.
4) How can I modify ASIO pattern to work with disk files? Is there any
recommended design pattern?
The Asio library does not natively support file I/O as you noted. There have been several attempts to add it to the library, I'd suggest discussing on the mailing list.
First of all:
got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
You probably has a small misconception, asynchronous can be build on top of "polling" api.
More then that "reactor" (epoll-like) API is more powerful then "proactor" API (IOCP) as
the second can be implemented in terms of the first one (but not the other way around).
Also some operations that are "truly" asynchronous for example like disk I/O, some some other tools can be with combination of signals and Linux specific signalfd can provide full coverage of some other cases.
Bottom line. epoll is truly asynchronous I/O
I'm working on an instant messenger client in C++ (Win32) and I'm experimenting with different asynchronous socket models. So far I've been using WSAAsyncSelect for receiving notifications via my main window. However, I've been experiencing some unexpected results with Winsock spawning additionally 5-6 threads (in addition to the initial thread created when calling WSAAsyncSelect) for one single socket.
I have plans to revamp the client to support additional protocols via DLL:s, and I'm afraid that my current solution won't be suitable based on my experiences with WSAAsyncSelect in addition to me being negative towards mixing network with UI code (in the message loop).
I'm looking for advice on what a suitable asynchronous socket model could be for a multi-protocol IM client which needs to be able to handle roughly 10-20+ connections (depending on amount of protocols and protocol design etc.), while not using an excessive amount of threads -- I am very interested in performance and keeping the resource usage down.
I've been looking on IO Completion Ports, but from what I've gathered, it seems overkill. I'd very much appreciate some input on what a suitable socket solution could be!
Thanks in advance! :-)
There are four basic ways to handle multiple concurrent sockets.
Multiplexing, that is using select() to poll the sockets.
AsyncSelect which is basically what you're doing with WSAAsyncSelect.
Worker Threads, creating a single thread for each connection.
IO Completion Ports, or IOCP. dp mentions them above, but basically they are an OS specific way to handle asynchronous I/O, which has very good performance, but it is a little more confusing.
Which you choose often depends on where you plan to go. If you plan to port the application to other platforms, you may want to choose #1 or #3, since select is not terribly different from other models used on other OS's, and most other OS's also have the concept of threads (though they may operate differently). IOCP is typically windows specific (although Linux now has some async I/O functions as well).
If your app is Windows only, then you basically want to choose the best model for what you're doing. This would likely be either #3 or #4. #4 is the most efficient, as it calls back into your application (similar, but with better peformance and fewer issues to WSAsyncSelect).
The big thing you have to deal with when using threads (either IOCP or WorkerThreads) is marshaling the data back to a thread that can update the UI, since you can't call UI functions on worker threads. Ultimately, this will involve some messaging back and forth in most cases.
If you were developing this in Managed code, i'd tell you to look at Jeffrey Richter's AysncEnumerator, but you've chose C++ which has it's pros and cons. Lots of people have written various network libraries for C++, maybe you should spend some time researching some of them.
consider to use the ASIO library you can find in boost (www.boost.org).
Just use synchronous models. Modern operating systems handle multiple threads quite well. Async IO is really needed in rare situations, mostly on servers.
In some ways IO Completion Ports (IOCP) are overkill but to be honest I find the model for asynchronous sockets easier to use than the alternatives (select, non-blocking sockets, Overlapped IO, etc.).
The IOCP API could be clearer but once you get past it it's actually easier to use I think. Back when, the biggest obstacle was platform support (it needed an NT based OS -- i.e., Windows 9x did not support IOCP). With that restriction long gone, I'd consider it.
If you do decide to use IOCP (which, IMHO, is the best option if you're writing for Windows) then I've got some free code available which takes away a lot of the work that you need to do.
Latest version of the code and links to the original articles are available from here.
And my views on how my framework compares to Boost::ASIO can be found here: http://www.lenholgate.com/blog/2008/09/how-does-the-socket-server-framework-compare-to-boostasio.html.