versatile pthread based multithread utility library - c++

I don't want to reinvent the wheel, and what I'm looking for most likely already exist in the FOSS world.
I'm looking for a pthread bases utility library that implements often used primitives to do communication between threads.
My main need is some kind of blocking queue for fixed size messages and the ability to wait for data to arrive on multiple queues at the same time (what you usually do using poll and select with file-handles).
Does something like this exist?
Programming language is C++ but I'm fine with a C library. OS is Linux but anything posix will do.
EDIT
I'm not looking for a thin wrapper around pthreads (like boost::thread or so). I already have this up and running. I'm looking for higher level primitives. Basically What java.util.concurrancey offers for the java guys.

Your requirements are already baked into POSIX Message Queues.
Instead of using select() you can do it in reverse. Rather than waiting in a select() you can use mq_notify() to tell you when there is something to read. MQs give you the option of having a signal delivered or having them spawn a new thread to read the queue.
If you are really intent on using select(), Linux makes this painless since the mqd_t type is actually a file descriptor. You can simply use the mqd_t returned from mq_open() like any other FD in select().
Note that use of a mqd_t in select() is not portable. In theory you should be able to do something similar on other systems but I have never tested it. Since POSIX MQs have a path to an entry to the filesystem you should be able to do a straight open() on the path and use the returned file descriptor in the select(), mapping it to the mqd_t used in mq_open() to determine which queue to read. Again, I have never tried it.

There's always boost::thread.

You could try OpenMP, though I'm not sure whether it's based on the pthread API or not.

For what programming language / environment?
Some options:
C: c-pthread-queue, APR queue
Python: queue module

Related

User events in unix c++

I am trying to port my multi threaded windows application to Unix. In my application, we have user created events which signals the thread to perform specific task.I found that the conditional variable or semaphore can be used to signal threads.My requirement is to create dynamic events on request but that is not feasible with conditional variable or semaphore.Help me to use the events like signalling concepts in Unix.
Posix threads allow you to communicate between threads with mutex and conditions. man pthreads, man pthread_cond_wait and other.
Another way to make events doesn't depend on threads and made on descriptors.
See man epoll, man poll, man select
Implementing event system manually in C may take some time, so there are libraries, that are implementing event system like libev, libuv, libevent, libevent2.
If you want more C++ then C you can use boost (like boost::signals. Don't know if there are other) or Qt, which has full signal/slot mechanism and event system built in. But it requires quite heavy dependencies.
If you want full implemented event queue and many interprise features you can look at any AMQP framework like rabbitmq.
It's hard to tell you what you need without specific problems.

Getting to know the basics of Asynchronous programming on *nix

For some time now I have been googling a lot to get to know about the various ways to acheive asynchronous programming/behavior on nix machines and ( as known earlier to me ) got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
Below are the few alternatives present for linux:
select/poll/epoll :: Cannot be done using single thread as epoll is still blocking call. Also the monitored file descriptors must be opened in non-blocking mode.
libaio:: What I have come to know about is that its implementation sucks and its still notification based instead of being completion based as with windows I/O completion ports.
Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern
libevent :: Any reason to go for it if I prefer ASIO?
Now Here comes the questions :)
What would be the best design pattern for writing fast scalable network server using epoll (ofcourse, will have to use threads here :( )
I had read somewhere that "only sockets can be opened in non-blocking mode" hence epoll supports only sockets and hence cannot be used for disk I/O.
How true is the above statement and why async programming cannot be done on disk I/O using epoll ?
Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself. Similar question
How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern ?
Hope somebody will able to answer all the questions with nice explanations also. Any link to source where the implementation details of epoll and AIO design patterns are exaplained is also appreciated.
Boost ASIO :: It uses epoll under linux and thus not a true async
pattern as it spawns thread which are completely abstracted from user
code to acheive the proactor design pattern
This is not correct. The Asio library uses epoll() by default on most recent Linux kernel versions. however, threads invoking io_service::run() will invoke callback handlers as needed. There is only one place in the Asio library that a thread is used to emulate an asynchronous interface, it is well described in the documentation:
An additional thread per io_service is used to emulate asynchronous
host resolution. This thread is created on the first call to either
ip::tcp::resolver::async_resolve() or
ip::udp::resolver::async_resolve().
This does not make the library "not a true async pattern" as you claim, in fact its name would disagree with you by definition.
1) What would be the best design pattern for writing fast scalable network server using epoll (of course, will have to use threads here :(
)
I suggest using Boost Asio, it uses the proactor design pattern.
3) Boost ASIO uses one big lock around epoll call. I didnt actually
understand what can be its implications and how to overcome it using
asio itself
The epoll reactor uses a mutex to dispatch handlers, though in practice this is not a big concern for most applications. There are application specific ways to mitigate this behavior, such as an io_service per CPU to exploit data locality. See my answer to a similar question on this topic. It is also discussed on the Asio mailing list frequently.
4) How can I modify ASIO pattern to work with disk files? Is there any
recommended design pattern?
The Asio library does not natively support file I/O as you noted. There have been several attempts to add it to the library, I'd suggest discussing on the mailing list.
First of all:
got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).
You probably has a small misconception, asynchronous can be build on top of "polling" api.
More then that "reactor" (epoll-like) API is more powerful then "proactor" API (IOCP) as
the second can be implemented in terms of the first one (but not the other way around).
Also some operations that are "truly" asynchronous for example like disk I/O, some some other tools can be with combination of signals and Linux specific signalfd can provide full coverage of some other cases.
Bottom line. epoll is truly asynchronous I/O

Framework for a server application (preferably, using BOOST C++)

I am thinking of writing a server application - along the lines of mySQL or Apache.
The main requirements are:
Clients will communicate with the server via TCP/IP (sockets)
The server will spawn a new child process to handle requests (ala Apache)
Ideally, I would like to use the BOOST libraries rather than attempt to reinvent my own. There must be code somewhere that does most of what I am trying to do - so I can use it (or atleast part of it as my starting point) can anyone point me to a useful link?
In the (hopefully unlikely) event that there is no code I can use as a starting point, can someone point out the most appropriate BOOST libraries to use - and a general guideline on how to proceeed.
My main worry is how to know when one of the children has crashed. AFAIK, there are two ways of doing this:
Using heartbeats between the parent and children (this quickly becomes messy, and introduces more things that could go wrong)
Somehow wrap the spawning of the process with a timeout parameter - but this is a dumb approach, because if a child is carrying out time intensive work, the parent may incorrectly think that the child has died
What is the best practises of making the parent aware that a child has died?
[Edit]
BTW, I am developing/running/deploying on Linux
On what platform (Windows/Linux/both)? Processes on Windows are considered more heavy-weight than on Linux, so you may indeed consider threads.
Also, I think it is better (like Apache does) not to spawn a process for each request but to have a process pool, so you save the cost of creating a process, especially on Windows.
If you are on Linux, can waitpid() be useful for you? You can use it in the non-blocking mode to check recurrently with some interval whether one of the child processes terminated
I can say for sure that Pion is your only stable option.
I have never used it but I intend to, and the API looks very clean.
As for the Boost libraries you would need:
Boost.Asio
Boost.Threading
Boost.Spirit (or something similar to parse the HTTP protocol)
Boost.IPC
What about using threads (which are supported by Boost) rather than forking the process? This would allow you to make queries about the state of a child and, imho, threads are simpler to handle than forking.
Generally Boost.Asio is good point to begin with.
But several points to be aware of:
Boost.Asio is very good library but it is not very fork aware, so don't try to share Asio
event loop between several fork processes - this would not work (i.e. - if boost::asio::io_service was created before fork - don't use it in more then one process after it)
Also it does not allow you to release file handler from boost::asio::XX::socket
so only way is to call dup and then pass it to child process.
But to be honest? I don't think you'll find any network event loop library that is
fork aware (maybe with exception of CppCMS's booster.aio that I had written
to be fork aware by myself).
Waiting for children is quite simple you can define a signal handler with sigaction
on SIGCHLD signal that is send then child crashes or exits.
So all you need to do is handle this signal and in main loop call waitpid when such
signal received.
With asio you can use "self-pipe" trick to wake the loop from sleep from signal handler.
First, take a look at CPPCMS. It might already fit your needs.
Now, as pointed by others, boost::asio is a good starting point but is really the basics of the task.
Maybe you'll be more interested in the works being done about server-code based on boost::asio : cpp-netlib (that is made to be submitted in boost once done) The author's blog.
I've made an FOSS library for creating C++ applications in a modular way. It's hosted at
https://github.com/chilabot/chila
here's my blog: http://chilatools.blogspot.com/view/sidebar
It's specially suited for generic server creation (that was my motivation for constructing it), but I think it can be used for any kind of application.
The part that has to be deployed with the final binary is LGPL, so it can be used with commercial applications.

Threadsafe logging inside C++ Shared library

I have implemented multithreaded shared library in C++ (For Linux and Windows). I would like to add logging mechanism inside the library itself. The caller of the library is not aware of that. The log file would be same so I am wondering how could I design the thread safe logging if multiple process is using my library and trying to open and log into the same log file. Any suggestions?
You can try using log4cpp library.
Use file locking. I believe fcntl is POSIX compliant so should work on Windows too. Does your code use Posix calls?
With fcntl, you should be able to lock a specific range of bytes. So if you seek to end and try to lock out the amount of bytes you are about to write, it should be pretty fast. To obtain the lock, you can probably spin, relinquishing the CPU for a small amount of time, if you don't obtain the lock.
Your library shares the log file with the client application? If so, there is absolutely no way to do thread-safe logging. The client could just create a thread and log while calling you.
Otherwise, you have two options:
Use a mutex. Simplest solution.
Have a logging thread with a lock-free (you'll probably be able to get away with a mutex) FIFO queue of messages that is created/destroyed when your library is created/destroyed. I hope you have an init/deinit function...
The difference b/w 1 and 2 is that for 1, you hold a mutex for the full I/O operation. In the second, you only hold a mutex for as long as it takes to push a message at the back of the queue, which can be a constant operation if you do it right.

What's the deal with boost.asio and file i/o?

I've noticed that boost.asio has a lot of examples involving sockets, serial ports, and all sorts of non-file examples. Google hasn't really turned up a lot for me that mentions if asio is a good or valid approach for doing asynchronous file i/o.
I've got gobs of data i'd like to write to disk asynchronously. This can be done with native overlapped io in Windows (my platform), but I'd prefer to have a platform independent solution.
I'm curious if
boost.asio has any kind of file support
boost.asio file support is mature enough for everyday file i/o
Will file support ever be added? Whats the outlook for this?
Has boost.asio any kind of file support?
Starting with (I think) Boost 1.36 (which contains Asio 1.2.0) you can use [boost::asio::]windows::stream_handle or windows::random_access_handle to wrap a HANDLE and perform asynchronous read and write methods on it that use the OVERLAPPED structure internally.
User Lazin also mentions boost::asio::windows::random_access_handle that can be used for async operations (e.g. named pipes, but also files).
Is boost.asio file support mature enough for everyday file i/o?
As Boost.Asio in itself is widely used by now, and the implementation uses overlapped IO internally, I would say yes.
Will file support ever be added? Whats the outlook for this?
As there's no roadmap found on the Asio website, I would say that there will be no new additions to Boost.Asio for this feature. Although there's always the chance of contributors adding code and classes to Boost.Asio. Maybe you can even contribute the missing parts yourself! :-)
boost::asio file i/o on Linux
On Linux, asio uses the epoll mechanism to detect if a socket/file descriptor is ready for reading/writing. If you attempt to use vanilla asio on a regular file on Linux you'll get an "operation not permitted" exception because epoll does not support regular files on Linux.
The workaround is to configure asio to use the select mechanism on Linux. You can do this by defining BOOST_ASIO_DISABLE_EPOLL. The trade-off here being select tends to be slower than epoll if you're working with a large number of open sockets. Open a file regularly using open() and then pass the file descriptor to a boost::asio::posix::stream_descriptor.
boost::asio file i/o on Windows
On Windows you can use boost::asio::windows::object_handle to wrap a Handle that was created from a file operation. See example.
boost::asio::windows::random_access_handle is the easiest way to do this, if you need something advanced, for example asynchronous LockFileEx or something else, you might extend asio, add your own asynchronous events. example
io_uring has changed everything.
asio now support async file read/write.
See the releases notes:
asio 1.21.0 releases notes
ASIO supports overlapped I/O on Windows where support is good. On Unixes this idea has stagnated due to:
Files are often located on the same physical device, accessing them sequentially is preferable.
File requests often complete very rapidly because they are physically closeby.
Files are often critical to complete the basic operation of a program (e.g. reading in its configuration file must be done before initializing further)
The one common exception is serving files directly to sockets. This is such a common special-case that Linux has a kernel function that handles this for you. Again, negating the reason to use asynchronous file I/O.
In Short: ASIO appears to reflect the underlying OS design philosophy, overlapped I/O being ignored by most Unix developers, so it is not supported on that platform.
Asio 1.21 appears to have added built-in filesystem support.
For instance, asio::stream_file now exists with all the async methods you'd expect.
Linux has an asio Library that is no harder to use than Windows APIs for this job (I've used it). Both sets of operating systems implement the same conceptual architecture. They differ in details that are relevant to writing a good library, but not to the point that you cannot have a common interface for both OS platforms (I've used one).
Basically, all flavors of Async File I/O follow the "Fry Cook" architecture. Here's what I mean in the context of a Read op: I (processing thread) go up to a fast food counter (OS) and ask for a cheeseburger (some data). It gives me a copy of my order ticket (some data structure) and issues a ticket in the back to the cook (the Kernel & file system) to cook my burger. I then go sit down or read my phone (do other work). Later, somebody announces that my burger is ready (a signal to the processing thread) and I collect my food (the read buffer).