Multithreaded application concept

Multithreaded application concept - c++

I have a small architecture doubt about organizing code in separate functional units (most probably threads?). Application being developed is supposed to be doing the following tasks:
Display some images on a screen (i.e. slideshow)
Read the data from external device through the USB port
Match received data against the corresponding image (stimulus)
Do some data analysis
Plot the results of data analysis
My thoughts were to organize the application into the following modules:
GUI thread (+ image slideshow)
USB thread buffering the received data
Thread for analyzing/plotting data (main GUI thread should not be blocked while plotting the data which might consume some more time)
So, what do you generally think about this concept? Is there anything else you think that might be a better fit in this particular scenario?

You can probably get away with combining 1 & 2, since the slide-show feature is essentially gui oriented anyway.
For #3, you may be able to make do with some kind of asynchronous I/O methodology, so that you don't need to dedicate a polling thread. Not sure if you can do this with USB, but you can certainly get async I/O with serial and network interfaces, so it's worth looking into.
It's probably a good idea to move heavy-weight tasks like 4 & 5 to their own thread. If you aren't doing the analysis and plotting concurrently, maybe one thread for them both. However, you should really consider how much cpu time these activities will need. If the worst-case analyze-and-plot takes much less than half a second, you might even just perform these actions with a call from the gui. Conversely, if there are cases where this will take longer than that, a separate thread is favorable b/c your users won't like a laggy gui.
Just bear in mind that the dark side of threads lies in the inevitable challenge of coordinating them.

Because of the way the Windows API works, especially with regard to user input and window ownership. You can really only do UI on a single thread. If you try and use multiple threads, they just end up locking each other out and only 1 thread runs at a time. There are some specialized exceptions, but you have to be a real master of the API to pull it off.
So.
GUI thread, owns the Window, and handles all user input.
USB listening thread, you would know better than I whether this makes sense
Thread(s) for analyzing/plotting data, once again, I can't speak to this, but I'm skeptical that they will really both be running at the same time. It seems more likely this it would be analyze then plot so 1 thread.
Thread for rendering frames for a slideshow.
I'm not sure how plotting isn't the same thing as the slideshow, but I do think you can have a background thread for drawing the slideshow as long as it doesn't display the images.
You can render (i.e. draw to a bitmap or DirectX surface) in a background thread, you just can't show it in a window. But you could hand completed bitmaps off to the GUI thread and have it do the actual displaying of the bitmap. This is essentially how a lot of video playback code works.

A lot of this depends on how much is involved in performing 3 (Do some data analysis.) and 4 (Plot analyzed data.)
My instincts would be:
Definitely have a separate thread for reading the data off the USB. Assuming for a moment that 3 is dependent on reading the data, then I would do 3 in the same thread as reading the data. This will simplify your signaling to the GUI when the data is ready. This also assumes the processing is quick, and won't block the USB port (How is that being read? IO completion ports?). If the processing takes time then you need a separate thread.
Likewise if image slide processing show takes a long time, this should be done in a separate thread. If this can be quickly recalculated depending say in a paint function, I would keep it as part of the main GUI.
There is some overhead with context switch of threads, and for each thread added complexity of signaling. So I would only add a thread to solve blocking of the GUI and the USB port. It may be possible to do all of this in just two threads.

4 and 5 are definitely good ideas. That being said, avoid using low-level threads unless you absolutely must.
I'd check out Boost and Boost::Thread. Not only does it make your code more portable, but I haven't worked with an easier library for threading.

If you are using Builder 2009, you should look at TThread. It has some stuff to simplify thread coding.

I can't help thinking that you may be going a bit overboard here. A USB port can't really deliver data terribly quickly -- it's theoretical bandwidth is only 480 Mbits/second, and realistically, it's a pretty rare USB device that can really get very close to that.
Unless the analysis you've mentioned is quite a bit more complex than you've implied, my guess is that a single thread is probably entirely adequate. I'd think hard about using overlapped I/O to read the data, and MsgWaitForMultipleObjects for the main message loop.
It seems to me that the main place you stand a good chance of gaining a lot is in plotting the data after it's processed. It might be worth considering something like OpenGL or DirectX Graphics to do the drawing. Especially if you're producing quite a bit of output, this can give a really substantial speed improvement. In an ideal situation, multiple threads might multiply your speed by the number of available cores -- typically 2 or 4 on today's machines. Drawing the output is likely to be the slowest part of the job, and hardware acceleration can easily speed that up by a considerably larger factor -- 10x is at the low end of what you can typically expect, and 100x is fairly common.

Related

Should GTK+ interface run in a separate thread?

I'm taking my first steps in GTK+ (C++ and gtkmm more specificaly) and I have a rather conceptual doubts about how to best structure my program. Right now I just want my GUI to show what is happening in my C++ program by printing several values, and since my main thread is halted while the GUI window is running, I've come across solutions that separated both the processing/computing operations and the graphical interface in separate threads. Is this commonly accepted as the best way to do it, not at all, or not even relevant?

Unless you have a good reason, you are generally better off not creating new threads. Synchronization is hard to get right.
GUI programming is event driven (click on a button and something happens). So you will probably need to tie your background processing into the GUI event system.
In the event that your background processing takes a long time, you will need to break it into a number of fast chunks. At the end of each chunk, you can update a progress bar and schedule the next chunk.
This will mean you will need to probably use some state machine patterns.
Also make sure that any IO is non-blocking.

Here's an example of lengthy operation split in smaller chunks using the main loop without additional threads. Lazy Loading using the main loop.

Yes, absolutely! (in response to your title)
The GUI must be run in a separate thread. If you have ever come across those extremely annoying interfaces that lock up while an operation is in progress1, you'd know why it's very important to have the GUI always running regardless of operation happening.
It's a user experience thing.
1 I don't mean the ones that disable some buttons during operation (that's normal), but the ones that everything seems frozen.

This is the reverse: the main thread should be the Gtk one, and the long processing/computing tasks should be done in threads.
The documentation gives a clear example:
https://pygobject.readthedocs.io/en/latest/guide/threading.html

Thread per connection vs Reactor pattern (with a thread pool)?

I want to write a simple multiplayer game as part of my C++ learning project.
So I thought, since I am at it, I would like to do it properly, as opposed to just getting-it-done.
If I understood correctly: Apache uses a Thread-per-connection architecture, while nginx uses an event-loop and then dedicates a worker [x] for the incoming connection. I guess nginx is wiser, since it supports a higher concurrency level. Right?
I have also come across this clever analogy, but I am not sure if it could be applied to my situation. The analogy also seems to be very idealist. I have rarely seen my computer run at 100% CPU (even with a umptillion Chrome tabs open, Photoshop and what-not running simultaneously)
Also, I have come across a SO post (somehow it vanished from my history) where a user asked how many threads they should use, and one of the answers was that it's perfectly acceptable to have around 700, even up to 10,000 threads. This question was related to JVM, though.
So, let's estimate a fictional user-base of around 5,000 users. Which approach should would be the "most concurrent" one?
A reactor pattern running everything in a single thread.
A reactor pattern with a thread-pool (approximately, how big do you suggest the thread pool should be?
Creating a thread per connection and then destroying the thread the connection closes.
I admit option 2 sounds like the best solution to me, but I am very green in all of this, so I might be a bit naive and missing some obvious flaw. Also, it sounds like it could be fairly difficult to implement.
PS: I am considering using POCO C++ Libraries. Suggesting any alternative libraries (like boost) is fine with me. However, many say POCO's library is very clean and easy to understand. So, I would preferably use that one, so I can learn about the hows of what I'm using.

Reactive Applications certainly scale better, when they are written correctly. This means
Never blocking in a reactive thread:
Any blocking will seriously degrade the performance of you server, you typically use a small number of reactive threads, so blocking can also quickly cause deadlock.
No mutexs since these can block, so no shared mutable state. If you require shared state you will have to wrap it with an actor or similar so only one thread has access to the state.
All work in the reactive threads should be cpu bound
All IO has to be asynchronous or be performed in a different thread pool and the results feed back into the reactor.
This means using either futures or callbacks to process replies, this style of code can quickly become unmaintainable if you are not used to it and disciplined.
All work in the reactive threads should be small
To maintain responsiveness of the server all tasks in the reactor must be small (bounded by time)
On an 8 core machine you cannot cannot allow 8 long tasks arrive at the same time because no other work will start until they are complete
If a tasks could take a long time it must be broken up (cooperative multitasking)
Tasks in reactive applications are scheduled by the application not the operating system, that is why they can be faster and use less memory. When you write a Reactive application you are saying that you know the problem domain so well that you can organise and schedule this type of work better than the operating system can schedule threads doing the same work in a blocking fashion.
I am a big fan of reactive architectures but they come with costs. I am not sure I would write my first c++ application as reactive, I normally try to learn one thing at a time.
If you decide to use a reactive architecture use a good framework that will help you design and structure your code or you will end up with spaghetti. Things to look for are:
What is the unit of work?
How easy is it to add new work? can it only come in from an external event (eg network request)
How easy is it to break work up into smaller chunks?
How easy is it to process the results of this work?
How easy is it to move blocking code to another thread pool and still process the results?
I cannot recommend a C++ library for this, I now do my server development in Scala and Akka which provide all of this with an excellent composable futures library to keep the code clean.
Best of luck learning C++ and with which ever choice you make.

Option 2 will most efficiently occupy your hardware. Here is the classic article, ten years old but still good.
http://www.kegel.com/c10k.html
The best library combination these days for structuring an application with concurrency and asynchronous waiting is Boost Thread plus Boost ASIO. You could also try a C++11 std thread library, and std mutex (but Boost ASIO is better than mutexes in a lot of cases, just always callback to the same thread and you don't need protected regions). Stay away from std future, cause it's broken:
http://bartoszmilewski.com/2009/03/03/broken-promises-c0x-futures/
The optimal number of threads in the thread pool is one thread per CPU core. 8 cores -> 8 threads. Plus maybe a few extra, if you think it's possible that your threadpool threads might call blocking operations sometimes.

FWIW, Poco supports option 2 (ParallelReactor) since version 1.5.1

I think that option 2 is the best one. As for tuning of the pool size, I think the pool should be adaptive. It should be able to spawn more threads (with some high hard limit) and remove excessive threads in times of low activity.

as the analogy you linked to (and it's comments) suggest. this is somewhat application dependent. now what you are building here is a game server. let's analyze that.
game servers (generally) do a lot of I/O and relatively few calculations, so they are far from 100% CPU applications.
on the other hand they also usually change values in some database (a "game world" model). all players create reads and writes to this database. which is exactly the intersection problem in the analogy.
so while you may gain some from handling the I/O in separate threads, you will also lose from having separate threads accessing the same database and waiting for its locks.
so either option 1 or 2 are acceptable in your situation. for scalability reasons I would not recommend option 3.

C++ Server - To Thread or not to Thread?

I'm working on a game server, written in C++, and I'm trying to decide how many threads to use and what tasks to thread. The basic server skeleton consists of keyboard I/O and output to a console, accepting incoming connects, sending outgoing connects, and doing the game "stuff".
What I'd like to know is which things should be given a separate thread. Should each connect have its own thread? I know this is variable, it depends on the project or so, but I would like it to support a pretty decent number of players (somewhere in the hundreds if possible).

The standard answer should always be: Try it the simplest way first, and only look for ways to improve performance if the simple way isn't good enough. However, re-architecting a large C++ program can be a painful experience, so some guesses about performance in advance may be appropriate.
Theoretically, hundreds of threads are probably OK on modern machines. The NPTL implementation for Linux was tested with tens of thousands of threads, as I recall. If that's the easiest way for you to implement, it may be the right answer.
However, high-performance web servers and similar typically use event-driven models instead. Consider a library like libevent. I'm sure there are C++ libraries for the same purpose.
I personally believe that languages without first-class continuations, or at least coroutines, are poor choices for this kind of work, but the C language family is how we get work done today, so off we go. :-)

A good solution could be to use a Thread pool.
Idea is to let the main thread dispatch equitably all connexions in a fixed number of threads.
With a good design, you can easily set the number of thread on runtime.
You can find more informations here.
Create more threads than you have CPU cores is not productive, and adding too threads decrease performances due to time taken for switching between threads.
By example, for compiling a large project (it's not exactly the same thing, but it's valid for both case), it's often recommended to use no more thread than number of CPU cores + 1.

A very common technique is to have the game server run on one thread to monitor several connections (i.e. sockets) by using a select on each socket. When data is available, grab the data and enqueue it in a producer/consumer type model for the game engine to pick up.
This is by no means the be-all-end-all implementation, but it should be enough to get you started. Sounds like a cool project. Good luck!

If you setup the connections and utilize them in a manner that cause the thread to block waiting on IO then you should be able to service all of the connections and the keyboard on one thread. You may not want to put the console output on that same thread, as I've seen cases (on windows at least), where the speed of writing to the console is actually a bottleneck (i.e. if the console window is minimized the process runs considerably faster).
If the work of your game engine parallelizes well then you probably want to set use as many threads as there are CPUs less one (for the OS and the other two threads). If you expect the client to run on the same machine the server will want to detect that and scale back the number of threads it uses.

VNC viewer implementation

Our team is implementing a VNC viewer (=VNC client) on Windows. The protocol (called RFB) is stateful, meaning that the viewer has to read 1 byte, see what it is, then read either 3 or 10 bytes more, parse them, and so on.
We've decided to use asynchronous sockets and a single (UI) thread. Consequently, there are 2 ways to go:
1) state machine -- if we get a block on socket reading, just remember the current state and quit. Later on, a socket notification will arrive and the interrupted logic will resume from the proper stage;
2) inner message loop -- once we determine that reading from the socket would block, we enter an inner message loop and spin there until all the necessary data is finally received.
UI is not thus frozen in case of a block.
As experience showed, the second approach is bad, as any message can come while we're in the inner message loop. I cannot tell the full story here, but it simply is not reliable enough. Crashes and kludges.
The first option seems to be quite acceptable, but it is not easy to program in such a style. One has to remember the state of an algorithm and values of all the local variables required for further processing.
This is quite possible to use multiple threads, but we just thought that the problems in this case would be even much harder: synchronization of frame-buffer access, multi-threading issues, etc. Moreover, even in this variant it seems necessary to use asynchronous sockets as well.
So, what way is in your opinion the best ?
The problem is quite a general one. This is the problem of organizing asynchronous communication through stateful protocols.
Edit 1: We use C++ and MFC as UI framework.

I've done a few parallel computing projects and it seems that MPI (Message Passing Interface) might be helpful to your VNC project. You're probably not so interested in the parallel computing power provided by MPI, but you may want to use the simplified socket-like interface for asynchronous communication over a network.
http://www.open-mpi.org/
You can find other implementations of MPI and tons of use examples from google.

Don't bother with CSocket, you'll move to CAsyncSocket in the end because of the extra control you get (interrupting, shutting down etc.). I'd also recommend using a separate thread to manage the communication, it adds complexity but keeping the UI responsive should be a top priority.

I think you will find that your design will be simplified greatly by using a separate thread to handle a blocking socket.
The main reason for this is you don't need to spin and wait. The UI remains responsive while the network thread(s) block when it has nothing to do and comes back when it has stuff to do. You are effectively offloading a large portion of your overhead to the OS.
Remember, RFB does not require a whole lot of state info to work. Because client to server messages are short; there is nothing requiring you to receive a frame buffer before you send your next pointer input.
My point being is messages in RFB can be intermixed; the server will work on your schedule.
Now, Windows provides easy to use synchronization API's that while not always the most efficient, are more than enough for your purposes and will ease getting a proof of concept up and going.
Take a look at Windows Synchronization and specifically Critical Sections
Just my 2cents, I've implemented both a vnc server and client on windows, these were my impressions.

Realtime Display of Data

I am designing an application to collect my vehicles data and display it on an application. I'm trying to figure out what the best archtitecure of my software would be. I plan on using Qt for my gui (QPainter) and I have custom hardware that collects the data from sensors. I was thinking that the hardware I/O would reside in the application that renders the graphics in its own thread, but now I am thinking it might be better to put all the Hardware I/O comm in a seperate process and communicate between the two processes with some IPC protocol (not sure which one).
What do you guys recommend me doing. This would also be my first time writing a multi-process application.

I have written such things hundreds of times. By far, the best solution is to split the dedicated hardware into two threads or tasks:
one which does whatever realtime operations are needed
another which responds to data queries and commands from the UI
These two threads cooperate with each other to maintain a consistent, semaphore-protected shared variable space. The second thread does all its parsing and whatnot before locking the shared space, makes a copy of whatever it needs, and unlocks. The goal is to limit the locking interval to as short a time as possible. Oftentimes, it is practical to arrange all the shared variables into a single structure, and use a bulk memcpy(), even if only a few members are of interest. The simpler this interaction, the better.
The UI contains
screens which, when visible and active, cause periodic queries to the data module
Other architectures are possible, but whenever I've seen them, they have devolved into huge steaming masses of patches to work around synchronization and timing issues.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js