My Linux C++ application is periodically reading sensor data. Readout is done by simple file I/O operation (OS is writing to file, application is reading from this file).
Some information about my platform:
I have single core processor with hyper-threading
sensor data update frequency is 1 second
application GUI runs in main thread and shouldn't be blocked
I considered two approaches for sensor data read out:
timer running in main application thread
separate thread with infinite loop which does sensor data readout and then sleeps
Which approach makes more sens, are there any other alternatives ? What are the costs of both solution (e.g. blocking of main thread in first or context switching in second approach) ?
I don't know anything about your application or the hardware, but here are a few things to consider:
If you use a thread, you will have to create a communication channel of some sort to tell the main thread that data has been updated. Usually this would be a pipe(), as signals are inherently unreliable and condition locks don't work with I/O multiplexing (i.e. select()/poll()).
Can you get the entire set of data without blocking? If so, then just reading it in the main thread is probably easier. However, if your read can block you'll probably need some more "keep track of my read state to incorporate it into my central select()", whereas a thread can just block until more data is available.
Thus, neither solution is automatically "easier" to do.
I wouldn't worry about "context switching" for a read that only occurs once per second; that's irrelevant.
What else does the main thread have to do? Is it ok if it blocks? If so, then you dont need to do the timer, etc in a separate thread.
If the main thread cant block waiting for the periodic timer, then a separate thread must be created. The communication of data between the threads can be via an object that is accessible to both threads and protected via a mutex (look up pthread_mutex_t), which is quite simple to do.
As for which solution would be better and what are the costs, it depends on what else the main thread is doing. But for something this simple, either way should be about the same, and the context switching shouldnt affect anything. What should affect performance the most is how performance intensive the reads are.
I believe that cost of the context switch once a second is not an issue even for single-core CPU without hyper-threading especially taking to the account that the application is running in user space, thus is not really time-critical. The polling of your sensor in the main thread complicates the logic of the application. So, I would recommend you to start a thread for that purpose.
A sleep-loop will skew the timing because each iteration is going to take longer than 1sec. Timers don't have that problem, and they are made for this scenario. So choose a timer.
Performance-wise there is no difference because you are only triggering once a second.
If the Linux driver is reading a sensor data and writing it to a device file every second, you shouldn't duplicate the timer logic in your application. It may happen that after 1 second sleep your application will still read the same data as 1 second ago. A better approach would be to have a thread that would call a blocking read on a device file. When new sensor data is available, blocking read returns, the thread can process the data and call read again.
Related
We have developed a system which integrates with camera and micro controller.
GUI shows the image from the camera and serial count from micro controller, we used a serial thread to poll the data from micontroller and emitted the signal to GUI to display it and also we have used a separate thread to capture image and pass it to main thread.
The problem with the application is when the system is in idle state, the GUI freezes and we have to restart the application to start working (idle meaning, user is not leasing any buttons and counts and images are coming in continuously).
Most important thing to notice here is the GUI freeze issue is not consistent here. There are several systems installed, and some places, the freeze (not responding) issue comes once in 2/3 weeks and in some places, once in 2 days. Waiting for the application to be responsive doesn’t help.
My main question is what is the main cause for the GUI to freeze and is there any checks to implement on serial thread and image capturing thread to avoid unnecessary data emissions.
Sounds like you are experiencing a concurrency violation that doesn't necessarily result in crashing until things have been running long enough to finally hit the magic combination of events occurring at just the right time to bork things up.
You have three threads in your app: A GUI thread, a serial thread, and a camera thread. The serial and camera threads collect data from devices and then pass them on to the GUI thread for display. I presume the serial and camera threads don't share any data with each other, so there is no risk of problems there.
How are you passing data up from the serial and camera threads to the GUI thread? This is where you are probably having an issue.
Most complex data structures and Qt classes are not thread safe, meaning, they must never be read and written at the same time from two or more threads at the same time.
Here are some strategies for passing data between threads safely:
An integer is atomic at the CPU's instruction set level, so you can safely read and write an integer (or any data type equal to or smaller than an integer, such as a bool, a single char, or a pointer) from multiple threads without having any inconsistent state occurring. You must declare such variables with C++'s std::atomic<> template to ensure that the compiler will not perform optimizations that break atomicity.
Anything bigger/more complex than an integer runs the risk of one thread having written half of it's data to memory, while another thread concurrently reads out that half written data, resulting in very unexpected results, quite often crashing your application or getting something stuck in an infinite loop.
Signals and slots in Qt are thread safe. One way to pass complex data between threads is to emit the data in one thread and have a slot in another thread receive that data. Qt takes care of any concurrency issues under the hood for you in this case. The only gotcha here is if the consumer thread(s) of the data can not absorb the data fast enough, Qt's event queue will get stuffed up with too much data and eventually your app will crash because new GUI events (such as mouse clicks, repaint events, etc) can no longer get through the clogged up event queue.
You can use a QMutex to ensure that only one thread is reading or writing a complex data structure at one time. QMutex lets you block/halt execution in one or more threads while a single thread "holds" the mutex and allows it to execute, doing work on the data without any risk of other threads touching that data. When that one thread is done, it "releases" the mutex, which then allows one of the other threads to "hold" the mutex, resuming its execution so it can do work with the data.
As far as testing goes, typically the chances of your app crashing from a concurrency violation goes up the higher your data flow rates are. If you can artificially increase the rate of camera frames and "serial counts" being passed up to the GUI thread, you will be able to reproduce your application crash faster. Once you successfully solve your concurrency issues, you should be able to flood the system with data and never get it to crash.
I'm working with a user mode driver for small scale USB devices. My usb reading loop should be very responsive and operations it performs should be very small ( not necessary to be atomic). Like an interrupt service routine in a kernel mode driver. In one processing I need to create a thread and pass some parameters to that thread inside that reading loop.
So I need to know the exact upper limit of that operation. It will not take more than 200mS , or something like that.
Next alternative is to do the thread initialization at the device initialization time ( probing time ) and then sleep that thread waiting till I signal it from the reading thread. But in this scenario the thread is always running and it would be costly.
What is the best option ? My platform is linux, and they said in linux, thread creation have very short operation. I need to decide what is best. Keep the thread alive at all-time or create the thread when necessary.
Modern machines have hundreds, sometimes thousands of threads instantiated and in "ready" state at all times. "Ready" does not mean "Actually Running".
So, there is no problem with starting one more thread at device initialization and keeping it in "Ready" state most of the time, and giving it some work to do every once in a rare while.
The trick to getting this to work smoothly this is to make sure that the thread is block-waiting for an event to occur. When a thread is block-waiting for a signal it is consuming zero, or near-zero, CPU.
Starting a new thread each time you need to do something can be quite costly. A new thread usually needs to allocate memory, and this can be a time consuming operation, especially in a system that is running low on memory, where memory allocation can cause swapping.
Just create thread once and make it block on some semaphore or mutex until you signal it. This way it won't be "always running" and it won't "be costly". This way you don't need to handle case like: "What if thread didn't start when I needed some processing" or "What if system was busy and thread startup was slow"?..
Just a minor thing: if the thread doesn't do much I would initialize it with smaller stack size.
I have a multi-threaded application that is using pthreads. I have a mutex() lock and condition variables(). There are two threads, one thread is producing data for the second thread, a worker, which is trying to process the produced data in a real time fashion such that one chuck is processed as close to the elapsing of a fixed time period as possible.
This works pretty well, however, occasionally when the producer thread releases the condition upon which the worker is waiting, a delay of up to almost a whole second is seen before the worker thread gets control and executes again.
I know this because right before the producer releases the condition upon which the worker is waiting, it does a chuck of processing for the worker if it is time to process another chuck, then immediately upon receiving the condition in the worker thread, it also does a chuck of processing if it is time to process another chuck.
In this later case, I am seeing that I am late processing the chuck many times. I'd like to eliminate this lost efficiency and do what I can to keep the chucks ticking away as close to possible to the desired frequency.
Is there anything I can do to reduce the delay between the release condition from the producer and the detection that that condition is released such that the worker resumes processing? For example, would it help for the producer to call something to force itself to be context switched out?
Bottom line is the worker has to wait each time it asks the producer to create work for itself so that the producer can muck with the worker's data structures before telling the worker it is ready to run in parallel again. This period of exclusive access by the producer is meant to be short, but during this period, I am also checking for real-time work to be done by the producer on behalf of the worker while the producer has exclusive access. Somehow my hand off back to running in parallel again results in significant delay occasionally that I would like to avoid. Please suggest how this might be best accomplished.
I could suggest the following pattern. Generally the same technique could be used, e.g. when prebuffering frames in some real-time renderers or something like that.
First, it's obvious that approach that you describe in your message would only be effective if both of your threads are loaded equally (or almost equally) all the time. If not, multi-threading would actually benefit in your situation.
Now, let's think about a thread pattern that would be optimal for your problem. Assume we have a yielding and a processing thread. First of them prepares chunks of data to process, the second makes processing and stores the processing result somewhere (not actually important).
The effective way to make these threads work together is the proper yielding mechanism. Your yielding thread should simply add data to some shared buffer and shouldn't actually care about what would happen with that data. And, well, your buffer could be implemented as a simple FIFO queue. This means that your yielding thread should prepare data to process and make a PUSH call to your queue:
X = PREPARE_DATA()
BUFFER.LOCK()
BUFFER.PUSH(X)
BUFFER.UNLOCK()
Now, the processing thread. It's behaviour should be described this way (you should probably add some artificial delay like SLEEP(X) between calls to EMPTY)
IF !EMPTY(BUFFER) PROCESS(BUFFER.TOP)
The important moment here is what should your processing thread do with processed data. The obvious approach means making a POP call after the data is processed, but you will probably want to come with some better idea. Anyway, in my variant this would look like
// After data is processed
BUFFER.LOCK()
BUFFER.POP()
BUFFER.UNLOCK()
Note that locking operations in yielding and processing threads shouldn't actually impact your performance because they are only called once per chunk of data.
Now, the interesting part. As I wrote at the beginning, this approach would only be effective if threads act somewhat the same in terms of CPU / Resource usage. There is a way to make these threading solution effective even if this condition is not constantly true and matters on some other runtime conditions.
This way means creating another thread that is called controller thread. This thread would merely compare the time that each thread uses to process one chunk of data and balance the thread priorities accordingly. Actually, we don't have to "compare the time", the controller thread could simply work the way like:
IF BUFFER.SIZE() > T
DECREASE_PRIORITY(YIELDING_THREAD)
INCREASE_PRIORITY(PROCESSING_THREAD)
Of course, you could implement some better heuristics here but the approach with controller thread should be clear.
My program's really consuming CPU time far more than I'd like (2 displays shoots it up to 80-90%). I'm using Qtimers, and some of them are as short as 2ms. At any given time, I can have 12+ timers going per display -- 2ms, 2ms, 2ms, 250ms, the rest ranging between 200ms and 500ms. Would it be better if I used threads for some or all of these (especially the short ones)? Would it make much of a difference?
The main time issue is going to come in on the high priority timers. First off make sure you really need these every 2ms, secondly to overcome some of the overhead in the QTimer class you could group your 3 2ms timeouts into one, and everytime it goes off just execute the 3 sections of code sequentially. I don't think threading will solve the issue though.
The 2 ms seams suspect to me. People have been reading and writing to Serial Ports at 19200 baud for years (for example on 486 hardware) without overloading the cpu. Maybe your approach is wrong.
What api are you using to access the port? I sounds like you are polling them, if the api supports blocked reads and writes this would be a much better approach.
The simplest would then be to put the read and write in their own thread, and use blocking reads in a loop, then your thread will only be busy when there is data to read and you are processing it. Your application should know when it needs to write, so the right thread should wait on a condition variable, when data is available this condition is triggered waking up the write thread.
There are probably easier single threaded approaches to this as I am sure the first applications to read and write on serial ports (e.g. x modem) were not multi-threaded, but I do not know them but they should be documented in the api you are using.
I am reading data from multiple serial ports. At present I am using a custom signal handler (by setting sa_handler) to compare and wake threads based on file descriptor information. I was searching for a way out to have individual threads with unique signal handlers, in this regard I found that select system call is to be used.
Now I have following questions:
If I am using a thread (Qt) then where do I put the select system call to monitor the serial port?
Is the select system call thread safe?
Is it CPU intensive because there are many things happening in my app including GUI update?
Please do not mind, if you find these questions ridiculous. I have never used such a mechanism for serial communication.
The POSIX specification (select) is the place to look for the select definition. I personally recommend poll - it has a better interface and can handle any number of descriptors, rather than a system-defined limit.
If I understand correctly you're waking threads based on the state of certain descriptors. A better way would be to have each thread have its own descriptor and call select itself. You see, select does not modify the system state, and as long as you use thread-local variables it'll be safe. However, you will definitely want to ensure you do not close a descriptor that a thread depends on.
Using select/poll with a timeout leaves the "waiting" up to the kernel side, which means the thread is usually put to sleep. While the thread is sleeping it is not using any CPU time. A while/for loop on a select call without a timeout on the other hand will give you a higher CPU usage as you're constantly spinning in the loop.
Hope this helps.
EDIT: Also, select/poll can have unpredictable results when working with the same descriptor in multiple threads. The simple reason for this is that the first thread might be woken up because the descriptor is ready for reading, but the second thread has to wait for the next "available for reading" wakeup.
As long as you're not selecting on the same descriptor in multiple threads you should not have a problem.
It is a system call -- it should be thread safe, I think.
I did not do this before, but I would be rather surprised, if it where not. How CPU intensive select() is, depends in my opinion largely on the number of file handles you are waiting for. select() is mostly used, to wait for a number (>1) of file handles to become ready.
It should also be mentioned that select() should not be used to poll the file handles -- for performance reason. Normal usage is: You have your work done and some time can elapse till the next thing is going on. Now you suspend your process with select and let another process run. select() normally does suspend the active process. How this works together with threads, I am not sure! I would think, that the whole process (and all threads) are suspended. But this might be documented. It also could depend (on Linux) whether you use system-threads or User-Threads. The kernel will not know User-Threads and hence suspend the whole process.