Save data periodically during execution - c++

I have a program which executes constantly and I need to save data every minute.
The program process data and every minute I want to save the value of a variable and do some statistical operations to know the variation of this variable.
I thought i can make it with a signal, SIGALRM and alarm(60). My subquestion is, can I put a class method as the destiny method for SIGALRM?
Any other idea to execute a method to save data and do some operations every minute ??
The program is written in C++, runs in Linux an a mono-core processor.

Your solution using alarm will work, both open and write being asynchronous-signal-safe. Though you have to be aware that interactions between alarm and sleep are undefined, so don't use them in the same program.
A different solution, especially in case you already use an epoll, would be to have a timerfd trigger the epoll. That will avoid possible undefined interactions.
As for the actual saving, consider forking. This is a technique that I learned from redis (maybe someone else invented it, but that's where I learned it from), and which I consider totally cool. The point being that the forked process can take all time in the universe to finish writing as much data as you want to disk. It can access the snapshot at the time of forking while the other process keeps running and modifying data. And thanks to page magic done in the kernel, it still all works seamlessly without any risk of corruption, without ever stalling, and without ever needing to look at something like asynchronous IO, which is great.

You can call a class method using something like boost bind
Apart from that I wouldn't recommend to use signals for that, they are not that reliable, and could, for example, make one of your syscalls to return prematurely.
I would spawn a thread, assuming your monocore doesn't mean no threads, that waits 60 seconds, takes locks, makes calcs, outputs and releases locks.
As they have already suggested, if you have an async compatible system(driven by events) you could use timerfd to generate events.

Saving data from a signal handler is a very bad idea. Even if open and write are async-signal-safe, your data could very well be in an inconsistent state due to a signal interrupting a function that was modifying it.
A much better approach would be to add to all functions which modify the data:
if (current_time > last_save_time + 60) save();
This will avoid useless saves when the data has not been modified, too. If you don't want the overhead of making a system call to determine the current time on every operation, you could instead install a timer/signal handler that updates current_time, as long as you declare it volatile.
Another good approach would be to use threads instead of signals. Then you should use a mutex (or better, rwlock) to synchronize access to the data.

Related

Fast synchronised cout for multithreading

Recently I ran into a rather common problem about using cout in a multithreading application but with a little twist. I've got several callbackfunctions which get called by external hardware via a driver. Main objective of the callback funtions is to receive some data and store it in a queue and signal a processing-task as soon as a certain amout of datasets got collected. The callback-function needs to run as fast as possible in order to respond to the hardware in soft realtime.
My problem is this: From time to time my queue gets full and I have to handle this case by printing out a warning to the console (hard requirement). As I work with several threads I've created a wrapper function which uses a mutex to synchronise cout. Unfortunately, in some cases waiting for access to cout can take so much time that my callback function doesn't end fast enough to respond to the hardware before a timeout. My solution was to use a atomic variable for each possible error to count the number of occurences and a further task to check these variables periodically and print out the messages afterwards, but I'm pretty sure that this is not the best approach to solve my performance problems.
Are there any general approaches for this type of problem?
Any recommendations how I could improve or simplify my solution?
Thank you in advance
Don't write output in the hotpath.
Instead, queue up the stuff you want to log (prefereably raw data rather than a fully formatted string). Have another OOB thread running which picks up this stuff and logs it.

Do Asynchronous Loggers really help in performance?

We know that synchronous logging, writes the log message to the file and then continues to the program execution. Asynchronous loggers queues the log messages and writes them in a separate thread. I'm starting to implement Log4CPlus in my Project and couple of things came to my mind.
I can't initialize more LogObjects, because that will open more file handles and we don't need that. (I Know we should use Feature based logging objects, example for UploadLogObj,DownloadLogOb,WebReqLogObj,AuthLogObj,etc). Hope each and every addition of log object may increase logging threads too.
Still for argument sake, if i use a Single Log Object and push log messages from Multiple Threads, i suppose there must be some mutex lock to prevent writing to the message queue. My Question won't this mutex lock slow down the process, won't it create performance issue ..?
I'm just wondering how Asynchronous loggers work, i can look into the code, that's one way. But Hope the answers will be enlightening to a lot of people.
Yes, the mutex will slow down the process a bit, but if you are logging from multiple threads to the same destination you will need some form of synchronization anyway, since you don't want lines from different threads to be mixed up.
In the end it's a matter of deciding where to synchronize, not if. With asynchronous logging this happens when the object to be logged is pushed to the queue of the logging thread. In the synchronous case probably at the time the line is written (though it depends on the implementation).
In the first case the time spent inside the mutex will be much shorter and predictable, since no disk flushes happens while in the mutex. This means that you may have less performance degradation and better scaling than in the second case (plus the time that you didn't spend writing the actual data, because the other thread is taking care of it).
If you don't have a lot of threads competing for the mutex anyway it won't a problem. I had the chance to write and use an asynchronous logger for a real-time system some time ago, and we reached disk-bandwidth related issues long before sychronization issues.
One downside of asynchronous logging is more memory related: since you need to pass the data to be logged around you need to be careful and avoid unneeded allocations/deallocations.
Mutex lock takes something like 40-60 nanoseconds (if mutex is not locked by another thread) on modern hardware. This is nothing comparing to IO operation which is theoretically can write file to a slow HDD or network drive for a few seconds.
Lock-free is a different thing - in this case you don't even have mutexes. However, there is price for it - you'll have to write a more complicated code.

any decent way to set a timeout on a thread?

We have a decoding function that runs in its own thread to carry out its job.
The time of execution is usually well below a defined timeout value, but on some occasions it may take much longer to complete. Thus the need to have a timeout in order to make sure this function will not cause extra delays to the rest of the program.
This is currently being developed on Windows OS but I'm also looking at a portable solution to Linux.
The implementation so far as multiple checks within the decoding function to see if it still has time to continue or abort processing. Which is def. not great practice and I'm looking at improving this.
I'm aware that boost provides such facility, but we do not use boost in this project.
Here is an excellent article by Herb Sutter on the subject. The conclusion would be: your current approach is OK. Just have your decoding threads periodicly check if they run out of time. The important thing is to strike a balance about how frequently you check.
One way is to set a flag on timeout to instruct the thread instance to not report any completion, not continue and to delete/terminate itself ASAP. Reduce its priority to the lowest possible and forget about it. Create another thread object immediately, overwriting the old instance value, and use the new thread instance for subsequent decoding.
The lowest-priority orphaned thread will eventually die off itself when it finally gets around to checking its suicide-flag.

Multithreading: a blocking wait with timeout

I'm using TinyThread++ to get clean and simple platform independent control over threading features in my project. I just came upon a situation where I'd like to have responsive synchronized message passing without pegging the CPU, while allowing a thread to continue to do a bit of work on the side while it is idle. Sure, I could simply spawn a third thread to do this "other work" but all I'm missing is a condition variable wait(int ms) type function rather than the wait() that already works great. The idea is that I'd like for it to block only for up to ms milliseconds, so it will be able to time out and perform some actions periodically (during which the thread will not be actively waiting on the condition variable). The idea is that even though it's nice to have the thread sitting there waiting to pounce on any incoming messages, if I give it some task to do on the side which takes only 50 microseconds to execute, and I only need to run that once every second, it definitely shouldn't push me to make yet another thread (and message queue and other resources) to get it done.
Does any of this make sense? I'm looking for suggestions on how i might go about implementing this. I'm hoping adding a couple of lines to the TinyThread code can provide me with this functionality.
Well the source code for the wait function isn't very complicated so making the required modificiations looks simple enough:
The linux implementation relies on the pthread_cond_wait function
which can trivially be changed to the pthread_cond_timedwait
function. Do read the documentation carefully in case I forgot about any minutias.
On the windows side of things, it's a little more
complicated and I'm no expert on multithreading on windows. That
being said, if there's a timed version of the _wait function (I'm pretty sure there is),
changing that should work just fine. Again, read over the documentation carefully before doing any modifications.
Now before you go off and do these modifications, I don't think what you're trying to do is a good idea. The main advantage of using threads is to conceptually seperate different tasks. Trying to do multiple things in a single thread is a bit like trying to do multiple things in a single function: it complicates the design and makes things harder to debug. So unless the overhead of creating a new thread is provably too great or unless the resulting code remains simple and easy to understand, I'd split it up into multiple threads.
Finally, I get the feeling that you might not be aware that condition variables can return spuriously (returns without anybody having done any signalling or returns when the condition is still false). So just in case, I'd suggest reviewing the usage examples and making sure you understand why those loops are there.

Is there a way to abort an SQLite call?

I'm using SQLite3 in a Windows application. I have the source code (so-called SQLite amalgamation).
Sometimes I have to execute heavy queries. That is, I call sqlite3_step on a prepared statement, and it takes a lot of time to complete (due to the heavy I/O load).
I wonder if there's a possibility to abort such a call. I would also be glad if there was an ability to do some background processing in the middle of the call within the same thread (since most of the time is spent in waiting for the I/O to complete).
I thought about modifying the SQLite code myself. In the simplest scenario I could check some condition (like an abort event handle for instance) before every invocation of either ReadFile/WriteFile, and return an error code appropriately. And in order to allow the background processing the file should be opened in the overlapped mode (this enables asynchronous ReadFile/WriteFile).
Is there a chance that interruption of WriteFile may in some circumstances leave the database in the inconsistent state, even with the journal enabled? I guess not, since the whole idea of the journal file is to be prepared for any error of any kind. But I'd like to hear more opinions about this.
Also, did someone tried something similar?
Thanks in advance.
EDIT:
Thanks to ereOn. I wasn't aware of the existence of sqlite3_interrupt. This probably answers my question.
Now, for all of you who wonders how (and why) one expects to do some background processing during the I/O within the same thread.
Unfortunately not many people are familiar with so-called "Overlapped I/O".
http://en.wikipedia.org/wiki/Overlapped_I/O
Using it one issues an I/O operation asynchronously, and the calling thread is not blocked. Then one receives the I/O completion status using one of the completion mechanisms: waitable event, new routine queued into the APC, or the completion port.
Using this technique one doesn't have to create extra threads. Actually the only real legitimation for creating threads is when your bottleneck is the computation time (i.e. CPU load), and the machine has several CPUs (or cores).
And creating a thread just to let it be blocked by the OS most of the time - this doesn't make sense. This leads to the unjustified waste of the OS resources, complicates the program (need for synchronization and etc.).
Unfortunately not all the libraries/APIs allow asynchronous mode of operation, thus making creating extra threads the necessarily evil.
EDIT2:
I've already found the solution, thansk to ereOn.
For all those who nevertheless insist that it's not worth doing things "in background" while "waiting" for the I/O to complete using overlapped I/O. I disagree, and I think there's no point to argue about this. At least this is not related to the subject.
I'm a Windows programmer (as you may noticed), and I have a very extensive experience in all kinds of multitasking. Plus I'm also a driver writer, so that I also know how things work "behind the scenes".
I know that it's a "common practice" to create several threads to do several things "in parallel". But this doesn't mean that this is a good practice. Please allow me not to follow the "common practice".
I don't understand why you want the interruption to come from the same thread and I even don't understand how that would be possible: if the current thread is blocked, waiting for some IO, you can't execute any other code. (Yeah, that's what "blocked" means)
Perhaps if you give us more hints about why you want this, we might help further.
Usually, I use sqlite3_interrupt() to cancel calls. But this, obviously, involves that the call is made from another thread.
By default, SQLite is threadsafe. It sounds to me like the easiest thing to do would be to start the Sqlite command on a background thread, and let SQLite to the necessary locking to have that work.
From your perspective then, the sqlite call looks like an asynchronous bit of I/O, and you can continue normal processing on this thread, such as e.g. using a loop including interruptible sleep and a bit of occasional background processing (e.g. to update a liveness indicator). When the SQLite statement completes, the background thread should set a state variable to indicate this, wake the main thread (if necessary), and terminate.