i have created a class which reads a file and does some operations on the contents and saves a new file with time stamp. But, i am in a requirement to perform in such a way that , a code should check every one min whether the file is present. If yes, it should process the file. It need to work on cross platform.
I am novice in c++ and need to know what approach i need to follow for this. Do i need to create process or something. I am completely blank .
class inputHandler
{
public:
void readInput();
void performTask();
void saveFile();
};
since the code implementation is too large, just i am posting the structure. I am ready to spend time on this. So, i need a sample tutorial which can guide me to achieve this .
This is not addressed by the C++ standard. Thus, you'll have to implement code for each supported system, or use a library.
As far as I understood, the most general solution is to create a thread which loops every minute, checking file timestamps. Naturally, depending on your code, you could do it another way, avoiding threads whatsoever. Using a notification system such as inotify could be much better. Also, you could use alarm() on POSIX-compatible systems, being alarmed whenever a minute has passed.
Anyway, if you go with the thread solution, in POSIX-compatible systems, check out pthread_create() and stat(). In Windows, check out CreateThread() and GetFileTime(). To have a one-minute delay, sleep(60000) or Sleep(60000) respectively should do the trick.
Just to clarify, "to create a process" is system's programming jargon meaning roughly "to launch a new program" (or "thread", sometimes). In that sense, if you follow the above you'll be creating a new thread.
The simple part is checking if a file exists: when you open an std::ifstream it will be in good state only if the file exists:
std::ifstream in(filename);
if (in) {
// the file exists and can be processed here
}
The more interesting part is to do something in regular intervals. The basic idea is to set up a timer in some form. Depending on whether anything else needs to be done you may need a separate thread: if the program just waits until the file exists and doesn't do anything in the mean time, you can just sleep and there is no need to spawn another thread. Otherwise, you probably want to spawn a thread which is just sleeping.
Assuming you need to use a separate thread, you probably want to be able to interrupt it from waiting, e.g., to exit in a clean way upon condition from a separate thread. thus, I would use a condition variable with a timed wait, i.e., something like this:
std::mutex guard;
std::condition_variable condition;
bool done(false);
std::unique_lock<std::mutex> lock(guard);
while (!done) {
condition.wait_for(lock, std::chrono::minutes(n));
if (!done) {
do_whatever_needs_to_be_done_once_every_n_minutes();
}
}
The code above uses C++ 2011 facilities. If you can't use the corresponding classes, you can use suitable alternatives, e.g., the Boost classes.
Related
This question already has answers here:
Is cout synchronized/thread-safe?
(4 answers)
Closed 5 years ago.
Recently I started learning C++ 11. I only studied C/C++ for a brief period of time when I was in college.I come from another ecosystem (web development) so as you can imagine I'm relatively new into C++.
At the moment I'm studying threads and how could accomplish logging from multiple threads with a single writer (file handle). So I wrote the following code based on tutorials and reading various articles.
My First question and request would be to point out any bad practices / mistakes that I have overlooked (although the code works with VC 2015).
Secondly and this is what is my main concern is that I'm not closing the file handle, and I'm not sure If that causes any issues. If it does when and how would be the most appropriate way to close it?
Lastly and correct me if I'm wrong I don't want to "pause" a thread while another thread is writing. I'm writing line by line each time. Is there any case that the output messes up at some point?
Thank you very much for your time, bellow is the source (currently for learning purposes everything is inside main.cpp).
#include <iostream>
#include <fstream>
#include <thread>
#include <string>
static const int THREADS_NUM = 8;
class Logger
{
public:
Logger(const std::string &path) : filePath(path)
{
this->logFile.open(this->filePath);
}
void write(const std::string &data)
{
this->logFile << data;
}
private:
std::ofstream logFile;
std::string filePath;
};
void spawnThread(int tid, std::shared_ptr<Logger> &logger)
{
std::cout << "Thread " + std::to_string(tid) + " started" << std::endl;
logger->write("Thread " + std::to_string(tid) + " was here!\n");
};
int main()
{
std::cout << "Master started" << std::endl;
std::thread threadPool[THREADS_NUM];
auto logger = std::make_shared<Logger>("test.log");
for (int i = 0; i < THREADS_NUM; ++i)
{
threadPool[i] = std::thread(spawnThread, i, logger);
threadPool[i].join();
}
return 0;
}
PS1: In this scenario there will always be only 1 file handle open for threads to log data.
PS2: The file handle ideally should close right before the program exits... Should it be done in Logger destructor?
UPDATE
The current output with 1000 threads is the following:
Thread 0 was here!
Thread 1 was here!
Thread 2 was here!
Thread 3 was here!
.
.
.
.
Thread 995 was here!
Thread 996 was here!
Thread 997 was here!
Thread 998 was here!
Thread 999 was here!
I don't see any garbage so far...
My First question and request would be to point out any bad practices / mistakes that I have overlooked (although the code works with VC 2015).
Subjective, but the code looks fine to me. Although you are not synchronizing threads (some std::mutex in logger would do the trick).
Also note that this:
std::thread threadPool[THREADS_NUM];
auto logger = std::make_shared<Logger>("test.log");
for (int i = 0; i < THREADS_NUM; ++i)
{
threadPool[i] = std::thread(spawnThread, i, logger);
threadPool[i].join();
}
is pointless. You create a thread, join it and then create a new one. I think this is what you are looking for:
std::vector<std::thread> threadPool;
auto logger = std::make_shared<Logger>("test.log");
// create all threads
for (int i = 0; i < THREADS_NUM; ++i)
threadPool.emplace_back(spawnThread, i, logger);
// after all are created join them
for (auto& th: threadPool)
th.join();
Now you create all threads and then wait for all of them. Not one by one.
Secondly and this is what is my main concern is that I'm not closing the file handle, and I'm not sure If that causes any issues. If it does when and how would be the most appropriate way to close it?
And when do you want to close it? After each write? That would be a redundant OS work with no real benefit. The file is supposed to be open through entire program's lifetime. Therefore there is no reason to close it manually at all. With graceful exit std::ofstream will call its destructor that closes the file. On non-graceful exit the os will close all remaining handles anyway.
Flushing a file's buffer (possibly after each write?) would be helpful though.
Lastly and correct me if I'm wrong I don't want to "pause" a thread while another thread is writing. I'm writing line by line each time. Is there any case that the output messes up at some point?
Yes, of course. You are not synchronizing writes to the file, the output might be garbage. You can actually easily check it yourself: spawn 10000 threads and run the code. It's very likely you will get a corrupted file.
There are many different synchronization mechanisms. But all of them are either lock-free or lock-based (or possibly a mix). Anyway a simple std::mutex (basic lock-based synchronization) in the logger class should be fine.
The first massive mistake is saying "it works with MSVC, I see no garbage", even moreso as it only works because your test code is broken (well it's not broken, but it's not concurrent, so of course it works fine).
But even if the code was concurrent, saying "I don't see anything wrong" is a terrible mistake. Multithreaded code is never correct unless you see something wrong, it is incorrect unless proven correct.
The goal of not blocking ("pausing") one thread while another is writing is unachieveable if you want correctness, at least if they concurrently write to the same descriptor. You must synchronize properly (call it any way you like, and use any method you like), or the behavior will be incorrect. Or worse, it will look correct for as long as you look at it, and it will behave wrong six months later when your most important customer uses it for a multi-million dollar project.
Under some operating systems, you can "cheat" and get away without synchronization as these offer syscalls that have atomicity guarantees (e.g. writev). That is however not what you may think, it is indeed heavyweight synchronization, only just you don't see it.
A better (more efficient) strategy than to use a mutex or use atomic writes might be to have a single consumer thread which writes to disk, and to push log tasks onto a concurrent queue from how many producer threads you like. This has minimum latency for threads that you don't want to block, and blocking where you don't care. Plus, you can coalesce several small writes into one.
Closing or not closing a file seems like a non-issue. After all, when the program exits, files are closed anyway. Well yes, except, there are three layers of caching (four actually if you count the physical disk's caches), two of them within your application and one within the operating system.
When data has made it at least into the OS buffers, all is good unless power fails unexpectedly. Not so for the other two levels of cache!
If your process dies unexpectedly, its memory will be released, which includes anything cached within iostream and anything cached within the CRT. So if you need any amount of reliability, you will either have to flush regularly (which is expensive), or use a different strategy. File mappying may be such a strategy because whatever you copy into the mapping is automatically (by definition) within the operating system's buffers, and unless power fails or the computer explodes, it will be written to disk.
That being said, there exist dozens of free and readily available logging libraries (such as e.g. spdlog) which do the job very well. There's really not much of a reason to reinvent this particular wheel.
Hello and welcome to the community!
A few comments on the code, and a few general tips on top of that.
Don't use native arrays if you do not absolutely have to.
Eliminating the native std::thread[] array and replacing it with an std::array would allow you to do a range based for loop which is the preferred way of iterating over things in C++. An std::vector would also work since you have to generate the thredas (which you can do with std::generate in combination with std::back_inserter)
Don't use smart pointers if you do not have specific memory management requirements, in this case a reference to a stack allocated logger would be fine (the logger would probably live for the duration of the program anyway, hence no need for explicit memory management). In C++ you try to use the stack as much as possible, dynamic memory allocation is slow in many ways and shared pointers introduce overhead (unique pointers are zero cost abstractions).
The join in the for loop is probably not what you want, it will wait for the previously spawned thread and spawn another one after it is finished. If you want parallelism you need another for loop for the joins, but the preferred way would be to use std::for_each(begin(pool), end(pool), [](auto& thread) { thread.join(); }) or something similar.
Use the C++ Core Guidelines and a recent C++ standard (C++17 is the current), C++11 is old and you probably want to learn the modern stuff instead of learning how to write legacy code. http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines
C++ is not java, use the stack as much as possible - this is one of the biggest advantages to using C++. Make sure you understand how the stack, constructors and destructors work by heart.
The first question is subjective so someone else would want to give an advice, but I don't see anything awful.
Nothing in C++ standard library is thread-safe except for some rare cases. A good answer on using ofstream in a multithreaded environment is given here.
Not closing a file is indeed an issue. You have to get familiar with RAII as it is one of the first things to learn. The answer by Detonar is a good piece of advice.
I am sorry if this was asked before, but I didn't find anything related to this. And this is for my understanding. It's not an home work.
I want to execute a function only for some amount of time. How do I do that? For example,
main()
{
....
....
func();
.....
.....
}
function func()
{
......
......
}
Here, my main function calls another function. I want that function to execute only for a minute. In that function, I will be getting some data from the user. So, if user doesn't enter the data, I don't want to be stuck in that function forever. So, Irrespective of whether function is completed by that time or it is not completed, I want to come back to the main function and execute the next operation.
Is there any way to do it ? I am on windows 7 and I am using VS-2013.
Under windows, the options are limited.
The simplest option would be for func() to explicitly and periodically check how long it has been executing (e.g. store its start time, periodically check the amount of time elapses since that start time) and return if it has gone longer than you wish.
It is possible (C++11 or later) to execute the function within another thread, and for main() to signal that thread when the required time period has elapsed. That is best done cooperatively. For example, main() sets a flag, the thread function checks that flag and exits when required to. Such a flag is usually best protected by a critical section or mutex.
An extremely unsafe way under windows is for main() to forceably terminate the thread. That is unsafe, as it can leave the program (and, in worst cases, the operating system itself) in an unreliable state (e.g. if the terminated thread is in the process of allocating memory, if it is executing certain kernel functions, manipulating global state of a shared DLL).
If you want better/safer options, you will need a real-time operating system with strict memory and timing partitioning. To date, I have yet to encounter any substantiated documentation about any variant of Windows and unix (not even real time variants) with those characteristics. There are a couple of unix-like systems (e.g. LynxOS) with variants that have such properties.
I think a part of your requirement can be met using multithreading and a loop with a stopwatch.
Create a new thread.
Start a stopwatch.
Start a loop with one minute as the condition for the loop.
During each iteration check if the user has entered the input and process.
when one minute is over, the loop quits.
I 'am not sure about the feasibility about this idea, just shared my idea. I don't know much about c++, but in Node.js your requirement can be achieved using 'events'. May be such things exists in C++ too.
I'm making a text-based RPG, and I'd really like to emulate time.
I could just make some time pass between each time the player types something, but id like it to be better than that if possible. I was wondering if multithreading would be a good way to do this.
I was thinking maybe just have a second, really simple thread in the background that just has a loop, looping every 1000ms. For every pass though its loop the world time would increase by 1 sec and the player would regenerate a bit of health and mana.
Is this something that multithreading could do, or is there some stuff i don't know about that would make this not work? (I'd prefer not to spend a bunch of time struggling to learn this if its not going to help me with this project.)
Yes, mutlithreading could certainly do this, but be weary that threading is usually more complicated than the alternative (which would be the main thread polling various update events as part of its main loop, which should be running at least once every 100ms or so anyway).
In your case, if the clock thread follows pretty strict rules, you'll probably be "ok."
The clock thread is the only thread allowed to set/modify the time variables.
The main/ui thread is only allowed to read the time.
You must still use a system time function, since the thread sleep functions cannot be trusted for accuracy (depending on system activity, the thread's update loop may not run until some milliseconds after you requested it run).
If you implement it like that, then you won't even need to familiarize yourself with mutexes in order to get the thread up and running safely, and your time will be accurate.
But! Here's some food for thought: what if you want to bind in-game triggers at specific times of the day? For example, a message that would be posted to the user "The sun has set" or similar. The code needed to do that will need to be running on the main thread anyway (unless you want to implement cross-thread message communication queues!), and will probably look an awful lot like basic periodic-check-and-update-clock code. So at that point you would be better off just keeping a simple unified thread model anyway.
I usually use a class named Simulation to step forward time. I don't have it in C++ but I've done threading in Java that is stepping time forward and activating events according to schedule (or a random event at a planned time). You can take this and translate to C++ or use to see how an object-oriented implementation is.
package adventure;
public class Simulation extends Thread {
private PriorityQueue prioQueue;
Simulation() {
prioQueue = new PriorityQueue();
start();
}
public void wakeMeAfter(Wakeable SleepingObject, double time) {
prioQueue.enqueue(SleepingObject, System.currentTimeMillis() + time);
}
public void run() {
while (true) {
try {
sleep(5);
if (prioQueue.getFirstTime() <= System.currentTimeMillis()) {
((Wakeable) prioQueue.getFirst()).wakeup();
prioQueue.dequeue();
}
} catch (InterruptedException e) {
}
}
}
}
To use it, you just instantiate it and add your objects:
` Simulation sim = new Simulation();
// Load images to be used as appearance-parameter for persons
Image studAppearance = loadPicture("Person.gif");
// --- Add new persons here ---
new WalkingPerson(sim, this, "Peter", studAppearance);
I'm going to assume that your program currently spends the majority of its time waiting for user input - which blocks your main thread irregularly and for a relatively long period of time, preventing you from having short time-dependant updates. And that you want to avoid complicated solutions (threading).
If you want to access the time in the main thread, accessing it without a separate thread is relatively easy (look at the example).
If you don't need to do anything in the background while waiting for user input, couldn't you write a function to calculate the new value, based on the amount of time that has passed while waiting? You can have some variable LastSystemTimeObserved that gets updated every time you need to use one of your time-dependant variables - calling some function that calculates the variable's changed value based on how much time has passed since it was last called, instead of recalculating values every second.
If you do make a separate thread, be sure that you properly protect any variables that are accessed by both threads.
Let's say that I have two libraries (A and B), and each has one function that listen on sockets. These functions use select() and they return some event immediately if the data has arrived, otherwise they wait for some time (timeout) and then return NULL:
A_event_t* A_wait_for_event(int timeout);
B_event_t* B_wait_for_event(int timeout);
Now, I use them in my program:
int main (int argc, char *argv[]) {
// Init A
// Init B
// .. do some other initialization
A_event_t *evA;
B_event_t *evB;
for(;;) {
evA = A_wait_for_event(50);
evB = B_wait_for_event(50);
// do some work based on events
}
}
Each library has its own sockets (e.g. udp socket) and it is not accessible from outside.
PROBLEM: This is not very efficient. If for example there is a lot of events waiting to be delivered by *B_wait_for_event* these would have to wait always until *A_wait_for_event* timeouts, which effectively limits the throughput of library B and my program.
Normally, one could use threads to separate processing, BUT what if processing of some event require to call function of other library and vice verse. Example:
if (evA != 0 && evA == A_EVENT_1) {
B_do_something();
}
if (evB != 0 && evB == B_EVENT_C) {
A_do_something();
}
So, even if I could create two threads and separate functionality from libraries, these threads would have to exchange events among them (probably through pipe). This would still limit performance, because one thread would be blocked by *X_wait_for_event()* function, and would not be possible to receive data immediately from other thread.
How to solve this?
This solution may not be available depending on the libraries you're using, but the best solution is not to call functions in individual libraries that wait for events. Each library should support hooking into an external event loop. Then your application uses a single loop which contains a poll() or select() call that waits on all of the events that all of the libraries you use want to wait for.
glib's event loop is good for this because many libraries already know how to hook into it. But if you don't use something as elaborate as glib, the normal approach is this:
Loop forever:
Start with an infinite timer and an empty set of file descriptors
For each library you use:
Call a setup function in the library which is allowed to add file descriptors to your set and/or shorten (but not lengthen) the timeout.
Run poll()
For each library you use:
Call a dispatch function in the library that responds to any events that might have occurred when the poll() returned.
Yes, it's still possible for an earlier library to starve a later library, but it works in practice.
If the libraries you use don't support this kind of setup & dispatch interface, add it as a feature and contribute the code upstream!
(I'm moving this to an answer since it's getting too long for a comment)
If you are in a situation where you're not allowed to call A_do_something in one thread while another thread is executing A_wait_for_event (and similarly for B), then I'm pretty sure you can't do anything efficient, and have to settle between various evils.
The most obvious improvement is to immediately take action upon getting an event, rather than trying to read from both: i.e. order your loop
Wait for an A event
Maybe do something in B
Wait for a B event
Maybe do something in A
Other mitigations you could do are
Try to predict whether an A event or a B event is more likely to come next, and wait on that first. (e.g. if they come in streaks, then after getting and handling an A event, you should go back to waiting for another A event)
Fiddle with the timeout values to strike a balance between spin loops and too much blocking. (maybe even adjust dynamically)
EDIT: You might check the APIs for your library; they might already offer a way to deal with the problem. For example, they might allow you to register callbacks for events, and get notifications of events through the callback, rather than polling wait_for_event.
Another thing is if you can create new file descriptors for the library to listen on. e.g. If you create a new pipe and hand one end to library A, then if thread #1 is waiting for an A event, thread #2 can write to the pipe to make an event happen, thus forcing #1 out of wait_for_event. With the ability to kick threads out of the wait_for_event functions at will, all sorts of new options become available.
A possible solution is to use two threads to wait_for_events plus boost::condition_variable in "main" thread which "does something". An alike but not exact solution is here
I have a situation where 2 different processes(mine C++, other done by other people in JAVA) are a writer and a reader from some shared data file. So I was trying to avoid race condition by writing a class like this(EDIT:this code is broken, it was just an example)
class ReadStatus
{
bool canRead;
public:
ReadStatus()
{
if (filesystem::exists(noReadFileName))
{
canRead = false;
return;
}
ofstream noWriteFile;
noWriteFile.open (noWriteFileName.c_str());
if ( ! noWriteFile.is_open())
{
canRead = false;
return;
}
boost::this_thread::sleep(boost::posix_time::seconds(1));
if (filesystem::exists(noReadFileName))
{
filesystem::remove(noWriteFileName);
canRead= false;
return;
}
canRead= true;
}
~ReadStatus()
{
if (filesystem::exists(noWriteFileName))
filesystem::remove(noWriteFileName);
}
inline bool OKToRead()
{
return canRead;
}
};
usage:
ReadStatus readStatus; //RAII FTW
if ( ! readStatus.OKToRead())
return;
This is for one program ofc, other will have analogous class.
Idea is:
1. check if other program created his "I'm owner file", if it has break else go to 2.
2. create my "I'm the owner" file, check again if other program created his own, if it has delete my file and break else go to 3.
3. do my reading, then delete mine "I'm the owner file".
Please note that rare occurences when they both dont read or write are OK, but the problem is that I still see a small chance of race conditions because theoretically other program can check for the existence of my lock file, see that there isnt one, then I create mine, other program creates his own, but before FS creates his file I check again, and it isnt there, then disaster occurs. This is why I added the one sec delay, but as a CS nerd I find it unnerving to have code like that running.
Ofc I don't expect anybody here to write me a solution, but I would be happy if someone does know a link to a reliable code that I can use.
P.S. It has to be files, cuz I'm not writing entire project and that is how it is arranged to be done.
P.P.S.: access to data file isn't reader,writer,reader,writer.... it can be reader,reader,writer,writer,writer,reader,writer....
P.P.S: other process is not written in C++ :(, so boost is out of the question.
On Unices the traditional way of doing pure filesystem based locking is to use dedicated lockfiles with mkdir() and rmdir(), which can be created and removed atomically via single system calls. You avoid races by never explicitly testing for the existence of the lock --- instead you always try to take the lock. So:
lock:
while mkdir(lockfile) fails
sleep
unlock:
rmdir(lockfile)
I believe this even works over NFS (which usually sucks for this sort of thing).
However, you probably also want to look into proper file locking, which is loads better; I use F_SETLK/F_UNLCK fcntl locks for this on Linux (note that these are different from flock locks, despite the name of the structure). This allows you to properly block until the lock is released. These locks also get automatically released if the app dies, which is usually a good thing. Plus, these will let you lock your shared file directly without having to have a separate lockfile. This, too, work on NFS.
Windows has very similar file locking functions, and it also has easy to use global named semaphores that are very convenient for synchronisation between processes.
As far as I've seen it, you can't reliably use files as locks for multiple processes. The problem is, while you create the file in one thread, you might get an interrupt and the OS switches to another process because I/O is taking so long. The same holds true for deletion of the lock file.
If you can, take a look at Boost.Interprocess, under the synchronization mechanisms part.
While I'm generally against making API calls which can throw from a constructor/destructor (see docs on boost::filesystem::remove) or making throwing calls without a catch block in general that's not really what you were asking about.
You could check out the Overlapped IO library if this is for windows. Otherwise have you considered using shared memory between the processes instead?
Edit: Just saw the other process was Java. You may still be able to create a named mutex that can be shared between processes and used that to create locks around the file IO bits so they have to take turns writing. Sorry I don't know Java so no I idea if that's more feasible than shared memory.