append and read one file at same time - c++

I have one file that is updated in every second, I append some line end of it and another thread read it every time. so I have two pointer to this file for these work. is it possible?
(I use two while(1) for updating and reading in two function)
thanks.

Here's a good example for reading a single file with multiple threads : Mutlitple thread reading a single file
You could start from here.
Like said #MatsPetersson, you have to be really sure of what you're doing in each thread. If you don't want to read incomplete data, you will need to make sure the other thread is not writing in the file. There's several ways of doing this, you can use for example Mutex or Signal or Shared Memory Segment of a bool.
I think in your case, even if it's not explicit, you need to read only when no other thread is writing, to do this I will recommand the use of Mutex. Here's the doc : Mutex function documentation .
So we have readThread and writeThread. Here's a pseudo-code of how you treat your problem :
main(){
putTheMutexTo(1);
}
readThread(){
consumeMutex(1);
openTheFile();
readTheFile();
closeTheFile();
loadMutex(1);
}
writeThread(){
consumeMutex(1);
openTheFile();
writeTheFile();
closeTheFile();
loadMutex(1);
}
But if you don't really know how Mutex works, don't go code right now, and go read some doc on the Internet, because this is a bit complex to understand when you start.

Related

Trying to understand read write lock and the need for two mutex instead of one

I have been reading this wiki article about read/write lock and it says to implement it we need two mutexes but why can't we do it with one mutex? I am wondering if my understanding is correct.
we need to make sure one write is happening at any moment
we need to make sure no one is reading while it's being written to
many reads can happen concurrently and its completely fine
Write:
// wait until no one is reading concurrently
if (wait(write_lock))
{
acquire(write_lock)
// Do the write operation
release(write_lock)
}
Read:
// wait until no one is currently writing
if (wait(write_lock))
{
// Do the read operation atomically
}

Can you have multiple "cursors" for the same ifstream ? Would that be thread-safe?

I have multiple threads, and I want each of them to process a part of my file. Can I have a single ifstream object for that and make them read concurrently read different parts ? The parts are non overlapping, so the same line will not be processed by two threads. If yes, how to get multiple cursors ?
A single std::ifstream is associated with exactly one cursor (there's a seekg and tellg method associated with the std::ifstream directly).
If you want the same std::ifstream object to be shared accross multiple threads, you'll have to have some sort of synchronization mechanism between the threads, which might defeat the purpose (in each thread, you'll have to lock, seek, read and unlock each time).
To solve your problem, you can open one std::ifstream to the same file per thread. In each thread, you'd seek to whatever position you want to start reading from. This would only require you to be able to "easily" compute the seek position for each thread though (Note: this is a pretty strong requirement).
C++ file streams are not guaranteed to be thread safe (see e.g. this answer).
The typical solution is anyway to open separate streams on the same file, each instance comes with their own "cursor". However, you need to ensure shared access, and concurrency becomes platform specific.
For ifstream (i.e. only reading from the file), the concurrency issues are usually tame. Even if someone else modifies the file, both streams might see different content, but you do have some kind of eventual consistency.
Reads and writes are usually not atomic, i.e. you might read only part of a write. Writes might not even execute in the order they are issued (see write combining).
Looking at FILE struct it seems like there is a pointer inside FILE, char* curp, pointing to the current active pointer, which may mean that for each FILE object, you'd have one particular part of the file.
This being in C, I don't know how ifstream works and if it uses FILE object/it is built like a FILE object. Might not help you at all, but I thought it would be interesting to share this little information, and that it could may be help someone.

Where to put the mutex in a logging class?

First off, it's been a while since I've used any sort of mutex or semaphore, so go easy on me.
I have implemented a generic logging class that right now only receives a message from other classes and prepends that message with date/time and the level of debug, and then prints the message to stdout.
I would like to implement some sort of queue or buffer that will hold many messages that are sent to the logging class and then write them to a file.
The problem that I'm running into is I can't decide how/where to protect the queue.
Below is some pseudo-code of what I've come up with so far:
logMessage(char *msg, int debugLevel){
formattedMsg = formatMsg(msg, debugLevel) //formats the msg to include date/time & debugLevel
lockMutext()
queue.add(formattedMsg)
unlockMutex()
}
wrtieToFile(){
if (isMessageAvailable()) { //would check to see if there is a message in the queue
lockMutext()
file << queue.getFirst() //would append file with the first available msg from the queue
unlockMutex()
}
}
My questions are:
Do I really need to use the mutex in both places?
Is a mutex really what I'm looking for?
I'm thinking I may need a thread for the writing to the file part - does that sound like a good idea?
FYI I looking for a way to do this without using Boost or any 3rd party library.
EDIT The intended platform is Linux.
EDIT 2 Moved formatMsg to before the mutex lock (thank you #Paul Rubel)
With respect to do you really need the mutex. Think what could happen if you didn't lock things. Unless your queue is thread-safe you probably need to protect both insertion and removal.
Imagine execution contexts changing as you are removing the first element. The add could find the queue in a inconsistent state, and then who knows what could happen.
Regarding creating the message, unless formatMsg makes use of shared resources you can probably more it out of the locked section, which can increase your parallelism.
Extracting the writing to file into its own thread sounds like a reasonable choice, that way the logging threads will not have to make the calls themselves.
correct me if i'm wrong. Multiple callers from multiple threads all trying to access the same resource concurrently.
Maybe you could just have one mutex wrapping the entirety of your logging functionality.
watch out for race conditions.
Edit
Readers take a look at the comments to this answer for some valuable discussion
You can define a global variable which contains the number of element present in the queue or buffer. That means you need to increment or decrement this variable while adding data or removing data from buffer or queue. So you keep this variable inside a mutex for your above logging framework.

check file exists once in n mins c++

i have created a class which reads a file and does some operations on the contents and saves a new file with time stamp. But, i am in a requirement to perform in such a way that , a code should check every one min whether the file is present. If yes, it should process the file. It need to work on cross platform.
I am novice in c++ and need to know what approach i need to follow for this. Do i need to create process or something. I am completely blank .
class inputHandler
{
public:
void readInput();
void performTask();
void saveFile();
};
since the code implementation is too large, just i am posting the structure. I am ready to spend time on this. So, i need a sample tutorial which can guide me to achieve this .
This is not addressed by the C++ standard. Thus, you'll have to implement code for each supported system, or use a library.
As far as I understood, the most general solution is to create a thread which loops every minute, checking file timestamps. Naturally, depending on your code, you could do it another way, avoiding threads whatsoever. Using a notification system such as inotify could be much better. Also, you could use alarm() on POSIX-compatible systems, being alarmed whenever a minute has passed.
Anyway, if you go with the thread solution, in POSIX-compatible systems, check out pthread_create() and stat(). In Windows, check out CreateThread() and GetFileTime(). To have a one-minute delay, sleep(60000) or Sleep(60000) respectively should do the trick.
Just to clarify, "to create a process" is system's programming jargon meaning roughly "to launch a new program" (or "thread", sometimes). In that sense, if you follow the above you'll be creating a new thread.
The simple part is checking if a file exists: when you open an std::ifstream it will be in good state only if the file exists:
std::ifstream in(filename);
if (in) {
// the file exists and can be processed here
}
The more interesting part is to do something in regular intervals. The basic idea is to set up a timer in some form. Depending on whether anything else needs to be done you may need a separate thread: if the program just waits until the file exists and doesn't do anything in the mean time, you can just sleep and there is no need to spawn another thread. Otherwise, you probably want to spawn a thread which is just sleeping.
Assuming you need to use a separate thread, you probably want to be able to interrupt it from waiting, e.g., to exit in a clean way upon condition from a separate thread. thus, I would use a condition variable with a timed wait, i.e., something like this:
std::mutex guard;
std::condition_variable condition;
bool done(false);
std::unique_lock<std::mutex> lock(guard);
while (!done) {
condition.wait_for(lock, std::chrono::minutes(n));
if (!done) {
do_whatever_needs_to_be_done_once_every_n_minutes();
}
}
The code above uses C++ 2011 facilities. If you can't use the corresponding classes, you can use suitable alternatives, e.g., the Boost classes.

proper way to use lock file(s) as locks between multiple processes

I have a situation where 2 different processes(mine C++, other done by other people in JAVA) are a writer and a reader from some shared data file. So I was trying to avoid race condition by writing a class like this(EDIT:this code is broken, it was just an example)
class ReadStatus
{
bool canRead;
public:
ReadStatus()
{
if (filesystem::exists(noReadFileName))
{
canRead = false;
return;
}
ofstream noWriteFile;
noWriteFile.open (noWriteFileName.c_str());
if ( ! noWriteFile.is_open())
{
canRead = false;
return;
}
boost::this_thread::sleep(boost::posix_time::seconds(1));
if (filesystem::exists(noReadFileName))
{
filesystem::remove(noWriteFileName);
canRead= false;
return;
}
canRead= true;
}
~ReadStatus()
{
if (filesystem::exists(noWriteFileName))
filesystem::remove(noWriteFileName);
}
inline bool OKToRead()
{
return canRead;
}
};
usage:
ReadStatus readStatus; //RAII FTW
if ( ! readStatus.OKToRead())
return;
This is for one program ofc, other will have analogous class.
Idea is:
1. check if other program created his "I'm owner file", if it has break else go to 2.
2. create my "I'm the owner" file, check again if other program created his own, if it has delete my file and break else go to 3.
3. do my reading, then delete mine "I'm the owner file".
Please note that rare occurences when they both dont read or write are OK, but the problem is that I still see a small chance of race conditions because theoretically other program can check for the existence of my lock file, see that there isnt one, then I create mine, other program creates his own, but before FS creates his file I check again, and it isnt there, then disaster occurs. This is why I added the one sec delay, but as a CS nerd I find it unnerving to have code like that running.
Ofc I don't expect anybody here to write me a solution, but I would be happy if someone does know a link to a reliable code that I can use.
P.S. It has to be files, cuz I'm not writing entire project and that is how it is arranged to be done.
P.P.S.: access to data file isn't reader,writer,reader,writer.... it can be reader,reader,writer,writer,writer,reader,writer....
P.P.S: other process is not written in C++ :(, so boost is out of the question.
On Unices the traditional way of doing pure filesystem based locking is to use dedicated lockfiles with mkdir() and rmdir(), which can be created and removed atomically via single system calls. You avoid races by never explicitly testing for the existence of the lock --- instead you always try to take the lock. So:
lock:
while mkdir(lockfile) fails
sleep
unlock:
rmdir(lockfile)
I believe this even works over NFS (which usually sucks for this sort of thing).
However, you probably also want to look into proper file locking, which is loads better; I use F_SETLK/F_UNLCK fcntl locks for this on Linux (note that these are different from flock locks, despite the name of the structure). This allows you to properly block until the lock is released. These locks also get automatically released if the app dies, which is usually a good thing. Plus, these will let you lock your shared file directly without having to have a separate lockfile. This, too, work on NFS.
Windows has very similar file locking functions, and it also has easy to use global named semaphores that are very convenient for synchronisation between processes.
As far as I've seen it, you can't reliably use files as locks for multiple processes. The problem is, while you create the file in one thread, you might get an interrupt and the OS switches to another process because I/O is taking so long. The same holds true for deletion of the lock file.
If you can, take a look at Boost.Interprocess, under the synchronization mechanisms part.
While I'm generally against making API calls which can throw from a constructor/destructor (see docs on boost::filesystem::remove) or making throwing calls without a catch block in general that's not really what you were asking about.
You could check out the Overlapped IO library if this is for windows. Otherwise have you considered using shared memory between the processes instead?
Edit: Just saw the other process was Java. You may still be able to create a named mutex that can be shared between processes and used that to create locks around the file IO bits so they have to take turns writing. Sorry I don't know Java so no I idea if that's more feasible than shared memory.