Problem- Multiple processes want to update a file simultaneously.I do not want to use file locking functionality as highly loaded environment a process may block for a while which i don't want. I want something like all process send data to queue or some shared place or something else and one master process will keep on taking data from there and write to the file.So that no process will get block.
One possibility using socket programming.All the processes will send data to to single port and master keep on listening this single port and store data to file.But what if master got down for few seconds.if it happen than i may write to some file based on timestamp and than later sync.But i am putting this on hold and looking for some other solution.(No data lose)
Another possibility may be tacking lock for the particular segment of the file on which the process want to write.Basically each process will write a line.I am not sure how good it will be for high loaded system.
Please suggest some solution for this problem.
Have a 0mq instance handle the writes (as you initially proposed for the socket) and have the workers connect to it and add their writes to the queue (example in many languages).
Each process can write to own file (pid.temp) and periodically rename file (pid-0.data, pid-1.data, ...) for master process that can grab all this files.
You may not need to construct something like this. If you do not want to get processes blocked just use the LOCK_NB flag of perl flock. Periodically try to flock. If not succeeds continue the processing and the values can stored in an array. If file locked, write the data to it from the array.
Related
I have to write some data to a file based on current hour in my server. For example, write data to a file named like 2015061117.txt. And there is multiple processes write data to file simultaneously. How should I design my server to implement this? Do I need to use some synchronization api like pthread_mutex_lock?
If you want multiple processes, or even multiple threads, to write to the same file simultaneously, then you need to synchronize them so only one process or thread write at a time.
My suggestion is to use a separate process or thread that handles all the logging, and the other processes/threads sends "messages" to the logging process/thread which then writes the messages in the order it receives them. Similar to the syslog system in Linux.
When using blocking sockets, all I had to do to send a file was to open the file and loop through it and send it in chunks.
But I find sending a file using overlapped sockets to be more challenging. I can think of the following approach to do it:
I open the file and send the first chunk, and I keep track of the
file handle and file position (I store these data somewhere in memory).
Now when I get a completion packet indicating that some data has
been sent, I retrieve the file handle and file position and send the next chunk.
I repeat step 2 until I reach the last chunk in the file, and then I
close the file.
Is this approach correct?
Note: I don't want to use TransmitFile().
Edit: I have updated my question.
If you don't want to use TransmitFile() then you can use overlapped file I/O using IOCP where the completion of a file read is used to trigger a socket write and the completion of a socket write is used to trigger a file read. You then decide how much data you want in transit and issue that many file reads and wait for EOF...
Easiest way: look up 'TransmitFile' on MSDN. This functionality is so common, (eg. serving up web pages), that there is a specific API for it.
I'm working on a Multi Threaded application programmed in C++. I uses some temporary files to pass data between my threads. One thread writes the data to be processed into files in a directory. Another thread scans the directory for work files and reads the files and process them further, then delete those files. I have to use these files , because if my app gets killed when , i have to retain the data which has not been processed yet.
But i hate to be using multiple files. I just want to use a single file. One thread continuously writing to a file and other thread reading the data and deleting the data which has been read.
Like a vessel is filled from top and at bottom i can get and delete the data from vessel. How to do this efficiently in C++ , first is there a way ..?
As was suggested in the comments to your questions using a database like SQLite may be a very good solution.
However if you insist on using a file then this is of course possible.
I did it myself once - created a persistent queue on disk using a file.
Here are the guidelines on how to achieve this:
The file should contain a header which point to the next unprocessed record (entry) and to the next available place to write to.
If the records have variable length then each record should contain a header which states the record length.
You may want to add to each record a flag that indicates whether the record was processed
file locking can be used to ensure no one reads from the portion of the file that is being written to
Use low level IO - don't use buffered streams of any kind, use direct write semantics
And here is the schemes for reading and writing (probably with some small logical bugs but you should be able to take it from there):
READER
Lock the file header and read it and unlock it back
Go to the last record position
Read the record header and the record
Write the record header back with the processed flag turned on
If you are not at the end of file Lock the header and write the new location of the next unprocessed record else write some marking to indicate there are no more records to process
Make sure that the next record to write points to the correct place
You may also want the reader to compact the file for you once in a while:
Lock the entire file
Copy all unprocessed records to the beginning of the file (You may want to keep some logic as not to overwrite your unprocessed records - maybe compact only if processed space is larger than unprocessed space)
Update the header
Unlock the file
WRITER
Lock the header of the file and see where the next record is to be written then unlock it
Lock the file from the place to be written to the length of the record
Write the record and unlock
Lock the header if the unprocessed record mark indicates there are no records to process let it point to the new record unlock the header
Hope this sets you on the write track
The win32Api function CreateFileMapping() enables processes to share data, multiple processes can use memory-mapped files that the system paging file stores.
A few good links:
http://msdn.microsoft.com/en-us/library/aa366551(VS.85).aspx
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366551(v=vs.85).aspx
http://www.codeproject.com/Articles/34073/Inter-Process-Communication-IPC-Introduction-and-S
http://www.codeproject.com/Articles/19531/A-Wrapped-Class-of-Share-Memory
http://www.bogotobogo.com/cplusplus/multithreaded2C.php
you can write data that was process line per line and delimeter for each line indicate if this record processing or not
I have this tool in which a single log-like file is written to by several processes.
What I want to achieve is to have the file truncated when it is first opened, and then have all writes done at the end by the several processes that have it open.
All writes are systematically flushed and mutex-protected so that I don't get jumbled output.
First, a process creates the file, then starts a sequence of other processes, one at a time, that then open the file and write to it (the master sometimes chimes in with additional content; the slave process may or may not be open and writing something).
I'd like, as much as possible, not to use more IPC that what already exists (all I'm doing now is writing to a popen-created pipe). I have no access to external libraries other that the CRT and Win32 API, and I would like not to start writing serialization code.
Here is some code that shows where I've gone:
// open the file. Truncate it if we're the 'master', append to it if we're a 'slave'
std::ofstream blah(filename, ios::out | (isClient ? ios:app : 0));
// do stuff...
// write stuff
myMutex.acquire();
blah << "stuff to write" << std::flush;
myMutex.release();
Well, this does not work: although the output of the slave process is ordered as expected, what the master writes is either bunched together or at the wrong place, when it exists at all.
I have two questions: is the flag combination given to the ofstream's constructor the right one ? Am I going the right way anyway ?
If you'll be writing a lot of data to the log from multiple threads, you'll need to rethink the design, since all threads will block on trying to acquire the mutex, and in general you don't want your threads blocked from doing work so they can log. In that case, you'd want to write your worker thread to log entries to queue (which just requires moving stuff around in memory), and have a dedicated thread to pull entries off the queue and write them to the output. That way your worker threads are blocked for as short a time as possible.
You can do even better than this by using async I/O, but that gets a bit more tricky.
As suggested by reinier, the problem was not in the way I use the files but in the way the programs behave.
The fstreams do just fine.
What I missed out is the synchronization between the master and the slave (the former was assuming a particular operation was synchronous where it was not).
edit: Oh well, there still was a problem with the open flags. The process that opened the file with ios::out did not move the file pointer as needed (erasing text other processes were writing), and using seekp() completely screwed the output when writing to cout as another part of the code uses cerr.
My final solution is to keep the mutex and the flush, and, for the master process, open the file in ios::out mode (to create or truncate the file), close it and reopen it using ios::app.
I made a 'lil log system that has it's own process and will handle the writing process, the idea is quite simeple. The proccesses that uses the logs just send them to a pending queue which the log process will try to write to a file. It's like batch procesing in any realtime rendering app. This way you'll grt rid of too much open/close file operations. If I can I'll add the sample code.
How do you create that mutex?
For this to work this needs to be a named mutex so that both processes actually lock on the same thing.
You can check that your mutex is actually working correctly with a small piece of code that lock it in one process and another process which tries to acquire it.
I suggest blocking such that the text is completely written to the file before releasing the mutex. I've had instances where the text from one task is interrupted by text from a higher priority thread; doesn't look very pretty.
Also, put the format into Comma Separated format, or some format that can be easily loaded into a spreadsheet. Include thread ID and timestamp. The interlacing of the text lines shows how the threads are interacting. The ID parameter allows you to sort by thread. Timestamps can be used to show sequential access as well as duration. Writing in a spreadsheet friendly format will allow you to analyze the log file with an external tool without writing any conversion utilities. This has helped me greatly.
One option is to use ACE::logging. It has an efficient implementation of concurrent logging.
We're trying to read data from 2 usb mice connected to a linux box (this data is used for odometry/localization on a robot). So we need to continuously read from each mouse how much it moved. The problem is that when a mouse is not moving, it doesn't send any data, so the file stream from which we get the data blocks execution and therefore the program can't do the odometry calculations (which involve time measurement for speed).
Is there a way to set a timeout on the input stream (we're using ifstream in C++ and read from /dev/input/mouse), so that we're able to know when the mouse doesn't move, instead of waiting for an event to be received? Or do we need to mess up with threads (arggh...)? Any other suggestions are welcome!
Thanks in advance!
A common way to read from multiple file descriptors in linux is to use select(). I suggest starting with the manpage. The basic system flow is as follows:
1) Initialize devices
2) Obtain list of device file descriptors
3) Setup the time out
4) Call select with file descriptors and timeout as parameters - it will block until there is data on one of the file descriptors or the time out is reached
5) Determine why select returned and act accordingly (i.e. call read() on the file descriptor that has data). You may need to internally buffer the result of read until an entire data gram is obtained.
6) loop back to 4.
This can become your programs main loop. If you already have a different main loop you, can run the above without looping, but your will need to insure that the function is called frequently enough such that you do not lose data on the serial ports. You should also insure that your update rate (i.e. 1/timeout) is fast enough for your primary task.
Select can operate on any file descriptor such network sockets and anything else that exposes an interface through a file descriptor.
What you're looking for would be an asynchronous way to read from ifstream, like socket communication. The only thing that could help would be the readsome function, perhaps it returns if no data is available, but I doubt this helps.
Using threads would be the best way to handle this.
Take a look at the boost Asio library. This might help you deal with the threading suggested by schnaeder.
No, there is no such method. You'll have to wait for an event, or create a custom Timer class and wait for a timeout to repoll, or use threads.