I have a large text file.
Each time my program runs, it needs to read in the first line, remove it, and put that data back into the bottom of the file.
Is there a way to accomplish this task without having to read in every part of the file?
It would be great to follow this example of pseudo code:
1. Open file stream for reading/writing
2. data = first line of file
3. remove first line from file <-- can I do this?
4. Close file stream
5. Open file stream for appending
6. write data to file
7. Close file stream
The reason I'm trying to avoid reading everything in is because the program runs at a specific time each day. I don't want the delay to be longer each time the file gets bigger.
All the solutions I've found require that the program process the whole file. If C++ filestreams are not able to accomplish this, I'm up for whatever alternative is quick and efficient for my C++ program to execute.
thanks.
The unfortunate truth is that no filesystem on a modern OS is designed to do this. The only way to remove something from the beginning of a file is to copy the contents to a new file, except for the first bit. There's simply no way to do precisely what you want to do.
But hopefully you can do a bit of redesign. Maybe each entry could be a record in a database -- then the reordering can be done very efficiently. Or perhaps the file could contain fixed-size records, and you could use a second file of indexes to specify record order, so that rearranging the file was just a matter of updating the indices.
Related
I've been Googling this for hours...reading and reading and reading, and yet nothing I come across seems to answer this simple question: In C or C++ programming: I have a file, it contains "hello world". I want to delete "world" (like pressing Backspace in a text editor), then save the file. How do I do this?
I know that files are streams (excellent info on that here!), which don't seem to have a way to delete items from a file per say, and I've studied all of the file-related functions in stdio.h: http://www.cplusplus.com/reference/cstdio/fopen/.
It seems to me that files and streams therefore are NOT like arrays: I can't just delete a byte from a file! Rather (I guess?) I have to create an entire new file and copy the whole original file into the new file withOUT the parts I want to delete? Is that the case?
The only other option I can think of is to seek to the position before "world", then write binary zeros to the end of the file, thereby overwriting "world". The problem with this, however, is a text editor will now no longer properly display this file, as it has non-printable characters in it--and the file size hasn't shrunk--it still contains these bytes--it's just that they hold zeros now instead of ASCII text, so this doesn't seem to be right either.
Related
Resizing a file in C++
You want std::filesystem::resize_file()
Assume your original file is "data.txt". As part of your code, open a new temp file say "data.txt.tmp" and start writing contents to it from original file. Upon writing data, replace the original file with the new one.
You can use a memory map from the source file and copy the data blocks you want to another memory map over target file. That's the easy and fast way (see http://man7.org/linux/man-pages/man2/mmap.2.html)
I am trying to write a c++ code that will put some additional text in the middle that not overwrite. I have tried every possible combination of tags but none of them are working. Can anybody give me an working example ?
For example :-
if input is :-
Hello!
Hey are you there ?
Is anybody home ?
Then the output should be :-
Hello!
Hey are you there ?
Where are you ?
Is anybody home ?
The Where are you text is inserted in the middle.I'm using c++ file handling.
I think that files work alot like arrays in that you can't just do an easy insert. For instance if you are implementing a vector or arraylist and want to insert a value in the middle you must shit all values after that. To insert in the middle I think you will need to shift all the contents bellow. I would maybe read everything into memory first or use a temp file.
This is not a limitation of C++ but of the underlying filesystem (on most modern file-systems).
A file is a block(s) of contiguous bytes, you can not append in the middle.
You have two options:
Read the file into memory.
Manipulate the file in memory
Overwrite the old file.
Open the file for reading and a temporary file for writing.
Copy from input file to output file until you get to the point you want to add text.
Write the modifications and finish the copy.
Replace the file with the tempfile.
I have two programs that will be reading / writing files to the same directory at the same time (but not to the same exact files at the same time). I have the writing portion done, but I am struggling to get a half way decent and working implementation of the reading directory portion.
The files within the directory follow the following naming scheme:
Image-[INDEX]-[KEY/DEL]--[TIMESTAMP]
[INDEX] increments up from 000000, [KEY/DEL] alternates based on whether the image is a key or a delta frame and [TIMESTAMP] is the Unix / Linux epoch time at file creation.
Right now, the reading program reads in the directory (using the dirent.h library) one file at a time every time it needs to find an image within the directory. When the directory gets extremely large, I would imagine that this operation / method will quickly become extremely resource intensive, and eventually fail. So, I am trying to find an alternative method. I was thinking of reading in the entire directory at initialization, and saving the file information in an array to access / use later in the program. Then, when a file is requested that is not in the array, the program would go and update the array of files by reading in the directory, but this time starting from the point it left off at the end of the initialization.
Is this possible? To start reading in the file names within a directory at a known point (the last file "read in") in the directory? Or do I have to start all the way from the beginning each time?
Or is there a better way of doing this?
Thanks.
As Andrew said, I would confirm that this is actually a problem before trying to solve it.
If you can discount the possibility of files being created out of sequence, that is, no file
you wish to process before another file will ever be created after that file, then you can use this method.
First, read the entire directory listing into an array or vector. Then, when iterating files, just iterate the vector. Finally, if you get a file not found or reach the end of the vector, refresh it just in case more have been created.
You will no doubt want to encapsulate this logic into some sort of context object, which remembers the last file read. You could also optimise by sorting the vector.
I'm trying to create a reference program which I think will use an excel spreadsheet to hold information for reading only. I want the user to be able to select a topic from an option list and have the information in the appropriate cell be fed back to them. The program is being written in C++. My question is, how do I access specific cells from a spreadsheet from my program? I've researched it a little and I've seen that I want to save my file as a csv and use fscanf to read the contents, but I'm at a loss as to how I would do this part. I googled it and found this thread:
http://www.daniweb.com/software-development/cpp/threads/204808/parsing-a-csv-file-separated-by-semicolons
but I think it reads in all of the data from the CSV? From what I can tell anyways. And I only want to pull specific elements. Is that possible?
If you only want specific elements, you would still have to parse all contents of the file until you reach those elements. You don't have to store values you don't need, but you do need to parse them to advance in the file.
Are you invoking the program from Excel? If you are, a little VBA goes a long way. You could always only export the cells of interest ready for your C++ program to read in.
Otherwise, other answers are correct. However, you don't need to load the entire file into memory at once. You can use std::fstream to open the file and read in each line of the file, parsing in the required information for each line.
I was writing a program in C++ and wonder if anyone can help me with the situation explained here.
Suppose, I have a log file of about size 30MB, I have copied last 2MB of file to a buffer within the program.
I delete the file (or clear the contents) and then write back my 2MB to the file.
Everything works fine till here. But, the concern is I read the file (the last 2MB) and clear the file (the 30MB file) and then write back the last 2MB.
To much of time will be needed if in a scenario where I am copying last 300MB of file from a 1GB file.
Does anyone have an idea of making this process simpler?
When having a large log file the following reasons should and will be considered.
Disk Space: Log files are uncompressed plain text and consume large amounts of space.
Typical compression reduce the file size by 10:1. However a file cannot be compressed
when it is in use (locked). So a log file must be rotated out of use.
System resources: Opening and closing a file regularly will consume lots of system
resources and it would reduce the performance of the server.
File size: Small files are easier to backup and restore in case of a failure.
I just do not want to copy, clear and re-write the last specific lines to a file. Just a simpler process.... :-)
EDIT: Not making any inhouse process to support log rotation.
logrotate is the tool.
I would suggest an slightly different approach.
Create a new temporary file
Copy the required data from the original file to the temporary file
Close both files
Delete the original file
Rename the temp file to the same name as the original file
To improve the performance of the copy, you can copy the data in chunks, you can play around with the chunk size to find the optimal value.
If this is your file before:
-----------------++++
Where - is what you don't want and + is what you do want, the most portable way of getting:
++++
...is just as you said. Read in the section you want (+), delete/clear the file (as with fopen(... 'wb') or something similar and write out the bit you want (+).
Anything more complicated requires OS-specific help, and isn't portable. Unfortunately, I don't believe any major OS out there has support for what you want. There might be support for "truncate after position X" (a sort of head), but not the tail like operation you're requesting.
Such an operation would be difficult to implement, as varying blocksizes on filesystems (if the filesystem has a block size) would cause trouble. At best, you'd be limited to cutting on blocksize boundaries, but this would be harry. This is such a rare case, that this is probably why such a procudure is not directly supported.
A better approach might be not to let the file grow that big but rather use rotating log files with a set maximum size per log file and a maximum number of old files being kept.
If you can control the writing process, what you probably want to do here is to write to the file like a circular buffer. That way you can keep the last X bytes of data without having to do what you're suggesting at all.
Even if you can't control the writing process, if you can at least control what file it writes to, then maybe you could get it to write to a named pipe. You could attach your own program at the end of this named pipe that writes to a circular buffer as discussed.