How to overwrite only part of a file in c++ - c++

I want to make modifications to the middle of a text file using c++, without altering the rest of the file. How can I do that?

Use std::fstream.
The simpler std::ofstream would not work. It would truncate your file (unless you use option std::ios_base::app, which is not what you want anyway).
std::fstream s(my_file_path); // use option std::ios_base::binary if necessary
s.seekp(position_of_data_to_overwrite, std::ios_base::beg);
s.write(my_data, size_of_data_to_overwrite);

If the replacement string is the same length, you can make the change in place. If the replacement string is shorter, you may be able to pad it with zero-width spaces or similar to make it the same number of bytes, and make the change in place. If the replacement string is longer, there just isn't enough room unless you first move all remaining data.

Generally, open the file for reading in text mode, read line after line until the place you want to change, while reading the lines, write them in a second text file you opened for writing. At the place for change, write to the second file the new data. Then continue the read/write of the file to its end.

Related

How do i write to a specific line of a text file?

myfile<<hashdugumu[key].numara;
I have this piece of code.For example,i would like to write to eighth line.How do i do that in c++ ?Thanks in advance.
If the line you want to write is exactly the same length (in bytes, not in characters, remember some encodings (like e.g. UTF-8) is variable length) then it's very easy: Just skip over the first seven lines and then write the line.
There is a caveat with this though: input streams and output streams have different stream positions. So if you read from a combined input/output file stream then only the read position will change, so if you just try to write directly then you will not write at the same position. To solve this you need to get the read position, and set the write position to the same value.
As an alternative, or if the data you want to write is not the same size as the existing data, then you have to use a temporary "buffer", be it another file or an actual in-memory buffer.
If the file is not big you can use an in-memory buffer, for example using a std::vector for the lines. Read each line into the vector, and then modify the lines (elements in the vector) that you want to modify. Finally reopen the file for writing, truncating it, and then just write each "line" to the file.
There is a slight problem with the above though when it comes to the rewriting of the data, and that is if the file is truncated and then there's an error when you write to the file, you can lose data. This can be dsolved by using a temporary file.
Using a temporary file it's easier to not bother with the in-memory buffer, and instead read from the original file and write directly to the temporary file. Knowing when you should write something else is done by keeping track of the current line numbers, which is easy if you read one line at a time. In your example you read the first seven lines from the original file and write them to the temporary file, after the seventh line you write your special eight line while skipping the original eight line from the original file, and then just continue reading/writing the remaining lines. When done close the files and then rename the temporary file as the original file.

how to delete the last line in a text file with 100M lines without having to rewrite the whole file?

Suppose I have a really large text file, say 100 million lines or 1 GB and I want to delete the last line. Is there anyway to do this without having to rewrite 99,999,999 lines to a new file and delete the old one? Suppose the file is really really large that the rewrite option is prohibitively expensive. What would you do to delete the last line then? Thank you.
You can open the file, read from the end backwards until you find the first line delimiter (normally LF or CR/LF, depending on platform), calculate the file offset at that point, and truncate the file to that file offset.
You should use a truncation function, but neither FILE* nor iostream support it.
However, there are usually OS-specific functions at the lower level to truncate a file.
If Unix, you may use ftruncate, but you'll need to find the offset where you want to truncate first (does each line have a fixed size?).
Be careful that, if you have opened a FILE* for finding the offset, you need to be sure to synchronize it with the lower level. You may simply fclose the file, then reopen it with open for the ftruncate of the file at the decided offset.
Similar questions: https://stackoverflow.com/a/873653/2741329 and https://stackoverflow.com/a/15154682/2741329

getline() text with UNIX formatting characters

I am writing a C++ program which reads lines of text from a .txt file. Unfortunately the text file is generated by a twenty-something year old UNIX program and it contains a lot of bizarre formatting characters.
The first few lines of the file are plain, English text and these are read with no problems. However, whenever a line contains one or more of these strange characters mixed in with the text, that entire line is read as characters and the data is lost.
The really confusing part is that if I manually delete the first couple of lines so that the very first character in the file is one of these unusual characters, then everything in the file is read perfectly. The unusual characters obviously just display as little ascii squiggles -arrows, smiley faces etc, which is fine. It seems as though a decision is being made automatically, without my knowledge or consent, based on the first line read.
Based on some googling, I suspected that the issue might be with the locale, but according to the visual studio debugger, the locale property of the ifstream object is "C" in both scenarios.
The code which reads the data is as follows:
//Function to open file at location specified by inFilePath, load and process data
int OpenFile(const char* inFilePath)
{
string line;
ifstream codeFile;
//open text file
codeFile.open(inFilePath,ios::in);
//read file line by line
while ( codeFile.good() )
{
getline(codeFile,line);
//check non-zero length
if (line != "")
ProcessLine(&line[0]);
}
//close line
codeFile.close();
return 1;
}
If anyone has any suggestions as to what might be going on or how to fix it, they would be very welcome.
From reading about your issues it sounds like you are reading in binary data, which will cause getline() to throw out content or simply skip over the line.
You have a couple of choices:
If you simply need lines from the data file you can first sanitise them by removing all non-printable characters (that is the "official" name for those weird ascii characters). On UNIX a tool such as strings would help you with that process.
You can off course also do this programmatically in your code by simply reading in X amount of data, storing it in a string, and then removing those characters that fall outside of the standard ASCII character range. This will most likely cause you to lose any unicode that may be stored in the file.
You change your program to understand the format and basically write a parser that allows you to parse the document in a more sane way.
If you can, I would suggest trying solution number 1, simply to see if the results are sane and can still be used. You mention that this is medical data, do you per-chance know what file format this is? If you are trying to find out and have access to a unix/linux machine you can use the utility file and maybe it can give you a clue (worst case it will tell you it is simply data).
If possible try getting a "clean" file that you can post the hex dump of so that we can try to provide better help than that what we are currently providing. With clean I mean that there is no personally identifying information in the file.
For number 2, open the file in binary mode. You mentioned using Windows, binary and non-binary files in std::fstream objects are handled differently, whereas on UNIX systems this is not the case (on most systems, I'm sure I'll get a comment regarding the one system that doesn't match this description).
codeFile.open(inFilePath,ios::in);
would become
codeFile.open(inFilePath, ios::in | ios::binary);
Instead of getline() you will want to become intimately familiar with .read() which will allow unformatted operations on the ifstream.
Reading will be like this:
// This code has not been tested!
char input[1024];
codeFile.read(input, 1024);
int actual_read = codeFile.gcount();
// Here you can process input, up to a maximum of actual_read characters.
//ProcessLine() // We didn't necessarily read a line!
ProcessData(input, actual_read);
The other thing as mentioned is that you can change the locale for the current stream and change the separator it considers a new line, maybe this will fix your issue without requiring to use the unformatted operators:
imbue the stream with a new locale that only knows about the newline. This method may or may not let your getline() function without issues.

QTextStream Maniuplation

I'm opening a file and getting a QTextStream of it. I am then reading the stream line by line using readLine(). When the line matches a certain string, I need to replace it with another string. I need the behaviour to be that the line is completely replaced (ie, if the line was "longword" and I replace it with "word", the line should contain "word" and "word" only).
At the moment I am using seek() and then the << operator to put my string in at the given location, but the remnants of the last string remain, so I am left with something like "wordword". How can I prevent this from happening and ensure the entire previous line is fully replaced with my new one?
To my knowledge, you cannot simply remove a chunk of a text file in-place. If the replacement string was identical in size, you might be able to replace those exact bytes, and if it were shorter you might be able to hack around the problem by filling the empty space with nulls.
If you didn't want to do that, you would have to create a new file, read each line from the old file, make any required changes to that line in memory, then write that line out to the new file. Once this is complete, you could then replace the original file with the new file.
If it were possible to add/remove chunks to/from the file, you would most likely be left with a considerably fragmented file on the HDD. If you needed to insert more characters, extra fragments would have to be created as the new data simply couldn't fit in the amount of space occupied by the old data, and removing data would leave holes in the file.

Seeking to a line in a file in g++

Is there a way that I can seek to a certain line in a file to read or write data?
Let's say I want to write some data starting on the 10th line in a text file. There might be some data already in the first few lines, or the file could even be empty. Is there a way I can seek directly to the line I want without having to worry about what's already in the file?
Only if the lines are all the same length (seek to 9 * bytes_per_line). Otherwise, you'll just have to scan your way to the appropriate spot in the file.
Also be wary of writing into the middle of a file. It may not do what you expect (insert new lines). It will simply overwrite whatever content is already there, and won't respect existing line boundaries.
You can seek to a position in a file, but that position must be a character offset from the start, end or current position - see for example fseek(). There is no way of seeking to a particular line, unless all the lines are exactly the same length.
No, you have to process the data to find the line delimiters (unless you have fixed length lines). Have a look at getline(), ftell() and fseek(). http://www.pixelbeat.org/programming/readline/cpp.cpp
The easy best way is to read the file in memory inserting for instance each line in a vector of strings, then modifying/adding whatever you want, and re-write each line in a new file.
(supposing the file fits in memory)