c++ overwriting file data? - c++

I am trying to run a program to replace certain data within a file. The relevant parts of the file attempting to be replaced look like the following:
1 Information 15e+10
2 Information 2e+16
3 Information 6e+2
And so on.
The files in question can be very large in the multiple gigabyte range and to my understanding because of this using a buffer of the whole file and rewriting the whole file is impossible/unreasonable. Well that is all fine I just want to replace the values (ex. the 15e+10).
This all works fine with simple ios::in|ios::out and tellp() if I am replacing the value with a similar sized value (15e+10->12e+12) or even if its a smaller size as I can simply add an extra space which can be ignored down the line (ex. 15e+10->4e+10 ). But I am running into the problem if I need to replace the value with a value whose length is longer than already in the file (ex. 6e+2->16e+10) it will write over the new line character or start writing over the information in the next line.
I have searched on the forums and everyone says you can either overwrite in the file, you can append to the end of the file, or you can buffer and recreate the whole file. Is there anyway I can achieve my goal of overwriting the value correctly without having to recreate the file?
If not then how can I have 2 files open (1 input 1 output) to do this if multiple files in question are too large for the memory?
Note: I would also like to avoid using boost:: as I need to be able to run this on a system without the boost library.

Open a stream to read from the input (IN) file and a second stream (OUT) to write to a new output (tmp) file.
Read from IN and write to OUT. When you get a value from IN that you want to replace write the replacement to OUT instead of the value you got from IN.
When parsing is complete replace the first file with the second (tmp) file.
Would this work for you?

Use lseek()/fseek() for "jump" to a given position in a file.

You can use seekp to go to the location and rewrite it with <<
Example:
example.txt ( |?| = 1 byte of data )
|A|B|C|\n|1|2|3|D|E|F|\n|4|5|6|
//Somewhere in the code
fstream file;
open("example.txt");
//Somehow find the character distance and store it into "distance"
seekp(distance);//If distance = 0, it will go to "A" like rewind() but easier for me
If the distance is 4, the next character will be overwritten is 1
file << "987";
And the file will be
|A|B|C|\n|9|8|7|D|E|F|\n|4|5|6|
BUT the only problem here is when you need to increase/decrease the size:
Increase:
You will overwrite the other character so you need to create a temp string to store it the rest of data or separate it into smaller chunk if the data is too large like
|A|B|C|\n|9|8|7|D|E|F|\n|4|5|6|
string tempstring;
seekp(distance);
file >> tempstring;
seekp(distance);
file << content << tempstring; //content is the data
Decrease:
The easiest solution is to write NULL character \0 to the excess space like
|A|B|C|\n|1|\0|\0|D|E|F|\n|4|5|6|
The only side-effect is the file size is the same as before

Related

Reading a line of a text file from a specific position in C++

I would like to read a text file in C++ in following manner:
Ignore the entire first line as it is simply meant as an introduction.
Only read the following lines from a specific position.
That starting position for reading is a fixed one and remains the same for every line; however, the numbers after that may be of variable length. I need to save all of these numbers from line 2 to line n into an Array.
At the moment I can read a regular 2D Array with getline.
How can I work around these things?
An example for a line I want to read could be:
Person1: 25 988.3 0.0023 7
To set the file to a position, use std::ifstream::seekg().
To set the file to the beginning of a line, you must read and count the line endings. Many text files have variable length text lines.
How can I work around these things?
You can't, unless you can ensure that all of the data lines after the first line are all the same length.
If you can't ensure that, then all you can do is read through all of the preceding lines.
An alternative I have employed in the past is to generate an 'index' of line start positions in a secondary file in binary format (so that I CAN jump directly to the right place in that file), and use that to jump to the right place in the text file. Of course that means that you need to regenerate that index file every time you replace/amend the data file.

How do i write to a specific line of a text file?

myfile<<hashdugumu[key].numara;
I have this piece of code.For example,i would like to write to eighth line.How do i do that in c++ ?Thanks in advance.
If the line you want to write is exactly the same length (in bytes, not in characters, remember some encodings (like e.g. UTF-8) is variable length) then it's very easy: Just skip over the first seven lines and then write the line.
There is a caveat with this though: input streams and output streams have different stream positions. So if you read from a combined input/output file stream then only the read position will change, so if you just try to write directly then you will not write at the same position. To solve this you need to get the read position, and set the write position to the same value.
As an alternative, or if the data you want to write is not the same size as the existing data, then you have to use a temporary "buffer", be it another file or an actual in-memory buffer.
If the file is not big you can use an in-memory buffer, for example using a std::vector for the lines. Read each line into the vector, and then modify the lines (elements in the vector) that you want to modify. Finally reopen the file for writing, truncating it, and then just write each "line" to the file.
There is a slight problem with the above though when it comes to the rewriting of the data, and that is if the file is truncated and then there's an error when you write to the file, you can lose data. This can be dsolved by using a temporary file.
Using a temporary file it's easier to not bother with the in-memory buffer, and instead read from the original file and write directly to the temporary file. Knowing when you should write something else is done by keeping track of the current line numbers, which is easy if you read one line at a time. In your example you read the first seven lines from the original file and write them to the temporary file, after the seventh line you write your special eight line while skipping the original eight line from the original file, and then just continue reading/writing the remaining lines. When done close the files and then rename the temporary file as the original file.

index a text file (lines with different size) in c++

I have to extract information from a text file.
In the text file there is a list of strings.
This is an example of a string: AAA101;2015-01-01 00:00:00;0.784
The value after the last ; is a non integer value, which changes from line to line, so every line has different lenght of characters.
I want to map all of these lines into a structured vector as I can access to a specific line anytime I need without scan the whole file again.
I did some research and I found some threads about a command called, which permit me to reach a specific line of a text file but I read it only works if any line has the same characters lenght of the others.
I was thinking about converting all the lines in the file in a proper format in order to be able to map that file as I want but I hope there is a better and quick way
You can try TStringList*. It creates a list of AnsiStrings. Then each AnsiString can be accessed via ->operator [](numberOfTheLine).

QTextStream Maniuplation

I'm opening a file and getting a QTextStream of it. I am then reading the stream line by line using readLine(). When the line matches a certain string, I need to replace it with another string. I need the behaviour to be that the line is completely replaced (ie, if the line was "longword" and I replace it with "word", the line should contain "word" and "word" only).
At the moment I am using seek() and then the << operator to put my string in at the given location, but the remnants of the last string remain, so I am left with something like "wordword". How can I prevent this from happening and ensure the entire previous line is fully replaced with my new one?
To my knowledge, you cannot simply remove a chunk of a text file in-place. If the replacement string was identical in size, you might be able to replace those exact bytes, and if it were shorter you might be able to hack around the problem by filling the empty space with nulls.
If you didn't want to do that, you would have to create a new file, read each line from the old file, make any required changes to that line in memory, then write that line out to the new file. Once this is complete, you could then replace the original file with the new file.
If it were possible to add/remove chunks to/from the file, you would most likely be left with a considerably fragmented file on the HDD. If you needed to insert more characters, extra fragments would have to be created as the new data simply couldn't fit in the amount of space occupied by the old data, and removing data would leave holes in the file.

How do I insert data into a pre-allocated CSV?

Text file (or CSV) is:
Data:,,,,,\n
(but with 100 ","s)
In C or C++ I would like to open the file and then fill in values between the ",".
i.e.- Data:,1,2,3,4,\n
I'm guessing that I need some sort of search to find the next comma, insert data, find the next comma insert, etc.
I was looking at memchr() for a buffer and was wondering if there is something similar for a text file?
If you could point me in the right direction, I would appreciate it.
(I don't mind reading a book to find something out like this either, I just don't know what book would have this information?)
Thank You.
You can't actually do that in C... if you open in read/write mode you'll overwrite characters, not insert them.
http://c-faq.com/stdio/fupdate.html
You need to open the file, read the line into memory, write the new line to a temp file.
After you're done inserting all the lines, copy the temp file over the original file. I don't think there's any other way to do it.
(This is for the C++ case)
Just parse the data into an Linked list with the Objects that hold the data, modify the data and overwrite the file.
You first need to split your data into lines(\n creates a new linked-list Element):
Data:,,,,,\n
Data2:,,,,,\n
will get the strings (pseudolist):
["Data:,,,,,", "Data2:,,,,,"]
So now you need to define your Object for each Line like:
class LineStruct {
public:
string head;
LinkedList<string> data;
};
and fill it.
Then you edit the data-structure and after that you write it back to disk.
If you have
Data:,,,,,\n
then there is no space between the , to fill, you have to write out brand new lines.
However if you had
Data: , , , , , \n
then you could overwrite just those parts represented by ' '
in C you would seek to the part of the file and write and then seek to the next pos, sorry no code off the top of my head.
This is where I would look:
std::getline for reading lines into std::strings
std::string::find_first_of for finding the comas
std::stringstream for building the new output-line
As suggested by wmils answer, you will have to either use a temporary file, or hold all the new lines in memory until all lines are processed, and then overwrite the original file.