Error copying and pasting data from a file to another - c++

I am writing a code to merge multiple text files and output a single file.
There can be up to 22 input text files which contain 1400 lines each.
Each line has 8 bits of binary and the new line characters \n.
I am out putting a single file that has all 22 text files merged.
Problem is with my output file, after 1400 lines it appears that the content from the previous file is still being placed into output file(although the length of the previous file was 1400 lines). This extra content also begins to have additional line space between each row if opened by microsoft office or sublime, however it is interpreted as a single line if opened by notepad or excel(a single cell in excel).
Following is the picture of expected behaviour of the output file,
Here is a picture of abnormal behaviour. This starts when the first file finishes.
I know this data is from the first file still because the second file starts from 00000000
And here is the start of the second file,
And this abnormal behavior repeats every single time the files are switching.
My implementation to achieve this is as follows:
repeat:
if(user_input == 'y')
{
fstream data_out ("data.txt",fstream::out);
for(int i = 0; i<files_found; i++)
{
fstream data_in ((file_names[i].c_str()),fstream::in);
if(data_in.is_open())
{
data_in.seekg(0,data_in.end);
long size = data_in.tellg();
data_in.seekg(0,data_in.beg);
char * buffer = new char[size];
cout << size;
data_in.read(buffer,size);
data_out.write(buffer,size);
delete[] buffer;
}else
{
cout << "Unexpected error";
return 1;
}
data_in.close();
}
data_out.close();
}else if(user_input == 'n')
{
return 1;
}else
{
cout << "Input not recognised. Type y for Yes, and n for No";
cin >> user_input;
goto repeat;
}
Further information:
I have checked the size variable and it is as I expect, 14000.
8 bits, and a \ with n = 10 characters per line,
1400 rows x 10 = 14000.
Assuming reader of code to be experienced.

Sorry to bump this question, but I really like question that are marked as answered. JoachimPileborg answer seems to have worked for you:
Also, instead of seeking and checking sizes and allocating memory, why
not just do e.g. data_out << data_in.rdbuf();? This will copy the
whole input file to the output. – Joachim Pileborg Jul 29 at 17:26
A reference http://www.cplusplus.com/reference/ios/ios/rdbuf/ and an example:
#include <fstream>
#include <string>
#include <vector>
int main(int argc, char** argv)
{
typedef std::vector<std::string> Filenames;
Filenames vecFilenames;
// Populate the list of file names
vecFilenames.push_back("Text1.txt");
vecFilenames.push_back("Text2.txt");
vecFilenames.push_back("Text3.txt");
// Merge the files into Output.txt
std::ofstream fpOutput("Output.txt");
for (Filenames::iterator it = vecFilenames.begin();
it != vecFilenames.end(); ++it)
{
std::ifstream fpInput(it->c_str());
fpOutput << fpInput.rdbuf();
fpInput.close();
}
fpOutput.close();
return 0;
}

Related

Edit file line by line number - C++

I'm trying to edit a .dat file. I want to read a line by line number, turn the content to int, edit and replace it.
like I want to edit line number 23, it says "45" I need to make it "46". How do I do that?
ofstream f2;
theBook b;
f2.open("/Users/vahidgr/Documents/Files/UUT/ComputerProjects/LibraryCpp/LibraryFiles/Books.dat", ios::app);
ifstream file("/Users/vahidgr/Documents/Files/UUT/ComputerProjects/LibraryCpp/LibraryFiles/Books.dat");
cout<<"In this section you can add books."<<endl;
cout<<"Enter ID: "; cin>>b.id;
cout<<"Enter Name: "; cin>>b.name;
string sID = to_string(b.id);
string bookName = b.name;
string line;
int lineNumber = 0;
while(getline(file, line)) {
++lineNumber ;
if(line.find(bookName) != string::npos && line.find(sID) != string::npos) {
int countLineNumber = lineNumber + 4;
registered = true;
f2.close();
break;
}
}
Inside the file:
10000, book {
author
1990
20
20
}
If your file is small (such as under 1GB), you can just read the entire file into memory line-by-line as a std::vector<std::string> (Hint: use std::getline). Then, edit the required line, and overwrite the file with an updated one.
Iterate Byte for Byte through the file and count line breaks (\n or \r\n on Windows).
After 22 breaks, insert bytes that say “46”. It should overwrite the existing bytes.
If your modifications are the exact size of the original text, you can write back to the same file. Otherwise, you will need to write your modifications to a new file.
Since your file is variable length text, separated by newlines, we'll have to skip lines until we get to the desired line:
const unsigned int desired_line = 23;
std::ifstream original_file(/*...*/);
std::ofstream modified_file(/*...*/);
// Skip lines
std::string text_line;
for (unsigned int i = 0; i < desired_line - 1; ++i)
{
std::getline(original_file, text_line);
modified_file << text_line << std::endl;
}
// Next, read the text, modify and write to the original file
//... (left as an exercise for the OP, since this was not explicit in the post.
// Write remaining text lines to modified file
while (std::getline(original_file, text_line))
{
modified_file << text_line << std::endl;
}
Remember to write your modified text to the modified file before copying the remaining text.
Edit 1: By record / object
This looks like an X-Y problem.
A preferred method is to read in the objects, modify the object, then write the objects to a new file.

C++ copying parts of a file to a new file

I am trying to create a new file with data from two different existing files. I need to copy the first existing file in it's entirety, which I have done successfully. For the second existing file I need to copy just the last two columns and append it to the first file at the end of each row.
Ex:
Info from first file already copied into my new file:
20424297 1092 CSCI 13500 B 3
20424297 1092 CSCI 13600 A- 3.7
Now I need to copy the last two columns of each line in this file and then append them to the appropriate row in the file above:
17 250 3.00 RNL
17 381 3.00 RLA
i.e. I need "3.00" and "RNL" appended to the end of the first row, "3.0" and "RLA" appended to the end of the second row, etc.
This is what I have so far:
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <cstdlib>
using namespace std;
int main() {
//Creates new file and StudentData.tsv
ofstream myFile;
ifstream studentData;
ifstream hunterCourseData;
//StudentData.tsv is opened and checked to make sure it didn't fail
studentData.open("StudentData.tsv");
if(studentData.fail()){
cout << "Student data file failed to open" << endl;
exit(1);
}
//My new file is opened and checked to make sure it didn't fail
myFile.open("file.txt");
if(myFile.fail()){
cout << "MyFile file failed to open" << endl;
exit(1);
}
//HunterCourse file is opened and checked to make sure if didn't fail
hunterCourseData.open("HunterCourse.tsv");
if(myFile.fail()){
cout << "Hunter data file failed to open" << endl;
exit(1);
}
// Copies data from StudentData.tsv to myFile
char next = '\0';
int n = 1;
while(! studentData.eof()){
myFile << next;
if(next == '\n'){
n++;
myFile << n << ' ';
}
studentData.get(next);
}
return 0;
}
I am going bananas trying to figure this out. I'm sure it's a simple fix but I can't find anything online that works. I've looked into using ostream and a while loop to assign each row into a variable but I can't get that to work.
Another approach that has crossed my mind is just to remove all integers from the second file because I only need the last two columns and neither of those columns include integers.
If you take a look at the seekg method of a file-stream, you'll note the second version allows you to implement the location to set an offset from (such as ios_base::end which sets the offset compared to the end of the file. With this you can effectively read backwards from the end of the a file.
Consider the following
int Pos=0;
while(hunterCourseData.peek()!= '\n')
{
Pos--;
hunterCourseData.seekg(Pos, ios_base::end);
}
//this line will execute when you have found the first newline-character from the end of the file.
Much better code is available at this Very Similar question
Another possibility is simply to find how many lines are in the file beforehand. (less fast, but workable), in this case one would simply loop though the file calling getline and increment a count variable, reset to the start, then repeat until reaching the count - 2. Though I wouldn't use this technique myself.

Using seekg() in text mode

While trying to read in a simple ANSI-encoded text file in text mode (Windows), I came across some strange behaviour with seekg() and tellg(); Any time I tried to use tellg(), saved its value (as pos_type), and then seek to it later, I would always wind up further ahead in the stream than where I left off.
Eventually I did a sanity check; even if I just do this...
int main()
{
std::ifstream dataFile("myfile.txt",
std::ifstream::in);
if (dataFile.is_open() && !dataFile.fail())
{
while (dataFile.good())
{
std::string line;
dataFile.seekg(dataFile.tellg());
std::getline(dataFile, line);
}
}
}
...then eventually, further into the file, lines are half cut-off. Why exactly is this happening?
This issue is caused by libstdc++ using the difference between the current remaining buffer with lseek64 to determine the current offset.
The buffer is set using the return value of read, which for a text mode file on windows returns the number of bytes that have been put into the buffer after endline conversion (i.e. the 2 byte \r\n endline is converted to \n, windows also seems to append a spurious newline to the end of the file).
lseek64 however (which with mingw results in a call to _lseeki64) returns the current absolute file position, and once the two values are subtracted you end up with an offset that is off by 1 for each remaining newline in the text file (+1 for the extra newline).
The following code should display the issue, you can even use a file with a single character and no newlines due to the extra newline inserted by windows.
#include <iostream>
#include <fstream>
int main()
{
std::ifstream f("myfile.txt");
for (char c; f.get(c);)
std::cout << f.tellg() << ' ';
}
For a file with a single a character I get the following output
2 3
Clearly off by 1 for the first call to tellg. After the second call the file position is correct as the end has been reached after taking the extra newline into account.
Aside from opening the file in binary mode, you can circumvent the issue by disabling buffering
#include <iostream>
#include <fstream>
int main()
{
std::ifstream f;
f.rdbuf()->pubsetbuf(nullptr, 0);
f.open("myfile.txt");
for (char c; f.get(c);)
std::cout << f.tellg() << ' ';
}
but this is far from ideal.
Hopefully mingw / mingw-w64 or gcc can fix this, but first we'll need to determine who would be responsible for fixing it. I suppose the base issue is with MSs implementation of lseek which should return appropriate values according to how the file has been opened.
Thanks for this , though it's a very old post. I was stuck on this problem for more then a week. Here's some code examples on my site (the menu versions 1 and 2). Version 1 uses the solution presented here, in case anyone wants to see it .
:)
void customerOrder::deleteOrder(char* argv[]){
std::fstream newinFile,newoutFile;
newinFile.rdbuf()->pubsetbuf(nullptr, 0);
newinFile.open(argv[1],std::ios_base::in);
if(!(newinFile.is_open())){
throw "Could not open file to read customer order. ";
}
newoutFile.open("outfile.txt",std::ios_base::out);
if(!(newoutFile.is_open())){
throw "Could not open file to write customer order. ";
}
newoutFile.seekp(0,std::ios::beg);
std::string line;
int skiplinesCount = 2;
if(beginOffset != 0){
//write file from zero to beginoffset and from endoffset to eof If to delete is non-zero
//or write file from zero to beginoffset if to delete is non-zero and last record
newinFile.seekg (0,std::ios::beg);
// if primarykey < largestkey , it's a middle record
customerOrder order;
long tempOffset(0);
int largestKey = order.largestKey(argv);
if(primaryKey < largestKey) {
//stops right before "current..." next record.
while(tempOffset < beginOffset){
std::getline(newinFile,line);
newoutFile << line << std::endl;
tempOffset = newinFile.tellg();
}
newinFile.seekg(endOffset);
//skip two lines between records.
for(int i=0; i<skiplinesCount;++i) {
std::getline(newinFile,line);
}
while( std::getline(newinFile,line) ) {
newoutFile << line << std::endl;
}
} else if (primaryKey == largestKey){
//its the last record.
//write from zero to beginoffset.
while((tempOffset < beginOffset) && (std::getline(newinFile,line)) ) {
newoutFile << line << std::endl;
tempOffset = newinFile.tellg();
}
} else {
throw "Error in delete key"
}
} else {
//its the first record.
//write file from endoffset to eof
//works with endOffset - 4 (but why??)
newinFile.seekg (endOffset);
//skip two lines between records.
for(int i=0; i<skiplinesCount;++i) {
std::getline(newinFile,line);
}
while(std::getline(newinFile,line)) {
newoutFile << line << std::endl;
}
}
newoutFile.close();
newinFile.close();
}
beginOffset is a specific point in the file (beginning of each record) , and endOffset is the end of the record, calculated in another function with tellg (findFoodOrder) I did not add this as it may become very lengthy, but you can find it on my site (under: menu version 1 link) :
http://www.buildincode.com

Logic for reading rows and columns from a text file (textparser) C++

I'm really stuck with this problem I'm having for reading rows and columns from a text file. We're using text files that our prof gave us. I have the functionality running so when the user in puts "numrows (file)" the number of rows in that file prints out.
However, every time I enter the text files, it's giving me 19 for both. The first text file only has 4 rows and the other one has 7. I know my logic is wrong, but I have no idea how to fix it.
Here's what I have for the numrows function:
int numrows(string line) {
ifstream ifs;
int i;
int row = 0;
int array [10] = {0};
while (ifs.good()) {
while (getline(ifs, line)) {
istringstream stream(line);
row = 0;
while(stream >>i) {
array[row] = i;
row++;
}
}
}
}
and here's the numcols:
int numcols(string line) {
int col = 0;
int i;
int arrayA[10] = {0};
ifstream ifs;
while (ifs.good()) {
istringstream streamA(line);
col = 0;
while (streamA >>i){
arrayA[col] = i;
col++;
}
}
}
edit: #chris yes, I wasn't sure what value to return as well. Here's my main:
int main() {
string fname, line;
ifstream ifs;
cout << "---- Enter a file name : ";
while (getline(cin, fname)) { // Ctrl-Z/D to quit!
// tries to open the file whose name is in string fname
ifs.open(fname.c_str());
if(fname.substr(0,8)=="numrows ") {
line.clear();
for (int i = 8; i<fname.length(); i++) {
line = line+fname[i];
}
cout << numrows (line) << endl;
ifs.close();
}
}
return 0;
}
This problem can be more easily solved by opening the text file as an ifstream, and then using std::get to process your input.
You can try for comparison against '\n' as the end of line character, and implement a pair of counters, one for columns on a line, the other for lines.
If you have variable length columns, you might want to store the values of (numColumns in a line) in a std::vector<int>, using myVector.push_back(numColumns) or similar.
Both links are to the cplusplus.com/reference section, which can provide a large amount of information about C++ and the STL.
Edited-in overview of possible workflow
You want one program, which will take a filename, and an 'operation', in this case "numrows" or "numcols". As such, your first steps are to find out the filename, and operation.
Your current implementation of this (in your question, after editing) won't work. Using cin should however be fine. Place this earlier in your main(), before opening a file.
Use substr like you have, or alternatively, search for a space character. Assume that the input after this is your filename, and the input in the first section is your operation. Store these values.
After this, try to open your file. If the file opens successfully, continue. If it won't open, then complain to the user for a bad input, and go back to the beginning, and ask again.
Once you have your file successfully open, check which type of calculation you want to run. Counting a number of rows is fairly easy - you can go through the file one character at a time, and count the number that are equal to '\n', the line-end character. Some files might use carriage-returns, line-feeds, etc - these have different characters, but are both a) unlikely to be what you have and b) easily looked up!
A number of columns is more complicated, because your rows might not all have the same number of columns. If your input is 1 25 21 abs 3k, do you want the value to be 5? If so, you can count the number of space characters on the line and add one. If instead, you want a value of 14 (each character and each space), then just count the characters based on the number of times you call get() before reaching a '\n' character. The use of a vector as explained below to store these values might be of interest.
Having calculated these two values (or value and set of values), you can output based on the value of your 'operation' variable. For example,
if (storedOperationName == "numcols") {
cout<< "The number of values in each column is " << numColsVal << endl;
}
If you have a vector of column values, you could output all of them, using
for (int pos = 0; pos < numColsVal.size(); pos++) {
cout<< numColsVal[pos] << " ";
}
Following all of this, you can return a value from your main() of 0, or you can just end the program (C++ now considers no return value from main to a be a return of 0), or you can ask for another filename, and repeat until some other method is used to end the program.
Further details
std::get() with no arguments will return the next character of an ifstream, using the example code format
std::ifstream myFileStream;
myFileStream.open("myFileName.txt");
nextCharacter = myFileStream.get(); // You should, before this, implement a loop.
// A possible loop condition might include something like `while myFileStream.good()`
// See the linked page on std::get()
if (nextCharacter == '\n')
{ // You have a line break here }
You could use this type of structure, along with a pair of counters as described earlier, to count the number of characters on a line, and the number of lines before the EOF (end of file).
If you want to store the number of characters on a line, for each line, you could use
std::vector<int> charPerLine;
int numberOfCharactersOnThisLine = 0;
while (...)
{
numberOfCharactersOnThisLine = 0
// Other parts of the loop here, including a numberOfCharactersOnThisLine++; statement
if (endOfLineCondition)
{
charPerLine.push_back(numberOfCharactersOnThisLine); // This stores the value in the vector
}
}
You should #include <vector> and either specific std:: before, or use a using namespace std; statement near the top. People will advise against using namespaces like this, but it can be convenient (which is also a good reason to avoid it, sort of!)

does fstream move to the next position after read in a binary integer (c++)

I am trying to read in a binary file and write in chunks to multiple output files. Say if there are 25 4byte numbers in total and chunk size is set to 20, the program will generate two output files one with 20 integers the other with 5. However if I have a input file with 40 integers, my program generates three files, first 2 files both have 20 numbers, however the third file has one number which is the last one from the input file and it is already included in the second output file. How do I force the read position to move forward every time reading a number?
ifstream fin("in.txt", ios::in | ios::binary);
if(fin.is_open())
{
while(!fin.eof()){
//set file name for each output file
fname[0] = 0;
strcpy(fname, "binary_chunk");
index[0] = 0;
sprintf(index, "%d", fcount);
strcat(fname, index);
// open output file to write
fout.open(fname);
for(i = 0; i < chunk; i++)
{
fin.read((char *)(&num), INT_SIZE);
fout << num << "\n";
if(fin.eof())
{
fout.close();
fin.close();
return;
}
}
fcount ++;
fout.close();
}
fout.close();
}
The problem is most likely your use of while (!fin.eof()). The eofbit flag is not set until after you have tried to read from beyond the end of the file. This means that the loop will loop one extra time without you noticing.
Instead you should remember that all input operations returns the stream object, and that stream objects can be used as boolean conditions. That means you can do like this:
while (fin.read(...))
This is safe from the problems with looping while !fin.eof().
And to answer your question in the title: Yes, the file position is moved when you successfully read anything. If you read X bytes, the read-position will be moved forward X bytes as well.