C++ copying parts of a file to a new file - c++

I am trying to create a new file with data from two different existing files. I need to copy the first existing file in it's entirety, which I have done successfully. For the second existing file I need to copy just the last two columns and append it to the first file at the end of each row.
Ex:
Info from first file already copied into my new file:
20424297 1092 CSCI 13500 B 3
20424297 1092 CSCI 13600 A- 3.7
Now I need to copy the last two columns of each line in this file and then append them to the appropriate row in the file above:
17 250 3.00 RNL
17 381 3.00 RLA
i.e. I need "3.00" and "RNL" appended to the end of the first row, "3.0" and "RLA" appended to the end of the second row, etc.
This is what I have so far:
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <cstdlib>
using namespace std;
int main() {
//Creates new file and StudentData.tsv
ofstream myFile;
ifstream studentData;
ifstream hunterCourseData;
//StudentData.tsv is opened and checked to make sure it didn't fail
studentData.open("StudentData.tsv");
if(studentData.fail()){
cout << "Student data file failed to open" << endl;
exit(1);
}
//My new file is opened and checked to make sure it didn't fail
myFile.open("file.txt");
if(myFile.fail()){
cout << "MyFile file failed to open" << endl;
exit(1);
}
//HunterCourse file is opened and checked to make sure if didn't fail
hunterCourseData.open("HunterCourse.tsv");
if(myFile.fail()){
cout << "Hunter data file failed to open" << endl;
exit(1);
}
// Copies data from StudentData.tsv to myFile
char next = '\0';
int n = 1;
while(! studentData.eof()){
myFile << next;
if(next == '\n'){
n++;
myFile << n << ' ';
}
studentData.get(next);
}
return 0;
}
I am going bananas trying to figure this out. I'm sure it's a simple fix but I can't find anything online that works. I've looked into using ostream and a while loop to assign each row into a variable but I can't get that to work.
Another approach that has crossed my mind is just to remove all integers from the second file because I only need the last two columns and neither of those columns include integers.

If you take a look at the seekg method of a file-stream, you'll note the second version allows you to implement the location to set an offset from (such as ios_base::end which sets the offset compared to the end of the file. With this you can effectively read backwards from the end of the a file.
Consider the following
int Pos=0;
while(hunterCourseData.peek()!= '\n')
{
Pos--;
hunterCourseData.seekg(Pos, ios_base::end);
}
//this line will execute when you have found the first newline-character from the end of the file.
Much better code is available at this Very Similar question
Another possibility is simply to find how many lines are in the file beforehand. (less fast, but workable), in this case one would simply loop though the file calling getline and increment a count variable, reset to the start, then repeat until reaching the count - 2. Though I wouldn't use this technique myself.

Related

Filling a cstring using <cstring> with text from a textfile using File I/O C++

I began learning strings yesterday and wanted to manipulate it around by filling it with a text from a text file. However, upon filling it the cstring array only prints out the last word of the text file. I am a complete beginner, so I hope you can keep this beginner friendly. The lines I want to print from the file are:
"Hello World from UAE" - First line
"I like to program" - Second line
Now I did look around and eventually found a way and that is to use std::skipary or something like that but that did not print it the way I had envisioned, it prints letter by letter and skips each line in doing so.
here is my code:
#include <fstream>
#include <iostream>
#include <cstring>
#include <cctype>
using namespace std;
int main() {
ifstream myfile;
myfile.open("output.txt");
int vowels = 0, spaces = 0, upper = 0, lower = 0;
//check for error
if (myfile.fail()) {
cout << "Error opening file: ";
exit(1);
}
char statement[100];
while (!myfile.eof()) {
myfile >> statement;
}
for (int i = 0; i < 30; ++i) {
cout << statement << " ";
}
I'm not exactly sure what you try to do with output.txt's contents, but a clean way to read through a file's contents using C++ Strings goes like this:
if (std::ifstream in("output.txt"); in.good()) {
for (std::string line; std::getline(in, line); ) {
// do something with line
std::cout << line << '\n';
}
}
You wouldn't want to use char[] for that, in fact raw char arrays are hardly ever useful in modern C++.
Also - As you can see, it's much more concise to check if the stream is good than checking for std::ifstream::fail() and std::ifstream::eof(). Be optimistic! :)
Whenever you encounter output issues - either wrong or no output, the best practise is to add print (cout) statements wherever data change is occurring.
So I first modified your code as follows:
while (!myfile.eof()) {
myfile >> statement;
std::cout<<statement;
}
This way, the output I got was - all lines are printed but the last line gets printed twice.
So,
We understood that data is being read correctly and stored in statement.
This raises 2 questions. One is your question, other is why last line is printed twice.
To answer your question exactly, in every loop iteration, you're reading the text completely into statement. You're overwriting existing value. So whatever value you read last is only stored.
Once you fix that, you might come across the second question. It's very common and I myself came across that issue long back. So I'm gonna answer that as well.
Let's say your file has 3 lines:
line1
line2
line3
Initially your file control (pointer) is at the beginning, exactly where line 1 starts. After iterations when it comes to line3, we know it's last line as we input the data. But the loop control doesn't know that. For all it knows, there could be a million more lines. Only after it enters the loop condition THE NEXT TIME will it come to know that the file has ended. So the final value will be printed twice.

ifstream does not read first line

I am using the code with ifstream that I used ~1 year ago, but now it does not work correctly. Here, I have the following file (so, just a line of integers):
2 4 2 3
I read it while constructing a graph from this file:
graph g = graph("file.txt");
where graph constructor starts with:
#include <iostream>
#include <fstream>
#include <sstream>
using namespace std;
graph::graph(const char *file_name) {
ifstream infile(file_name);
string line;
getline(infile, line);
cout << line << endl; // first output
istringstream iss;
iss.str(line);
iss >> R >> C >> P >> K;
iss.clear();
cout << R << " " << C << " " << P << " " << K; // second output
}
The second output (marked in code), instead of giving me 2 4 2 3, returns random(?) values -1003857504 32689 0 0. If I add the first output to check the contents of line after getline, it is just an empty string "".
All the files (main.cpp where a graph is instantiated, 'graph.cpp' where the graph is implemented and 'file.txt') are located in the same folder.
As I mentioned, this is my old code that worked before, so probably I do not see some obvious mistake which broke it. Thanks for any help.
These two locations:
where your program's original source code is located
where your program's input data is located
are completely unrelated.
Since "file.txt" is a relative path, your program looks for input data in the current working directory during execution. Sometimes that is the same as where the executable is. Sometimes it is not. (Only you can tell what it is, since it depends on how you execute your program.) There is never a connection to the location of the original source file, except possibly by chance.
When the two do not match, you get this problem, because you perform no I/O error checking in your program.
If you checked whether infile is open, I bet you'll find that it is not.
This is particularly evident since the program stopped working after a period of time without any changes to its logic; chances are, the only thing that could have changed is the location of various elements of your solution.

Using seekg() in text mode

While trying to read in a simple ANSI-encoded text file in text mode (Windows), I came across some strange behaviour with seekg() and tellg(); Any time I tried to use tellg(), saved its value (as pos_type), and then seek to it later, I would always wind up further ahead in the stream than where I left off.
Eventually I did a sanity check; even if I just do this...
int main()
{
std::ifstream dataFile("myfile.txt",
std::ifstream::in);
if (dataFile.is_open() && !dataFile.fail())
{
while (dataFile.good())
{
std::string line;
dataFile.seekg(dataFile.tellg());
std::getline(dataFile, line);
}
}
}
...then eventually, further into the file, lines are half cut-off. Why exactly is this happening?
This issue is caused by libstdc++ using the difference between the current remaining buffer with lseek64 to determine the current offset.
The buffer is set using the return value of read, which for a text mode file on windows returns the number of bytes that have been put into the buffer after endline conversion (i.e. the 2 byte \r\n endline is converted to \n, windows also seems to append a spurious newline to the end of the file).
lseek64 however (which with mingw results in a call to _lseeki64) returns the current absolute file position, and once the two values are subtracted you end up with an offset that is off by 1 for each remaining newline in the text file (+1 for the extra newline).
The following code should display the issue, you can even use a file with a single character and no newlines due to the extra newline inserted by windows.
#include <iostream>
#include <fstream>
int main()
{
std::ifstream f("myfile.txt");
for (char c; f.get(c);)
std::cout << f.tellg() << ' ';
}
For a file with a single a character I get the following output
2 3
Clearly off by 1 for the first call to tellg. After the second call the file position is correct as the end has been reached after taking the extra newline into account.
Aside from opening the file in binary mode, you can circumvent the issue by disabling buffering
#include <iostream>
#include <fstream>
int main()
{
std::ifstream f;
f.rdbuf()->pubsetbuf(nullptr, 0);
f.open("myfile.txt");
for (char c; f.get(c);)
std::cout << f.tellg() << ' ';
}
but this is far from ideal.
Hopefully mingw / mingw-w64 or gcc can fix this, but first we'll need to determine who would be responsible for fixing it. I suppose the base issue is with MSs implementation of lseek which should return appropriate values according to how the file has been opened.
Thanks for this , though it's a very old post. I was stuck on this problem for more then a week. Here's some code examples on my site (the menu versions 1 and 2). Version 1 uses the solution presented here, in case anyone wants to see it .
:)
void customerOrder::deleteOrder(char* argv[]){
std::fstream newinFile,newoutFile;
newinFile.rdbuf()->pubsetbuf(nullptr, 0);
newinFile.open(argv[1],std::ios_base::in);
if(!(newinFile.is_open())){
throw "Could not open file to read customer order. ";
}
newoutFile.open("outfile.txt",std::ios_base::out);
if(!(newoutFile.is_open())){
throw "Could not open file to write customer order. ";
}
newoutFile.seekp(0,std::ios::beg);
std::string line;
int skiplinesCount = 2;
if(beginOffset != 0){
//write file from zero to beginoffset and from endoffset to eof If to delete is non-zero
//or write file from zero to beginoffset if to delete is non-zero and last record
newinFile.seekg (0,std::ios::beg);
// if primarykey < largestkey , it's a middle record
customerOrder order;
long tempOffset(0);
int largestKey = order.largestKey(argv);
if(primaryKey < largestKey) {
//stops right before "current..." next record.
while(tempOffset < beginOffset){
std::getline(newinFile,line);
newoutFile << line << std::endl;
tempOffset = newinFile.tellg();
}
newinFile.seekg(endOffset);
//skip two lines between records.
for(int i=0; i<skiplinesCount;++i) {
std::getline(newinFile,line);
}
while( std::getline(newinFile,line) ) {
newoutFile << line << std::endl;
}
} else if (primaryKey == largestKey){
//its the last record.
//write from zero to beginoffset.
while((tempOffset < beginOffset) && (std::getline(newinFile,line)) ) {
newoutFile << line << std::endl;
tempOffset = newinFile.tellg();
}
} else {
throw "Error in delete key"
}
} else {
//its the first record.
//write file from endoffset to eof
//works with endOffset - 4 (but why??)
newinFile.seekg (endOffset);
//skip two lines between records.
for(int i=0; i<skiplinesCount;++i) {
std::getline(newinFile,line);
}
while(std::getline(newinFile,line)) {
newoutFile << line << std::endl;
}
}
newoutFile.close();
newinFile.close();
}
beginOffset is a specific point in the file (beginning of each record) , and endOffset is the end of the record, calculated in another function with tellg (findFoodOrder) I did not add this as it may become very lengthy, but you can find it on my site (under: menu version 1 link) :
http://www.buildincode.com

Error copying and pasting data from a file to another

I am writing a code to merge multiple text files and output a single file.
There can be up to 22 input text files which contain 1400 lines each.
Each line has 8 bits of binary and the new line characters \n.
I am out putting a single file that has all 22 text files merged.
Problem is with my output file, after 1400 lines it appears that the content from the previous file is still being placed into output file(although the length of the previous file was 1400 lines). This extra content also begins to have additional line space between each row if opened by microsoft office or sublime, however it is interpreted as a single line if opened by notepad or excel(a single cell in excel).
Following is the picture of expected behaviour of the output file,
Here is a picture of abnormal behaviour. This starts when the first file finishes.
I know this data is from the first file still because the second file starts from 00000000
And here is the start of the second file,
And this abnormal behavior repeats every single time the files are switching.
My implementation to achieve this is as follows:
repeat:
if(user_input == 'y')
{
fstream data_out ("data.txt",fstream::out);
for(int i = 0; i<files_found; i++)
{
fstream data_in ((file_names[i].c_str()),fstream::in);
if(data_in.is_open())
{
data_in.seekg(0,data_in.end);
long size = data_in.tellg();
data_in.seekg(0,data_in.beg);
char * buffer = new char[size];
cout << size;
data_in.read(buffer,size);
data_out.write(buffer,size);
delete[] buffer;
}else
{
cout << "Unexpected error";
return 1;
}
data_in.close();
}
data_out.close();
}else if(user_input == 'n')
{
return 1;
}else
{
cout << "Input not recognised. Type y for Yes, and n for No";
cin >> user_input;
goto repeat;
}
Further information:
I have checked the size variable and it is as I expect, 14000.
8 bits, and a \ with n = 10 characters per line,
1400 rows x 10 = 14000.
Assuming reader of code to be experienced.
Sorry to bump this question, but I really like question that are marked as answered. JoachimPileborg answer seems to have worked for you:
Also, instead of seeking and checking sizes and allocating memory, why
not just do e.g. data_out << data_in.rdbuf();? This will copy the
whole input file to the output. – Joachim Pileborg Jul 29 at 17:26
A reference http://www.cplusplus.com/reference/ios/ios/rdbuf/ and an example:
#include <fstream>
#include <string>
#include <vector>
int main(int argc, char** argv)
{
typedef std::vector<std::string> Filenames;
Filenames vecFilenames;
// Populate the list of file names
vecFilenames.push_back("Text1.txt");
vecFilenames.push_back("Text2.txt");
vecFilenames.push_back("Text3.txt");
// Merge the files into Output.txt
std::ofstream fpOutput("Output.txt");
for (Filenames::iterator it = vecFilenames.begin();
it != vecFilenames.end(); ++it)
{
std::ifstream fpInput(it->c_str());
fpOutput << fpInput.rdbuf();
fpInput.close();
}
fpOutput.close();
return 0;
}

getline() returns empty line in Eclipse but working properly in Dev C++

Here is my code:
#include <iostream>
#include <stdlib.h>
#include <fstream>
using namespace std;
int main() {
string line;
ifstream inputFile;
inputFile.open("input.txt");
do {
getline(inputFile, line);
cout << line << endl;
} while (line != "0");
return 0;
}
input.txt content:
5 9 2 9 3
8 2 8 2 1
0
In Enclipse, it goes to infinite-loop. I'm using MinGW 5.1.6 + Eclipse CDT.
I tried many things but I couldn't find the problem.
Since you are on windows try:
} while (line != "0\r");
The last line is stored as "0\r\n". The \n is used as the line delimiter by getline so the actual line read will be "0\r"
or
you can convert the dos format file to UNIX format using command
dos2unix input.txt
Now your original program should work. The command will change the \r\n at the end of the line to \n
Also you should always do error checking after you try to open a file, something like:
inputFile.open("input.txt");
if(! inputFile.is_open()) {
cerr<< "Error opening file";
exit(1);
}
It will create an infinite loop if no line contains exactly 0. For example 0\n is not the same thing as 0. My guess is that that is your problem.
EDIT: To elaborate, getline should be discarding the newline. Perhaps the newline encoding of your file wrong (i.e. windows vs. unix).
Your main problem is working directory.
Because you are specifying a file using a relative path it searches for the file from the current working directory. The working directory can be specified by your dev environment. (Note: The working directory is not necessarily the same directory where the executable lives (this is a common assumption among beginners but only holds in very special circumstances)).
Though you have a special end of input marker "0" you should also check that the getline() is not failing (as it could error out for other reasons (including beady formatted input). As such it is usually best to check the condition of the file as you read it.
int main()
{
string line;
ifstream inputFile;
inputFile.open("input.txt");
while((getline(inputfile, line)) && (line != "0"))
{
// loop only entered if getline() worked and line !="0"
// In the original an infinite loop is entered when bad input results in EOF being hit.
cout << line << endl;
}
if (inputfile)
{
cout << line << endl; // If you really really really want to print the "0"
// Personally I think doing anything with the termination
// sequence is a mistake but added here to satisfy comments.
}
return 0;
}