Formatting files - C++ - c++

I have hit a brick wall trying to format one of my files. I have a file that I have formatted to look like this:
0 1 2 3 4 5
0.047224 0.184679 -0.039316 -0.008939 -0.042705 -0.014458
-0.032791 -0.039254 0.075326 -0.000667 -0.002406 -0.010696
-0.020048 -0.008680 -0.000918 0.302428 -0.127547 -0.049475
...
6 7 8 9 10 11
[numbers as above]
12 13 14 15 16 17
[numbers as above]
...
Each block of numbers has exactly the same number of lines. What I am trying to do is basically move every block (including the headers) to the right of the first block so in the end my output file would look like this:
0 1 2 3 4 5 6 7 8 9 10 11 ...
0.047224 0.184679 -0.039316 -0.008939 -0.042705 -0.014458 [numbers] ...
-0.032791 -0.039254 0.075326 -0.000667 -0.002406 -0.010696 [numbers] ...
-0.020048 -0.008680 -0.000918 0.302428 -0.127547 -0.049475 [numbers] ...
...
So in the end I should basically get a nxn matrix (only considering the numbers). I already have a python/bash hybrid script that can format this file
exactly like this BUT I've switched the running of my code from Linux to Windows and hence cannot use the bash part of the script anymore (since my code has to be compliant will all versions of Windows). To be honest I have no idea how to do it so any help would be appreciated!
Here's what I tried until now (it's completely wrong I know but maybe I can build on it...):
void finalFormatFile()
{
ifstream finalFormat;
ofstream finalFile;
string fileLine = "";
stringstream newLine;
finalFormat.open("xxx.txt");
finalFile.open("yyy.txt");
int countLines = 0;
while (!finalFormat.eof())
{
countLines++;
if (countLines % (nAtoms*3) == 0)
{
getline(finalFormat, fileLine);
newLine << fileLine;
finalFile << newLine.str() << endl;
}
else getline(finalFormat, fileLine);
}
finalFormat.close();
finalFile.close();
}

For such a task, I would do it the simple way. As we already know the number of lines and we know the pattern, I would simply keep a vector of strings (one entry per line of the final file) that I would update as I'm parsing the input file. Once it's done, I would iterate through my strings to print them into the final file. Here is a code that's doing it :
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
int main(int argc, char * argv[])
{
int n = 6; // n = 3 * nAtoms
std::ifstream in("test.txt");
std::ofstream out("test_out.txt");
std::vector<std::string> lines(n + 1);
std::string line("");
int cnt = 0;
// Reading the input file
while(getline(in, line))
{
lines[cnt] = lines[cnt] + " " + line;
cnt = (cnt + 1) % (n + 1);
}
// Writing the output file
for(unsigned int i = 0; i < lines.size(); i ++)
{
out << lines[i] << std::endl;
}
in.close();
out.close();
return 0;
}
Note that, depending of the structure of your input/ouput files, you might want to adjust the line lines[cnt] = lines[cnt] + " " + line in order to separate the columns with the right delimiter.

Related

Separate integers in a file by tabs into an array

Honestly, its a file of 200k integers and have no idea how to error check if the way I am doing it is correct so I'd like some help! There are 10 integers per line and 20k lines total.
Here's my code:
void readFile(int searchValues[VALUES], string fileName, string fileExtension, string filePath){
string fileLine;
ifstream myFile (fileName + fileExtension);
int currentIndex = 0;
if (myFile.is_open()){
while (getline(myFile,fileLine)){
for (int i = 0; i < 10; i++){
searchValues[currentIndex] = stoi(fileLine.substr(0, fileLine.find('\t')));
fileLine = fileLine.substr(fileLine.find('\t') + 1, fileLine.length()-1);
currentIndex++;
}
}
}
}
Here is a working example for 3 integers ... do the same for 10.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
FILE *fl=fopen(argv[1],"rt");
int a,b,c,line=0;
while (3 == fscanf(fl,"%d\t%d\t%d",&a,&b,&c))
{
printf(
"line %d contains ints: %d %d %d\n",
line++,
a,
b,
c);
}
fclose(fl);
}
I'm not sure what is the use of VALUES. Also if you're gonna put all the 200k values inside a single array, then why reading only 10 values? I assume you're gonna put each 10 values in a seperate array for example? Anyways, here is a code that reads VALUES amount of integers and stores them in an array. Also note that the parameter filePath is not used
input.txt:
0 1 2 3 4 5 6 7 8 9
#include <iostream>
#include <string>
#include <fstream>
const int VALUES = 10;
void readFile(int searchValues[VALUES], std::string fileName, std::string fileExtension, std::string filePath){
if (!filePath.empty() && filePath.back() != '/')
filePath.push_back('/');
std::string fullPath = filePath + fileName + '.' + fileExtension;
std::ifstream input(fullPath);
if (input.is_open())
{
int i = 0;
while (i < VALUES && input >> searchValues[i]) i++;
}
}
int main()
{
int arr[VALUES];
readFile(arr, "input", "txt", "");
for (int i : arr)
std::cout << i << ' ';
}
Output:
0 1 2 3 4 5 6 7 8 9

C++ Stop inputting text into 2D array once it reaches the end of the line

Ive searched and re-read but I can't figure this one out. I am simply trying to input a text file into a [i][j] string array. Which it does fine, but i need it to stop inputting into the array once it gets to the end of the line and start putting the second line in the 2nd array stop at the end of the line and so on..
My file contains 4 separate lines that read.
This is line one.
This is line two.
This is line three.
This is line four.
And my code reads it and mostly does what i need it to. But it puts everything into the array until it runs out of room then continues to the next row. It doesn't stop once it reaches the end of line one. Here is my code.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
string arrayTextIn[25][25];
ifstream textIn;
textIn.open("sampleText.txt");
for(int i=0; i<25; i++)
for(int j=0; j<25; j++)
{
textIn >> arrayTextIn[i][j];
if(arrayTextIn[i][j] == "\n") //This is where I dont know how to proceed.
break;
}
for(int i=0; i<25; i++)
for(int j=0; j<25; j++)
cout << i << " " << j << " "<< arrayTextIn[i][j] << endl;
return 0;
}
This is the output but what i want is each line to start at a new [ith] row. Thanks for the help.
0 0 This
0 1 is
0 2 line
0 3 one.
0 4 This
0 5 is
0 6 line
0 7 two.
0 8 This
0 9 is
0 10 line
0 11 three.
0 12 This
0 13 is
0 14 line
0 15 four.
0 16
1 0
1 1
1 2
1 3
1 4
This is a two-step process.
The first step is to read the input, one line at a time, this would be reading each line of text in the input file:
ifstream textIn;
textIn.open("sampleText.txt");
for(int i=0; i<25; i++)
{
std::string line;
if (std::getline(textIn, line).eof())
break;
// Magic goes here.
}
So, what we've accomplished so far is read each line of input, up to the maximum of 25.
The second step is take each line of input, and divide it into whitespace-delimited words. This part goes where magic goes, above:
std::istringstream iline(line);
for(int j=0; j<25; j++)
{
std::string word;
if ((iline >> word).eof())
break;
arrayTextIn[i][j]=word;
}
You start by constructing an istringstream, which works exactly like ifstream, except that the input stream comes from a string.
Then, it's pretty much what you had originally, except now the scope is small enough to be easily handled with a single loop.
In conclusion: the way to approach a task of any moderate complexity is to divide it into two or more smaller task. Here, you take this relatively complicated task and turn it into two smaller, easier to implement tasks: first, reading each line of text, and, second, given a line of read text, divide each line into its individual words.
If you really want to maintain your design (2D-array) then this may be a fast solution:
for (int i = 0; i < 25; i++) {
std::string line;
std::getline(textIn, line); // read the entire line
if (textIn.eof()) {
// if reach the end of file, then break the cycle.
break;
}
if (line.size() > 0) { // avoid empty lines
int j = 0;
size_t finder;
// Cycle and looking for whitespace delimiters.
while (finder = line.find(' '), finder != std::string::npos && j < 25) {
// get a single word
auto token = line.substr(0, finder);
if (token != " ") {
// if that word is not a simple white-space then save it in the array. You can also std::move if C++11
arrayTextIn[i][j] = token;
++j;
}
// erase that part from the main line
line.erase(0, finder + 1);
}
}
}

Read integer data from a file

I am just getting started on C++ and am working on codeval questions, so if anyones done that, they'll recognize this problem as it's the first on the list. I need to open a file that has 3 columns of space separated integer values. Here is mine, under fizbuz.txt. I need to get the integer values from the file and store them for later use elsewhere in the program.
1 2 10
3 5 15
4 5 20
2 8 12
2 4 10
3 6 18
2 3 11
8 9 10
2 5 8
4 9 25
Now I can open the file just fine, and I've used getline() to read the files just fine using my below code. However, I don't want them in string format, I'd like them as integers. So I looked around and everyone basically says the same notation (file>>int1>>int2...). I've written some code exactly how I've seen it in a few examples, and it does not behave at all like they're telling me it should.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string filename = "fizbuz.txt";
string line;
int d1,d2,len;
int i =0;
int res1[10], res2[10], length[10];
ifstream read (filename.c_str());
if (read.is_open())
{
// while(read>>d1>>d2>>len);
// {
// res1[i] = d1;
// res2[i] = d2;
// length[i] = len;
// i++;
// }
while (!read.eof())
{
read>>d1>>d2>>len;
res1[i] = d1;
res2[i] = d2;
length[i] = len;
}
read.close();
}
else
{
cout << "unable to open file\n";
}
for (int j = 0; j < 10;j++)
{
cout<< res1[j] << " " << res2[j] << " " << length[j] << '\n';
}
}
Both of the while loops perform the same in the output function at the bottom. The last line of fizbuz.txt will be returned to the first elements of res1,res2 and length, and the remaining elements of all 3 are psuedorandom values, presumably from whatever program was using that memory block before. ex output below
4 9 25
32767 32531 32767
-1407116911 4195256 -1405052128
32531 0 32531
0 0 1
0 1 0
-1405052128 807 -1404914400
32531 1 32531
-1405054976 1 -1404915256
32531 0 32531
The first version should work except that you need to remove the ; in the while line.
while (read >> d1 >> d2 >> len);
^
Try this
while (!read.eof())
{
read>>d1>>d2>>len;
res1[i] = d1;
res2[i] = d2;
length[i] = len;
i++;
}

Merging files with mergesort algorithm in c++

I wrote a program to do an external mergesort on a file of 100,000 doubles. I couldn't quickly find and external storage libraries for c++ because googling it just leads to a bunch of pages about the extern keyword, so I decided to just write my own, and I think that's where the problem is.
The program actually works, except for a couples details. The output fill will have all of the doubles in sorted order, but at the end of the file are 30 lines of
-9.2559631349317831e+061
which is not in the input file. I also have 21 more values in the output file and the input file, not counting the 30 lines of the single number I just mentioned.
How the program runs is it reads the 100,000 doubles ~4000 lines at a time and sorts them, then stores them in to 26 text files, then those 26 files are merged into 13 files, and those 13 into 7, etc... until there is only one file.
I'm sorry if the code is really ugly, I figured out all of the external storage stuff on my own by pencil, paper, trial, and error. The program is not going to be used for anything. I haven't cleaned it up yet. The driver doesn't do much other than call these methods.
//reads an ifstream file and stores the data in a deque. returns a bool indicating if the file has not reached EOF
bool readFile(ifstream &file, deque<DEQUE_TYPE> &data){
double d;
for(int i = 0; i < DEQUE_SIZE && file.good(); i++){
file >> d;
data.push_back(d);
}
return file.good();
}
//opens a file with the specified filename and prints the contents of the deque to it. if append is true, the data will be appended to the file, else it will be overwritten
void printFile(string fileName, deque<DEQUE_TYPE> &data, bool append){
ofstream outputFile;
if(append)
outputFile.open(fileName, ios::app);
else
outputFile.open(fileName);
outputFile.precision(23);
while(data.size() > 0){
outputFile << data.front() << endl;
data.pop_front();
}
}
//merges the sortfiles until there is one file left
void mergeFiles(){
ifstream inFile1, inFile2;
ofstream outFile;
string fileName1, fileName2;
int i, k, max;
deque<DEQUE_TYPE> data1;
deque<DEQUE_TYPE> data2;
bool fileGood1, fileGood2;
i = 0;
k = 0;
max = 25;
while(max > 1){
fileName1 = ""; fileName1 += "sortfile_"; fileName1 += to_string(i); fileName1 += ".txt";
fileName2 = ""; fileName2 += "sortfile_"; fileName2 += to_string(i+1); fileName2 += ".txt";
try{
inFile1.open(fileName1);
inFile2.open(fileName2);
} catch(int e){
cout << "Could not open the open the files!\nError " << e;
}
fileGood1 = true;
fileGood2 = true;
while(fileGood1 || fileGood2){
fileGood1 = readFile(inFile1, data1);
fileGood2 = readFile(inFile2, data2);
data1 = merge(data1, data2);
printFile("temp", data1, true);
data1.clear();
}
inFile1.close();
inFile2.close();
remove(fileName1.c_str());
remove(fileName2.c_str());
fileName1 = ""; fileName1 += "sortfile_"; fileName1 += to_string(k); fileName1 += ".txt";
rename("temp", fileName1.c_str());
i = i + 2;
k++;
if(i >= max){
max = max / 2 + max % 2;
i = 0;
k = 0;
}
}
}
//merge function
deque<double> merge(deque<double> &left, deque<double> &right){
deque<double> result;
while(left.size() > 0 || right.size() > 0){
if (left.size() > 0 && right.size() > 0){
if (left.front() <= right.front()){
result.push_back(left.front());
left.pop_front();
}
else{
result.push_back(right.front());
right.pop_front();
}
}
else if(left.size() > 0){
result.push_back(left.front());
left.pop_front();
}
else if(right.size() > 0){
result.push_back(right.front());
right.pop_front();
}
}
return result;
}
I sorted a file of 26 numbers (0 - 25), as ThePosey suggested, and here are the results:
-9.2559631349317831e+061 (47 lines of this)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
25
25
25
25
25
So I'm pretty sure the last number of the file is being duplicated, but I'm still not sure what the 47 occurrences of the random large number is caused by. I checked and the last number of the 100,000 number word is only in the output file twice, not 22 times, So I think I have 11 separate last number being duplicated.
I don't know if this is the whole problem or not, but you have a classic error in your input loop. file.good() doesn't guarantee that the next read will succeed, it only tells you that the previous one did. Try restructuring it like this:
for(int i = 0; i < DEQUE_SIZE && (file >> d); i++){
data.push_back(d);
}
The expression file >> d returns a reference to file, which calls good when you try to evaluate it as a boolean.
Is there a reason why you can't use a few megs of memory to read the entire list in at once into RAM and sort it all at once? It would simplify your program a lot. If you are trying to do this as a challenge I would start by shrinking the problem to say like 1 file of 100 doubles, split that into 4, 25 double reads, and then it should be very easy to trace through and see where the additional lines are coming from.
Assuming your files are in text format, you can use std::merge to do an external merge just as well as an internal one, by using std::istream_iterators.
std::ifstream in1("temp1.txt");
std::ifstream in2("temp2.txt");
std::ofstream out("output.txt");
std::merge(std::istream_iterator<double>(in1),
std::istream_iterator<double>(),
std::istream_iterator<double>(in2),
std::istream_iteraror<double>(),
std::ostream_iterator<double>(out, "\n"));

c++ process file blank line at the end of file

when I use c++ to process a file ,I found there is always a blank line in the end of file .Someone says that vim will append an '\n' in the end of file,but when I use gedit,it also has the same question.Can anyone tell me the reason?
1 #include<iostream>
2 #include<fstream>
3
4 using namespace std;
5 const int K = 10;
6 int main(){
7 string arr[K];
8 ifstream infile("test1");
9 int L = 0;
10 while(!infile.eof()){
11 getline(infile, arr[(L++)%K]);
12 }
13 //line
14 int start,count;
15 if (L < K){
16 start = 0;
17 count = L;
18 }
19 else{
20 start = L % K;
21 count = K;
22 }
23 cout << count << endl;
24 for (int i = 0; i < count; ++i)
25 cout << arr[(start + i) % K] << endl;
26 infile.close();
27 return 1;
28 }
while test1 file just:
abcd
but the program out is :
2
abcd
(upside is a blank line)
while(!infile.eof())
infile.eof() only is true after you tried to read beyond the end of the file. So the loop tries to read one more line than there is and gets an empty line on that attempt.
It's a matter of order, you're reading, assigning and after checking...
you should change your code a little bit, in order to read, check and assign:
std::string str;
while (getline(infile, str)) {
arr[(L++)%K] = str;
}
http://www.parashift.com/c++-faq-lite/istream-and-eof.html
How to determine whether it is EOF when using getline() in c++