Read specific information in a text file - c++

As my title specifies,
I need to read information within a text file (C++).
I saw many example involving text file organized as list of numbers or strings, but in my case I need to extract the information within a file (example.txt) organized as follow:
// This is the begin of the text file:
Here_the_coordinates_are_going_to_be_listed
Start
x y z
0 0 0
1 0 0
1 1 0
0 1 0
End
And I would ideally read and store in "std::vector" the information contained between "Start" and "End" such that matrix is a N x 3 vector:
matrix[i][j] = 0 0 0
1 0 0
1 1 0
0 1 0
I gave a look at the tutorials and all I've got so far was:
std::array<std::array<int , 5>, 7> matrix;
std::ifstream file("../test/matrix.txt");
for (unsigned int i = 0; i < 7; i++)
{
for (unsigned int j = 0; j < 5; j++) {
file >> matrix[i][j];
}
which allows me to read a file where only numbers are written.
Thank you very much,
dARIO

go look here: http://www.cplusplus.com/doc/tutorial/files/ on how to read/write to files, then you can go through the file and look for the first relevant character (or last irelvant character, like the z in this case) and then just loop through all the relevant characters and store them in a 2 dimensional array (could be dynamic too if you do not know the length of your list)
EDIT:
So here is an example from that website of how to read a text file it is from the link above (which I updated, I'm very sorry about that) So basically the idea is as you see below, you open the file and then you just loop through every single line here (I think you can also use getchar for characters instead, which is probably better for you, but I am not too sure how excatly that works, you're just gonna have to mess around with that a bit, I'm sure you'll get it :) ) So here the line is just saved in a string and then printed out using cout, but you could manipulate the string further to find your data, I hope that helps! Feel free to ask again if I wasn't clear enough
// reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main ()
{
string line;
ifstream myfile ("example.txt");
if (myfile.is_open())
{
while ( getline (myfile,line) )
{
cout << line << '\n';
}
myfile.close();
}
else cout << "Unable to open file";
return 0;
}

Related

Logic for reading rows and columns from a text file (textparser) C++

I'm really stuck with this problem I'm having for reading rows and columns from a text file. We're using text files that our prof gave us. I have the functionality running so when the user in puts "numrows (file)" the number of rows in that file prints out.
However, every time I enter the text files, it's giving me 19 for both. The first text file only has 4 rows and the other one has 7. I know my logic is wrong, but I have no idea how to fix it.
Here's what I have for the numrows function:
int numrows(string line) {
ifstream ifs;
int i;
int row = 0;
int array [10] = {0};
while (ifs.good()) {
while (getline(ifs, line)) {
istringstream stream(line);
row = 0;
while(stream >>i) {
array[row] = i;
row++;
}
}
}
}
and here's the numcols:
int numcols(string line) {
int col = 0;
int i;
int arrayA[10] = {0};
ifstream ifs;
while (ifs.good()) {
istringstream streamA(line);
col = 0;
while (streamA >>i){
arrayA[col] = i;
col++;
}
}
}
edit: #chris yes, I wasn't sure what value to return as well. Here's my main:
int main() {
string fname, line;
ifstream ifs;
cout << "---- Enter a file name : ";
while (getline(cin, fname)) { // Ctrl-Z/D to quit!
// tries to open the file whose name is in string fname
ifs.open(fname.c_str());
if(fname.substr(0,8)=="numrows ") {
line.clear();
for (int i = 8; i<fname.length(); i++) {
line = line+fname[i];
}
cout << numrows (line) << endl;
ifs.close();
}
}
return 0;
}
This problem can be more easily solved by opening the text file as an ifstream, and then using std::get to process your input.
You can try for comparison against '\n' as the end of line character, and implement a pair of counters, one for columns on a line, the other for lines.
If you have variable length columns, you might want to store the values of (numColumns in a line) in a std::vector<int>, using myVector.push_back(numColumns) or similar.
Both links are to the cplusplus.com/reference section, which can provide a large amount of information about C++ and the STL.
Edited-in overview of possible workflow
You want one program, which will take a filename, and an 'operation', in this case "numrows" or "numcols". As such, your first steps are to find out the filename, and operation.
Your current implementation of this (in your question, after editing) won't work. Using cin should however be fine. Place this earlier in your main(), before opening a file.
Use substr like you have, or alternatively, search for a space character. Assume that the input after this is your filename, and the input in the first section is your operation. Store these values.
After this, try to open your file. If the file opens successfully, continue. If it won't open, then complain to the user for a bad input, and go back to the beginning, and ask again.
Once you have your file successfully open, check which type of calculation you want to run. Counting a number of rows is fairly easy - you can go through the file one character at a time, and count the number that are equal to '\n', the line-end character. Some files might use carriage-returns, line-feeds, etc - these have different characters, but are both a) unlikely to be what you have and b) easily looked up!
A number of columns is more complicated, because your rows might not all have the same number of columns. If your input is 1 25 21 abs 3k, do you want the value to be 5? If so, you can count the number of space characters on the line and add one. If instead, you want a value of 14 (each character and each space), then just count the characters based on the number of times you call get() before reaching a '\n' character. The use of a vector as explained below to store these values might be of interest.
Having calculated these two values (or value and set of values), you can output based on the value of your 'operation' variable. For example,
if (storedOperationName == "numcols") {
cout<< "The number of values in each column is " << numColsVal << endl;
}
If you have a vector of column values, you could output all of them, using
for (int pos = 0; pos < numColsVal.size(); pos++) {
cout<< numColsVal[pos] << " ";
}
Following all of this, you can return a value from your main() of 0, or you can just end the program (C++ now considers no return value from main to a be a return of 0), or you can ask for another filename, and repeat until some other method is used to end the program.
Further details
std::get() with no arguments will return the next character of an ifstream, using the example code format
std::ifstream myFileStream;
myFileStream.open("myFileName.txt");
nextCharacter = myFileStream.get(); // You should, before this, implement a loop.
// A possible loop condition might include something like `while myFileStream.good()`
// See the linked page on std::get()
if (nextCharacter == '\n')
{ // You have a line break here }
You could use this type of structure, along with a pair of counters as described earlier, to count the number of characters on a line, and the number of lines before the EOF (end of file).
If you want to store the number of characters on a line, for each line, you could use
std::vector<int> charPerLine;
int numberOfCharactersOnThisLine = 0;
while (...)
{
numberOfCharactersOnThisLine = 0
// Other parts of the loop here, including a numberOfCharactersOnThisLine++; statement
if (endOfLineCondition)
{
charPerLine.push_back(numberOfCharactersOnThisLine); // This stores the value in the vector
}
}
You should #include <vector> and either specific std:: before, or use a using namespace std; statement near the top. People will advise against using namespaces like this, but it can be convenient (which is also a good reason to avoid it, sort of!)

Reading a truth table in from plain text, translating it to a map<int,list<int>> in C++

I'm writing a file parser for standard C++ (no third-parties like Boost, unfortunately)...
I'm dealing with a situation where I have a plain-text file formatted like this:
1 ..header line 1, unimportant
2 ..header line 2, unimportant
3 ..header line 3, unimportant
4 1 0 0 0 0 0 0 1
5 2 0 1 0 2 1 0 0
...skipping ahead
14 11 1 0 0 0 0 1 1
15 12 0 0 1 0 0 1 2
16 13 2 0 0 0 1 0 0
...etc
(Note: The first column, 1 - 16, are line numbers. The skip ahead is meant to represent the gap of 8 spaces from the start of each line gets shorter as the second column, 1- 13, gets longer and longer numbers.
This text file denotes a truth table whereby items must be grouped by the columns, and each group will be composed of corresponding numbers from the first column. For instance, by the end of parsing this example, a map of type <int, list<int>> should look like (assuming there are no truths between lines 6 and 13):
[1: {11, 13}]
[2: {5, 15}]
[3: {12}]
[4: {5}]
[5: {5,16}]
[6: {14,15}]
[7: {4,14,15}]
In general, the number of columns in the text file can change, meaning the number of groups will change, so this must be accounted for. The number of rows is also variable, but will both will always start at 1 and the columns will not be numbered (but we can do that ourselves).
Now, were I to do this in Java I'd have a working solution rather quickly. However, I've never done work in C++ and am having trouble figuring out how to perform the operations properly, between its different structures and syntax. Despite scouring and finding lots of good guides, my lack of C++ foundation makes it hard to understand even the syntax differences that, I speculate, must be very basic.
Still, I've designed procedure, and it should work according to the following pseudocode:
//Begin Parse
//Create filereader "strmFileIn"
//To get past the first three lines, which will always be needless header info
string dummyLine;
for (i = 1; i <= 3; i++)
getline(strmFileIn, strDummyLine);
//Read first line to get count of how many groups are present
//(Copied from internet: gets the first line and puts the cursor back at its start)
int startPos = strmFileIn.tellg();
string strFirstLine;
getline(strmFileIn, strFirstLine);
strmFileIn.seekg(startPos, std::ios_base::beg);
//Tokenize strFirstLine into Array<int> tempArray
int numGroups = tempArray.size() - 1 //accounting for the row-header column, 1 - 13
//Create map (going to use java syntax, sorry)
Map<int,list<int>> myMap = new Map<int,list<int>>;
//Populate map with ints and empty lists (java again, sorry)
for (int i = 1; i <= numGroups; i++)
myMap.put(i, new List<int>);
//Iterate over lines in the file and appropriately populate the map's lists
while (fileIn != eof)
{
string fileInLine;
getline(strmFileIn, fileInLine);
//Tokenize fileInLine into Array<int> tempFileInArray
int intElemID = tempFileInArray[0];
//Remove element [0] from tempFileInArray (will be the row number, 1 - 13
//Iterate over remaining items in tempFileInArray, affect myMap where necessary
for (int i = 1; int i <= groupNum; i++)
if (tempFileInArray[i] != 0) //is not a strict truth-table, as any nonzero will be a truth
myMap.get[i].add(intElemID);
}
//Remove any entries in myMap with empty lists
//Kill strmFileIn for memory's sake
//End Parse
As you can see, my code is a broken mix of pseudocode and comparable Java I've already figured out. I just don't know how to turn this into C++; even with similar data structures, the syntax is a little daunting to someone with no experience. Is anyone here willing to help me out with it?
I really appreciate any insight.
Your code seems overly complicated, so lets do this one step at a time. Additionally, neither your code nor file format show how many bool columns should exist on each row, so I've ignored that part for this answer.
But first, a tip: In C++, the containers you care about 99.99% of the time are std::unordered_map, std::vector, and in very rare cases, std::map, boost::stable_vector and std::deque. In your case, you have rows with sequential indices, and the data for each row appears to be better stored as a vector of booleans. However, we'll do it your way, with the replacement of std::vector instead of std::list, and std::unordered_map instead of std::map.
This major data structures are mostly obvious:
std::unordered_map<int,std::vector<int>> myMap;
std::ifstream strmFileIn("input_file.txt");
Next your code reads in the first line, then ignores it entirely. I have no idea why, so I'll skip over that. Then, we parse out the lines one by one:
std::string full_current_line;
//for as long as we can read more lines, read them in
while(std::getline(strmFileIn, full_current_line)
{
//make the line into a stream so that we can parse data out
std::stringstream cur_line_stream(full_current_line);
//read in the line identifier
int identifier = 0;
cur_line_stream >> identifier;
//if that failed, abort.
if (!cur_line_stream)
{
//invalid identifer!
std::cerr << "identifier is invalid!\n"; //report
strmFileIn.setstate(std::ios::failbit); //failed to parse the data
break; //do not continue this loop
}
After that, we parse out the data for each row, which is surprisingly simple:
int column = 0;
int is_true = false;
//for each number remaining in the row...
while(cur_line_stream >> is_true)
{
//hooray we read a column!
++column;
if (is_true ==0)
{
//if it's zero, skip it
}
else if (is_true == 1)
{
//get the data for this column, and add this row's identifier
//myMap[column] will create a new empty entry if it didn't exist yet
//NOTE: This syntax only creates when used with map and unordered_map.
// This syntax does NOT create for vector and deque.
//once we have the vector, we push_back the new identifier into it.
myMap[column].push_back(identifier);
}
else
{
//invalid data!
std::cerr << is_true << " is invalid! found on row " << identifier << '\n';
cur_line_stream.setstate(std::ios::failbit); //failed to parse the data
strmFileIn.setstate(std::ios::failbit); //failed to parse the data
break; //do not continue this loop
}
}
}
If you know that groupNum contained the number of bools, you could replace that second while with something more like you already have:
for (int i = 1; int i <= groupNum; i++)
{
cur_line_stream >> is_true;
//if that failed, abort
if (!cur_line_stream)
{
//invalid data!
std::cerr << "data could not be read on row " << identifier << '\n';
cur_line_stream.setstate(std::ios::failbit); //failed to parse the data
strmFileIn.setstate(std::ios::failbit); //failed to parse the data
break; //do not continue this loop
}
else if (is_true == 0)
{
//if it's zero, skip it
}
etc etc etc
Work the other way. Code only in C++ (not in Java and don't think in Java), but start by parsing a small chunk of your syntax. First, code the lexer. Test it. Then code the parser, probably a recursive descent parser, and test it on short simple subelements of your language. Perhaps you'll need some small look-ahead (an easy task, use a std::list<Token>) Keep going up.
Start by formalizing, with pencil and paper, your input language. Could you for instance write a simple BNF grammar for it? (your question does not explain what is the input, it just gives an example)
In C++ parlance: to parse a map<int,list<int>> you certainly need to be able to parse int and list<int>. So write first the parsers for these.
As commented by Mooing Duck, your input language (which you did not define, just gave an example) seems simple enough to avoid most of this. But still, the idea is the same, think directly in C++ and start by reading a simple subpart of the input. Test your code. When that works, increase the part that is accepted. Repeat all this.
Here's a very simple solution that uses nothing but C++ and standard libraries. It just reads line by line and pulls each element out of the line with stream extraction using operator>>.
#include <iostream>
#include <fstream>
#include <sstream>
#include <map>
#include <list>
int main(int argc, char* argv[])
{
// Parse command line
if( argc != 2 )
return 1;
std::fstream fin(argv[1]);
if( !fin.good() )
{
std::cerr << "Error opening file for reading: " << argv[1] << std::endl;
return 1;
}
// Skip first three lines
std::string line;
for( int i=0; i<3; ++i )
{
std::getline(fin, line);
}
// Read each line
std::map<int, std::list<int> > hits;
while( std::getline(fin, line) )
{
// Extract each element from the line
std::stringstream sstr(line);
// Read line number from first column
int linenum = 0;
sstr >> linenum;
// Interpret remaining columns as truth values
bool truth;
int col=1;
while( sstr >> truth )
{
// Store position in map if true
if( truth )
{
hits[col].push_back(linenum);
}
col++;
}
}
// Print results
std::map<int, std::list<int> >::const_iterator col_iter;
for( col_iter = hits.begin(); col_iter != hits.end(); ++col_iter )
{
std::cout << "[" << col_iter->first << ": {";
std::list<int>::const_iterator line_iter;
for( line_iter = col_iter->second.begin(); line_iter != col_iter->second.end(); ++line_iter )
{
std::cout << *line_iter << " ";
}
std::cout << "} ]" << std::endl;
}
return 0;
}

Error copying and pasting data from a file to another

I am writing a code to merge multiple text files and output a single file.
There can be up to 22 input text files which contain 1400 lines each.
Each line has 8 bits of binary and the new line characters \n.
I am out putting a single file that has all 22 text files merged.
Problem is with my output file, after 1400 lines it appears that the content from the previous file is still being placed into output file(although the length of the previous file was 1400 lines). This extra content also begins to have additional line space between each row if opened by microsoft office or sublime, however it is interpreted as a single line if opened by notepad or excel(a single cell in excel).
Following is the picture of expected behaviour of the output file,
Here is a picture of abnormal behaviour. This starts when the first file finishes.
I know this data is from the first file still because the second file starts from 00000000
And here is the start of the second file,
And this abnormal behavior repeats every single time the files are switching.
My implementation to achieve this is as follows:
repeat:
if(user_input == 'y')
{
fstream data_out ("data.txt",fstream::out);
for(int i = 0; i<files_found; i++)
{
fstream data_in ((file_names[i].c_str()),fstream::in);
if(data_in.is_open())
{
data_in.seekg(0,data_in.end);
long size = data_in.tellg();
data_in.seekg(0,data_in.beg);
char * buffer = new char[size];
cout << size;
data_in.read(buffer,size);
data_out.write(buffer,size);
delete[] buffer;
}else
{
cout << "Unexpected error";
return 1;
}
data_in.close();
}
data_out.close();
}else if(user_input == 'n')
{
return 1;
}else
{
cout << "Input not recognised. Type y for Yes, and n for No";
cin >> user_input;
goto repeat;
}
Further information:
I have checked the size variable and it is as I expect, 14000.
8 bits, and a \ with n = 10 characters per line,
1400 rows x 10 = 14000.
Assuming reader of code to be experienced.
Sorry to bump this question, but I really like question that are marked as answered. JoachimPileborg answer seems to have worked for you:
Also, instead of seeking and checking sizes and allocating memory, why
not just do e.g. data_out << data_in.rdbuf();? This will copy the
whole input file to the output. – Joachim Pileborg Jul 29 at 17:26
A reference http://www.cplusplus.com/reference/ios/ios/rdbuf/ and an example:
#include <fstream>
#include <string>
#include <vector>
int main(int argc, char** argv)
{
typedef std::vector<std::string> Filenames;
Filenames vecFilenames;
// Populate the list of file names
vecFilenames.push_back("Text1.txt");
vecFilenames.push_back("Text2.txt");
vecFilenames.push_back("Text3.txt");
// Merge the files into Output.txt
std::ofstream fpOutput("Output.txt");
for (Filenames::iterator it = vecFilenames.begin();
it != vecFilenames.end(); ++it)
{
std::ifstream fpInput(it->c_str());
fpOutput << fpInput.rdbuf();
fpInput.close();
}
fpOutput.close();
return 0;
}

Need to write specific lines of a text into a new text

I have numerical text data lines ranging between 1mb - 150 mb in size, i need to write lines of numbers related to heights, for example: heights=4 , new text must include lines: 1,5,9,13,17,21.... consequentially.
i have been trying to find a way to do this for a while now, tried using a list instead of vector which ended up with compilation errors.
I have cleaned up the code as advised. It now writes all lines sample2 text, all done here. Thank you all
I am open to method change as long as it delivers what i need, Thank you for you time and help.
following is what i have so far:
#include <iostream>
#include <fstream>
#include <string>
#include <list>
#include <vector>
using namespace std;
int h,n,m;
int c=1;
int main () {
cout<< "Enter Number Of Heights: ";
cin>>h;
ifstream myfile_in ("C:\\sample.txt");
ofstream myfile_out ("C:\\sample2.txt");
string line;
std::string str;
vector <string> v;
if (myfile_in.is_open()) {
myfile_in >> noskipws;
int i=0;
int j=0;
while (std::getline(myfile_in, line)) {
v.push_back( line );
++n;
if (n-1==i) {
myfile_out<<v[i]<<endl;
i=i+h;
++j;
}
}
cout<<"Number of lines in text file: "<<n<<endl;
}
else cout << "Unable to open file(s) ";
cout<< "Reaching here, Writing one line"<<endl;
system("PAUSE");
return 0;
}
You need to use seekg to set the position at the beginning of the file, once you have read it (you have read it once, to count the lines (which I don't think you actually need, as this size is never used, at least in this piece of code)
And what is the point if the inner while? On each loop, you have
int i=1;
myfile_out<<v[i]; //Not writing to text
i=i+h;
So on each loop, i gets 1, so you output the element with index 1 all the time. Which is not the first element, as indices start from 0. So, once you put seekg or remove the first while, your program will start to crash.
So, make i start from 0. And get it out of the two while loops, right at the beginning of the if-statement.
Ah, the second while is also unnecessary. Leave just the first one.
EDIT:
Add
myfile_in.clear();
before seekg to clear the flags.
Also, your algorithm is wrong. You'll get seg fault, if h > 1, because you'll get out of range (of the vector). I'd advise to do it like this: read the file in the while, that counts the lines. And store each line in the vector. This way you'll be able to remove the second reading, seekg, clear, etc. Also, as you already store the content of the file into a vector, you'll NOT lose anything. Then just use for loop with step h.
Again edit, regarding your edit: no, it has nothing to do with any flags. The if, where you compare i==j is outside the while. Add it inside. Also, increment j outside the if. Or just remove j and use n-1 instead. Like
if ( n-1 == i )
Several things.
First you read the file completely, just to count the number of lines,
then you read it a second time to process it, building up an in memory
image in v. Why not just read it in the first time, and do everything
else on the in memory image? (v.size() will then give you the number
of lines, so you don't have to count them.)
And you never actually use the count anyway.
Second, once you've reached the end of file the first time, the
failbit is set; all further operations are no-ops, until it is reset.
If you have to read the file twice (say because you do away with v
completely), then you have to do myfile_in.clear() after the first
loop, but before seeking to the beginning.
You only test for is_open after having read the file once. This test
should be immediately after the open.
You also set noskipws, although you don't do any formatted input
which would be affected by it.
The final while is highly suspect. Because you haven't done the
clear, you probably never enter the loop, but if you did, you'd very
quickly start accessing out of bounds: after reading n lines, the size
of v will be n, but you read it with index i, which will be n * h.
Finally, you should explicitly close the output file and check for
errors after the close, just in case.
It's not clear to me what you're trying to do. If all you want to do is
insert h empty lines between each existing line, something like:
std::string separ( h + 1, '\n' );
std::string line;
while ( std::getline( myfile_in, line ) ) {
myfile_out << line << separ;
}
should do the trick. No need to store the complete input in memory.
(For that matter, you don't even have to write a program for this.
Something as simple a sed 's:$:\n\n\n\n:' < infile > outfile would do
the trick.)
EDIT:
Reading other responses, I gather that I may have misunderstood the
problem, and that he only wants to output every h-th line. If this is
the case:
std::string line;
while ( std::getline( myfile_in, line ) ) {
myfile_out << line << '\n';
for ( int count = h - 1; h > 0; -- h ) {
std::getline( myfile_in, line );
// or myfile_in.ignore( INT_MAX, '\n' );
}
}
But again, other tools seem more appropriate. (I'd follow thiton's
suggestion and use AWK.) Why write a program in a language you don't
know well when tools are already available to do the job.
If there is no absolutely compelling reason to do this in C++, you are using the wrong programming language for this. In awk, your whole program is:
{ if ( FNR % 4 == 1 ) print; }
Or, giving the whole command line e.g. in sh to filter lines 1,5,9,13,...:
awk '{ if ( FNR % 4 == 1 ) print; }' a.txt > b.txt

Textfiles C++ Editing the very first line

Example of Textfile:
5 <- I need to edit this number.
0
1
0
6
(Sample Code Not Whole Program)
#include <fstream>
#include <iostream>
using namespace std;
int main() {
int i;
cin>>i;
std::fstream file("example.txt", std::ios::in | std::ios::out | std::ios::app);
file.seekp(0);
file << i;
return 0;
}
With this code the number is added here:
(example.txt)
5
0
1
0
67 <<
Please note that from the bottom the numbers will keep increasing so it has to be always the first line not that specific 5.
Please Help
Thanks
You have opened the file in a mode that requests that all new data is appended to the end of the file (std::ios::app). Don't specify that flag if you don't want to always append.
Note that you will encounter problems if the new line you're writing is not exactly the same length as the existing line. In the case where it's a different length, you will have to copy and rewrite the entire remainder of the file.