std::getline doesn't skip empty lines when reading from ifstream - c++

Consider this code:
vector<string> parse(char* _config) {
ifstream my_file(_config);
vector<string> my_lines;
string nextLine;
while (std::getline(my_file, nextLine)) {
if (nextLine[0] == '#' || nextLine.empty() || nextLine == "") continue;
my_lines.push_back(nextLine);
}
return my_lines;
}
and this config file:
#Verbal OFF
0
#Highest numeric value
100
#Deck
67D 44D 54D 63D AS 69H 100D 41H 100C 39H 10H 85H 7D 42S 6C 67H 61D 33D 28H 93S QH 5D 91C 40S 50C 74S 8C 98C 96C 71D 82S 75S 23D 40C 29S QC 84C 16C 80D 13H 35S
#Players
P1 1
P2 2
My goal is to parse the config file to a vector of strings, parsed line by line, ignoring empty lines and the '#' character.
When running this code on Visual Studio, the output is correct But, when running on Linux with g++, I still get some empty lines.

Your input file most likely has lines ending with CR LF, i.e. Windows/DOS text files. Linux expects all lines ending with LF only, so on Linux, std::getline() ends up reading a line containing a single CR character.
Before the existing code that checks the contents of nextLine, check if the line is non-empty, and ends with the CR character, then remove it. Then continue on with your existing if statement.

Related

printing structs in C++

I'm trying to print 3 attributes from a struct. Why won't they all print? They will print 2 at a time, but not all three together.
name has a Carriage Return character at the end. So it's printing
2HP Potion\r10
\r moves the cursor to the beginning of the line, without moving to the next line, so 10 overwrites 2H.
I suspect this is because you read the name from a file that was written on Windows, which uses \r\n as its line break sequence in text files. You should either fix the file using dos2unix, or change the code that reads the file to remove \r characters.
You can remove the \r at the end with:
int last_pos = name.size()-1;
if (last_pos >= 0 && name[last_pos] == '\r') {
name.pop_back();
}

ofstream not translating "\r\n" to new line character

I have written a c++ code for changing file formats. Part of the functionality is to add a configured line end character. For one of file conversions, the line end character required is "\r\n" i.e. CR+NL .
My code basically reads the configured value from DB and appends it to the end of each record. Something on the lines of
//read DB and store line end char in a string lets say lineEnd.
//code snippet for file writting
string record = "this is a record";
ofstream outFileStream;
string outputFileName = "myfile.txt";
outFileStream.open (outputFileName.c_str());
outFileStream<<record;
outFileStream<<lineEnd; // here line end contains "\r\n"
But this prints record followed by \r\n as it is, no translation to CR+NL takes place.
this is a record\r\n
While the following works (prints CR+LF in output file)
outFileStream<<record;
outFileStream<<"\r\n";
this is a record
But I can not hard code it. I am facing similar issues with "\n" also.
Any suggestions on how to do it.
The translation of \r into the ASCII character CR and of \n into the ASCII character LF is done by the compiler when parsing your source code, and in literals only. That is, the string literal "A\n" will be a 3-character array with values 65 10 0.
The output streams do not interpret escape sequences in any way. If you ask an output stream to write the characters \ and r after each other, it will do so (write characters with ASCII value 92 and 114). If you ask it to write the character CR (ASCII code 13), it will do so.
The reason std::cout << "\r"; writes the CR character is that the string literal already contains the character 13. So if your database includes the string \r\n (4 characters: \, \r, \, n, ASCII 92 114 92 110), that is also the string you will get on output. If it contained the string with ASCII 13 10, that's what you'd get.
Of course, if it's impractical for you to store 13 10 in the database, nothing prevents you from storing 92 114 92 110 (the string "\r\n") in there, and translating it at runtime. Something like this:
void translate(std::string &str, const std::string &from, const std:string &to)
{
std::size_t at = 0;
for (;;) {
at = str.find(from, at);
if (at == str.npos)
break;
str.replace(at, from.size(), to);
}
}
std::string lineEnd = getFromDatabase();
translate(lineEnd, "\\r", "\r");
translate(lineEnd, "\\n", "\n");

reading and parsing a file, assigning each piece of the parsed string to its own variable

89 int Student::loadStudents() {
90 Student newStudent;
91 string comma;
92 string line;
93 ifstream myfile("student.dat");
94 string name,email="";
95 string status="";
96 int id;
97 if (myfile.is_open()){
98 while ( getline (myfile,line) ) {
99 //parse line
100 string myText(line);
101 istringstream iss(myText);
102 if(!(iss>>id)) id=0;
103
104 std::ignore(1,',');
105 std::getline(iss,name,',');
106 std::getline(iss,status,',');
107 std::getline(iss,email,',');
108 cout<<name<<endl;
109 Student newStudent(id,name,status,email);
110 Student::studentList.insert(std::pair<int,Student>(id,newStudent));
Above is the method I am defining. When the cout is executed the output is:
John Doe
Matt Smith
Before I added in the second getline(iss,name,',') the cout did nothing.
Can anyone explain why it works with the line repeated and why the same code won't work for status and email?
example line from file:
1,john doe,freshman,jd#email.com
EDIT:
I used std::ignore(1,',') before the first getline(iss,name,',') and received the error 'ignore' is undeclared in this namespace 'std'.
Can anyone explain why it works with the line repeated and why the same code won't work for status and email?
Because your first operation on isa is iss>>id.
Presumably your input file is of the form id,name,status,email. That first operation reads up to but not including the first comma. That first comma is still in the input stream. This means your first std::getline(iss,name,',') reads all the stuff remaining before that first comma and that first comma. All the stuff remaining before that first comma -- that's an empty string.
It's best not to mix parsing concepts. Split the line along the commas, then parse each of those split elements.
Edit
Another way to deal with this issue: call std::ignore instead of that first call to std::getline. The next character to be read should be a comma, so just ignore it. This is okay if you can assume a properly formatted input file. It is not okay if you have to deal with the vagaries of input files created by humans.
Another issue: Suppose someone's name is "John Doe, PhD" or the email address is "John Doe, PhD "?
Edit 2
Just to clarify, suppose the line contains "1234,John Doe,freshman,jdoe#college_name.edu".
Input pointer prior to iss>>id:
1234,John Doe,freshman,jdoe#college_name.edu
^
The call to iss>>id sets id to 1234 and advances the input pointer to the first non-numeric character -- the first comma.
Input pointer after iss>>id (prior to first call to std::getline):
1234,John Doe,freshman,jdoe#college_name.edu
____^
The first std::getline(iss,name,',') sees the input pointer is at a comma. It sets name to the empty string and advances the input pointer to just after the comma.
Input pointer after first call to std::getline (prior to second call to std::getline):
1234,John Doe,freshman,jdoe#college_name.edu
_____^
The second std::getline(iss,name,',') reads up to the second comma. It sets name to "John Doe" empty string and advances the input pointer to just after the second comma.
Input pointer after second call to std::getline (prior to third call to std::getline):
1234,John Doe,freshman,jdoe#college_name.edu
______________^

Parsing of file with Key value in C/C++

Need some help in parsing the file
Device# Device Name Serial No. Active Policy Disk# P.B.T.L ALB
Paths
--------------------------------------------------------------------------------------- -------------------------------------
1 AB OPEN-V-CM 50 0BC1F1621 1 SQST Disk 2 3.1.4.0 N/A
2 AB OPEN-V-CM 50 0BC1F1605 1 SQST Disk 3 3.1.4.1 N/A
3 AB OPEN-V*2 50 0BC1F11D4 1 SQST Disk 4 3.1.4.2 N/A
4 AB OPEN-V-CM 50 0BC1F005A 1 SQST Disk 5 3.1.4.3 N/A
The above information is in devices.txt file and and i want to extract the device number corresponding to the disk no i input.
The disk number i input is just an integer (and not "Disk 2" as shown in the file).
Open the file and skip first 3 lines.
Start reading line by line from 4th line onward. You can get the device number easily as it is the first column.
To get the disk no, search through each line using the space character. When you encounter one space character it means you've gone past one column. Ignore the repeated spaces and continue this until you reach the disk no. You must handle the spaces in the column data separately if it exist.
Load the disk no and device no in to say a map and later you can use your input to query the device info from this map.
#include <sstream>
#include <fstream>
#include <iostream>
#include <cctype>
using namespace std;
int main(int argc, char* argv[])
{
int wantedDisknum = 4;
int finalDeviceNum = -1;
ifstream fin("test.txt");
if(!fin.is_open())
return -1;
while(!fin.eof())
{
string line;
getline(fin, line);
stringstream ss(line);
int deviceNum;
ss >> deviceNum;
if(ss.fail())
{
ss.clear();
continue;
}
string unused;
int diskNum;
ss >> unused >> unused >> unused >> unused >> unused >> unused >> unused >> diskNum;
if(diskNum == wantedDisknum)
{
finalDeviceNum = deviceNum;
break;
}
}
fin.close();
cout << finalDeviceNum << endl;
system("pause");
return 0;
}
In UNIX, you can easily achieve this using awk or other script lang.
cat Device.txt | awk '{if ( $1 == 2 ) print}'
In C++, you have to extract specific column using strtok and compare it with 'val' if it matches print that line.'
Assuming there is no "Disk" in any of the following columns:
1) Skip lines until you encounter '-' as the first character of a line, then skip that line too.
2) read a line
2.a) skip characters of the current line until isdigit(line[i]) function returns true, then read current character and characters following it into a temporary buffer until isdigit(line[i]) returns false. This is the device id.
2.b) Skip characters of the current line until you find a 'D'
2.b.i) match 'i', 's', 'k' characters, if any of them fails, go to 2.b
2.c) skip characters of the current line until isdigit(line[i]) function returns true, then read current character and characters following it into another buffer until isdigit(line[i]) returns false. This is the disk id.
3) print out both buffers
I don't have my Regular Expression cheat sheet handy, but I'm pretty sure it would be straightforward to run each line in the file through a regex that:
1) looks for a integer in the line
2) skips whitespace followed by text three times
3) matches characters one space and characters
Boost, Qt, and most other common C++ class libraries have a Regex parser for just this kind of thing.

How to Parse A file to block in C++

I am looking for an elegant way to parse a file to blocks and for each block create a new file , for example :
original file:
line 1
line 2
line 3
line 4
line 5
line 6
line 7
result:
first file:
line 1
second file:
line 2
line 3
third file:
line 4
line 5
line 6
fourth file:
line 7
thanks
It looks like you could use this algorithm:
Count the number of spaces at the beginning of each line, if it's less than or equal to the number of spaces in the preceding non-empty line, open a new file.
What have you tried so far?
You could use scoped_ptrs to change the output file when the input line does not begin with whitespace:
std::ifstream in("/your/input/file");
boost::scoped_ptr<std::ofstream> out(NULL)
int out_serial = 0;
std::string line;
while(std::getline(in, line))
{
// test: first character non blank
if(! line.empty() && (line.at(0) != ' ' && line.at(0) != '\t'))
{
std::ostringstream new_output_file;
new_output_file << "/your/output/file/prefix_" << out_serial++;
out.reset(new std::ofstream(new_output_file.str()));
}
if(out.get() != NULL) (*out) << line << std::endl;
}
If your code is not properly intended or your blocks are only based on braces not spaces, you can use a stack(STL). Push on opening brace and Pop on closing brace. Open a new file each time the stack become empty