C++ cin.getline ignore empty lines - c++

I have a program that is given a file of a shape and reads it line by line to be able to put the shape into a 2D array. It's of an unknown size, so I have to count the rows as we go. Everything works fine, except I'm having trouble getting it to stop when there are empty lines trailing the input.
The piece of code in question is below:
while(cin.eof() != true){
getline(cin, input);
shape = shape + input;
rows++;
}
For example this will count 3 rows:
===
===
===
and this counts 4:
===
===
===
(empty line)
I need my program to ignore the empty lines, regardless of how many there are.
I've tried quite a few different things such as
if (!input.empty()){
shape = shape + input;
rows++;
}
or
if (input != " " && input[0] != '\0' && input[0] != '\n'){
shape = shape + input;
rows++;
}
These work if there is only one empty line, but if I had multiple empty lines it will only not count the very last one.
Shape and Input are both strings.

You have made a good choice to read a line at a time with getline(), but you are controlling your read loop incorrectly. See Why !.eof() inside a loop condition is always wrong.
Instead always control the continuation of your read loop based on the stream state resulting from the read function itself. In your case, you ignore the state after getline() and assume you have valid input -- which you won't when you read EOF. Why?
When you read the last line in your file, you will have read input, but the eofbit will not yet be set as you haven't reached the end-of-file yet. You loop checking cin.eof() != true (which it isn't yet) and then call getline (cin, input) BAM! Nothing was read and eofbit is now set, yet you blindly assign shape = shape + input; even though your read with getline() failed.
Your second issue is how do you skip empty lines? Simple. If input.size() == 0 the line was empty. To "skip" empty lines, just continue and read the next. To "quit reading" when the first empty line is reached, replace continue with break.
A short example incorporating the changes above would be:
#include <iostream>
#include <string>
int main (void) {
std::string input{}, shape{};
std::size_t rows = 0;
while (getline(std::cin, input)) { /* control loop with getline */
if (input.size() == 0) /* if .size() == 0, empty line */
continue; /* get next */
shape += input; /* add input to shape */
rows++; /* increment rows */
}
std::cout << rows << " rows\n" << shape << '\n';
}
Also see: Why is “using namespace std;” considered bad practice? and avoid developing habits that will be harder to break later.
Example Use/Output
$ cat << eof | ./bin/inputshape
> ===
> ===
> ===
> eof
3 rows
=========
With a blank line at the end
$ cat << eof | ./bin/inputshape
> ===
> ===
> ===
>
> eof
3 rows
=========
Or with multiple blank lines:
$ cat << eof | ./bin/inputshape
> ===
> ===
> ===
>
>
> eof
3 rows
=========
(note: the eof used in input above is simply the heredoc sigil marking the end of input and has no independent significance related to the stream state eofbit or .eof(). It could just as well have been banannas, but EOF or eof are generally/traditionally used. Also, if you are not using bash or another shell supporting a heredoc, just redirect a file to the program, e.g. ./bin/inputshape < yourfile)
Look things over and let me know if you have further questions.
Edit Based On No Use of continue or break
If you can't use continue or break, then just turn the conditional around and only add to shape if input.size() != 0. For example:
while (getline(std::cin, input)) { /* control loop with getline */
if (input.size() != 0) { /* if .size() != 0, good line */
shape += input; /* add input to shape */
rows++; /* increment rows */
}
}
Exact same thing, just written a bit differently. Let me know if that works for you.

Related

seekg() seeminlgy skipping characters past intended position C++

I've been having an issue with parsing through a file and the use of seekg(). Whenever a certain character is reached in a file, I want to loop until a condition is met. The loops works fine for the first iteration but, when it loops back, the file seemingly skips a character and causes the loop to not behave as expected.
Specifically, the loop works fine if it is all contained in one line in the file, but fails when there is at least one newline within the loop in the file.
I should mention I am working on this on Windows, and I feel like the issue arises from how Windows ends lines with \r\n.
Using seekg(-2, std::ios::cur) after looping back fixes the issue when the beginning loop condition is immediately followed by a newline, but does not work for a loop contained in the same line.
The code is structured by having an Interpreter class hold the file pointer and relevant variables, such as the current line and column. This class also has a functional map defined like so:
// Define function type for command map
typedef void (Interpreter::*function)(void);
// Map for all the commands
std::map<char, function> command_map = {
{'+', increment_cell},
{'-', decrement_cell},
{'>', increment_ptr},
{'<', decrement_ptr},
{'.', output},
{',', input},
{'[', begin_loop},
{']', end_loop},
{' ', next_col},
{'\n', next_line}
};
It iterates through each character, deciding if it has functionality or not in the following function:
// Iterating through the file
void Interpreter::run() {
char current_char;
if(!this->file.eof() && this->file.good()) {
while(this->file.get(current_char)) {
// Make sure character is functional command (ie not a comment)
if(this->command_map.find(current_char) != this->command_map.end()) {
// Print the current command if in debug mode
if(this->debug_mode && current_char != ' ' && current_char != '\n') {
std::cout << this->filename << ":" << this->line << ":"
<< this->column << ": " << current_char << std::endl;
}
// Execute the command
(this->*(command_map[current_char]))();
}
// If it is not a functional command, it is a comment. The rest of the line is ignored
else{
std::string temp_line = "";
std::getline(file, temp_line);
this->line++;
this->column = 0;
}
this->temp_pos = file.tellg();
this->column++;
}
}
else {
std::cout << "Unable to find file " << this->filename << "." << std::endl;
exit(1);
}
file.close();
}
The beginning of the loop (signaled by a '[' char) sets the beginning loop position to this->temp_pos:
void Interpreter::begin_loop() {
this->loop_begin_pointer = this->temp_pos;
this->loop_begin_line = this->line;
this->loop_begin_col = this->column;
this->run();
}
When the end of the loop (signaled by a ']' char) is reached, if the condition for ending the loop is not met, the file cursor position is set back to the beginning of the loop:
void Interpreter::end_loop() {
// If the cell's value is 0, we can end the loop
if(this->char_array[this->char_ptr] == 0) {
this->loop_begin_pointer = -1;
}
// Otherwise, go back to the beginning of the loop
if(this->loop_begin_pointer > -1){
this->file.seekg(this->loop_begin_pointer, std::ios::beg);
this->line = this->loop_begin_line;
this->column = this->loop_begin_col;
}
}
I was able to put in debugging information and can show stack traces for further clarity on the issue.
Stack trace with one line loop ( ++[->+<] ):
+ + [ - > + < ] [ - > + < ] done.
This works as intended.
Loop with multiple lines:
++[
-
>
+<]
Stack trace:
+ + [ - > + < ] > + < ] <- when it looped back, it "skipped" '[' and '-' characters.
This loops forever since the end condition is never met (ie the value of the first cell is never 0 since it never gets decremented).
Oddly enough, the following works:
++[
-
>+<]
It follows the same stack trace as the first example. This working and the last example not working is what has made this problem hard for me to solve.
Please let me know if more information is needed about how the program is supposed to work or its outputs. Sorry for the lengthy post, I just want to be as clear as possible.
Edit 1:
The class has the file object as std::ifstream file;.
In the constructor, it is opened with
this->file.open(filename), where filename is passed in as an argument.
For a file stream, seekg is ultimately defined in terms of fseek from the C standard library. The C standard has this to say:
7.21.9.2/4 For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.
So for a file opened in text mode, you can't do any arithmetic on offsets. You can rewind to the beginning, position at the end, or return to the position you were at previously and captured with tellg (which ultimately calls ftell). Anything else would exhibit undefined behavior.

c++ appending text into a string until see a specific character

I have more than one input files like this:
>1aab_
GKGDPKKPRGKMSSYAFFVQTSREEHKKKHPDASVNFSEFSKKCSERWKT
MSAKEKGKFEDMAKADKARYEREMKTYIPPKGE
>1j46_A
MQDRVKRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAE
KWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPK
>1k99_A
MKKLKKHPDFPKKPLTPYFRFFMEKRAKYAKLHPEMSNLDLTKILSKKYK
ELPEKKKMKYIQDFQREKQEFERNLARFREDHPDLIQNAKK
>2lef_A
MHIKKPLNAFMLYMKEMRANVVAESTLKESAAINQILGRRWHALSREEQA
KYYELARKERQLHMQLYPGWSARDNYGKKKKRKREK
Here, what I have to do:
vector <string> names;
vector <string> seqs;
names.resize(total); //"total" is already known.
seqs.resize(total);
counter=0;char input;
while ((input = myInput.get()) != EOF)
{
if(input=='>')
names[counter]= take all line (>1aab_, >1j46_A, so...)
else
untill the see next '>' append the character into sequence[counter]
counter++;
}
Finally it will be like this:
names[0]=">1aab_"
sequence[0]="GKGDPKKPRGKMSSYAFFVQTSREEHKKKHPDASVNFSEFSKKCSERWKTMSAKEKGKFEDMAKADKARYEREMKTYIPPKGE"
and so on..
I am thinking about for 2 hours and I couldn't figure out it. Can anyone help about that? Thanks in advance.
There's a few ways to solve it; I'll present some examples but I'm not testing/compiling this code, so there may be minor bugs - the logic is the important bit.
Since your pseudocode appears to be processing the input character by character, I've taken that as a requirement.
The way you seem to be thinking about it would be implemented with essentially a pair of loops - one for reading the name, the other for reading the sequence - which are enclosed in an outer loop, in order to process all records.
This would look something like the following:
// first character in file should be a '>', indicating the start
// of a record.
input = myInput.get();
if (input != '>')
{
std::cerr << "Malformed input file!" << std::endl;
return /*...*/;
}
do
{
// record name continues up until the newline
while ((input = myInput.get()) != EOF)
{
if (input == '\n' || input == '\r')
break;
names[counter].push_back(input);
}
// read sequence until we hit a '>' or EOF
while ((input = myInput.get()) != EOF)
{
if (input == '>')
{
// advance to next record number
counter++;
break;
}
sequence[counter].push_back(input);
}
} while (input != EOF && counter < total);
You'll also notice I moved the check for the initial '>' to before the loop, just as a way of ingesting (and discarding) the character, as well as a basic sanity check of the input. This is because we really use this character to mark the end of the sequence (rather than the "start of a record") - when we enter the loop, we assume we're already reading the record name.
Another way to approach it is to use a state machine. Essentially, this utilises additional variables to track the state the parser is in.
For this particular case, you only have two states: either you're reading a record name, or the sequence. So, we can just use a single boolean to track which state we're in.
Armed with the state variable, we can then make decisions about what to do with the character we just read based upon the state we're in. At the simplest level here, if we're in "read the record name" state, we add the character to the names variable, otherwise we add it to the sequence variable.
// state flag to indicate if we're currently reading a name line,
// i.e. a line starting with ">"
// This should be set true by the first record we encounter, so
// we'll set it false (to indicate we're reading a sequence) in
// order to allow us to detect bad input files.
bool reading_name = false;
// indicate we're on the first record, so we can avoid incrementing
// the record counter
bool first_record = true;
// process input character-by-character until end of file
while ((input = myInput.get()) != EOF)
{
// check for start of new record
if (input == '>')
{
// for robustness, verify we're not already reading a name,
// as this probably indicates invalid input
if (reading_name)
{
std::cerr << "Input is malformed?!" << endl;
break;
}
// switch to reading name state
reading_name = true;
// advance to next record, but only if it isn't the first record
if (first_record)
{
// disable the first_record flag, and explicitly set the
// record counter to 0.
first_record = false;
counter = 0;
}
else if (++counter >= total)
{
std::cerr << "Error: too many records!" << std::endl;
break;
}
}
// first character in file should start a new record
else if (first_record)
{
std::cerr << "Missing record start character at beginning of input!" << std::endl;
break;
}
// make sure we are processing a valid record number
else if (counter >= total)
{
std::cerr << "Invalid record number!" << std::endl;
break;
}
// continue reading the name
else if (reading_name)
{
// check if we've reached the end of the line; you
// may also want/need to check for \r if your input
// files may have Windows-style line endings
if (input == '\n')
{
// switch to reading sequence state
reading_name = false;
}
else
{
// add character to current name
names[counter].push_back(input);
}
}
// continue reading the sequence
else
{
// you might need to handle line ending characters here,
// maybe just skipping them?
// add character to current sequence
sequence[counter].push_back(input);
}
}
This adds a fair amount of complexity, which is of questionable value for this particular exercise, but does make adding additional states easier in future. It also has the benefit of only a single place in the code where I/O is done, which reduces the chances of errors (not checking for EOF, overflow array bounds, etc.).
In this case, we're actually using the '>' character as an indicator that a new record is starting, so we add a bit of extra logic to make sure that all works properly with the record counter. You could also just use a signed integer for your counter variable and start it at -1, so it will increment to 0 at the start of the first record, but using signed variables to index into arrays isn't a good idea.
There are more ways to approach this problem, but hopefully this gives you somewhere to start on your own solution.

how to discard from streams? .ignore() doesnt work for this purpose, any other methods?

I have a lack of understanding about streams. The idea is, to read a file to the ifstream and then working with it. Extract Data from the stream to a string, and discard the part which is now in a string from the stream. Is that possible? Or how to handle those problems?
The following method, is for inserting a file which is properly read by the ifstream. (its a text file, containing informations about "Lost" episodes, its an episodeguide. It works fine, for one element of the class episodes. Every time i instantiate a episode file, i want to check the stream of that file, discard the informations about one episode (its indicated by "****", then the next episode starts) and process the informations discarded in a string. If I create a new object of Episode I want to discard the next informations about the episodes after "****" to the next "****" and so on.
void Episode::read(ifstream& in) {
string contents((istreambuf_iterator<char>(in)), istreambuf_iterator<char>());
size_t episodeEndPos = contents.find("****");
if ( episodeEndPos == -1) {
in.ignore(numeric_limits<char>::max());
in.clear(), in.sync();
fullContent = contents;
}
else { // empty stream for next episode
in.ignore(episodeEndPos + 4);
fullContent = contents.substr(0, episodeEndPos);
}
// fill attributes
setNrHelper();
setTitelHelper();
setFlashbackHelper();
setDescriptionHelper();
}
I tried it with inFile >> words (to read the words, this is a way to get the words out of the stream) another way i was thinking about is, to use .ignore (to ignore an amount of characters in the stream). But that doesnt work as intended. Sorry for my bad english, hopefully its clear what i want to do.
If your goal is at each call of Read() to read the next episode and advance in the file, then the trick is to to use tellg() and seekg() to bookmark the position and update it:
void Episode::Read(ifstream& in) {
streampos pos = in.tellg(); // backup current position
string fullContent;
string contents((istreambuf_iterator<char>(in)), istreambuf_iterator<char>());
size_t episodeEndPos = contents.find("****");
if (episodeEndPos == -1) {
in.ignore(numeric_limits<char>::max());
in.clear(), in.sync();
fullContent = contents;
}
else { // empty stream for next episode
fullContent = contents.substr(0, episodeEndPos);
in.seekg(pos + streamoff(episodeEndPos + 4)); // position file at next episode
}
}
In this way, you can call several time your function, every call reading the next episode.
However, please note that your approach is not optimised. When you construct your contents string from a stream iterator, you load the full rest of the file in the memory, starting at the current position in the stream. So here you keep on reading and reading again big subparts of the file.
Edit: streamlined version adapted to your format
You just need to read the line, check if it's not a separator line and concatenate...
void Episode::Read(ifstream& in) {
string line;
string fullContent;
while (getline(in, line) && line !="****") {
fullContent += line + "\n";
}
cout << "DATENSATZ: " << fullContent << endl; // just to verify content
// fill attributes
//...
}
The code you got reads the entire stream in one go just to use some part of the read text to initialize an object. Imagining a gigantic file that is almost certainly a bad idea. The easier approach is to just read until the end marker is found. In an ideal world, the end marker is easily found. Based on comments it seems to be on a line of its own which would make it quite easy:
void Episode::read(std::istream& in) {
std::string text;
for (std::string line; in >> line && line != "****"; ) {
text += line + "\n";
}
fullContent = text;
}
If the separate isn't on a line of its own, you could use code like this instead:
void Episode::read(std::istream& in) {
std::string text;
for (std::istreambuf_iterator<char> it(in), end; it != end; ++it) {
text.push_back(*it);
if (*it == '*' && 4u <= text.size() && text.substr(text.size() - 4) == "****") {
break;
}
if (4u <= text.size() && text.substr(text.size() - 4u) == "****") {
text.resize(text.size() - 4u);
}
fullContent = text;
}
Both of these approaches would simple read the file from start to end and consume the characters to be extracted in the process, stopping as soon as reading of one record is done.

How to check for '\n' character using getline() for c-strings

I am trying to increment the variable int lines everytime getline encounters a '\n' character in an input file. However, my variable isn't being incremented after a new line and I'm assuming maybe I'm not checking the buffer correctly that I'm loading the characters of the line into. Here is my code with much of it simplified:
int lines = 0;
while(input.getline(buffer, 100))
{
if(buffer[0] == '\n')
lines++;
}
File format(I want it to increment once it encounters the '\n' between the two lines of data):
20012 CSCI 109 04 90 1 25
-- ID days_constraint start_contraint
Thanks guys.
if(buffer[0] == '\n')
That will only work if the newline is the first character in the line, i.e. if it's a blank line. So you're only counting blank lines. It should have been:
if(buffer[strlen(buffer)] == '\n')
But as #DavidSchwartz points out, you don't need it at all.
Assume you used std::basic_istream::getline, you don't need the check whether '\n' in your buffer, because '\n' was used as the delimiter. So just increment every successful getline is OK.
int lines = 0;
while(input.getline(buffer, 100))
{
lines++;
}
//other handle exception logic
if (input.fail()
{
// lines number is not valid
}
if (input.bad())
{
//......
}
Assume file is valid and lines less than 100, or you should add exception handle logic.

copying after a line has been found from a file from that position till the end of that file in c++

I have a file which holds protein coordinates as well as other information preceding it. My aim is to look for a certain line called "$PARAMETERS" and then copy from there every line succeeding it till the end of the file.
How can I get that done? This is the small code I wrote part of the entire program (that someone else wrote years ago, and I took over to upgrade his code for my research):
ifstream InFile;
InFile.open (DC_InFile.c_str(), ios::in);
while ( not InFile.eof() )
{
Line = NextLine (&InFile);
if (Line.find ("#") == 0) continue; // skip lines starting with # (comments)
if (Line.length() == 0) continue; // skip empty lines
size_t pos = Line.find("$PARAMETERS");
Line.copy(Line.begin("$PARAMETERS")+pos, Line.end("$END"));
&Line.copy >> x_1 >> y_2 >> z_3;
}
Bearing in mind that I defined Line as string
I guess you need to read data between $PARAMETERS and $END, not from $PARAMETERS until end of file. If so, you can use the following code:
string str;
while (getline(InFile, str))
{
if (str.find("#") == 0)
continue;
if (str.length() == 0)
continue;
if (str.find("$PARAMETERS") == 0)
{
double x_1, y_2, z_3; // you want to read numbers, i guess
while (getline(InFile, str))
{
if (str.find("$END") == 0)
break;
stringstream stream(str);
if (stream >> x_1 >> y_2 >> z_3)
{
// Do whatever you want with x_1, y_2 and z_3
}
}
}
}
This will handle multiple sections of data; not sure if you really want this behavior.
For example:
# comment
$PARAMETERS
1 2 3
4 5 6
$END
#unrelated data
100 200 300
$PARAMETERS
7 8 9
10 11 12
$END
I'm not sure what you want on the first line of the copied file but assuming you get that straight and you haven't read beyond the current line, you can copy the tail of the fike you are reading like this:
out << InFile.rdbuf();
Here out is the std::ostream you want to send the data to.
Note, that you should not use InFile.eof() to determine whether there is more data! Instead, you should read what you want to read and then check that the read was successful. You need to check after reading because the stream cannot know what you are trying to read before you have done so.
Following up on Dietmar's answer: it sounds to me like you
should be using std::getline until you find a line which
matches your pattern. If you want that line as part of your
output, then output it, then use Dietmar's solution to copy the
rest of the file. Something like:
while ( std::getline( in, line ) && ! isStartLine( line ) ) {
}
if ( in ) { // Since you might not have found the line
out << line << '\n'; // If you want the matching line
// You can also edit it here.
out << in.rdbuf();
}
And don't put all sorts of complicated parsing information,
with continue and break, in the loop. The results are both
unreadable and unmaintainable. Factor it out into a simple
function, as above: you'll also have a better chance of getting
it right. (In your case, should you match "$PARAMETERS #
xxx", or not?) In a separate function, it's much easier to get
it right.