Ifstream stops reading file after a few lines - c++

I am using an ifstream into a stringstream for reading a file but it stops after a couple lines...
string read(string filename)
{
ifstream inFile;
inFile.open(filename);
stringstream strStream;
strStream << inFile.rdbuf();
inFile.close();
string str = strStream.str();
return str;
}
This code stops after 'zh¬'
I am thinking maybe they are control characters in the ascii table, the first char after it stops is 26.
But i wouldn't think that matters.

Your ifstream is being opened in text mode. Try opening the file in binary mode:
std::ifstream inFile(filename, std::ios::binary);
A text stream is an ordered sequence of characters composed into lines (zero or more characters plus a terminating '\n'). Whether the last line requires a terminating '\n' is implementation-defined. Characters may have to be added, altered, or deleted on input and output to conform to the conventions for representing text in the OS (in particular, C streams on Windows OS convert \n to \r\n on output, and convert \r\n to \n on input)
Data read in from a text stream is guaranteed to compare equal to the data that were earlier written out to that stream only if all of the following is true:
the data consist only of printing characters and the control characters \t and \n (in particular, on Windows OS, the character '\0x1A' terminates input)
no \n is immediately preceded by a space character (space characters that are written out immediately before a \n may disappear when read)
the last character is \n
A binary stream is an ordered sequence of characters that can transparently record internal data. Data read in from a binary stream always equals to the data that were earlier written out to that stream. Implementations are only allowed to append a number of null characters to the end of the stream. A wide binary stream doesn't need to end in the initial shift state.
https://en.cppreference.com/w/cpp/io/c#Binary_and_text_modes

Related

cin is not accepting input with space in them in C++?

#include <iostream>
using namespace std;
int main(){
string doi, name, address, phone;
cout<<"Please provide these credentials:\n";
cout<<"1.Date of issue\n";
cin>>doi;
cout<<"2.Name\n";
cin>>name;
cout<<"3.Address\n";
cin>>address;
cout<<"4.Phone Number\n\n";
cin>>phone;
return 0;
}
When I am giving name input with spaces, for eg ("John Doe"), instead of storing this value is name it split value by space and store "John" in name and "Doe" in address.
If you have spaces in the string you want to read, you could use std::getline like this:
std::getline(std::cin, name);
instead of the line:
std::cin >> name;
Note that the getline will read all characters up to a newline character.
Also, please avoid using namespace std;.
You should use getline() in place of cin when you need to input strings with spaces.
In your case the syntax will be
string name;
getline(cin,name);
for more info on getline https://www.geeksforgeeks.org/getline-string-c/
There's a lot of muddled terminology in the comments and the answers. std::cin is an object; it doesn't do anything on its own.
Functions that read from input streams fit into one of two categories: they do formatted input or unformatted input. Formatted input functions translate the text that they get from the input stream (here, std::cin) into the data type that they're trying to read:
int i;
std::cin >> i; // operator>> reads text and translates it into an integer value
Formatted input functions begin by skipping whitespace, then they read characters and translate them; when the function encounters a character that isn't valid for the type that they're reading, or when they see whitespace, they stop. So in the example above, if you typed " 32 ", the stream extractor would skip the leading space, read the 3 and the 2, see the following space, and stop reading. The value stored into i would be 32.
std::string data;
std::cin >> data;
Here, if you type "Hello, world", the stream extractor (operator>>) will read up to the space, and store "Hello," in data.
If you want to read whitespace as well as non-whitespace you need an unformatted input function:
std::string data;
std::getline(std::cin, data);
Here, the call to getline reads text from std::cin up to the first newline character or to the end of the input stream. So if you typed " 32 " for this code, data would hold the text " 32 ". If you typed "Hello, world", data would hold the text "Hello, world".
And note that if you mix formatted input functions with unformatted input functions you have to be careful about leftover whitespace:
int i;
std::string data;
std::cin >> i;
std::getline(std::cin, data);
If you typed "32 Hello, world" on a single line, i would get the 32, and data would get " Hello, world".
On the other hand, if you type two lines of input, the first with "32" and the second with "Hello, world", you'll get 32 in i, but data will be empty. That's because the stream extractor read the 3 and the 2, then saw a newline character, and stopped, leaving the newline in the input buffer. std::getline then read that newline character and it, too, stopped. But it read the entire line: it swallowed the newline character.
So when your code switches from formatted input to unformatted you have to deal with whitespace characters remaining in the input buffer. If you want to read them, fine; but if you don't, you need to remove them:
int i;
std::string data;
std::cin >> i;
std::getline(std::cin, data); // read the rest of the line
std::getline(std::cin, data); // read the next line of text
A better approach is to do that cleanup with something like std::cin.ignore(42, '\n');. std::ignore is an unformatted input function; in this call it reads up to 42 characters, looking for a newline character. It stops reading when it has read 42 characters, sees a newline character, or hits the end of the input. That's better than using std::getline(std::cin, data) because it doesn't store the text into data, which could require a bunch of resizing if there's a lot of text in the remainder of the line. The more usual form for that call is to pass std::numeric_limits<int>::max() as the size argument; that's a special case, and it puts no limit on the number of characters to be read. So std::cin.ignore(std::numeric_limits<int>::max(), '\n'); will read characters until it finds a newline or hits the end of the input.
int i;
std::string data;
std::cin >> i;
std::cin.ignore(std::numeric_limits<int>::max(), '\n'); // flush the rest of the line
std::getline(std::cin, data); // read the next line of text

How to read string from file with line break using ifstream c++?

I used this code to read lines from file, but I noticed, that it didn't read line breaks:
ifstream fs8(sourceFile);
string line;
while (getline(fs8, line))
{
//here I am doing convertation from utf8 to utf16, but I need also to convert symbol "\n"
}
How to read line with line breaks ?
std::getline() reads data up to a delimiter, which is not stored. By default, that delimiter is '\n'. So you would have to either:
a) Pick a different delimiter -- but then you would no longer read "lines".
b) Add the newline to the data read (line += '\n').
I'd go for b), if you really need that newline converted. (I don't quite see why that would be necessary, but who am I to judge. ;-) )

Split string by delimiter by using vectors - how to split by newline?

I have function like this (I found it somewhere, it works with \t separator).
vector<string> delimited_str_to_vector(string& str, string delimiter)
{
vector<string> retVect;
size_t pos = 0;
while(str.substr(pos).find(delimiter) != string::npos)
{
retVect.push_back(str.substr(pos, str.substr(pos).find(delimiter)));
pos += str.substr(pos).find(delimiter) + delimiter.size();
}
retVect.push_back(str.substr(pos));
return retVect;
}
I have problem with splitting string by "\r\n" delimiter. What am I doing wrong?
string data = get_file_contents("csvfile.txt");
vector<string> csvRows = delimited_str_to_vector(data, "\r\n");
I'm sure, that my file uses CRLF for new line.
You can use getline to read the file line by line, which:
Extracts characters from is and stores them into str until the delimitation character delim is found (or the newline character, '\n' ...) If the delimiter is found, it is extracted and discarded, i.e. it is not stored and the next input operation will begin after it.
Perhaps you are already reading the file through a function that removes line endings.
If you open your file in text mode, i.e., you don't mention std::ios_base::binary (or one of it alternate spellings) it is likely that the system specific end of line sequences is replaced by \n characters. That is, even if your source file used \r\n, you may not see this character sequence when reading the file. Add the binary flag when opening the file if you really want to process these sequences.

Reading a text file in c++

string numbers;
string fileName = "text.txt";
ifstream inputFile;
inputFile.open(fileName.c_str(),ios_base::in);
inputFile >> numbers;
inputFile.close();
cout << numbers;
And my text.txt file is:
1 2 3 4 5
basically a set of integers separated by tabs.
The problem is the program only reads the first integer in the text.txt file and ignores the rest for some reason. If I remove the tabs between the integers it works fine, but with tabs between them, it won't work. What causes this? As far as I know it should ignore any white space characters or am I mistaken? If so is there a better way to get each of these numbers from the text file?
When reading formatted strings the input operator starts with ignoring leading whitespace. Then it reads non-whitespace characters up to the first space and stops. The non-whitespace characters get stored in the std::string. If there are only whitespace characters before the stream reaches end of file (or some error for that matter), reading fails. Thus, your program reads one "word" (in this case a number) and stops reading.
Unfortunately, you only said what you are doing and what the problems are with your approach (where you problem description failed to cover the case where reading the input fails in the first place). Here are a few things you might want to try:
If you want to read multiple words, you can do so, e.g., by reading all words:
std::vector<std::string> words;
std::copy(std::istream_iterator<std::string>(inputFile),
std::istream_iterator<std::string>(),
std::back_inserter(words));
This will read all words from inputFile and store them as a sequence of std::strings in the vector words. Since you file contains numbers you might want to replace std::string by int to read numbers in a readily accessible form.
If you want to read a line rather than a word you can use std::getline() instead:
if (std::getline(inputFile, line)) { ... }
If you want to read multiple lines, you'd put this operation into a loop: There is, unfortunately, no read-made approach to read a sequence of lines as there is for words.
If you want to read the entire file, not just the first line, into a file, you can also use std::getline() but you'd need to know about one character value which doesn't occur in your file, e.g., the null value:
if (std::getline(inputFile, text, char()) { ... }
This approach considers a "line" a sequence of characters up to a null character. You can use any other character value as well. If you can't be sure about the character values, you can read an entire file using std::string's constructor taking iterators:
std::string text((std::istreambuf_iterator<char>(inputFile)),
std::istreambuf_iterator<char>());
Note, that the extra pair of parenthesis around the first parameter is, unfortunately, necessary (if you are using C++ 2011 you can avoid them by using braces, instead of parenthesis).
Use getline to do the reading.
string numbers;
if (inputFile.is_open())//checking if open
{
getline (inputFile,numbers); //fetches entire line into string numbers
inputFile.close();
}
Your program does behave exactly as in your description : inputFile >> numbers; just extract the first integer in the input file, so if you suppress the tab, inputFile>> will extract the number 12345, not 5 five numbers [1,2,3,4,5].
a better method :
vector< int > numbers;
string fileName = "text.txt";
ifstream inputFile;
inputFile.open(fileName.c_str(),ios_base::in);
char c;
while (inputFile.good()) // loop while extraction from file is possible
{
c = inputFile.get(); // get character from file
if ( inputFile.good() and c!= '\t' and c!=' ' ) // not sure of tab and space encoding in C++
{
numbers.push_back( (int) c);
}
}
inputFile.close();

Portable char newline

Is there a way to write a cross-platform parser that reads chars until a newline character is found? I'm using '\0' in Linux, but I'm not sure that this can be done on Windows too.
std::string line;
// fill the line
QTextStream ss(&line);
for(;;)
{
ss >> c;
if(c == '"' || c=='\0' ) // here I want to continue parsing until a new-line character or a ending double quote is found
break;
}
If you are working with the C++ text streams (std::istream and std::ostream, unless the ios_base::binary flag has been set when opening a file stream), then C++ treats input and output of \n in a platform-independent manner.
That means that reading a file which contains \r\n on Windows will treat this as if it were \n, and likewise outputting \n will output a platform-specific newline character.
If you need to read consecutive lines, the easiest way is to use getline:
std::string line;
while (getline(std::cin, line)) {
// process line
}
\0 is never treated as a newline character.
The newline character in C is '\n' not '\0'. It will be converted to whatever the current platform uses.