input text from a file to a char variable - c++

I have a variable
ifstream inFile("stuff.txt");
in the stuff.txt file there's a full sentance:
The quick brown fox jumps over the lazy dog
How do I put it into a char array with all the spaces?
char text[250];
inFile >> text;
gets me to the first space, nothing else.

Use ifstream::getline() to read into a char array or use std::getline() to read into a std::string. When you use inFile >> text whitespace will act as a delimiter, which is why it stops at the first space.
Using std::getline() to read into a std::string removes the requirement for specifying the maximum number of characters to read. This removes the potential for buffer overruns and avoids the added complexity of coping with partially read lines (if a line was > 250 in this case for example).
Always check the result of IO operations immediately:
std::ifstream inFile("stuff.txt");
if (inFile.is_open())
{
std::string line;
if (std::getline(inFile, line))
{
// Use 'line'.
}
}

while (!inFile.eof()) {
inFile >> text;
}

Related

C++: Getline stops reading at first whitespace

Basically my issue is that I'm trying to read in data from a .txt file that's full of numbers and comments and store each line into a string vector, but my getline function stops reading at the first whitespace character so a comment like (* comment *) gets broken up into
str[0] = "(*";
str[1] = "comment";
str[2] = "*)";
This is what my codeblock for the getline function looks like:
int main() {
string line;
string fileName;
cout << "Enter the name of the file to be read: ";
cin >> fileName;
ifstream inFile{fileName};
istream_iterator<string> infile_begin {inFile};
istream_iterator<string> eof{};
vector<string> data {infile_begin, eof};
while (getline(inFile, line))
{
data.push_back(line);
}
And this is what the .txt file looks like:
101481
10974
1013
(* comment *) 0
28292
35040
35372
0000
7155
7284
96110
26175
I can't figure out why it's not reading the whole line.
This is for the very simple reason that your code is not using std::getline to read the input file.
If you look at your code very carefully, you will see that before you even get to that point, your code constructs an istream_iterator<string> on the file, and by passing it, and the ending istream_iterator<string> value to the vector's constructor, this effectively swallows the entire file, one whitespace-delimited word at a time, into the vector.
And by the time things get around to the getline loop, the entire file has already been read, and the loop does absolutely nothing. Your getline isn't really doing anything, with the current state of affairs.
Get rid of that stuff that involves istream_iterators, completely, and simply let getline do the job it was intended for.

C++ stringstreams skipping a character

I have a file which the first line reads ">FileName.txt". My goal is to read this line, and save "FileName.txt" to a variable called name. So I have:
ifstream file;
/* File opening stuff */
string line, name;
getline(file,line);
stringstream converter(line);
converter >> name;
This accomplishes saving ">FileName.txt" to the variable name, but I need to remove the '>' character. I am not sure if I should do that after this point, or if there is a way to skip over it entirely using the stringstream.
You can skip over it with the stream fairly easily:
char ch;
converter >> ch; // skip initial >
converter >> name; // now read the name
You can skip over it with the function ignore, add follow statement after stringstream converter(line); only need one line.
converter.ignore(line.length(), '>');

How do you know when an input stream has reached the last word in a line?

I'm reading input from an ifstream in C++. The input comes as a bunch of words separated by tabs, so I'm reading in something like "word1 word2 word3" as stream >> w1 >> w2 >> w3; I need to know when I've reached the final word in the line, so how would I go about that? The number of words is variable, but it should be always even. Also, will the last word contain the \n, or will the \n be the last word?
The simplest (and usual) solution is to read lines using
std::getline, and then parse the line using
std::istringstream:
std::string line;
while ( std::getline( std::cin, line ) ) {
std::istringstream s( line );
std::vector<std::string> words;
std::string word;
while ( s >> word ) {
words.push_back( word );
}
// ...
}
Read from ifstream and push it to vector using algorithm's std::copy like following:
std::ifstream stream("input.txt");
std::vector<std::string> vec;
//replace stream with std::cin for reading from console
std::copy(std::istream_iterator<std::string>(stream),
std::istream_iterator<std::string>(),
std::back_inserter(vec));
This needs a EOF for termination. Ctrl+Z or Ctrl+D depeding on windows or linux.
As suggested, you can use C++11's initializer list as follows:
std::vector<std::string> vec{std::istream_iterator<std::string>{stream},
std::istream_iterator<std::string>{}};
I would advise to mmap() the file. After that, you have the entire file in your virtual address space and can examine it at leisure like a big array of characters. Especially for such operations, where you have to go back a few steps, this is the most appropriate approach. As an added bonus, it's also the fastest...

C++ fstream: how to know size of string when reading?

...as someone may remember, I'm still stuck on C++ strings. Ok, I can write a string to a file using a fstream as follows
outStream.write((char *) s.c_str(), s.size());
When I want to read that string, I can do
inStream.read((char *) s.c_str(), s.size());
Everything works as expected. The problem is: if I change the length of my string after writing it to a file and before reading it again, printing that string won't bring me back my original string but a shorter/longer one. So: if I have to store many strings on a file, how can I know their size when reading it back?
Thanks a lot!
You shouldn’t be using the unformatted I/O functions (read() and write()) if you just want to write ordinary human-readable string data. Generally you only use those functions when you need to read and write compact binary data, which for a beginner is probably unnecessary. You can write ordinary lines of text instead:
std::string text = "This is some test data.";
{
std::ofstream file("data.txt");
file << text << '\n';
}
Then read them back with getline():
{
std::ifstream file("data.txt");
std::string line;
std::getline(file, line);
// line == text
}
You can also use the regular formatting operator >> to read, but when applied to string, it reads tokens (nonwhitespace characters separated by whitespace), not whole lines:
{
std::ifstream file("data.txt");
std::vector<std::string> words;
std::string word;
while (file >> word) {
words.push_back(word);
}
// words == {"This", "is", "some", "test", "data."}
}
All of the formatted I/O functions automatically handle memory management for you, so there is no need to worry about the length of your strings.
Although your writing solution is more or less acceptable, your reading solution is fundamentally flawed: it uses the internal storage of your old string as a character buffer for your new string, which is very, very bad (to put it mildly).
You should switch to a formatted way of reading and writing the streams, like this:
Writing:
outStream << s;
Reading:
inStream >> s;
This way you would not need to bother determining the lengths of your strings at all.
This code is different in that it stops at whitespace characters; you can use getline if you want to stop only at \n characters.
You can write the strings and write an additional 0 (null terminator) to the file. Then it will be easy to separate strings later. Also, you might want to read and write lines
outfile << string1 << endl;
getline(infile, string2, '\n');
If you want to use unformatted I/O your only real options are to either use a fixed size or to prepend the size somehow so you know how many characters to read. Otherwise, when using formatted I/O it somewhat depends on what your strings contain: if they can contain all viable characters, you would need to implement some sort of quoting mechanism. In simple cases, where strings consist e.g. of space-free sequence, you can just use formatted I/O and be sure to write a space after each string. If your strings don't contain some character useful as a quote, it is relatively easy to process quotes:
std::istream& quote(std::istream& out) {
char c;
if (in >> c && c != '"') {
in.setstate(std::ios_base::failbit;
}
}
out << '"' << string << "'";
std::getline(in >> std::ws >> quote, string, '"');
Obviously, you might want to bundle this functionality a class.

Tokenization of a text file with frequency and line occurrence. Using C++

once again I ask for help. I haven't coded anything for sometime!
Now I have a text file filled with random gibberish. I already have a basic idea on how I will count the number of occurrences per word.
What really stumps me is how I will determine what line the word is in. Gut instinct tells me to look for the newline character at the end of each line. However I have to do this while going through the text file the first time right? Since if I do it afterwords it will do no good.
I already am getting the words via the following code:
vector<string> words;
string currentWord;
while(!inputFile.eof())
{
inputFile >> currentWord;
words.push_back(currentWord);
}
This is for a text file with no set structure. Using the above code gives me a nice little(big) vector of words, but it doesn't give me the line they occur in.
Would I have to get the entire line, then process it into words to make this possible?
Use a std::map<std::string, int> to count the word occurrences -- the int is the number of times it exists.
If you need like by line input, use std::getline(std::istream&, std::string&), like this:
std::vector<std::string> lines;
std::ifstream file(...) //Fill in accordingly.
std::string currentLine;
while(std::getline(file, currentLine))
lines.push_back(currentLine);
You can split a line apart by putting it into an std::istringstream first and then using operator>>. (Alternately, you could cobble up some sort of splitter using std::find and other algorithmic primitaves)
EDIT: This is the same thing as in #dash-tom-bang's answer, but modified to be correct with respect to error handing:
vector<string> words;
int currentLine = 1; // or 0, however you wish to count...
string line;
while (getline(inputFile, line))
{
istringstream inputString(line);
string word;
while (inputString >> word)
words.push_back(pair(word, currentLine));
}
Short and sweet.
vector< map< string, size_t > > line_word_counts;
string line, word;
while ( getline( cin, line ) ) {
line_word_counts.push_back();
map< string, size_t > &word_counts = line_word_counts.back();
istringstream line_is( line );
while ( is >> word ) ++ word_counts[ word ];
}
cout << "'Hello' appears on line 5 " << line_word_counts[5-1]["Hello"]
<< " times\n";
You're going to have to abandon reading into strings, because operator >>(istream&, string&) discards white space and the contents of the white space (== '\n' or != '\n', that is the question...) is what will give you line numbers.
This is where OOP can save the day. You need to write a class to act as a "front end" for reading from the file. Its job will be to buffer data from the file, and return words one at a time to the caller.
Internally, the class needs to read data from the file a block (say, 4096 bytes) at a time. Then a string GetWord() (yes, returning by value here is good) method will:
First, read any white space characters, taking care to increment the object's lineNumber member every time it hits a \n.
Then read non-whitespace characters, putting them into the string object you'll be returning.
If it runs out of stuff to read, read the next block and continue.
If the you hit the end of file, the string you have is the whole word (which may be empty) and should be returned.
If the function returns an empty string, that tells the caller that the end of file has been reached. (Files usually end with whitespace characters, so reading whitespace characters cannot imply that there will be a word later on.)
Then you can call this method at the same place in your code as your cin >> line and the rest of the code doesn't need to know the details of your block buffering.
An alternative approach is to read things a line at a time, but all the read functions that would work for you require you to create a fixed-size buffer to read into beforehand, and if the line is longer than that buffer, you have to deal with it somehow. It could get more complicated than the class I described.