c++ strings and file input - c++

Ok, its been a while since I've done any file input or string manipulation but what I'm attempting to do is as follows
while(infile >> word) {
for(int i = 0; i < word.length(); i++) {
if(word[i] == '\n') {
cout << "Found a new line" << endl;
lineNumber++;
}
if(!isalpha(word[i])) {
word.erase(i);
}
if(islower(word[i]))
word[i] = toupper(word[i]);
}
}
Now I assume this is not working because >> skips the new line character?? If so, whats a better way to do this.

I'll guess that word is a std::string. When using >>, the first white-space character terminates the 'word' and the next invocation will skip white-space so no white-space while occur in word.
You don't say what you're actually trying to do but for line based input you should consider using the free function std::getline and then splitting each line into words as a separate step.
E.g.
std::string line;
while( std::getline( std::cin, line ) )
{
// parse line
}

There is getline function.

How about using getline()?
string line;
while(getline(infile, line))
{
//Parse each line into individual words and do whatever you're going to do with them.
}

Related

Read a file .txt C++

How should I read lines with spaces from a file.txt and record it in my vector?
I have a line that consists of many words ,but my loop doesn't see that and read them one by one and print in that way:
For example,I have a string in a file:
Hello, my friends,how are you?
Hello,James, we are fine.
And in my console, I see:
Hello,
my
friends
....
fine
This my loop:
while(rRecord»str)
{
lines.push_back(str);
}
And my function that prints my words:
void printRecord(int& numStr,struct winsize w,std::vector<std::string>& lines)
{
for (int i = numStr; i < numStr + w.winsize::ws_row-1; i++)
{
if (i>=lines.size())
break;
else
std::cout « lines[i] « std::endl;
}
numStr += w.winsize::ws_row;
}
To read line-by-line, use std::getline, like this:
std::string line;
while (std::getline(inFile, line)) {
lineVector.push_back(std::move(line));
}
The std::move means that when the vector creates the new element, it can "steal" the internal buffer from line, meaning line will now be an empty string, but saves an extra allocation+copy.
Be aware that mixing getline with >> is usually not a good idea, because >> will leave any trailing whitespace, including a newline, in the stream, meaning you get unexpected results the next time you try to getline.

Reading in only letters from a text file

I am trying to read in from a text file a poem that contains commas, spaces, periods, and newline character. I am trying to use getline to read in each separate word. I do not want to read in any of the commas, spaces, periods, or newline character. As I read in each word I am capitalizing each letter then calling my insert function to insert each word into a binary search tree as a separate node. I do not know the best way to separate each word. I have been able to separate each word by spaces but the commas, periods, and newline characters keep being read in.
Here is my text file:
Roses are red,
Violets are blue,
Data Structures is the best,
You and I both know it is true.
The code I am using is this:
string inputFile;
cout << "What is the name of the text file?";
cin >> inputFile;
ifstream fin;
fin.open(inputFile);
//Input once
string input;
getline(fin, input, ' ');
for (int i = 0; i < input.length(); i++)
{
input[i] = toupper(input[i]);
}
//check for duplicates
if (tree.Find(input, tree.Current, tree.Parent) == true)
{
tree.Insert(input);
countNodes++;
countHeight = tree.Height(tree.Root);
}
Basically I am using the getline(fin,input, ' ') to read in my input.
I was able to figure out a solution. I was able to read in an entire line of code into the variable line, then I searched each letter of the word and only kept what was a letter and I stored that into word.Then, I was able to call my insert function to insert the Node into my tree.
const int MAXWORDSIZE = 50;
const int MAXLINESIZE = 1000;
char word[MAXWORDSIZE], line[MAXLINESIZE];
int lineIdx, wordIdx, lineLength;
//get a line
fin.getline(line, MAXLINESIZE - 1);
lineLength = strlen(line);
while (fin)
{
for (int lineIdx = 0; lineIdx < lineLength;)
{
//skip over non-alphas, and check for end of line null terminator
while (!isalpha(line[lineIdx]) && line[lineIdx] != '\0')
++lineIdx;
//make sure not at the end of the line
if (line[lineIdx] != '\0')
{
//copy alphas to word c-string
wordIdx = 0;
while (isalpha(line[lineIdx]))
{
word[wordIdx] = toupper(line[lineIdx]);
wordIdx++;
lineIdx++;
}
//make it a c-string with the null terminator
word[wordIdx] = '\0';
//THIS IS WHERE YOU WOULD INSERT INTO THE BST OR INCREMENT FREQUENCY COUNTER IN THE NODE
if (tree.Find(word) == false)
{
tree.Insert(word);
totalNodes++;
//output word
//cout << word << endl;
}
else
{
tree.Counter();
}
}
This is a good time for a technique I've posted a few times before: define a ctype facet that treats everything but letters as white space (searching for imbue will show several examples).
From there, it's a matter of std::transform with istream_iterators on the input side, a std::set for the output, and a lambda to capitalize the first letter.
You can make a custom getline function for multiple delimiters:
std::istream &getline(std::istream &is, std::string &str, std::string const& delims)
{
str.clear();
// the 3rd parameter type and the condition part on the right side of &&
// should be all that differs from std::getline
for(char c; is.get(c) && delims.find(c) == std::string::npos; )
str.push_back(c);
return is;
}
And use it:
getline(fin, input, " \n,.");
You can use std::regex to select your tokens
Depending on the size of your file you can read it either line by line or entirely in an std::string.
To read the file you can use :
std::ifstream t("file.txt");
std::string sin((std::istreambuf_iterator<char>(t)),
std::istreambuf_iterator<char>());
and this will do the matching for space separated string.
std::regex word_regex(",\\s]+");
auto what =
std::sregex_iterator(sin.begin(), sin.end(), word_regex);
auto wend = std::sregex_iterator();
std::vector<std::string> v;
for (;what!=wend ; wend) {
std::smatch match = *what;
V.push_back(match.str());
}
I think to separate tokens separated either by , space or new line you should use this regex : (,| \n| )[[:alpha:]].+ . I have not tested though and it might need you to check this out.

How can I ignore the "end of line" or "new line" character when reading text files word by word?

Objective:
I am reading a text file word by word, and am saving each word as an element in an array. I am then printing out this array, word by word. I know this could be done more efficiently, but this is for an assignment and I have to use an array.
I'm doing more with the array, such as counting repeated elements, removing certain elements, etc. I also have successfully converted the files to be entirely lowercase and without punctuation.
Current Situation:
I have a text file that looks like this:
beginning of file
more lines with some bizzare spacing
some lines next to each other
while
others are farther apart
eof
Here is some of my code with itemsInArray initialized at 0 and an array of words refered to as wordArray[ (approriate length for my file ) ]:
ifstream infile;
infile.open(fileExample);
while (!infile.eof()) {
string temp;
getline(infile,temp,' '); // Successfully reads words seperated by a single space
if ((temp != "") && (temp != '\n') && (temp != " ") && (temp != "\n") && (temp != "\0") {
wordArray[itemsInArray] = temp;
itemsInArray++;
}
The Problem:
My code is saving the end of line character as an item in my array. In my if statement, I've listed all of the ways I have tried to disclude the end of line character, but I've had no luck.
How can I prevent the end of line character from saving as an item in my array?
I've tried a few other methods I have found on threads similar to this, including something with a *const char that I couldn't make work, as well as iterating through and deleting the new line characters. I've been working on this for hours, I don't want to repost the same issue, and have tried many many methods.
The standard >> operator overloaded for std::string already uses white-space as word boundary so your program can be simplified a lot.
#include <iostream>
#include <string>
#include <vector>
int
main()
{
std::vector<std::string> words {};
{
std::string tmp {};
while (std::cin >> tmp)
words.push_back(tmp);
}
for (const auto& word : words)
std::cout << "'" << word << "'" << std::endl;
}
For the input you are showing, this will output:
'beginning'
'of'
'file'
'more'
'lines'
'with'
'some'
'bizzare'
'spacing'
'some'
'lines'
'next'
'to'
'each'
'other'
'while'
'others'
'are'
'farther'
'apart'
'eof'
Isn't this what you want?
The stream's extraction operator should take care of that for you
std::ifstream ifs("file.txt");
while (ifs.good())
{
std::string word;
ifs >> word;
if (ifs.eof())
{
break;
}
std::cout << word << "\n";
}
int main()
{
char *n;
int count=0,count1=0;
ofstream output("user.txt");
output<<"aa bb cc";
output.close();
ifstream input("user.txt");
while(!input.eof())
{
count++;
if(count1<count)
cout<<" ";
count1=count;
input>>n;
cout<<n;
}
cout<<"\ncount="<<count;
getch();
}

Getline keeps on getting newline character. How can I avoid this?

Basically I first takes an integer as input and then test case follows. My each test case is an string. I am suppose to print the string back if the starting patten of string matches "HI A" and it is case-insensitive. I wrote the code below to accomplish to this. My problem is that when I press enter after each input, getline takes newline character as new input. I have tried to tackle this by using extra getline after each input but the issue is still there. Program gets stuck in the loop even though I have put a break condition. What am I doing wrong?
#include <iostream>
#include <string>
using namespace std;
int main(){
int N;
cin >>N;
string nl;
getline(cin,nl);
for (int i=0;i<N;i++){
string s;
getline(cin,s);
//cout <<"string"<<s<<endl;
int flag=0;
if ((s.at(0)=='h'||s.at(0)=='H')&&(s.at(1)=='i'||s.at(1)=='I')&&(s.at(2)==' ')&&(s.at(3)=='a'||s.at(3)=='A')) flag=1;
if (flag==1) cout << s;
//cout << "not " <<s;
string ne;
cout << "i="<< i<<endl;
if (i==N-1) {break;}
getline(cin,ne);
}
}
Here is sample input:
5
Hi Alex how are you doing
hI dave how are you doing
Good by Alex
hidden agenda
Alex greeted Martha by saying Hi Martha
Output should be:
Hi Alex how are you doing
ignore() function does the trick. By default, it discards all the input suquences till new line character.
Other dilimiters and char limit can be specified as well.
http://www.cplusplus.com/reference/istream/istream/ignore/
In your case it goes like this.
cin >> N;
cin.ignore();
Your cin >>N stops at the first non-numeric character, which is the newline. This you have a getline to read past it, that's good.
Each additional getline after that reads the entire line, including the newline at the end. By putting in a second getline you're skipping half your input.
So, your real problem isn't that getline eats newlines, but that your second getline(cin, ne) is eating a line...
And that is because you mistakenly think that you need two getline operations to read one line - or something like that. Mixing "linebased" and "itembased" input does have confusing ways to deal with newlines, so you do need something to "skip" the newline left behind frin cin >> N;, but once you have got rid of that, you only need ONE getline to read up and including the newline at the end of a line.
I am writing this answer with the hopes that it may help someone else out there that wants a very simple solution to this problem.
In my case the problem was due to some files having different line endings such as '\r' vs. '\n'. Everything worked fine in windows but then it failed in Linux.
The answer was actually simple. I created a function removeNewLineChar after each line was read in. That way the char was removed. The removeNewLineChar takes in the line that was read in and copies it over character by character into a new string but it avoids copying either of the newline characters.
Here is an example:
string trim(string line)
{
string newString;
for (char ch : line)
{
if (ch == '\n' || ch == '\r')
continue;
newString += ch;
}
return newString;
}
//some function reading a file
while (getline(fin, line)) {
line = trim(line);
//... do something with the line
line = "";
}
you just need to accept the fact that getline will give you '\n' at the end. One solution is remove '\n' after getting it. Another solution is do not write the additional 'endl'. for example, for your problem, you can use this code
int N;
cin >> N;
string line;
getline(cin, line); // skip the first new line after N.
for (int i = 0; i < N; i++) {
string line;
getline(cin, line);
string first4 = line.substr(0, 4);
// convert to upper case.
std::transform(first4.begin(), first4.end(), first4.begin(), std::ptr_fun<int, int>(std::toupper)); // see http://en.cppreference.com/w/cpp/algorithm/transform
if (first4 == "HI A") {
cout << line; // do not include "<< endl"
}
}
cin.ignore() worked for me.
void House::provideRoomName()
{
int noOfRooms;
cout<<"Enter the number of Rooms::";
cin>>noOfRooms;
cout<<endl;
cout<<"Enter name of the Rooms::"<<endl;
cin.ignore();
for(int i=1; i<=noOfRooms; i++)
{
std::string l_roomName;
cout<<"Room"<<"["<<i<<"] Name::";
std::getline(std::cin, l_roomName);
}
}
std::string line;
std::cin>>std::ws; // discard new line not processed by cin
std::getline(std::cin,line);
From Notes section https://en.cppreference.com/w/cpp/string/basic_string/getline
When consuming whitespace-delimited input (e.g. int n; std::cin >> n;) any whitespace that follows, including a newline character, will be left on the input stream. Then when switching to line-oriented input, the first line retrieved with getline will be just that whitespace. In the likely case that this is unwanted behaviour, possible solutions include:
An explicit extraneous initial call to getline
Removing consecutive whitespace with std::cin >> std::ws
Ignoring all leftover characters on the line of input with cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

Modify cin to also return the newlines

I know about getline() but it would be nice if cin could return \n when encountered.
Any way for achieving this (or similar)?
edit (example):
string s;
while(cin>>s){
if(s == "\n")
cout<<"newline! ";
else
cout<<s<<" ";
}
input file txt:
hola, em dic pere
caram, jo també .
the end result shoud be like:
hola, em dic pere newline! caram, jo també .
If you are reading individual lines, you know that there is a newline after each read line. Well, except for the last line in the file which doesn't have to be delimited by a newline character for the read to be successful but you can detect if there is newline by checking eof(): if std::getline() was successful but eof() is set, the last line didn't contain a newline. Obviously, this requires the use of the std::string version of std::getline():
for (std::string line; std::getline(in, line); )
{
std::cout << line << (in.eof()? "": "\n");
}
This should write the stream to std::cout as it was read.
The question asked for the data to be output but with newlines converted to say "newline!". You can achieve this with:
for (std::string line; std::getline(in, line); )
{
std::cout << line << (in.eof()? "": "newline! ");
}
If you don't care about the stream being split into line but actually just want to get the entire file (including all newlines), you can just read the stream into a std::string:
std::string file((std::istreambuf_iterator<char>(in)),
std::istreambuf_iterator<char>());
Note, however, that this exact approach is probably fairly slow (although I know that it can be made fast). If you know that the file doesn't contain a certain character, you can also use std::getline() to read the entire file into a std::string:
std::getline(in, file, 0);
The above code assumes that your file doesn't contain any null characters.
A modification of #Dietmar's answer should do the trick:
for (std::string line; std::getline(in, line); )
{
std::istringstream iss(line);
for (std::string word; iss >> word; ) { std::cout << word << " "; }
if (in.eof()) { std::cout << "newline! "; }
}
Just for the record, I ended up using this (I wanted to post it 11h ago)
string s0, s1;
while(getline(cin,s0)){
istringstream is(s0);
while(is>>s1){
cout<<s1<<" ";
}
cout<<"newline! ";
}