It's my first time here and I am a beginner at C++. I would like to know how I can split sentences with punctuation mark while I am reading from a text file.
i.e.
hey how are you? The Java is great. Awesome C++ is awesome!
The result would be this in my vector (assuming I have put endl to display each content of the vector):
hey how are you?
The Java is great.
Awesome C++ is awesome!
Here's my code so far:
vector<string> sentenceStorer(string documentOfSentences)
{
ifstream ifs(documentOfSentences.c_str());
string word;
vector<string> sentence;
while ( ifs >> word )
{
char point = word[word.length()-1];
if (point == '.' || point == '?' || point == '!')
{
sentence.push_back(word);
}
}
return sentence;
}
void displayVector (vector<string>& displayV)
{
for(vector<string>::const_iterator i = displayV.begin(); i != displayV.end(); ++i )
{
cout << *i <<endl;
}
}
int main()
{
vector <string> readstop = sentenceStorer("input.txt");
displayVector(readstop);
return 0;
}
Here is my result:
you?
great.
awesome!
Can you explain why I couldn't get the previous word and fix that?
I will give you a clue. In while statement you have three conditions in the or clause. So if any of them is fulfilled than the while statement do not check other ones. So it takes your first and looks for . (dot). Then after finding it, reads it into the word, so in fact it omits question mark.
It looks you need to find other way to solve this. If I were you, I would read whole line and parse it char by char. As far as I am concerned there is no build-in string function that splits words by delimiter.
Related
I'm beginner learning my first programming language - C++ - from Bjarne Stroustrup's book "Programming Principles and Practice Using C++". Chapter 4 talks about vectors. Everything explained before I would get easily and code would always work properly, until now.
The code I write doesn't function at all. The following code was made for little exercise, where the input is read in and prints the words out, bleeping out the disliked ones.
#include "std_lib_facilities.h"
int main() {
vector<string>text;
string disliked = "cat";
for (string word; cin >> word;) {
text.push_back(word);
}
for (int a = 0; a < text.size(); ++a) {
if (text[a] != disliked) {
cout << text[a] << endl;
}
else {
cout << "BLEEP\n";
}
}
keep_window_open();
}
My first idea was to create another vector, vector<string>disliked ={"cat", ...} , for disliked words, but then if (text[x] != disliked) didn't seem like a way of comparing elements from each vector (at least it was warning me about operator and operands mismatch). Is there a way for that?
But back to the code: with some modifications and without any disliked word in the input, the program would run sometimes. Still I can't manage to meet the main purpose. And perhaps, the actual mistake is in the input termination. Ctrl+Z doesn't work for me but just inputs a character. And somehow Ctrl+C happened to work properly (if there were no disliked words).
So here come the actual questions:
Is the code correct? (since I can't check it myself while I might have been terminating the input improperly entire time)
How can I terminate input any other way, considering that Ctrl+Z only adds a character to the input?
Is there a way of making this program work by comparing the "input" vector with the "disliked" vector?
Is the code correct? (since I can't check it myself while I might have been terminating the input improperly entire time)
Seems to work for me.
How can I terminate input any other way, considering that Ctrl+Z only adds a character to the input?
I use Ctrl-D to mark end of line.
Is there a way of making this program work by comparing the "input" vector with the "disliked" vector?
Usually when you compare types (with == and !=) they have the same type (or the compiler can convert one type to the same type as the other type (but that's for the pedantic here; for beginners its best to think of comparison comparing objects of the same type)).
vector<string> text;
string disliked = "cat";
// STUFF
if (text[x] != disliked) // disliked is a string
// text[x] is a string (because we are accessing the `x` element.
If we change disliked to a vector:
vector<string> text;
vector<string> disliked = "cat";
// STUFF
if (text[x] != disliked) // disliked is a vector<string>
// text[x] is a string
Since the types do not match they are hard to compare. So you need to loop over all the elements in disliked to see if you can find the word.
bool found = false;
for(std::size_t loop = 0; loop < disliked.size(); ++loop) {
if (text[x] == disliked[loop) { // Types are the same here.
found = true;
break;
}
}
if (!found) {
There are techniques to make the above compact. If you are just started this may be a bit early for this, but for completness I will add it here:
bool found = std::find(std::begin(disliked), std::end(disliked), text[x]) != std::end(disliked);
if (!found) {
I guess you have two options here:
1. Get the input from a text file.
In this case you have to place your data in a text file, in your project directory. For example, in the code posted below, "text.txt" is where the input should be stored (your words).
Minor remarks:
I'm not sure what "std_lib_facilities.h" contains so I added some of the standard headers to make the code compile for me.
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
int main()
{
std::vector<std::string> texts;
std::string dislikedWord = "cat";
std::ifstream fin("text.txt");
for (std::string word; fin >> word;)
texts.push_back(word);
unsigned textsCount = texts.size();
for (unsigned textIndex = 0; textIndex < textsCount; ++textIndex)
if (texts[textIndex] != dislikedWord)
std::cout << texts[textIndex] << '\n';
else
std::cout << "BLEEP\n";
return 0;
}
2. Keep reading words until a condition is finally met.
In this case (if you don't want to read from a text file) you should insert a condition that makes the program stop taking input, so you can preceed further. A condition can be something like maximum number of words that can be read or some 'special' word. In the example below, I chose to end reading at ten words.
Minor remarks: Maybe the chapter from where you got the exercise tells you what condition to insert. I doubt that you have to use CTRL+C or any other key combinations to solve the exercise.
#include <iostream>
#include <vector>
#include <string>
int main()
{
const unsigned totalWords = 10;
std::vector<std::string> texts;
std::string dislikedWord = "cat";
for (std::string word; std::cin >> word;) {
texts.push_back(word);
if (texts.size() == totalWords) // could've put it in the for condition
break; // terminates the for-loop
}
unsigned textsCount = texts.size();
for (unsigned textIndex = 0; textIndex < textsCount; ++textIndex)
if (texts[textIndex] != dislikedWord)
std::cout << texts[textIndex] << '\n';
else
std::cout << "BLEEP\n";
return 0;
}
This code here is for reversing words in a string. The problem is that it only reverses the first word in the string. When I ran a trace I found that it is stopping after encountering the statement if(s[indexCount] == '\0') break;
Why the code is getting null character every time the first word is reversed even though some other character is present after the first word.
#include <iostream>
using namespace std;
int main()
{
string s;
char tchar;
int indexCount=0,charCount=0,wordIndex;
cin>>s;
while(1){
if(s[indexCount]==' ' && charCount==0) continue;
if(s[indexCount]==' ' || s[indexCount]=='\0' ){
wordIndex=indexCount-charCount;
charCount=indexCount-1;
while(charCount!=wordIndex && charCount>wordIndex){
tchar=s[wordIndex];
s[wordIndex]=s[charCount];
s[charCount]=tchar;
charCount--;
wordIndex++;
}
if(s[indexCount] == '\0') break;
indexCount++; charCount=0;
}
else{
charCount++;
indexCount++;
}
}
cout<<"\nReveresed words in the string : \n\t"<<s<<endl;
return 0;
}
Also I'm using while(1). Does it make this a bad code?
The problem indeed lies with the method of input. cin >> string_variable will consider whitespace to be a delimiter. That is why only the first word is being entered. Replace cin >> s; with getline(cin, s); and it will work correctly.
First of all I want to point out that
cin >> stringObject;
will never ever read space character! so inserting My name is geeksoul will cause above code to read only My and leave everything else in the buffer!
To read space character you should use getline function like this
std::getline(std::cin, stringObject);
read about getline
Second The standard doesn't say that in case of an std::string '\0' is any special character. Therefore, any compliant implementation of std::string should not treat '\0' as any special character. Unless of course a const char* is passed to a member function of a string, which is assumed to be null-terminated.
If you really want to check your string with null terminating character then you should consider using stringObject.c_str() which converts your C++ style string to old school C style string!
Check this for c_str
Finally this might be helpful for you!
Quick tip.
If you reverse all characters in the whole strings, and then all characters between each pair of consecutive spaces, you will achieve the same result, with way simple code, like this: (Note, this may not compile or be slightly buggy (haven't compiled or anything), but should convey the basic idea)
void reverseWords(std::string& aString) {
std::reverse(aString.begin(), aString.end());
size_t lastSpaceIndex = 0;
for (size_t index = 0; index != aString.size(); ++index) {
if (aString[index] == ' ') {
std::reverse(aString.begin() + lastSpaceIndex + 1, aString.begin() + index);
lastSpaceIndex = index;
}
}
}
I would like some advice/help in regards to splitting up a paragraph from a separate text file into their own strings. The code I have so far just counts the total amount of words in that paragraph but I would like to split it so each line is 1 sentence then count how many words are in that sentence/line then put that into its' own array so I can do other things with that specific sentience/line. Here is what I have code wise:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
std::ifstream inFile;
inFile.open("Rhymes.txt", std::ios::in);
if (inFile.is_open())
{
string word;
unsigned long wordCount = 0;
while (!inFile.eo())
{
inFile >> word;
if (word.length() > 0)
{
wordCount++;
}
}
cout << "The file had " << wordCount << " word(s) in it." << endl;
}
system("PAUSE");
return 0;
}
The separate text file is called "Rhymes.txt" and that contains:
Today you are You, that is truer than true. There is no one alive who is Youer than You.
The more that you read, the more things you will know. The more that you learn, the more places you'll go.
How did it get so late so soon? Its night before its afternoon.
Today was good. Today was fun. Tomorrow is another one.
And will you succeed? Yes indeed, yes indeed! Ninety-eight and three-quarters percent guaranteed!
Think left and think right and think low and think high. Oh, the things you can think up if only you try!
Unless someone like you cares a whole awful lot, nothing is going to get better. It's not.
I'm sorry to say so but, sadly it's true that bang-ups and hang-ups can happen to you.
So the first line would be its own sentence and when the code is executed it would say:
The line has 19 words in it
I am a bit confused as too how I would go about doing this. I have seen examples of splitting sentences into words but I couldn't find anything that I could really understand that had to do with what I am asking for.
Under the assumption that each white space is exactly one blank character, and there is no plenking/klemping, you can count via std::count. Reading in the lines can be done via std::getline.
int main()
{
// Simulating the file:
std::istringstream inFile(
R"(Today you are You, that is truer than true. There is no one alive who is Youer than You.
The more that you read, the more things you will know. The more that you learn, the more places you'll go.
How did it get so late so soon? Its night before its afternoon.
Today was good. Today was fun. Tomorrow is another one.
And will you succeed? Yes indeed, yes indeed! Ninety-eight and three-quarters percent guaranteed!
Think left and think right and think low and think high. Oh, the things you can think up if only you try!
Unless someone like you cares a whole awful lot, nothing is going to get better. It's not.
I'm sorry to say so but, sadly it's true that bang-ups and hang-ups can happen to you.)");
std::vector<std::string> lines; // This vector will contain all lines.
for (std::string str; std::getline(inFile, str, '\n');)
{
std::cout << "The line has "<< std::count(str.begin(), str.end(), ' ')+1 <<" words in it\n";
lines.push_back(std::move(str)); // Avoid the copy.
}
for (auto const& s : lines)
std::cout << s << '\n';
}
If you need the amount of words in each sentence later on, save an std::pair<std::string, std::size_t> to save both the line and the word count - alter the loop body to this:
std::size_t count = std::count(str.begin(), str.end(), ' ') + 1;
std::cout << "The line has "<<count<<" words in it\n";
lines.emplace_back(std::move(str), count);
I'd write something like:
vector<string> read_line()
{ string line, w;
vector<string> words;
getline(cin, line);
stringstream ss(line);
while(ss >> w)
words.push_back(w);
return words;
}
The returned vector contains the information you need: count of words and the words themselves (with punctuation which you can remove easily).
vector<string> words = read_line();
cout << "This line has " << words.size() << " words in it" << endl;
To read all lines you do:
while(1)
{ vector<string> words = read_line();
if(words.size() == 0) break;
// process line
}
I used this function but it is wrong.
for (int i=0; i<sen.length(); i++) {
if (sen.find (' ') != string::npos) {
string new = sen.substr(0,i);
}
cout << "Substrings:" << new << endl;
}
Thank you! Any kind of help is appreciated!
new is a keyword in C++, so first step is to not use that as a variable name.
After that, you need to put your output statement in the "if" block, so that it can actually be allowed to access the substring. Scoping is critical in C++.
First: this cannot compile because new is a language keyword.
Then you have a loop running through every character in the string so you shouldn't need to use std::string::find. I would use std::string::find, but then the loop condition should be different.
This doesn't use substr and find, so if this is homework and you have to use that then this won't be a good answer... but I do believe it's the better way to do what you're asking in C++. It's untested but should work fine.
//Create stringstream and insert your whole sentence into it.
std::stringstream ss;
ss << sen;
//Read out words one by one into a string - stringstream will tokenize them
//by the ASCII space character for you.
std::string myWord;
while (ss >> myWord)
std::cout << myWord << std::endl; //You can save it however you like here.
If it is homework you should tag it as such so people stick to the assignment and know how much to help and/or not help you so they don't give it away :)
No need to iterate over the string, find already does this. It starts to search from the beginning by default, so once we found a space, we need to start the next search from this found space:
std::vector<std::string> words;
//find first space
size_t start = 0, end = sen.find(' ');
//as long as there are spaces
while(end != std::string::npos)
{
//get word
words.push_back(sen.substr(start, end-start));
//search next space (of course only after already found space)
start = end + 1;
end = sen.find(' ', start);
}
//last word
words.push_back(sen.substr(start));
Of course this doesn't handle duplicate spaces, starting or trailing spaces and other special cases. You would actually be better off using a stringstream:
#include <sstream>
#include <algorithm>
#include <iterator>
std::istringstream stream(sen);
std::vector<std::string> words(std::istream_iterator<std::string>(stream),
std::istream_iterator<std::string>());
You can then just put these out however you like or just do it directly in the loops without using a vector:
for(std::vector<std::string>::const_iterator iter=
words.begin(); iter!=words.end(); ++iter)
std::cout << "found word: " << *iter << '\n';
I created the program to read from text file and remove special characters. I can't seem to code better the if statement. Please help. I searched online for the right code statements but they have all advanced code statements. The book I am learning from has the last(14th) chapter with strings and file open and closing code. I tried creating an array of special chars, but did not work. Please help me!
int main()
{
string paragraph = "";
string curChar = "";
string fileName = "";
int subscript=0;
int numWords=0;
ifstream inFile; //declaring the file variables in the implement
ofstream outFile;
cout << "Please enter the input file name(C:\owner\Desktop\para.txt): " << endl;
cin >> fileName;
inFile.open(fileName, ios::in); //opening the user entered file
//if statement for not finding the file
if(inFile.fail())
{
cout<<"error opening the file.";
}
else
{
getline(inFile,paragraph);
cout<<paragraph<<endl<<endl;
}
numWords=paragraph.length();
while (subscript < numWords)
{
curChar = paragraph.substr(subscript, 1);
if(curChar==","||curChar=="."||curChar==")"
||curChar=="("||curChar==";"||curChar==":"||curChar=="-"
||curChar=="\""||curChar=="&"||curChar=="?"||
curChar=="%"||curChar=="$"||curChar=="!"||curChar==" ["||curChar=="]"||
curChar=="{"||curChar=="}"||curChar=="_"||curChar==" <"||curChar==">"
||curChar=="/"||curChar=="#"||curChar=="*"||curChar=="_"||curChar=="+"
||curChar=="=")
{
paragraph.erase(subscript, 1);
numWords-=1;
}
else
subscript+=1;
}
cout<<paragraph<<endl;
inFile.close();
You might want to look into the strchr function which searches a string for a given character:
include <string.h>
char *strchr (const char *s, int c);
The strchr function locates the first occurrence of c (converted to a char) in the
string pointed to by s. The terminating null character is considered to be part of the
string.
The strchr function returns a pointer to the located character, or a null pointer if the
character does not occur in the string.
Something like:
if (strchr (",.();:-\"&?%$![]{}_<>/#*_+=", curChar) != NULL) ...
You'll have to declare curChar as a char rather than a string and use:
curChar = paragraph[subscript];
rather than:
curChar = paragraph.substr(subscript, 1);
but they're relatively minor changes and, since your stated goal was I want to change the if statement into [something] more meaningful and simple, I think you'll find that's a very good way to achieve it.
In <cctype> header we have functions like isalnum(c) which returns true iff c is an alpanumeric character, isdigit(c) etc... I think the condition you are looking for is
if(isgraph(c) && !isalnum(c))
But c must be a char, not an std::string (well, technically speaking c must be int, but the conversion is implicit:) hth
P.S. This isn't the best idea, but if you want to keep sticking with std::string for curChar, c will be this char c = curChar[0]
since you are learning c++, I will introduce you the c++ iterator way of erasing.
for (string::iterator it = paragraph.begin();
it != paragraph.end();
++it)
while (it != paragraph.end() && (*it == ',' || *it == '.' || ....... ))
it = paragraph.erase(it);
First, try using iterator. This won't give you best performance, but its concept would help you work with other c++ structure.
if(curChar==","||curChar=="."||curChar==")" ......
Second, single quote ' and double quote " differs. You use ' for char.