stopword removal in C++ code [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
can Anyone help me make the stopword to be removed..I could not.. still appear after run!
#include <iostream>
#include <cmath>
#include <fstream>
#include <cstdlib>
using namespace std;
int main()
{
char filename[50]; //open file
ifstream example;
cin.getline(filename , 50);
example.open(filename);
if(!example.is_open())
{
exit(EXIT_FAILURE);
}
char word[50];
example>>word;
while (example.good()&&word!="a"&& word!="an"&&word!="be"&& word!="at"&& word!="the")
{
cout <<word<<" "; // remove stopwords
example>>word;
}
system("PAUSE");
return 0;
}
can Anyone help me make the stopword to be removed..I could not.. still appear after run!

You cannot compare C-strings with the == operator. The easiest solution to your problem will be to use std::string:
string word;
example >> word;
while (example.good() && word != "a" && word != "an" && word != "be" && word != "at" && word != "the")
{
cout << word << " "; // remove stopwords
example >> word;
}
On the other hand, this will actually not remove all, as you call it, stopwords. It will just print all words until the first “stopword” is read, and then the whole loop will stop.

The problem is that you're using C-style strings, which are fiddly to use correctly. The simplest option is to use the C++ strings library:
#include <string>
std::string word;
and the rest of your program should work as expected. This will also prevent the hideous stack-corruption bug that your program will experience if an input word is too long.
If you really want to muck around with character arrays for educational purposes, then you'll need to use the C strings library to compare them:
#include <cstring>
if (std::strcmp(word, "a") != 0 && ...)
Your code compares the address of the array containing the input word with the address of a string literal; these will never be equal.

When removing stopwords, remove not only a few of them.
In addition, you should apply the Porter algorithm to your piece of code.
The Porter Stemmer has to be applied regarding string similarity if you wanna check a filtered text.
Yes, it is in C, but only applying a few words (like your question) is not an adequate removal procedure of stopwords. The C code gives you an impression if you really wanna stem in addition to removal of stopwords. This depends on the purpose.
Have done both in 2008 to filter many text fragments. Both was relevant.
hth

A competent compiler with warnings turned on will fix your problem for you. Here's what mine said:
warning: result of comparison against a string literal is unspecified (use strncmp instead)
[-Wstring-compare]
while (example.good()&&word!="a"&& word!="an"&&word!="be"&& word!="at"&& word!="the")
^ ~~~

Related

How to implement a semicolon ends the input in C++? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
How to implement a command line interface, the command must end with semicolon. Then press enter to execute. Otherwise, press enter wraps the line. If I don't descirbe it clearly, you can refer to the mysql command line.
How to implement the above in C++? For example:
If user inputs foo;bar then str = "foo". It can have some spaces in between ;.
In C++ IO I just know:
#include <iostream>
#include <string>
using namespace std;
int main() {
string str;
cin >> str;
}
I don't know how to implement other input function.
The most simple approach would be to use std::getline (as adviced in comments), but with a custom delimiter (';' in your case) like this:
string command;
while (getline(cin, command, ';')) {
// process the command there
}
However, this approach has several drawbacks and is pretty limited:
it reads until any ';' is hit. If you're going to process commands complicated enough to support string literals, then you will need more complicated parsing to handle this: echo "Hello; sample text"; exit;, as two commands, but not three;
when you hit Enter, getline will wait for more input until it sees a semicolon, but it will not insert any 'user-friendly' prompt like > to let the user know that they need to supply more input or that they forgot the semicolon at the end of command.
If you're ok to go without supporting these features, getline is quite good to go. Otherwise you'll need to parse your input lines by yourself.
I guess this is what you want to do:
#include <iostream>
int main() {
std::string command;
bool flag = true;
do
{
std::string str;
std::getline(std::cin,str);
for(int i = 0; i < str.length(); i++)
{
if(str[i] == ';')
{
str = str.substr(0,i+1);
flag = false;
}
}
command += str;
} while(flag);
}

Is there alternative str.find in c++?

I have got a queue fifo type (first in, first out) with strings in it. Every string is sentence. I need to find a word in it, and show it on console. The problem is, that when i used str.find("word") it can showed sentence with "words".
Add white space and some symbols like ".,?!" = str.find("word ") etc. but its not a solution
if (head != nullptr)
do {
if (head->zdanie_kol.find("promotion") != string::npos ||
head->zdanie_kol.find("discount") != string::npos ||
head->zdanie_kol.find("sale") != string::npos ||
head->zdanie_kol.find("offer") != string::npos)
cout << head->zdanie_kol << endl;
} while (head != nullptr);
For example, i got two sentences, one is correct, another one is not.
Correct:
We have a special OFFER for you an email database which allows to contact eBay members both sellers and shoppers.
Not Correct:
Do not lose your chance sign up and find super PROMOTIONS we prepared for you!
The three simplest solutions I can think of for this are:
Once you get the result simply check the next character. If it's a whitespace or '\0', you found your match. Make sure to check the character before too so you don't match sword when looking for word. Also make sure you're not reading beyond the string memory.
Tokenize the string first. This will break the sentence into words and you can then check word by word to see if it matches. You can do this with strtok().
Use regular expression (e.g. regex_match()) as mentioned in the comments. Depending on the engine you choose, the syntax may differ, but most of them have a something like "\\bsale\\b" which will match on word boundary (see here for more information).
Here is a solution, using std::unordered_set and std::istringstream:
#include <unordered_set>
#include <string>
#include <sstream>
//...
std::unordered_set<std::string> filter_word = {"promotion", "discount", "sale", "offer"};
//...
std::istringstream strm(head->zdanie_kol);
std::string word;
while (strm >> word)
{
if (filter_word(word).count())
{
std::cout << head->zdanie_kol << std::endl;
break;
}
}
//...
If you had many more words to check instead of only 4 words, this solution seems easier to use since all you have to do is add those words to the unordered_set.

Replace all letters in a string with an underscore and space [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 6 years ago.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Improve this question
As the title says, I am attempting to replace every letter in a string with an underscore followed by a space. For example: "hello" would be replaced with
"_ _ _ _ _". I can replace letters with just a space or just an underscore, but I am having trouble with replacing both. Any help is appreciated!
A regex based solution:
#include <iostream>
#include <iterator>
#include <string>
#include <regex>
int main() {
std::string s = "hello";
std::regex_replace(std::ostream_iterator<char>(std::cout), s.begin(), s.end(), std::regex("."), "_ ");
return 0;
}
This is not the most efficient solution, but it should be easy to understand:
#include <string>
#include <iostream>
int main()
{
std::string theString = "hello";
std::string theResult = "";
for (int count = 0; count < theString.length(); count++)
{
theResult += "_ ";
}
std::cout << theResult;
return 0;
}
I think this is what you're looking for:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string input;
string output;
int size_input;
cin>> input;
size_input= input.size();
for(int i=0; i<size_input; i++)
{
output+= "_";
if(i!= size_input-1)
output+= " ";
}
cout<< output<<endl;
}
To mock you a bit more. You can't!
I mean, if you have string "hello", "replacing" it by "_ _ _ _ _ " is not just replacing, but also extending.
If you are doing some hangman game, don't replace it, keep the source word as is, having 5 characters.
Also keep around some vector<bool> visibility; initialized to length 5x false.
Where you want your "_ " is display method of word, where you display per-letter either "_" or the letter from string, according to visibility[index], and move the position of next letter far enough to simulate some adequate space. This will later over course of game lead to display like "h e _ _ o" (notice the spaces included also between letters).
If you need this for something different, my advice may be wrong...
But generally, unless you really want to destroy the original string in memory, don't overwrite it, generate new one. Then the advice from #NathanOliver is valid. If you want to destroy original string in memory, then question is where it is stored (char *string = "hello"; is compiled into read-only const segment, you don't want to write there) and how that memory was allocated, that decides whether you can even enlarge it enough to contain 2x more letters ("_ " pairs).
edit:
After checking OP profile, I see he's very likely really working on hangman game, so my guess was correct, and I do believe my answer will help him lot more in the long run.
About that "mock". I wasn't the one downvoting his question. But he was already well in negatives - that and my sarcastic nature lead me into such opener.
Anyway, I would love to hear from OP himself, if he does find it offensive, or he felt the weird sense of awkward humour of it, and understood from rest of the answer, that I tried to care and help.
So I can do a proper facepalm, if he's hurt. It's not fair to have fun only from oversensitive commenters :/, I can't be satisfied by that.

Modifying specific characters in text input (C++) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I receive text with special characters (such as á) so I have to manually search and replace each one with code (in this case "á")
I would like to have code to search and replace such instances automatically after user input. Since I'm a noob, I'll show you the code I have so far - however meager it may be.
// Text fixer
#include <iostream>
#include <fstream>
#include <string>
int main(){
string input;
cout << "Input text";
cin >> input;
// this is where I'm at a loss. How should I manipulate the variable?
cout << input;
return 0;
}
Thank you!
An easy method is to use an array of substitution strings:
std::string replacement_text[???];
The idea is that you use the incoming character as the index into the array and extract the replacement text.
For example:
replacement_text[' '] = " ";
// ...
std::string new_string = replacement_text[input_character];
Another method is to use switch and case to convert the character.
Alternative techniques are a lookup table and std::map.
The lookup table could be an array of mapping structures:
struct Entry
{
char key;
std::string replacement_text;
}
Search the table using the key field to match the incoming character. Use the replacement_text to get the replacement text.

how to insert a word and use it to make comparison in if condition in c++

i want to use the word i insert to use it to make comparison in if condition to show some word it the comparison is true.
here is my code
#include <iostream>
using namespace std;
int main()
{
char u[5];
cout<<" p " <<" c "<<" U "<<endl;
cout<<" pepsi=5"<<" coca=3"<<" 7-UP=2"<<endl;
cout<<"CHOOSE your drink"<<endl;
cin>>u;
if (u=="pepsi")
cout<<"your choice is pepsi and ur bill is 5 ";
}
First in the future I would suggest trying to be more specific on what your problem is and what you don't understand. Just saying I want to do X and here is my code is giving us very little to work with and we are basically just guessing on what you are having problems with.
Now on to what I believe you are having problems with (I am assuming since you didn't tell us what is going wrong).
In this case you are using a character array with a length of 5. Now when you use character arrays you need to take into account that all the reasonable inputs that that variable might store will actually fit into that character array.
Let's look at pepsi. You might think it would fit but in fact it doesn't because you are forgetting about the null character that is added on the end. This is what it looks like.
u[0] = 'p'
u[1] = 'e'
u[2] = 'p'
u[3] = 's'
u[4] = 'i'
u[5] = '\0'
So as you can see there is actually 6 characters in this word which will cause a overflow. I am assuming this is your problem.
Now how do we fix this? As others have said in the comments if you are using C++ it is probably better for you to use std::string for this problem since it will hide from you most of the problems you have to do deal with when using C style string (What you are using now). Then once you feel more comfortable with the language you can come back and revisit C style strings.
With std::string it would look something like this. Remember that when testing strings case matters (IE "string" is not the same as "String").
std::string choice;
std::cin >> choice;
if (choice == "pepsi")
{
std::cout << "You selected pepsi!" << std::endl;
}
Hope that helps a little and fixes your problems.