My code with regular expressions for file doesn't run properly - c++

#include <iostream>
#include <fstream>
#include <string>
#include <regex>
using namespace std;
int main()
{
regex r1("(.*\\blecture\\b.*)");
regex r2("(.* practice.*)");
regex r3("(.* laboratory practice.*)");
smatch base_match;
int lecture = 0;
int prakt = 0;
int lab = 0;
string name = "schedule.txt";
ifstream fin;
fin.open(name);
if (!fin.is_open()) {
cout << "didint open ";
}
else {
string str;
while (!fin.eof()) {
str = "";
getline(fin, str);
cout << str << endl;
if (regex_match(str, base_match, r1)) {
lecture++;
}
if (regex_match(str, base_match, r2)) {
prakt++;
}
if (regex_match(str, base_match, r3)) {
lab++;
}
}
}
cout << "The number of lectures: " << lecture << "\n";
cout << "The number of practices: " << prakt << "\n";
cout << "[The number of laboratory work][1]: " << lab << "\n";
fin.close();
}
In this program, I need to count the number of lectures, practice and laboratory work per week using regular expressions. I have got a text file, which you can see on the screen. But for lectures and practice, it doesn't work right.
enter image description here

You need the number of times each regex matches. C++ has std::sregex_iterator for performing multiple regex matches over a string.
That means you can do the following:
for (auto it = std::sregex_iterator{str.cbegin(), str.cend(), r1}; it != std::sregex_iterator{}; it++) {
lecture++;
}
If you want to get really fancy you can even do it in one go:
auto it = std::sregex_iterator{str.cbegin(), str.cend(), r1};
lecture += std::distance(it, std::sregex_iterator{});
Alternatively, you can call std::regex_search several times, starting from the end offset of the previous match (or 0 for the first).
EDIT: as remarked in the comments, this assumes that your regexes are suitable to incremental matching. Yours eat the whole string (presumably because regex_match is anchored whereas regex_search/regex_iterator are not), so you need to at least change your regular expression definitions to the following:
regex r1("\\blecture\\b");
regex r2(" practice");
regex r3(" laboratory practice");
... and of course every match for r3 is also a match for r2, but I leave that for you.

Related

How to extend the regular expression that works for one pair to many pairs

I have written a code to find all matches to the pattern like 23<=34 or 123<>2000 in the given file. More generally, a(sign)b where sign ∈ { =<, =, >, <=, <>, >= } and a,b ∈ N.
Now, I got stuck on extending this code to identify many joint patterns like a(sign)b(sign)c(sign)...(sign)d. This reminds me of finding the gcd of many numbers using only the gcd for two numbers. But I don't really know how to proceed. Assume that we already have the function solved() that works for only one pair (the code below if needed).
Summary of the problem:
Creating a function that more or less looks like this:
void solve(string text) {
// using solved(string text) [but optional]
}
Maybe there are other ways, and I would be more than happy to see them!
Below is the code I have:
#include <iostream>
#include <cstdlib>
#include <fstream>
#include <regex>
using namespace std;
#define REGEX_SIGN "(=<|=|>|<=|<>|>=)"
#define REGEX_DIGIT "[0-9]"
#define REGEX_NUMBER "[^0]\\d*"
void check(string text) {
regex sign(REGEX_SIGN);
regex digit(REGEX_DIGIT);
regex number(REGEX_NUMBER);
regex relation(REGEX_NUMBER REGEX_SIGN REGEX_NUMBER);
string word = "";
for (int i = 0; i < text.length(); i++) {
if (text[i] == ' ') {
regex_match(word, relation) ? cout << "✔️ " : cout << "☒ ";
cout << word << " ";
cout << endl;
word = "";
} else {
word += text[i];
}
}
}
int main() {
string line, text;
ifstream fin;
fin.open("name.txt");
if (fin.good()) {
while (getline(fin, line)) {
text += line + " ";
}
check(text);
}
return 0;
}
It seems you are looking for
REGEX_NUMBER "(" REGEX_SIGN REGEX_NUMBER ")+"
Regex demo with \d+((<|=|>|<=|<>|>=)\d+)+
C++ Demo.

Error in output in c++ program

The purpose of the program is to read a phrase from a file into a vector and convert the phrase into Pig Latin. When the translated phrase is outputted in Pig Latin, an additional "ay" is added after the phrase (which is not supposed to happen). Can anyone spot why this is happening? It is important that I fix this because it affects the total letters and total characters of the Pig Latin phrase that I need to output. Also, I'm not asking anyone to write any code for me, but any tips on how to make my code less redundant. A portion of my grade for programs is efficiency, which I usually lose points on.
Here's the code:
#include <iostream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <sstream>
#include <cctype>
#include <stdio.h>
#include <ctype.h>
using namespace std;
int main()
{
ifstream in;
string word, fileName;
vector <string> phrase;
int length = 0, index = 0;
int totalWords = -1, totalLetters = -3, totalChars;
cout << "PIG LATIN PROGRAM" << endl;
cout << "Which file are you accessing? : ";
cin >> fileName;
fileName += ".txt";
in.open(fileName);
if (in.fail()) cout << "\nFile not found!" << endl;
while(getline(in, word)) phrase.push_back(word);
cout << "Original Phrase: " << phrase[0] << endl;
istringstream iss(phrase[0]);
cout << "Pig Latin phrase: ";
do {
string OGword;
string PLword;
for (int i=0; i < phrase.size(); i++){
iss >> OGword;
totalWords++;
}
if (OGword[0]=='a' || OGword[0]=='A' || OGword[0]=='e' || OGword[0]=='E' || OGword[0]=='i' || OGword[0]=='I' || OGword[0]=='o' || OGword[0]=='O' || OGword[0]=='u' || OGword[0]=='U'){
cout << OGword << "way" << " ";
totalLetters += (OGword.size() + 3);
}
else {
PLword = OGword.substr(index);
length = PLword.length();
PLword.insert(length, "ay");
PLword.insert(length, 1, OGword[index]);
PLword.erase(0, 1);
if (isupper(OGword[0])){
transform(PLword.begin(), PLword.end(), PLword.begin(), ::tolower);
(toupper(PLword[1]));
char upper;
upper = toupper(PLword[0]);
PLword.erase(0, 1);
cout << upper;
}
cout << PLword << " ";
totalLetters += PLword.size();
}
} while (iss);
totalChars = totalLetters + 1;
cout << "\n\nTotal words: " << totalWords << endl;
cout << "Total Letters: " << totalLetters << endl;
cout << "Total Characters: "<< totalChars << endl;
}
Problem
The core loop of the program looks like this (in pseudocode):
istringstream iss; // Contains line of text.
do {
string OGword;
get_OGword_and_count_totalWords(iss, OGword);
print_pig_latin_of_word(OGword);
} while (iss);
The loop runs as long as iss has not experienced an error. And in particular, iss does not experience an error until an extraction operation fails. So things happen in the loop like this:
OGword contains the last legitimate word on the line.
Print the last word.
The while clause is tested. iss is still good at this point because no error has occurred, even if iss is at the end of string.
Attempt to extract a word into OGword. This fails, and leaves OGword empty ("").
Print the Pig Latin version of "", which is "ay".
The while clause is tested. iss is in an error state, and the loop ends.
Fix
One possible fix out of many is to test iss for an error immediately after extracting a word.
std::istringstream iss; // Contains line of text
std::string OGword;
while (iss >> OGword) {
increment_word_total();
print_pig_latin_of_word(OGword);
}
In this version, the operation iss >> OGword returns iss, which is converted to bool. If there was an error during the immediately preceeding extraction, the loop ends without printing anything.
Other Advice
I think the best way to improve readability is to break the code up into smaller functions. For instance, take the if / else block that formats and prints the Pig Latin, and actually put it in a function:
int print_pig_latin_of_word_and_return_total_letters(string_view word);
Then, the code in that function can be further subdivided:
bool starts_with_vowel(std::string_view word);
int print_vowel_word_and_count_letters(std::string_view word);
int print_consonant_word_and_count_letters(std::string_view word);
int print_pig_latin_of_word_and_count_letters(std::string_view word) {
if (starts_with_vowel(word)) {
return print_vowel_word_and_count_letters(word);
} else {
return print_consonant_word_and_count_letters(word);
}
}
Odds and Ends
I would drop using namespace std and write all of the std library names as std::string, etc. This makes it clear which things are from the standard library.
The program has interesting behavior on input files that contain more than one line. There is a for loop that loops over phrase.size() which is number of input lines. This causes words to be skipped and totalWords to be incorrect.
This statement doesn't do anything, because the result of toupper is ignored:
(toupper(PLword[1]));

parse a string with regexp

What is the best way if you want to read a input like this:
(1,13) { (22,446) (200,66) (77,103) }
(779,22) {  } // this is also possible, but always (X,X) in the beginning
I would like to use regular expressions for doing it. But there is little info on usage of reqexp when parsing a string with more than only numbers. Currently im trying something similar with sscanf (from the c-library):
string data;
getline(in, data); // format: (X,X) { (Y,Y)* }
stringstream ss(data);
string point, tmp;
ss >> point; // (X,X)
// (X,X) the reason for three is that they could be more than one digit.
sscanf(point.c_str(), "(%3d,%3d)", &midx, &midy);
int x, y;
while(ss >> tmp) // { (Y,Y) ... (Y,Y) }
{
if(tmp.size() == 5)
{
sscanf(tmp.c_str(), "(%3d,%3d)", &x, &y);
cout << "X: " << x << " Y: " << y << endl;
}
}
The problem is that this does not work, as soon as there is more than one digit sscanf does not read the numbers. So is this the best way to go, or is there a better solution with regexp? I don´t want to use boost or something like that as this is part of a school assignment.
Maybe the following piece of code matches your requirements:
#include <iostream>
#include <string>
#include <regex>
int main()
{
std::smatch m;
std::string str("(1,13) { (22,446) (200,66) (77,103) }");
std::string regexstring = "(\\(\\s*\\d+\\s*,\\s*\\d+\\s*\\))\\s*(\\{)(\\s*\\(\\s*\\d+\\s*,\\s*\\d+\\s*\\)\\s*)*\\s*(\\})";
if (std::regex_match(str, m, std::regex(regexstring))) {
std::cout << "string literal matched" << std::endl;
std::cout << "matches:" << std::endl;
for (std::smatch::iterator it = m.begin(); it != m.end(); ++it) {
std::cout << *it << std::endl;
}
}
return 0;
}
Output:
Assuming you're using C++11, you could use something like: std::regex pattern(r"\((\d+),(\d+)\)\s*\{(\s*\(\d+,\d+\))+\s*\}") (Disclaimer: This hasn't been tested), and then use it like so:
std::smatch match;
while (ss >> tmp) {
if (std::regex_match(tmp, match, pattern)) {
// match[0] contains the first number as a string
// match[1] contains the second number as a string
// match[2] contains the list of points
}
}

How do I find a complete word (not part of it) in a string in C++

In a C++ code, I'm trying to search for a word in a sentence but it keeps doing partial search. I want it to search only for the complete word not parts of it too, any help?
size_t kk;
string word="spo";
string sentence="seven spoons";
kk=sentence.find(word);
if (kk !=string::npos)
cout << "something" << endl;
It sounds like what you want is handled by the concept of word boundaries or word characters in regular expressions.
Here's a program that will return only a complete match. That is, it will only return a word if that word completely matches the exact word you're searching for. If some word in sentence has your target word as a strict substring then it will not be returned.
#include <regex>
#include <string>
#include <iostream>
int main() {
std::string word = "spo"; // spo is a word?
std::string sentence = "seven spoons";
std::regex r("\\b" + word + "\\b"); // the pattern \b matches a word boundary
std::smatch m;
if (std::regex_search(sentence, m, r)) { // this won't find anything because 'spoons' is not the word you're searching for
std::cout << "match 1: " << m.str() << '\n';
}
sentence = "what does the word 'spo' mean?";
if (std::regex_search(sentence, m, r)) { // this does find the word 'spo'
std::cout << "match 2: " << m.str() << '\n';
}
}
Or alternatively maybe you mean you want to find any word that matches a partial word you're searching for. regex can do that as well:
std::string partial_word = "spo";
std::regex r("\\w*" + partial_word + "\\w*"); // the pattern \w matches a word character
This produces:
match 1: spoons
match 2: spo
There are a bunch of options here:
a) Search for [space]WORD[space] instead of just WORD
string word="spo";
string sentence="seven spoons";
kk=sentence.find(" "+word+" ");
Note that this wont work, if your words are separated by newline characters or other white spaces.
b) Split the string into words, store them in a vector, and check if the desired word is somewhere in the vector, by using std::find.
stringstream parser(sentence);
istream_iterator<string> start(parser);
istream_iterator<string> end;
vector<string> words(start, end);
if(find(words.begin(), words.end(), word)!=words.end()) cout<<"found!";
If you're gonna search for words often, this maybe the best choice, since you can store the vector somewhere for future reference, so you don't have to split it. Also - if you want this to work, be sure to #include <algorithm> and #include <vector>.
c) Search for the word and check if isspace(string[position-1]) && isspace(string[position+wordLength])
string word="spo";
string sentence="seven spoons";
kk=sentence.find(" "+word+" ");
if(kk!=string::npos){
if((kk==0 || isspace(sentence[kk-1])) && (kk+word.length()==sentence.length() || isspace(kk+word.length()+1)))
cout << "found!";
}
Something like this :
std::size_t kk;
std::string word="spoo";
std::string sentence="seven spoons tables";
std::stringstream ss(sentence) ;
std::istream_iterator<std::string> f ;
auto it =std::find_if( std::istream_iterator<std::string> (ss),
f,
[=](const std::string& str){
return str == word;
}
);
if(it != f )
std::cout << "Success" <<std::endl;
See here
I think the best way is to split your string using whitespace and punctuation characters as delimiters, then use std::find on the result.
#include <boost/algorithm/string.hpp>
#include <vector>
#include <string>
#include <algorithm>
int main()
{
std::string word="spo";
std::string sentence="seven spoons";
std::vector<std::string> words;
boost::split(words, sentence, boost::is_any_of("\n\t .,!?\"()"));
auto match = std::find(begin(words), end(words), word);
if (match != end(words))
{
// Found it!
}
else
{
// Not there.
}
}
string word="spo";
string sentence="seven spoons";
string::size_type nIndex = sentence.find( word, 0 );
if( nIndex != string::npos )
{
if ((nIndex + word.length() + 1) == sentence.length())
{
cout << "Found" << endl;
}
else
{
string::size_type nSpace = sentence.find( " ", nIndex );
if (nSpace == (nIndex + word.length()))
{
cout << "Found" << endl;
}
}
}
else
{
cout << "No Match" << endl;
}
This seems to have worked.
#include <string>
/*find word in sentence and return the index of first occurrence*/
int find_whole(string sentence,string word){
size_t pos=sentence.find(word);
size_t offset=pos+sentence.size()+1;
if((pos!=string::npos) && (sentence.substr(pos,offset)==word))
return pos;
return string::npos;
}

how do I use ' string::find ' to find a word in file using C++

I'm creating a program that will open a file and search for a desired word within the text.
I created the following word bank...
Lawyer
Smith Janes
Doctor
Michael Zane
Teacher
Maria Omaha
#include <iostream>
#include <string>
#include <fstream>
#include <stdlib.h>
#include <string>
#include <sstream>
using namespace std;
int main ()
{
// Declarations
string reply;
string inputFileName;
ifstream inputFile;
char character;
cout << "Input file name: ";
getline(cin, inputFileName);
// Open the input file.
inputFile.open(inputFileName.c_str());
// Check the file opened successfully.
if ( ! inputFile.is_open())
{
cout << "Unable to open input file." << endl;
cout << "Press enter to continue...";
getline(cin, reply);
return 1;
}
Now that I save the whole file into a string how could I search inside that string
for a specific word I'm looking for...
I'm learning C++ from this Website http://www.cprogramming.com/tutorial/lesson10.html
I think you use string::find but I couldn't find much reference on how to search beside this wesite..
http://www.cplusplus.com/reference/string/string/find/
This section will display the whole file.
string original;
getline(inputFile, original, '\0');
cout << original << endl;
cout << "\nEnd of file reached\n" << endl;
// Close the input file stream
inputFile.close();
cout << "Press enter to continue...";
return 0;
}
This is how I think the program should act...
Please enter a word: Smith Janes
Smith Janes Lawyer
another example....
Please enter a word: Doctor
Michael Zane Doctor
find returns the position (zero based offset) in the string where the word is found. If the word is not found it returns npos.
#include <string>
#include <iostream>
int main()
{
std::string haystack("some string with words in it");
std::string::size_type pos = haystack.find("words");
if(pos != std::string::npos)
{
std::cout << "found \"words\" at position " << pos << std::endl;
}
else
{
std::cout << "\"words\" not found" << std::endl;
}
}
#include <string>
#include <iostream>
#include <cstdlib>
int main() {
std::string haystack = "Lawyer\nSmith Janes\nDoctor\nMichael Zane\nTeacher\nMaria Omaha\n";
std::string needle = "Janes";
auto res = haystack.find(needle);
if (std::string::npos == res) {
std::cout << "Not found\n";
std::exit(EXIT_FAILURE);
}
std::cout << res << '\n';
}
res is an index into the string at the point where "Janes" is (Should be 13).
The functionality you appear to be asking for is more complex than just finding some content in a string. The output you show has a user input either a name or a profession and the output is the related profession or name.
It's simple to write a program that shows the line the 'needle' is on, or to show always show the previous line, or always show the next line. But what you're asking for is to show one or the other depending on what was searched for.
One simple way we could implement this is to find if the needle is on an even or odd line and base what we show on that.
First we get the line number.
auto line_num = std::count(std::begin(haystack), std::begin(haystack) + res, '\n');
Based on the content you showed, professions are on even lines and names are on odd lines. We can easily get the line numbers we want:
auto profession_line_num = line_num/2*2;
auto name_line_num = line_num/2*2 + 1;
Next, we can split the text up into lines since we need to work with whole lines and get lines by index. The method I show below makes a copy of the text and is inefficient, but it's easy.
Here's a split function:
std::vector<std::string> split(std::string const &s, std::string const &delims) {
std::vector<std::string> res;
std::string::size_type i = 0;
auto found = s.find_first_of(delims, i);
while (std::string::npos != found) {
res.emplace_back(s, i, found-i);
i = found+1;
found = s.find_first_of(delims, i);
}
res.emplace_back(s, i);
return res;
}
And we use the split function like so:
auto lines = split(haystack, '\n');
Now, we can show the lines we want.
std::cout << lines[name_line_num] << ' ' << lines[profession_line_num] << '\n';
Which once you put the program together prints:
Smith Janes Lawyer
I think this has all the information you need.
http://www.cplusplus.com/reference/string/string/find/