Why can't I specify regex_constants in assign? - c++

I am using std::regex with VS2015/VS2017 on Windows. I am trying to set up a general routine that will do a search search to perform the 8 functions as requested by the user:
Begins with
Does not begin with
Ends with
Does not end with
Contains string
Does not contain string
Contains any of the characters in supplied string
Does not contain any of the characters in supplied string.
The routine is passed the type above, the associated string, the string to search, a boolean variable to state if the search is case sensitive/insensitive and another boolean variable on whether to restrict the search for whole words only (types 1-6 only). I modify the supplied find string if "whole words only" is specified.
e.g.
void Search(std::wstring sRegex, std::wstring wsString, bool bCaseSensitive, bool bWholeWordsOnly);
However, I can't get basic case insensitivity to work as I expected.
This works:
std::wregex reg;
std::wsmatch m;
if (bCaseSensitive) {
reg.assign(sRegex.c_str(), std::regex::ECMAScript);
} else {
reg.assign(sRegex.c_str(), std::regex::ECMAScript | std::regex::icase);
}
std::regex_search(wsString, m, reg) .....
but this doesn't (icase ignored and worse):
std::wregex reg;
std::wsmatch m;
reg.assign(sRegex.c_str(), std::regex::ECMAScript |
(bCaseSensitive ? 0 : std::regex::icase));
std::regex_search(wsString, m, reg) .....
Can anyone explain this please?
PS. I haven't yet coded more than type 1 yet as I want to fix this first before continuing. Any suggestions for the regex string for the other types with & without "whole word only" (where appropriate) would be gratefully received.

Related

What is returned in std::smatch and how are you supposed to use it?

string "I am 5 years old"
regex "(?!am )\d"
if you go to http://regexr.com/ and apply regex to the string you'll get 5.
I would like to get this result with std::regex, but I do not understand how to use match results and probably regex has to be changed as well.
std::regex expression("(?!am )\\d");
std::smatch match;
std::string what("I am 5 years old.");
if (regex_search(what, match, expression))
{
//???
}
The std::smatch is an instantiation of the match_results class template for matches on string objects (with string::const_iterator as its iterator type). The members of this class are those described for match_results, but using string::const_iterator as its BidirectionalIterator template parameter.
std::match_results supports a operator[]:
If n > 0 and n < size(), returns a reference to the std::sub_match representing the part of the target sequence that was matched by the nth captured marked subexpression).
If n == 0, returns a reference to the std::sub_match representing the part of the target sequence matched by the entire matched regular expression.
if n >= size(), returns a reference to a std::sub_match representing an unmatched sub-expression (an empty subrange of the target sequence).
In your case, regex_search finds the first match only and then match[0] holds the entire match text, match[1] would contain the text captured with the first capturing group (the fist parenthesized pattern part), etc. In this case though, your regex does not contain capturing groups.
Here, you need to use a capturing mechanism here since std::regex does not support a lookbehind. You used a lookahead that checks the text that immediately follows the current location, and the regex you have is not doing what you think it is.
So, use the following code:
#include <regex>
#include <string>
#include <iostream>
using namespace std;
int main() {
std::regex expression(R"(am\s+(\d+))");
std::smatch match;
std::string what("I am 5 years old.");
if (regex_search(what, match, expression))
{
cout << match.str(1) << endl;
}
return 0;
}
Here, the pattern is am\s+(\d+)". It is matching am, 1+ whitespaces, and then captures 1 or more digits with (\d+). Inside the code, match.str(1) allows access to the values that are captured with capturing groups. As there is only one (...) in the pattern, one capturing group, its ID is 1. So, str(1) returns the text captured into this group.
The raw string literal (R"(...)") allows using a single backslash for regex escapes (like \d, \s, etc.).

How can search number in string?

Hello? I want to know how can I find number in my string code.
This is my c++ code.
string firSen;
cout<<"write the senctence : "<<endl;
getline(cin,firSen);
int a=firSen.find("pizza");
int b=firSen.find("hamburger");
int aa=firSen.find(100);
int bb=firSen.find(30);
I want to write
I want to eat pizza 100g, hamburger 30g!!
and I want to know 100 and 30 address.
I know how to find pizza and hamburger address.(It's the right code)
but I don't know how to find number..(I think int aa=firSen.find(100); int bb=firSen.find(30); is wrong code)
Could you help me?
The std::string::find() function takes a std::string or a const char* as valid search keys.
If you want to search for 'generic numbers' you'll have to convert them to a std::string or use a const char* literal
size_type aa=firSen.find("100");
or
int num = 100;
size_type aa=firSen.find(std::to_string(num));
See the std::to_string() function reference
As it looks from your input sample, you don't know the numeric values beforehand, thus looking up something like
size_type aa=firSen.find("100");
renders useless.
What you actually need is some decent parser, that enables you reading the numeric values after some certain keywords, that require a numeric attribute (like weight in your sample).
The simplest way might be, to find your keywords like "hamburger" or "pizza", and move on from the found position, to find the next digit ('0-9'), and extract the number from that position.
Using std::regex as proposed in #deeiip's answer, might be a concise solution for your problem.
I'd use this in your situation (if I was searching for just a number, not a specific number):
std::regex rgx("[0-9]+");
std::smatch res;
while (std::regex_search(firSen, res, rgx)) {
std::cout << res[0] << std::endl;
s = res.suffix().str();
}
This is c++11 standard code using <regex>. What it does is: search for any occurence of a number. This is what [0-9]+ means. And It keep on searching for this pattern in your string.
This solution should only be used when I dont know what number I'm expecting otherwise it'll be much more expensive than other solution mentioned.

How to detect ESC in string using Boost regex

I need to determine if a file is PCL encoded. So I am looking at the first line to see if it begins with an ESC character. If you know a better way feel free to suggest. Here is my code:
bool pclFlag = false;
if (containStr(jobLine, "^\\e")) {
pclFlag=true;
}
bool containStr(const string& s, const string& re)
{
static const boost::regex e(re);
return regex_match(s, e);
}
pclFlag does not get set to true.
You've declared boost::regex e to be static, which means it will only get initialized the very first time your function is called. If your search here is not the first call, it will be searching for whatever string was passed in the first call.
regex_match must match the entire string. Try adding ".*" (dot star) to the end of your regex.
Important
Note that the result is true only if the expression matches the whole of the input sequence. If you want to search for an expression somewhere within the sequence then use regex_search. If you want to match a prefix of the character string then use regex_search with the flag match_continuous set.
http://www.boost.org/doc/libs/1_51_0/libs/regex/doc/html/boost_regex/ref/regex_match.html
#JoachimPileborg is right... if (jobline[0] == 0x1B) {} is much easier.
Boost.Regex seems like overkill if all you want to do is see if a string starts with a certain character.
bool pclFlag = jobLine.length() > 0 && jobLine[0] == '\033';
You could also use Boost string algorithms:
#include <boost/algorithm/string.hpp>
bool pclFlag = jobLine.starts_with("\033");
If you're looking to see if a string contains an escape anywhere in the string:
bool pclFlag = jobLine.find('\033') != npos;

Regular Expression for removing suffix

What is the regular expression for removing the suffix of file names? For example, if I have a file name in a string such as "vnb.txt", what is the regular expression to remove ".txt"?
Thanks.
Do you really need a regular expression to do this? Why not just look for the last period in the string, and trim the string up to that point? Frankly, there's a lot of overhead for a regular expression, and I don't think you need it in this case.
As suggested by tstenner, you can try one of the following, depending on what kinds of strings you're using:
std::strrchr
std::string::find_last_of
First example:
char* str = "Directory/file.txt";
size_t index;
char* pStr = strrchr(str,'.');
if(nullptr != pStr)
{
index = pStr - str;
}
Second example:
int index = string("Directory/file.txt").find_last_of('.');
If you are using Qt already, you could use QFileInfo, and use the baseName() function to get just the name (if one exists), or the suffix() function to get the extension (if one exists).
If you're looking for a solution that will give you anything except for the suffix, you should use string::find_last_of.
Your code could look like this:
const std::string removesuffix(const std::string& s) {
size_t suffixbegin = s.find_last_of('.');
//This will handle cases like "directory.foo/bar"
size_t dir = s.find_last_of('/');
if(dir != std::string::npos && dir > suffixbegin) return s;
if(suffixbegin == std::string::npos) return s;
else return s.substr(0,suffixbegin);
}
If you're looking for a regular expression, use \.[^.]+$.
You have to escape the first ., otherwise it will match any character, and put a $ at the end, so it will only match at the end of a string.
Different operating systems may allow different characters in filenams, the simplest regex might be (.+)\.txt$. Get the first capture group to get the filename sans extension.

string contains valid characters

I am writing a method whose signature is
bool isValidString(std::string value)
Inside this method I want to search all the characters in value are belongs to a set of characters which is a constant string
const std::string ValidCharacters("abcd")
To perform this search I take one character from value and search in ValidCharacters,if this check fails then it is invalid string is there any other alternative method in STL library to do this check.
Use find_first_not_of():
bool isValidString(const std::string& s) {
return std::string::npos == s.find_first_not_of("abcd");
}
you can use regular expressions to pattern match.
library regexp.h is to be included
http://www.digitalmars.com/rtl/regexp.html