I'm using C++ on XCode. I'd like to match non-alphabet characters using regex_match but seem to be having difficulty:
#include <iostream>
#include <regex>
using namespace std;
int main(int argc, const char * argv[])
{
cout << "BY-WORD: " << regex_match("BY-WORD", regex("[^a-zA-Z]")) << endl;
cout << "BYEWORD: " << regex_match("BYEWORD", regex("[^a-zA-Z]")) << endl;
return 0;
}
which returns:
BY-WORD: 0
BYEWORD: 0
I want "BY-WORD" to be matched (because of the hyphen), but regex_match returns a 0 for both tests.
I confoosed.
regex_match tries to match the whole input string against the regular expression you provide. Since your expression would only match a single character, it will always come back false on those inputs.
You probably want regex_search instead.
regex_match() returns whether the target sequence matches the regular expression rgx. If you want to search the non-alphabet characters from the target sequence, you need regex_search():
#include <regex>
#include <iostream>
int main()
{
std::regex rx("[^a-zA-Z]");
std::smatch res;
std::string str("BY-WORD");
while (std::regex_search (str,res,rx)) {
std::cout <<res[0] << std::endl;
str = res.suffix().str();
}
}
Related
I am playing around with C++ regular expression:
#include <iostream>
#include <regex>
using namespace std;
int main()
{
std::regex e("[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]");
std::string str = "¡";//u00A1, trying to match this character
bool match = std::regex_match(str, e);
cout << match << endl;
return 0;
}
The compilation failed with exception thrown:
terminate called after throwing an instance of 'std::regex_error'
what(): Invalid range in bracket expression.
I tested on regex101.com with:
[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]
and it gives me a match.
Also, is regex in C++ using the ECMAScript flavor? Thanks.
The problem is that you are using a char based regex but a 16 bit regular expression. Try switching to wchar_t as your character type.
#include <iostream>
#include <regex>
using namespace std;
int main()
{
std::wregex e(L"[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF]");
std::wstring str = L"¡";//u00A1, trying to match this character
bool match = std::regex_match(str, e);
cout << match << endl;
return 0;
}
I am having a problem with boost regex in cpp. I want to match a string like
"Hello %world% regex %cpp%" and expected string output is world, cpp
Can somebody suggest a regex for this
Thanks
Anil
I personally prefer "\\%([^\\%]*)\\%" (or as a raw string R"r(\%([^\%]*)\%)r")
It doesn't rely on non-greedy qualifiers
Which is essentially
one percent character \\%
any amount of non-percent characters [^\\%]*
one percent character \\%
I know this is tagged boost but here's a solution with std::regex
#include <string>
#include <regex>
#include <iostream>
int main()
{
using namespace std;
string source = "Hello %world%";
regex match_percent_enclosed (R"_(\%([^\%]*)\%)_");
smatch between_percent;
bool found_match = regex_search(source,between_percent,match_percent_enclosed);
if(found_match && between_percent.size()>1)
cout << "found: \"" << between_percent[1].str() << "\"." << endl;
else
cout << "no match found." << endl;
}
you may get some idea
%(.+?)%
Result:
Match 1
1. world
Match 2
1. cpp
You can use this regex \%(.*?)\%smallest group
Online regex: https://regex101.com/r/dSCE2a/2
And for the code with boost
#include <iostream>
#include <cstdlib>
#include <boost/regex.hpp>
using namespace std;
int main()
{
boost::cmatch mat;
boost::regex reg( "\\%(.*?)\\%" );
char szStr[] = "Hello %world% regex %cpp%";
char *where = szStr;
while (regex_search(where, mat, reg))
{
cout << mat[1] << endl; // 0 for whole match, 1 for sub
where = (char*)mat[0].second;
}
}
#include <iostream>
#include <stdio.h>
#include <string.h>
#include <regex>
using namespace std;
int main(int argc, char* argv[]) {
string test = "<html><div><script>var link = "http://example.com/?key=dynamic_key";</script></div></html>";
regex re("http://example.com/(*)");
smatch match;
if (regex_search(test, match, re)) {
cout<<"OK"<<endl;
}
return 0;
}
The command for this compile.
root# g++ test.cpp -o test -std=gnu++11
This program not working. How do I get the link (use regex) from the html code? Please, help me.
Your string construction is incorrect, see the " escaping:
string test = "<html><div><script>var link = \"http://example.com/?key=dynamic_key\";</script></div></html>";
And I would use this regex:
http:\/\/example.com[^"]*
which select only this:
http://example.com/?key=dynamic_key
I see two problems with your code.
The first is you are trying to put quotes " inside quotes without escaping them.
You need to do: "escape your \"quotes\" properly" (note the \"):
Also your regex was not quite right, * needs to follow a matchable character (like [^"] meaning not a quote):
#include <iostream>
#include <stdio.h>
#include <string.h>
#include <regex>
using namespace std;
int main(int argc, char* argv[]) {
//string test = "<html><div><script>var link = "http://example.com/?key=dynamic_key";</script></div></html>";
string test = "<html><div><script>var link = \"http://example.com/?key=dynamic_key\";</script></div></html>";
//regex re("http://example.com/(*)");
regex re("http://example.com/([^\"]*)"); // NOTE the escape \"
smatch match;
if (regex_search(test, match, re)) {
cout<<"OK"<<endl;
cout << match.str(1) << '\n'; // first capture group
}
return 0;
}
Output:
OK
?key=dynamic_key
I think there are two errors here:
The test string is incorrectly delimited. Try use raw string literals.
The regex isn't quite right either (I assume you want to match the full link).
Further there is one more warning, regex and html don't always work well together.
Sample code listing
#include <iostream>
#include <stdio.h>
#include <string.h>
#include <regex>
using namespace std;
int main(int argc, char* argv[]) {
string test = R"(<html><div><script>var link = "http://example.com/?key=dynamic_key";</script></div></html>)";
regex re( R"(http://example\.com/[^"]*)" );
smatch match;
if (regex_search(test, match, re)) {
cout << "OK" << endl;
for (auto i : match) {
cout << i << endl;
}
}
return 0;
}
And the output here is;
OK
http://example.com/?key=dynamic_key
See here for a live sample.
I googled around but still cannot find the error.
Why does the following code print false, I expected true?
#include <iostream>
#include <regex>
using namespace std;
int main()
{
std::string in("15\n");
std::regex r("[1-9]+[0-9]*\\n",
std::regex_constants::extended);
std::cout << std::boolalpha;
std::cout << std::regex_match(in, r) << std::endl;
}
The option to use regex_search is not given.
There is an extra slash before the "\n" in your regex. The code prints true with just the slash removed.
#include <iostream>
#include <regex>
using namespace std;
int main()
{
std::string in("15\n");
std::regex r("[1-9]+[0-9]*\n",
std::regex_constants::extended);
std::cout << std::boolalpha;
std::cout << std::regex_match(in, r) << std::endl;
}
Edit: #rici explains why this is an issue in a comment:
Posix-standard extended regular expressions (selected with std::regex_constants::extended) do not recognize C-escape sequences such as \n. See Posix base definitions 9.4.2: "The interpretation of an ordinary character preceded by a ( '\' ) is undefined."
Below is my following code
#include <iostream>
#include <stdlib.h>
#include <boost/regex.hpp>
#include <string>
using namespace std;
using namespace boost;
int main() {
std::string s = "Hello my name is bob";
boost::regex re("name");
boost::cmatch matches;
try{
// if (boost::regex_match(s.begin(), s.end(), re))
if (boost::regex_match(s.c_str(), matches, re)){
cout << matches.size();
// matches[0] contains the original string. matches[n]
// contains a sub_match object for each matching
// subexpression
for (int i = 1; i < matches.size(); i++){
// sub_match::first and sub_match::second are iterators that
// refer to the first and one past the last chars of the
// matching subexpression
string match(matches[i].first, matches[i].second);
cout << "\tmatches[" << i << "] = " << match << endl;
}
}
else{
cout << "No Matches(" << matches.size() << ")\n";
}
}
catch (boost::regex_error& e){
cout << "Error: " << e.what() << "\n";
}
}
It always outputs with no Matches.
Im sure that regex should work.
I used this example
http://onlamp.com/pub/a/onlamp/2006/04/06/boostregex.html?page=3
from boost regex:
Important
Note that the result is true only if the expression matches the whole of the input sequence. If you want to search for an expression somewhere within the sequence then use regex_search. If you want to match a prefix of the character string then use regex_search with the flag match_continuous set.
Try boost::regex re("(.*)name(.*)"); if you want to use the expression with regex_match.