How do I get the regex matched value using Boost.Regex? - c++

I'm trying to extract the domain from a URL. Following is an example script.
#include <iostream>
#include <string>
#include <boost/regex.hpp>
int main () {
std::string url = "http://mydomain.com/randompage.php";
boost::regex exp("^https?://([^/]*?)/");
std::cout << regex_search(url,exp);
}
How do I print the matched value?

You need to use the overload of regex_search that takes a match_results object. In your case:
#include <iostream>
#include <string>
#include <boost/regex.hpp>
int main () {
std::string url = "http://mydomain.com/randompage.php";
boost::regex exp("^https?://([^/]*?)/");
boost::smatch match;
if (boost::regex_search(url, match, exp))
{
std::cout << std::string(match[1].first, match[1].second);
}
}
Edit: Corrected begin, end ==> first, second

Related

pattern search in text strings in c++

I just want look for a pattern in a string. for example for this "abaxavabaabcabbc" string the app should print "abc" and "abbc". So, the pattern should have "abc" but the numbers of "b" are changing.
pattern => "abc" => the numbers of "b" are changeable.
And the programm should be in c++.
Using regex_search instead of the iterator:
Live On Coliru
#include <regex>
#include <string>
#include <iostream>
int main() {
std::regex const pattern("ab+c");
for (std::string const text :
{
"abaxavabaabcabbc",
}) //
{
std::smatch match;
for (auto it = text.cbegin(), e = text.cend();
std::regex_search(it, e, match, pattern); it = match[0].second) {
std::cout << "Match: " << match.str() << "\n";
}
}
}
Prints
Match: abc
Match: abbc
There is only one answer to this question. You MUST use a std::regex. Regular expressions are exactly made for this purpose.
C++ supports also regular expressions. Please see here
The regex "ab+c" will match all strings starting with an "a", having one or more "b" and end with a "c"
See the following very short program:
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <regex>
const std::regex re{ R"(ab+c)" };
using Iter = std::sregex_token_iterator;
int main() {
const std::string test{ "abaxavabaabcabbc" };
std::copy(Iter(test.begin(), test.end(), re), Iter(), std::ostream_iterator<std::string>(std::cout, "\n"));
}
This program will iterate over all matched patterns and copy them to std::cout

C++ regex - get a link from html code

#include <iostream>
#include <stdio.h>
#include <string.h>
#include <regex>
using namespace std;
int main(int argc, char* argv[]) {
string test = "<html><div><script>var link = "http://example.com/?key=dynamic_key";</script></div></html>";
regex re("http://example.com/(*)");
smatch match;
if (regex_search(test, match, re)) {
cout<<"OK"<<endl;
}
return 0;
}
The command for this compile.
root# g++ test.cpp -o test -std=gnu++11
This program not working. How do I get the link (use regex) from the html code? Please, help me.
Your string construction is incorrect, see the " escaping:
string test = "<html><div><script>var link = \"http://example.com/?key=dynamic_key\";</script></div></html>";
And I would use this regex:
http:\/\/example.com[^"]*
which select only this:
http://example.com/?key=dynamic_key
I see two problems with your code.
The first is you are trying to put quotes " inside quotes without escaping them.
You need to do: "escape your \"quotes\" properly" (note the \"):
Also your regex was not quite right, * needs to follow a matchable character (like [^"] meaning not a quote):
#include <iostream>
#include <stdio.h>
#include <string.h>
#include <regex>
using namespace std;
int main(int argc, char* argv[]) {
//string test = "<html><div><script>var link = "http://example.com/?key=dynamic_key";</script></div></html>";
string test = "<html><div><script>var link = \"http://example.com/?key=dynamic_key\";</script></div></html>";
//regex re("http://example.com/(*)");
regex re("http://example.com/([^\"]*)"); // NOTE the escape \"
smatch match;
if (regex_search(test, match, re)) {
cout<<"OK"<<endl;
cout << match.str(1) << '\n'; // first capture group
}
return 0;
}
Output:
OK
?key=dynamic_key
I think there are two errors here:
The test string is incorrectly delimited. Try use raw string literals.
The regex isn't quite right either (I assume you want to match the full link).
Further there is one more warning, regex and html don't always work well together.
Sample code listing
#include <iostream>
#include <stdio.h>
#include <string.h>
#include <regex>
using namespace std;
int main(int argc, char* argv[]) {
string test = R"(<html><div><script>var link = "http://example.com/?key=dynamic_key";</script></div></html>)";
regex re( R"(http://example\.com/[^"]*)" );
smatch match;
if (regex_search(test, match, re)) {
cout << "OK" << endl;
for (auto i : match) {
cout << i << endl;
}
}
return 0;
}
And the output here is;
OK
http://example.com/?key=dynamic_key
See here for a live sample.

Regex search & replace group in C++?

The best I can come up with is:
#include <boost/algorithm/string/replace.hpp>
#include <boost/regex.hpp>
#include <iostream>
using namespace std;
int main() {
string dog = "scooby-doo";
boost::regex pattern("(\\w+)-doo");
boost::smatch groups;
if (boost::regex_match(dog, groups, pattern))
boost::replace_all(dog, string(groups[1]), "scrappy");
cout << dog << endl;
}
with output:
scrappy-doo
.. is there a simpler way of doing this, that doesn't involve doing two distinct searches? Maybe with the new C++11 stuff (although I'm not sure that it's compatible with gcc atm?)
std::regex_replace should do the trick. The provided example is pretty close to your problem, even to the point of showing how to shove the answer straight into cout if you want. Pasted here for posterity:
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
int main()
{
std::string text = "Quick brown fox";
std::regex vowel_re("a|e|i|o|u");
// write the results to an output iterator
std::regex_replace(std::ostreambuf_iterator<char>(std::cout),
text.begin(), text.end(), vowel_re, "*");
// construct a string holding the results
std::cout << '\n' << std::regex_replace(text, vowel_re, "[$&]") << '\n';
}

How to search for a regular expression in c++?

I'm working on C++,
I need to search for a given regular expression in given string. Please provide me the pointer to do it. I tried to use boost::regex library.
Following is the regular expression:
regular expression to search : "get*"
And above expression i have to search in following different strings:
e.g.
1. "com::sun::star:getMethodName"
2. "com:sun:star::SetStatus"
3. "com::sun::star::getMessage"
so i above case i should get true for first string false for second and again true for third one.
Thanks in advance.
boost::regex re("get.+");
example.
#include <iostream>
#include <string>
#include <boost/regex.hpp>
#include <vector>
#include <algorithm>
int main()
{
std::vector<std::string> vec =
{
"com::sun::star:getMethodName",
"com:sun:star::SetStatus",
"com::sun::star::getMessage"
};
boost::regex re("get.+");
std::for_each(vec.begin(), vec.end(), [&re](const std::string& s)
{
boost::smatch match;
if (boost::regex_search(s, match, re))
{
std::cout << "Matched" << std::endl;
std::cout << match << std::endl;
}
});
}
http://liveworkspace.org/code/7d47ad340c497f7107f0890b62ffa609

If-Then-Else Conditionals in Regular Expressions and using capturing group

I have some difficulties in understanding if-then-else conditionals in regular expressions.
After reading If-Then-Else Conditionals in Regular Expressions I decided to write a simple test. I use C++, Boost 1.38 Regex and MS VC 8.0.
I have written this program:
#include <iostream>
#include <string>
#include <boost/regex.hpp>
int main()
{
std::string str_to_modify = "123";
//std::string str_to_modify = "ttt";
boost::regex regex_to_search ("(\\d\\d\\d)");
std::string regex_format ("(?($1)$1|000)");
std::string modified_str =
boost::regex_replace(
str_to_modify,
regex_to_search,
regex_format,
boost::match_default | boost::format_all | format_no_copy );
std::cout << modified_str << std::endl;
return 0;
}
I expected to get "123" if str_to_modify has "123" and to get "000" if I str_to_modify has "ttt". However I get ?123123|000 in the first case and nothing in second one.
Coluld you tell me, please, what is wrong with my test?
The second example that still doesn't work :
#include <iostream>
#include <string>
#include <boost/regex.hpp>
int main()
{
//std::string str_to_modify = "123";
std::string str_to_modify = "ttt";
boost::regex regex_to_search ("(\\d\\d\\d)");
std::string regex_format ("(?1foo:bar");
std::string modified_str =
boost::regex_replace(str_to_modify, regex_to_search, regex_format,
boost::match_default | boost::format_all | boost::format_no_copy );
std::cout << modified_str << std::endl;
return 0;
}
I think the format string should be (?1$1:000) as described in the Boost.Regex docs.
Edit: I don't think regex_replace can do what you want. Why don't you try the following instead? regex_match will tell you whether the match succeeded (or you can use match[i].matched to check whether the i-th tagged sub-expression matched). You can format the match using the match.format member function.
#include <iostream>
#include <string>
#include <boost/regex.hpp>
int main()
{
boost::regex regex_to_search ("(\\d\\d\\d)");
std::string str_to_modify;
while (std::getline(std::cin, str_to_modify))
{
boost::smatch match;
if (boost::regex_match(str_to_modify, match, regex_to_search))
std::cout << match.format("foo:$1") << std::endl;
else
std::cout << "error" << std::endl;
}
}