wsregex::compile crashes (memory leak) when handling regex string? - c++

I would like to understand why my program crashes when I try to use the wsregex::compile of BOOST with the following string:
(?P<path>\b[a-z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*)?
(:)?
(?P<ip>(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)
(;(?P<port>\d*))?
(:(?P<port>\b\d+\b):(?P<password>[\w]*))?
(:(?P<password>\b\d+\b))?
In regex buddy everything appears to be fine. I used the JGSoft flavor option on RegexBuddy.
I am validating the following:
c:\My Documents\Test\test.csv:1.12.12.13:111:admin
c:\My Documents\Test\test.csv:1.12.12.13:111
c:\My Documents\Test\test.csv:1.12.12.13;111
1.12.12.13:111
1.12.12.13;111
Can you guys help me. Thanks a lot.

This is neither a memory leak nor a crash as far as I can tell. Xpressive is throwing an exception because this is an invalid pattern. The following program:
#include <iostream>
#include <boost/xpressive/xpressive_dynamic.hpp>
namespace xpr = boost::xpressive;
int main()
{
const char pattern[] =
"(?P<path>\\b[a-z]:\\\\(?:[^\\\\/:*?\"<>|\\r\\n]+\\\\)*[^\\\\/:*?\"<>|\\r\\n]*)?"
"(:)?"
"(?P<ip>(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b)"
"(;(?P<port>\\d*))?"
"(:(?P<port>\\b\\d+\\b):(?P<password>[\\w]*))?"
"(:(?P<password>\\b\\d+\\b))?";
try
{
xpr::sregex rx = xpr::sregex::compile(pattern);
}
catch(xpr::regex_error const & e)
{
std::cout << e.what() << std::endl;
}
}
Outputs:
named mark already exists
Indeed, it does. This pattern uses "port" and "password" twice as the name of a capturing group. Xpressive doesn't like that. Just pick unique names for your captures and you should be fine.

Related

NAPI: How to match a JS regex from a C++ thread?

I am modifying a Node native extension that is spawning native threads to do some processing. My issue is that I'd like to have the Javascript code provide a filter for the processing to exclude some data.
At this point, I'm passing a JS RegExp string from JS to C++, creating a std::regex instance from it, and passing it around the different structures down to the native thread logic.
My issue now is that despite std::regex using what seems to be the same syntax as ECMAScript regular expressions, the behavior is not the same :(
My original plan was to rely on V8's RegExp engine somehow but trigger the C++ bits directly instead of going from C++ to JS and back. But I wasn't able to find how to do this.
As example, see the following programs using the same regex but yielding different results:
#include <stdio.h>
#include <regex>
int main() {
std::regex re("^(?:(?:(?!(?:\\/|^)\\.).)*?\\/c)$");
std::smatch match;
std::string input("a.b/c");
int result = std::regex_match(input, match, re);
if (result == 1) {
printf("ok");
} else {
printf("nok");
}
return 0;
}
The equivalent JS code:
const re = new RegExp("^(?:(?:(?!(?:\\/|^)\\.).)*?\\/c)$");
const match = re.exec("a.b/c");
if (match) {
console.log("ok");
} else {
console.log("nok");
}
My question then is: What can I do to get the same results I would in JS but in C++? Is it possible to run V8's RegExp from a pure C++ context?

How to check a specified string is a valid URL or not using C++ code

there any possible way to check that the specified string is a valid url or not. The solution must be in c++ and it should work without internet.
example strings are
good.morning
foo.goo.koo
https://hhhh
hdajdklbcbdhd
8881424.www.hfbn55.co.in/sdfsnhjk
://dgdh24.vom
dfgdfgdf(2001)/.com/sdgsgh
\adiihsdfghnhg.co.inskdhhj
aser//www.gtyuh.co.uk/kdsfgdfgfrgj
Chose a symphatetic regular expression like /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/.
Use std regex, or boost regex if you don't have C++11:
if (std::regex_match ("http://subject", std::regex("^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$") )) {
// ...
}
You could use regex.
What a regex is.
With C++11 the regex are build-in the STD library
regex c++11.
If you cannot use C++11, for some reason, you could use boost library.
Anyway you could check the patter of an url with:
#include <regex> //require c++11
// ...
// regex pattern
std::string pattern = "https?:\/\/(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,4}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)";
// Construct regex object
std::regex url_regex(pattern);
// An url-string for example
std::string my_url = "http://www.google.com/img.png";
// Check for match
if (std::regex_match(my_url, url_regex) == true) {
std::cout << "This is a well-formed url\n";
} else {
std::cout << "Ill-formed url\n";
}

std::regex_error exception thrown at runtime

Given that this code works:
regex r1{ "fish"s };
smatch m1;
if (regex_search("I love fish and chips"s, m1, r1))
cout << m1[0] << endl;
I believe that VS2015 supports regular expressions. However, initialization of this regular expression object:
regex r{ R"(\d{2,3}(-\d\d) { 2 })" };
throws a std::regex_error exception. What's wrong with the initialization?
So, yeah, as mentioned in the comments:
(\d{2,3}(-\d\d) { 2 })
should be
(\d{2,3}(-\d\d){2})
otherwise the {2} relates to the space instead of the (-\d\d), and other weird things might possibly happen as well…
You have a typo in your regex. Change this:
regex r{ R"(\d{2,3}(-\d\d) { 2 })" };
To:
regex r{ R"(\d{2,3}(-\d\d){2})" };

Emacs imenu and speedbar+semantic fails because of indentation in c++ mode

My problem is that imenu or speedbar/semantic fails because of indentation. For this simple file, it is ok:
#include <iostream>
void bar() {
std::cout << "bar" << std::endl;
}
But if I want to put function bar in a namespace and indent its code:
with speedbar (having (require 'semantic/sb) in init.el), I don't have the file tags in the speedbar frame, and I got "File mode specification error: (void-function c-subword-mode)" in minibuffer
with M-X imenu, I got "No items suitable for an index found in this buffer" in minibuffer
Exemple code that fails:
#include <iostream>
namespace foo {
void bar() {
std::cout << "bar" << std::endl;
}
}
It is not the namespace that makes it fail, but the identation. The following fails too:
#include <iostream>
void bar() {
std::cout << "bar" << std::endl;
}
Any idea why and how to have it to work?
Thanks!!
EDIT: Ok the solution is indeed speedbar+sementics. It actually works (I had something wrong in my init.el...)
Maybe, the example regexp from imenu.el is used together with imenu-example--create-c-index:
(defvar imenu-example--function-name-regexp-c
(concat
"^[a-zA-Z0-9]+[ \t]?" ; type specs; there can be no
"\\([a-zA-Z0-9_*]+[ \t]+\\)?" ; more than 3 tokens, right?
"\\([a-zA-Z0-9_*]+[ \t]+\\)?"
"\\([*&]+[ \t]*\\)?" ; pointer
"\\([a-zA-Z0-9_*]+\\)[ \t]*(" ; name
))
The caret ^ at the beginning means beginning of line. If you insert [[:blank:]]* behind it also function definitions with leading spaces are indexed.
I do not know whether stuff like
else if(...) {
...
}
gives false positives in this case. (You have to try.)
Actually, if I had sufficient time I would try to use semantic or ctags for the indexing. That would be much more robust.
Note, I did not try this. I just had a look at imenu.el. (Currently, I do not have much spare time. Sorry.)

How to redirect std::cout to a UITextView?

I'm adding C++ code to an iOS application, and I would like to use a UITextView as a way to display what's going through std::cout. I don't want to modify the C++ code too much.
So far, I have defined a string stream named stdcout, in the scope of the C++ code I'm interested in capturing the output, and I'm updating the UITextView after the C++ block returns. This is a bit intrusive, as I need to do some manual text replacing, and it's error prone.
Is there a better way to do this ?
You can look at rdbuf().
If you care about performance/flexibility, you could write a custom stream buffer and implement the overflow members so that you get "automatic" "live" updating.
Here's a simple example relaying to a stringstream:
#include <sstream>
#include <iostream>
int main()
{
std::ostringstream oss;
auto saved = std::cout.rdbuf(oss.rdbuf());
std::cout << "hello world" << std::endl;
std::cout.rdbuf(saved);
return oss.str().length();
}
This program exits with exitcode '12' on my cygwin shell:
./test.exe; echo $?
12