I am modifying a Node native extension that is spawning native threads to do some processing. My issue is that I'd like to have the Javascript code provide a filter for the processing to exclude some data.
At this point, I'm passing a JS RegExp string from JS to C++, creating a std::regex instance from it, and passing it around the different structures down to the native thread logic.
My issue now is that despite std::regex using what seems to be the same syntax as ECMAScript regular expressions, the behavior is not the same :(
My original plan was to rely on V8's RegExp engine somehow but trigger the C++ bits directly instead of going from C++ to JS and back. But I wasn't able to find how to do this.
As example, see the following programs using the same regex but yielding different results:
#include <stdio.h>
#include <regex>
int main() {
std::regex re("^(?:(?:(?!(?:\\/|^)\\.).)*?\\/c)$");
std::smatch match;
std::string input("a.b/c");
int result = std::regex_match(input, match, re);
if (result == 1) {
printf("ok");
} else {
printf("nok");
}
return 0;
}
The equivalent JS code:
const re = new RegExp("^(?:(?:(?!(?:\\/|^)\\.).)*?\\/c)$");
const match = re.exec("a.b/c");
if (match) {
console.log("ok");
} else {
console.log("nok");
}
My question then is: What can I do to get the same results I would in JS but in C++? Is it possible to run V8's RegExp from a pure C++ context?
Related
there any possible way to check that the specified string is a valid url or not. The solution must be in c++ and it should work without internet.
example strings are
good.morning
foo.goo.koo
https://hhhh
hdajdklbcbdhd
8881424.www.hfbn55.co.in/sdfsnhjk
://dgdh24.vom
dfgdfgdf(2001)/.com/sdgsgh
\adiihsdfghnhg.co.inskdhhj
aser//www.gtyuh.co.uk/kdsfgdfgfrgj
Chose a symphatetic regular expression like /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/.
Use std regex, or boost regex if you don't have C++11:
if (std::regex_match ("http://subject", std::regex("^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$") )) {
// ...
}
You could use regex.
What a regex is.
With C++11 the regex are build-in the STD library
regex c++11.
If you cannot use C++11, for some reason, you could use boost library.
Anyway you could check the patter of an url with:
#include <regex> //require c++11
// ...
// regex pattern
std::string pattern = "https?:\/\/(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,4}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)";
// Construct regex object
std::regex url_regex(pattern);
// An url-string for example
std::string my_url = "http://www.google.com/img.png";
// Check for match
if (std::regex_match(my_url, url_regex) == true) {
std::cout << "This is a well-formed url\n";
} else {
std::cout << "Ill-formed url\n";
}
I would like to understand why my program crashes when I try to use the wsregex::compile of BOOST with the following string:
(?P<path>\b[a-z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*)?
(:)?
(?P<ip>(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)
(;(?P<port>\d*))?
(:(?P<port>\b\d+\b):(?P<password>[\w]*))?
(:(?P<password>\b\d+\b))?
In regex buddy everything appears to be fine. I used the JGSoft flavor option on RegexBuddy.
I am validating the following:
c:\My Documents\Test\test.csv:1.12.12.13:111:admin
c:\My Documents\Test\test.csv:1.12.12.13:111
c:\My Documents\Test\test.csv:1.12.12.13;111
1.12.12.13:111
1.12.12.13;111
Can you guys help me. Thanks a lot.
This is neither a memory leak nor a crash as far as I can tell. Xpressive is throwing an exception because this is an invalid pattern. The following program:
#include <iostream>
#include <boost/xpressive/xpressive_dynamic.hpp>
namespace xpr = boost::xpressive;
int main()
{
const char pattern[] =
"(?P<path>\\b[a-z]:\\\\(?:[^\\\\/:*?\"<>|\\r\\n]+\\\\)*[^\\\\/:*?\"<>|\\r\\n]*)?"
"(:)?"
"(?P<ip>(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b)"
"(;(?P<port>\\d*))?"
"(:(?P<port>\\b\\d+\\b):(?P<password>[\\w]*))?"
"(:(?P<password>\\b\\d+\\b))?";
try
{
xpr::sregex rx = xpr::sregex::compile(pattern);
}
catch(xpr::regex_error const & e)
{
std::cout << e.what() << std::endl;
}
}
Outputs:
named mark already exists
Indeed, it does. This pattern uses "port" and "password" twice as the name of a capturing group. Xpressive doesn't like that. Just pick unique names for your captures and you should be fine.
I am using QRegExp and tries to find whether a QString is containing some pattern. There is no compiling error, but no match is identified at runtime where identification should normally happen. I tested the regexp in Python shell and match occurs with Python. i checked upon Qt doc that syntax is the same for the ergexp I am using. Here is code sample
bool Thing::isConstraint(const QString &cstr_)
{
QRegExp lB1("^(\d+\.?\d*|\d*\.\d+)<=PARAM(\d+)$");
QRegExp lB2("^PARAM(\d+)>=(\d+\.?\d*|\d*\.\d+)$");
QRegExp lB3("^PARAM(\d+)>(\d+\.?\d*|\d*\.\d+)$");
QRegExp lB4("^(\d+\.?\d*|\d*\.\d+)<PARAM(\d+)$");
QRegExp uB5("^(\d+\.?\d*|\d*\.\d+)>=PARAM(\d+)$");
QRegExp uB6("^(\d+\.?\d*|\d*\.\d+)>PARAM(\d+)$");
QRegExp uB7("^PARAM(\d+)<=(\d+\.?\d*|\d*\.\d+)$");
QRegExp uB8("^PARAM(\d+)<(\d+\.?\d*|\d*\.\d+)$");
QRegExp luB9("^(\d+\.?\d*|\d*\.\d+)>=PARAM(\d+)>=(\d+\.?\d*|\d*\.\d+)$");
QRegExp luB10("^(\d+\.?\d*|\d*\.\d+)>PARAM(\d+)>=(\d+\.?\d*|\d*\.\d+)$");
QRegExp luB11("^(\d+\.?\d*|\d*\.\d+)>=PARAM(\d+)>(\d+\.?\d*|\d*\.\d+)$");
QRegExp luB12("^(\d+\.?\d*|\d*\.\d+)>PARAM(\d+)>(\\d+\.?\d*|\d*\.\d+)$");
QRegExp luB13("^(\d+\.?\d*|\d*\.\d+)<=PARAM(\d+)<=(\d+\.?\d*|\d*\.\d+)$");
QRegExp luB14("^(\d+\.?\d*|\d*\.\d+)<=PARAM(\d+)<(\d+\.?\d*|\d*\.\d+)$");
QRegExp luB15("^(\d+\.?\d*|\d*\.\d+)<PARAM(\d+)<=(\d+\.?\d*|\d*\.\d+)$");
QRegExp luB16("^(\d+\.?\d*|\d*\.\d+)<PARAM(\d+)<(\d+\.?\d*|\d*\.\d+)$");
int pos_=0;
if((pos_ = lB1.indexIn(cstr_)) != -1)
{
m_func->setLowerBound((lB1.cap(2)).toInt(),(lB1.cap(1)).toDouble());
return true;
}
else if((pos_ = lB2.indexIn(cstr_)) != -1)
{
m_func->setLowerBound((lB2.cap(1)).toInt(),(lB2.cap(2)).toDouble());
return true;
}
/*
...
*/
return false;
}
This method is called in this other method:
void Thing::setConstraints(QStringList &constraints_)
{
if(!m_func)
return;
for(int j=0;j<constraints_.size();j++)
{
if(isConstraint(constraints_.at(j)))
{
constraints_.removeAt(j);
}
}
m_func->setConstraints(constraints_);
}
In VS2010 Watch, error for lB1.indexIn(cstr_) is: Error: argument list does not match a function .
Second, I would like the isConstraint() method to begin with this check and replace for whitespaces:
QRegExp wsp ("\s+");
cstr_.replace(wsp,"");
how to proceed avoiding const_cast ??
Thanks and regards.
edit ---------
needed to double backslash in C++ - different from Python. Tks!
I think you asked two questions, so I'll try to answer them:
1) Your regular expressions are most likely not passing because you need to escape your backslashes so that C++ doesn't mess up your strings. For example:
QRegExp lB1("^(\\d+\\.?\\d*|\\d*\\.\\d+)<=PARAM(\\d+)$");
2) To avoid using const_cast you can either change your function signature to this:
bool Thing::isConstraint( QString cstr_)
or make a copy of the cstr_ object and operate on the copy instead.
As a side note, you may want to take a look at the QRegExp::exactMatch() function which obviates the need to use ^ and $ at the beginning and end of all of your expressions, and also has a bool return value which would make your if statements a little cleaner.
So we have code like:
#include "cpptk.h"
#include <stdio.h>
using namespace Tk;
void hello() {
puts("Hello C++/Tk!");
}
int main(int, char *argv[])
{
static char* str = "button .a -text "Say Hello ppure TCL"\n"
"pack .a\n";
init(argv[0]);
button(".b") -text("Say Hello") -command(hello);
pack(".b") -padx(20) -pady(6);
runEventLoop();
}
imagine str is complex tcl code. We want to feed it to C++/Tk as a string. Also we want to have it exequted in the same TCL vm our general C++/Tk programm with gui we created in C++/Tk code runs. So the result of this code would be 2 buttons inside a window.
How to do such thing?
How to do such thing?
Have you got access to the Tcl_Interp* handle used inside C++/Tk? If so (and assuming here you've got it in a variable called interp) use:
int resultCode = Tcl_Eval(interp, str);
Next, check the resultCode to see if it is TCL_OK or TCL_ERROR (other values are possible, but uncommon in normal scripts). That tells you the interpretation of the “result”, which you get like this:
const char *result = Tcl_GetString(Tcl_GetObjResult(interp));
If the result code says its an error, result is now an error message. If it was ok, the result is the output of the script (NB: not what was written to standard out though). It's up to you what to do with that.
[EDIT]: I looked this up in more detail. It's nastier than it appears, because C++/Tk hides away Tcl quite deep inside itself. In so far as I can see, you do this (untested!):
#include "cpptk.h" // might need "base/cpptkbase.h" instead
#include <string>
// This next part is in a function or method...
std::string script("the script to evaluate goes here");
std::string result = Tk::details::Expr(script,true);
How to implement the ls "filename_[0-5][3-4]?" like class? The result I would like to store in the vector.
Currently I am using system() which is calling ls, but this is not portable under MS.
thanks,
Arman.
The following program lists files in the current directory whose name matches the regular expression filename_[0-5][34]:
#include <boost/filesystem.hpp>
#include <boost/regex.hpp> // also functional,iostream,iterator,string
namespace bfs = boost::filesystem;
struct match : public std::unary_function<bfs::directory_entry,bool> {
bool operator()(const bfs::directory_entry& d) const {
const std::string pat("filename_[0-5][34]");
std::string fn(d.filename());
return boost::regex_match(fn.begin(), fn.end(), boost::regex(pat));
}
};
int main(int argc, char* argv[])
{
transform_if(bfs::directory_iterator("."), bfs::directory_iterator(),
std::ostream_iterator<std::string>(std::cout, "\n"),
match(),
mem_fun_ref(&bfs::directory_entry::filename));
return 0;
}
I omitted the definition of transform_if() for brevity. It isn't a standard function but it should be straightforward to implement.
You can use boost::filesystem which has a portable way to retrieve files in a directory.
Then you can check the files against a regular expression with boost::regex for instance to only keep the ones that match your pattern.
The boost solution is portable, but not optimal on Windows. The reason is that it calls FindFirstFile("*.*") and thus returns everything. Given the globbing pattern, it would be more efficient to call FindFirstFile("filename_?*.*"). You'd still have to filter the results (using e.g. Boost::regex) but this would exclude many files that can't possibly match.
Also, using either method don't forget to filter out directories before doing the regex matching. The check whether an entry is a directory is quite cheap.