Qt5 Qregexp : why my pattern can't work? - regex

I get this problem When I open a text file, I can't get any matched string. Then I test this pattern: .* but I can either get nothing. I'm sure the text file can be read, and the pattern can be accepted in grep. Thank you.
QList<Nmap_result> ans;
QFile file(path);
if(!file.open(QFile::ReadOnly|QFile::Text))
{
exit(1);
}
QString text = file.readAll();
QRegExp reg(QRegExp::escape(".*"));
reg.indexIn(text);
qDebug()<<reg.capturedTexts().join("|")<<endl<<reg.captureCount()<<endl;
Sorry, I should not use escape. But when I change it like this:
QString text = file.readAll();
qDebug()<<text<<endl;
QRegExp reg("[0-9]");
//reg.indexIn(text); //first bind expr test
reg.exactMatch(text); //second bind expr test
qDebug()<<reg.capturedTexts().join("|!!!!!|")<<endl<<reg.captureCount()<<endl;
I use
reg.indexIn(text);
to bind this string to regexp, it return a number,but when I use the next expr
reg.exeacMatch(text);
I get nothing.

Why do you call QRegExp::escape method ?
Try this instead:
QRegExp reg(".*");
Calling QRegExp::escape, your regular expression becomes similar to this string: "\\.\\*". This string indicates that you want to match a dot immediatly followed by a star. This is not the intented use here: match zero or more characters (.*).

Related

Extract string matching a specific format

Given a QString, I want to extract a substring from the main string input.
e.g. I have a QString reading something like:
\\\\?\\Volume{db41aa6a-c0b8-11e9-bc8a-806e6f6e6963}\\
I need to extract the string (if a string with the format exists) using a template/format matching a regex format (\w){8}([-](\w){4}){3}[-](\w){12} as shown below:
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
and it should return
db41aa6a-c0b8-11e9-bc8a-806e6f6e6963
if found, else an empty QString.
Currently, I can achieve this by doing something like:
string.replace("{", "").replace("}", "").replace("\\", "").replace("?", "").replace("Volume", "");
But this is tedious and inefficient, and tailored to a specific request.
Is there a generalized function that enables me to extract a substring using a regex format or other?
Update
To clarity after #Emma's answer, I want e.g. QString::extract("(\w){8}([-](\w){4}){3}[-](\w){12}") which returns db41aa6a-c0b8-11e9-bc8a-806e6f6e6963.
Here's a bunch of ways to extract part of a string as presented in the question. I don't know how much of the string format is fixed vs. variable, so possibly not all of these examples would be practical. Also some examples below are using QStringRef class which can be more efficient but must have the original string (the one being referenced) available while any references are active (see warning in docs).
const QString str("\\\\?\\Volume{db41aa6a-c0b8-11e9-bc8a-806e6f6e6963}\\");
// Treat str as a list delimited by "{" and "}" chars.
const QString sectResult = str.section('{', 1, 1).section('}', 0, 0); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
const QString sectRxResult = str.section(QRegExp("\\{|\\}"), 1, 1); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
// Example using QStringRef, though this could also be just QString::split() which returns QString copies.
const QVector<QStringRef> splitRef = str.splitRef(QRegExp("\\{|\\}"));
const QStringRef splitRefResult = splitRef.value(1); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
// Use regular expressions to find/extract matching string
const QRegularExpression rx("\\w{8}(?:-(\\w){4}){3}-\\w{12}"); // match a UUID string
const QRegularExpressionMatch match = rx.match(str);
const QString rxResultStr = match.captured(0); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
const QStringRef rxResultRef = match.capturedRef(0); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
const QRegularExpression rx2(".+\\{([^{\\}]+)\\}.+"); // capture anything inside { } brackets
const QRegularExpressionMatch match2 = rx2.match(str);
const QString rx2ResultStr = match2.captured(1); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
// Make a copy for replace so that our references to the original string remain valid.
const QString replaceResult = QString(str).replace(rx2, "\\1"); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
qDebug() << sectResult << sectRxResult << splitRefResult << rxResultStr
<< rxResultRef << rx2ResultStr << replaceResult;
Maybe,
Volume{(\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b)}
or just,
\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b
for a full match might be a bit closer.
If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.
RegEx Circuit
jex.im visualizes regular expressions:
Source
Searching for UUIDs in text with regex

Non matching number Regex in Qt

I am working on a number parser for q Qt calculator. I have create a few regex in order to match with the different kinds of number:
Rationnal : ^[+-]?\d+\/[+-]?\d+$
Integer : ^[+-]?\d+\.?0*$
Real : ^[+-]?\d*\.0*[1-9][0-9]*$
Complexe : ^[+-]?[0-9]*(\.[0-9]*|\/[+-]?[0-9]+)?\$[+-]?[0-9]*(\.[0-9]*|\/[+-]?[0-9]+)?$
I try to use them with the following fuction on QString:
parser.h :
bool isRationnal(const QString s)
{
QRegExp ratioExp ("^[+-]?\d+\/[+-]?\d+$");
return ratioExp.exactMatch(s);
}
bool isInteger(const QString s)
{
QRegExp regExp ("^[+-]?\d+\.?0*$");
return regExp.exactMatch(s);
}
bool isReal(const QString s)
{
QRegExp regExp ("^[+-]?\d*\.0*[1-9][0-9]*$");
return regExp.exactMatch(s);
}
bool isComplex(const QString s)
{
QRegExp regExp ("^[+-]?[0-9]*(\.[0-9]*|\/[+-]?[0-9]+)?\$[+-]?[0-9]*(\.[0-9]*|\/[+-]?[0-9]+)?$");
return regExp.exactMatch(s);
}
bool isNumber(const QString s){
bool ok=false;
s.toInt(&ok);
return ok;
}
main.c :
QString test ="6/90";
if(isReal(test))
printf("real\n");
if(isInteger(test))
printf("Integer\n");
if(isRationnal(test))
printf("Rationnal\n");
if(isNumber(test))
printf("Number\n");
else printf("et merde\n");
QRegExp ratioExp ("^[+-]?\d+\/[+-]?\d+$");
QRegExp IntExp ("^[+-]?\d+\.?0*$");
QRegExp RealExp ("^[+-]?\d*\.0*[1-9][0-9]*$");
QRegExp CplxExp ("^[+-]?[0-9]*(\.[0-9]*|\/[+-]?[0-9]+)?\$[+-]?[0-9]*(\.[0-9]*|\/[+-]?[0-9]+)?$");
if(ratioExp.isValid())
printf("ratio valide\n");
if(IntExp.isValid())
printf("int valide\n");
if(RealExp.isValid())
printf("real valide\n");
if(CplxExp.isValid())
printf("cplx valide\n");
return 0;
}
I tried to run this code with several QString which are numbers, but it usually fail. Especially : it doesn't mathc with number if there is only one caracter like test="4".
Do you know why are those boolean function failing? Maybe I am using the wrong Qt function but after trying several this one looks like what I am searching for.
Feel free to give constructive critism,
Thank you very much,
Théophile
Your expression is fine, but replace \d with \\d. Tested you code in my program.
And I really recommend to this site which can debug your expression online.
Details from Qt Official Documentation :
Note: The C++ compiler transforms backslashes in strings. To include a
\ in a regexp, enter it twice, i.e. \. To match the backslash
character itself, enter it four times, i.e. \\.

How to QRegExp "[propertyID="anything"] "?

I am parsing a file which contains following packets:
[propertyID="123000"] {
fillColor : #f3f1ed;
minSize : 5;
lineWidth : 3;
}
To scan just this [propertyID="123000"] fragment I havre this QRegExp
QRegExp("^\b\[propertyID=\"c+\"\]\b");
but that does not work? Here I have example code to parse that file above:
QRegExp propertyIDExp= QRegExp("\\[propertyID=\".*\"]");
propertyIDExp.setMinimal(true);
QFile inputFile(fileName);
if (inputFile.open(QIODevice::ReadOnly))
{
QTextStream in(&inputFile);
while (!in.atEnd())
{
QString line = in.readLine();
// if does not catch if line is for instance
// [propertyID="123000"] {
if( line.contains(propertyIDExp) )
{
//.. further processing
}
}
inputFile.close();
}
QRegExp("\\[propertyID=\".+?\"\\]")
You can use ..It will match any character except newline.Also use +? to make it non greedy or it will stop at the last instance of " in the same line
Use the following expression:
QRegExp("\\[propertyID=\"\\d+\"]");
See regex demo
In Qt regex, you need to escape regex special characters with double backslashes, and to match digits, you can use the shorthand class \d. Also, \b word boundary prevented your regex from matching since it cannot match between the string start and [ and between ] and a space (or use \B instead).
To match anything in between quotes, use a negated character class:
QRegExp("\\[propertyID=\"[^\"]*\"]");
See another demo
As an alternative, you can use lazy dot matching with the help of .* and QRegExp::setMinimal():
QRegExp rx("\\[propertyID=\".*\"]");
rx.setMinimal(true);
In Qt, . matches any character including a newline, so please be careful with this option.

Qt Using QRegularExpression multiline option

I'm writing a program that use QRegularExpression and MultilineOption, I wrote this code but matching stop on first line. Why? Where am I doing wrong?
QString recv = "AUTH-<username>-<password>\nINFO-ID:45\nREG-<username>-<password>-<name>-<status>\nSEND-ID:195-DATE:12:30 2/02/2015 <esempio>\nUPDATEN-<newname>\nUPDATES-<newstatus>\n";
QRegularExpression exp = QRegularExpression("(SEND)-ID:(\\d{1,4})-DATE:(\\d{1,2}):(\\d) (\\d{1,2})\/(\\d)\/(\\d{2,4}) <(.+)>\\n|(AUTH)-<(.+)>-<(.+)>\\n|(INFO)-ID:(\\d{1,4})\\n|(REG)-<(.+)>-<(.+)>-<(.+)>-<(.+)>\\n|(UPDATEN)-<(.+)>\\n|(UPDATES)-<(.+)>\\n", QRegularExpression::MultilineOption);
qDebug() << exp.pattern();
QRegularExpressionMatch match = exp.match(recv);
qDebug() << match.lastCapturedIndex();
for (int i = 0; i <= match.lastCapturedIndex(); ++i) {
qDebug() << match.captured(i);
}
Can someone help me?
The answer is you should use .globalMatch method rather than .match.
See QRegularExpression documentation on that:
Attempts to perform a global match of the regular expression against
the given subject string, starting at the position offset inside the
subject, using a match of type matchType and honoring the given
matchOptions. The returned QRegularExpressionMatchIterator is
positioned before the first match result (if any).
Also, you can remove the QRegularExpression::MultilineOption option as it is not being used.
Sample code:
QRegularExpressionMatchIterator i = exp.globalMatch(recv);
while (i.hasNext()) {
QRegularExpressionMatch match = i.next();
// ...
}
Actually I google'd this question having similar issue, but I couldn't agree completely with an answer, as I think most of the questions about multi-line matching with new QRegularExpression can be answered as following:
use QRegularExpression::DotMatchesEverythingOption option which allows (.) to match newline characters. Which is extremely useful then porting from QRegExp
you got an or Expression and the first one is true, job is done.
you need to split the string and loop the array to compare with this Expression will work i think.
If the data every times have the same struct you can use something like this:
"(AUTH)-<([^>]+?)>-<([^>]+?)>\\nINFO-ID:(\\d+)\\n(REG)-<([^>]+?)>-<([^>]+?)>-<([^>]+?)>-<([^>]+?)>\\n(SEND)-ID:(\\d+)-DATE:(\\d+):(\\d+) (\\d+)/(\\d+)/(\\d+) <([^>]+?)>\\n(UPDATEN)-<([^>]+?)>\\n(UPDATES)-<([^>]+?)>"
21 Matches

Regular expression for highlighting words in quotes int qt5

I use QHighlighter class, and used regExp to highlight words in quotes:
void Highlighter::highlightBlock(const QString &text)
{
QRegExp expr("\"(.*?)\"");
int index = expr.indexIn(text);
while(index >=0)
{
int length = expr.matchedLength();
setFormat(index, length, Qt::red);
index = expr.indexIn(text, index+length);
}
}
It doesn't work. Work this:
"\".*\""
But it highlights unnecessary. What regular expression is correct?
Just higlight everything between quotes
QRegExp("\"([^\"]*)\"");
highlight single words (run in loop with offset to match words)
QRegExp("\"(\\w)*\"");
How to match words in quotes:
('|")[^\1]*?\1
Example:
http://regex101.com/r/iF5aA1