QString splitting multiple delimiters - regex

I'm having trouble splitting a QString properly. Unless I'm mistaken, for multiple delimiters I need a regex, and I can't seem to figure out an expression as I'm quite new to them.
the string is text input from a file:
f 523/845/1 524/846/2 562/847/3 564/848/4
I need each number seperately to put into an array.
Some codes....
QStringList x;
QString line = in.readLine();
while (!line.isNull()) {
QRegExp sep("\\s*/*");
x = line.split(sep);
Any pointers?
Cheers

Change your regular expression like this:
QRegExp sep("(\\s+|/)");
then x will have every number.

I found it quite useful to try out RegEx's interactively. Nowadays there are a lot of online tools even, for example: http://gskinner.com/RegExr/
You can put your search text there and play with the RegEx to see what is matched when.

You could use the strtok function, which split a QString with one or more different tokens.
It would be like this:
QString a = "f 523/845/1 524/846/2 562/847/3 564/848/4";
QByteArray ba = a.toLocal8Bit();
char *myString = ba.data();
char *p = strtok(myString, " /");
while (p) {
qDebug() << "p : " << p;
p = strtok(NULL, " /");
}
You can set as many tokens as you need. For further info visit the cplusplus page of this particular function. http://www.cplusplus.com/reference/cstring/strtok/
Regards!.

Related

Extract string matching a specific format

Given a QString, I want to extract a substring from the main string input.
e.g. I have a QString reading something like:
\\\\?\\Volume{db41aa6a-c0b8-11e9-bc8a-806e6f6e6963}\\
I need to extract the string (if a string with the format exists) using a template/format matching a regex format (\w){8}([-](\w){4}){3}[-](\w){12} as shown below:
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
and it should return
db41aa6a-c0b8-11e9-bc8a-806e6f6e6963
if found, else an empty QString.
Currently, I can achieve this by doing something like:
string.replace("{", "").replace("}", "").replace("\\", "").replace("?", "").replace("Volume", "");
But this is tedious and inefficient, and tailored to a specific request.
Is there a generalized function that enables me to extract a substring using a regex format or other?
Update
To clarity after #Emma's answer, I want e.g. QString::extract("(\w){8}([-](\w){4}){3}[-](\w){12}") which returns db41aa6a-c0b8-11e9-bc8a-806e6f6e6963.
Here's a bunch of ways to extract part of a string as presented in the question. I don't know how much of the string format is fixed vs. variable, so possibly not all of these examples would be practical. Also some examples below are using QStringRef class which can be more efficient but must have the original string (the one being referenced) available while any references are active (see warning in docs).
const QString str("\\\\?\\Volume{db41aa6a-c0b8-11e9-bc8a-806e6f6e6963}\\");
// Treat str as a list delimited by "{" and "}" chars.
const QString sectResult = str.section('{', 1, 1).section('}', 0, 0); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
const QString sectRxResult = str.section(QRegExp("\\{|\\}"), 1, 1); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
// Example using QStringRef, though this could also be just QString::split() which returns QString copies.
const QVector<QStringRef> splitRef = str.splitRef(QRegExp("\\{|\\}"));
const QStringRef splitRefResult = splitRef.value(1); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
// Use regular expressions to find/extract matching string
const QRegularExpression rx("\\w{8}(?:-(\\w){4}){3}-\\w{12}"); // match a UUID string
const QRegularExpressionMatch match = rx.match(str);
const QString rxResultStr = match.captured(0); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
const QStringRef rxResultRef = match.capturedRef(0); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
const QRegularExpression rx2(".+\\{([^{\\}]+)\\}.+"); // capture anything inside { } brackets
const QRegularExpressionMatch match2 = rx2.match(str);
const QString rx2ResultStr = match2.captured(1); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
// Make a copy for replace so that our references to the original string remain valid.
const QString replaceResult = QString(str).replace(rx2, "\\1"); // = "db41aa6a-c0b8-11e9-bc8a-806e6f6e6963"
qDebug() << sectResult << sectRxResult << splitRefResult << rxResultStr
<< rxResultRef << rx2ResultStr << replaceResult;
Maybe,
Volume{(\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b)}
or just,
\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b
for a full match might be a bit closer.
If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.
RegEx Circuit
jex.im visualizes regular expressions:
Source
Searching for UUIDs in text with regex

In Qt; what is the best method to capitalise the first letter of every word in a QString?

I am thinking of regular expressions, but that is not exactly readable. There are also functions like s.toUpper() to consider, and probably other things as well.
So what is the best method for capitalising the first letter of words in a QString?
Using this example as a reference, you can do something like this:
QString toCamelCase(const QString& s)
{
QStringList parts = s.split(' ', QString::SkipEmptyParts);
for (int i = 0; i < parts.size(); ++i)
parts[i].replace(0, 1, parts[i][0].toUpper());
return parts.join(" ");
}
Exactly the same, but written differently :
QString toCamelCase(const QString& s)
{
QStringList cased;
foreach (QString word, s.split(" ", QString::SkipEmptyParts))cased << word.at(0).toUpper() + word.mid(1);
return cased.join(" ");
}
This uses more memory but is without pointer access (no brackets operator).
There is an alternative way of doing this which iterates using references to the words and modifies the first character using a QChar reference instead:
QString capitalise_each_word(const QString& sentence)
{
QStringList words = sentence.split(" ", Qt::SkipEmptyParts);
for (QString& word : words)
word.front() = word.front().toUpper();
return words.join(" ");
}
Note that Qt::SkipEmptyParts is required here (as in the other answers to this question) since the first character of each word is assumed to exist when capitalising. This assumption will not hold with Qt::KeepEmptyParts (the default).
Incredible C++/Qt... You just want to get some chars ored with 0x20...

How to organize or extract info from a QByteArray

I have a programm that recieves a full block in a single QByteArray. This block is "divided" with 'carriage returns' followed by 'end lines' (\r\n). In the middle of all this junk I have a date. Most specifically in the third line (between the second and the third \r\n).
Every time I try to extract this date from the ByteArray I end up with some random junk. How to be more precise with the QByteArray?
What is the best way of extracting this date without altering my ByteArray? Take in consideration that I don't know the date and it can even be in the wrong format.
Just for understanding purposes, here is an example of my ByteArray:
RandomName=name\r\nRandomID=ID\r\nRandomDate=date\r\nRandomTime=time\r\nRandomWhatever=whatever(...)
EDIT:
Sorry for bad english.
Let's say I have the following text sent to me:
ProgName = Marcus
ProgID = 180
ProgDate = 15.01.16
ProgTime = 13:39
(More info)......
However, none of this information is useful to me... except the Date. Everything was stored in a single QByteArray (Let's call it 'ba'). So this is my ba:
ProgName(space)=(space)Marcus\r\nProgID(space)=(space)180\r\nProgDate(space)=(space)15.01.16\r\nProgTime(space)=(space)13:39\r\n (keeps going)
My problem is: Storing "15.01.16" (the "ProgDate") in a QString without altering or destroying ba.
There are a variety of ways, but try one of the following solutions.
1) using split()
foreach (auto subByte, yourByteArray.replace("\r\n", "\n").split('\n')) {
qDebug() << subByte;
foreach (auto val, subByte.split('=')) {
qDebug() << val;
}
}
2) using QRegularExpression/QRegularExpressionMatchIterator, making all pair(key, value)
QRegularExpression re("(\\w+)=(\\w+)");
QRegularExpressionMatchIterator i = re.globalMatch(yourByteArray);
while (i.hasNext()) {
QRegularExpressionMatch match = i.next();
qDebug() << match.captured(0)<< match.captured(1) << match.captured(2);
}
3) using QRegularExpression/QRegularExpressionMatch
QRegularExpression re("(RandomDate)=(\\w+)");
QRegularExpressionMatch match = re.match(yourByteArray);
if (match.hasMatch())
qDebug() << match.captured(0)<< match.captured(1) << match.captured(2);

Qt Using QRegularExpression multiline option

I'm writing a program that use QRegularExpression and MultilineOption, I wrote this code but matching stop on first line. Why? Where am I doing wrong?
QString recv = "AUTH-<username>-<password>\nINFO-ID:45\nREG-<username>-<password>-<name>-<status>\nSEND-ID:195-DATE:12:30 2/02/2015 <esempio>\nUPDATEN-<newname>\nUPDATES-<newstatus>\n";
QRegularExpression exp = QRegularExpression("(SEND)-ID:(\\d{1,4})-DATE:(\\d{1,2}):(\\d) (\\d{1,2})\/(\\d)\/(\\d{2,4}) <(.+)>\\n|(AUTH)-<(.+)>-<(.+)>\\n|(INFO)-ID:(\\d{1,4})\\n|(REG)-<(.+)>-<(.+)>-<(.+)>-<(.+)>\\n|(UPDATEN)-<(.+)>\\n|(UPDATES)-<(.+)>\\n", QRegularExpression::MultilineOption);
qDebug() << exp.pattern();
QRegularExpressionMatch match = exp.match(recv);
qDebug() << match.lastCapturedIndex();
for (int i = 0; i <= match.lastCapturedIndex(); ++i) {
qDebug() << match.captured(i);
}
Can someone help me?
The answer is you should use .globalMatch method rather than .match.
See QRegularExpression documentation on that:
Attempts to perform a global match of the regular expression against
the given subject string, starting at the position offset inside the
subject, using a match of type matchType and honoring the given
matchOptions. The returned QRegularExpressionMatchIterator is
positioned before the first match result (if any).
Also, you can remove the QRegularExpression::MultilineOption option as it is not being used.
Sample code:
QRegularExpressionMatchIterator i = exp.globalMatch(recv);
while (i.hasNext()) {
QRegularExpressionMatch match = i.next();
// ...
}
Actually I google'd this question having similar issue, but I couldn't agree completely with an answer, as I think most of the questions about multi-line matching with new QRegularExpression can be answered as following:
use QRegularExpression::DotMatchesEverythingOption option which allows (.) to match newline characters. Which is extremely useful then porting from QRegExp
you got an or Expression and the first one is true, job is done.
you need to split the string and loop the array to compare with this Expression will work i think.
If the data every times have the same struct you can use something like this:
"(AUTH)-<([^>]+?)>-<([^>]+?)>\\nINFO-ID:(\\d+)\\n(REG)-<([^>]+?)>-<([^>]+?)>-<([^>]+?)>-<([^>]+?)>\\n(SEND)-ID:(\\d+)-DATE:(\\d+):(\\d+) (\\d+)/(\\d+)/(\\d+) <([^>]+?)>\\n(UPDATEN)-<([^>]+?)>\\n(UPDATES)-<([^>]+?)>"
21 Matches

Comparing regex in qt

I have a regex which I hope means any file with extension listed:
((\\.cpp$)|(\\.cxx$)|(\\.c$)|(\\.hpp$)|(\\.h$))
How to compare it in Qt against selected file?
Your actual RegEx itself doesn't have double backslashes (just when you fit it into a string literal). And you'll need some kind of wildcard if you want to use it to match full filenames. There's a semantic issue of whether you want a file called just ".cpp" to match or not. What about case sensitivity?
I'll assume for the moment that you want at least one other character in the beginning and use .+:
.+((\.cpp$)|(\.cxx$)|(\.c$)|(\.hpp$)|(\.h$))
So this should work:
QRegExp rx (".+((\\.cpp$)|(\\.cxx$)|(\\.c$)|(\\.hpp$)|(\\.h$))");
bool isMatch = rx.exactMatch(filename);
But with the expressive power of a whole C++ compiler at your beck and call, it can be a bit stifling to use regular expressions. You might have an easier time adapting code if you write it more like:
bool isMatch = false;
QStringList fileExtensionList;
fileExtensionList << "CPP" << "CXX" << "C" << "HPP" << "H";
QStringList splitFilenameList = filename.split(".");
if(splitFilenameList.size() > 1) {
QString fileExtension = splitFilenameList[splitFilenameList.size() - 1];
isMatch = fileExtensionList.contains(fileExtension.toUpper()));
}