Regular expression for match string with new line char - regex

How use regular expression to match in text passphrase between Passphrase= string and \n char (Select: testpasssword)? The password can contain any characters.
My partial solution: Passphrase.*(?=\\nName) => Passphrase=testpasssword
[wifi_d0b5c2bc1d37_7078706c617967726f756e64_managed_psk]\nPassphrase=testpasssword\nName=pxplayground\nSSID=9079706c697967726f759e69\nFrequency=2462\nFavorite=true\nAutoConnect=true\nModified=2018-06-18T09:06:26.425176Z\nIPv4.method=dhcp\nIPv4.DHCP.LastAddress=0.0.0.0\nIPv6.method=auto\nIPv6.privacy=disabled\n

With QRegularExpression that supports PCRE regex syntax, you may use
QString str = "your_string";
QRegularExpression rx(R"(Passphrase=\K.+?(?=\\n))");
qDebug() << rx.match(str).captured(0);
See the regex demo
The R"(Passphrase=\K.+?(?=\\n))" is a raw string literal defining a Passphrase=\K.+?(?=\\n) regex pattern. It matches Passphrase= and then drops the matched text with the match reset operator \K and then matches 1 or more chars, as few as possible, up to the first \ char followed with n letter.
You may use a capturing group approach that looks simpler though:
QRegularExpression rx(R"(Passphrase=(.+?)\\n)");
qDebug() << rx.match(str).captured(1); // Here, grab Group 1 value!
See this regex demo.

The only thing you were missing is the the lazy quantifier telling your regex to only match as much as necessary and a positive lookbehind. The first one being a simple question mark after the plus, the second one just prefacing the phrase you want to match but not include by inputting ?<=. Check the code example to see it in action.
(?<=Passphrase=).+?(?=\\n)
const regex = /(?<=Passphrase=).+?(?=\\n)/gm;
const str = `[wifi_d0b5c2bc1d37_7078706c617967726f756e64_managed_psk]\\nPassphrase=testpasssword\\nName=pxplayground\\nSSID=9079706c697967726f759e69\\nFrequency=2462\\nFavorite=true\\nAutoConnect=true\\nModified=2018-06-18T09:06:26.425176Z\\nIPv4.method=dhcp\\nIPv4.DHCP.LastAddress=0.0.0.0\\nIPv6.method=auto\\nIPv6.privacy=disabled\\n
`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}

Related

How to detect the group of characters in a string using RegEx?

I have strings as follows :
M0122PD12XS1,
M1213234NW,
M1213234EFA1.
I need to read the last two/three characters in each string as follows. There will be at most one number after characters at the end regardless of numbers after it.
I need to read the last characters as follows :
M0122PD12XS1 => XS
M1213234NW => NW
M1213234EFA1=> EFA
I used the regex string as follows but it only read the last two/three characters when there are no other numbers next.
Regex string : ".{0,0}\D*$".
Any help is appreciated.
This might be what you need:
.*[0-9]([a-zA-Z]+)
look here for testing and here for visualization.
I think it should be
([A-Z]{2,})(?:[,.\s\d]+)?$
If no punctation required in the line ends, just
([A-Z]{2,})(?:[\d]+)?$
Where [A-Z]{2,} are 2 and more letters, [\d]+)? are optional numbers in the end of string.
Good question, we would be using the punctuation or space on the right side of our desired two or three letters with a simple expression:
[0-9]([A-Z]{2,3})([0-9])?[,.\s]
and on its left we would use the existing number as a left boundary.
Demo
const regex = /[0-9]([A-Z]{2,3})([0-9])?[,.\s]/gm;
const str = `M0122PD12XS1,
M1213234NW,
M1213234EFA1.
M0122PD12XS1
M1213234NW
M1213234EFA1
`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
[A-Z]{2,3}((?=\d\b)|(?=\r)|(?=.)|(?=\b))
Finds 2 or 3 Alpha Characters [A-Z]{2,3}
Up to, but not including (?=) either
Single Digit, followed by a word boundary (?=\d\b)
Return Character (?=\r)
Period Character (?=\.)
Word Boundary (?=\b)

Regular expression with two unique requirements

I would like a regular expression that matches the following string:
"( one , two,three ,four, '')"
and extracts the following:
"one"
"two"
"three"
""
There could be any number of elements. The Regular expression:
"\[a-zA-Z\]+|(?<=')\\s*(?=')"
works, but the library I am using is not compatible with look-around assertions.
Do I have any options?
This expression would likely capture what we might want to extract here:
(\s+)?([A-Za-z]+)(\s+)?|'(.+)?'
which we might not want other additional boundaries and our desired outputs are in these two groups:
([A-Za-z]+)
(.+)
Demo
RegEx Circuit
jex.im visualizes regular expressions:
Test
const regex = /(\s+)?([A-Za-z]+)(\s+)?|'(.+)?'/gm;
const str = `"( one , two,three ,four, '')"`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}

Match nested capture groups with quantifiers using QRegularExpression

I'm trying to get with a QRegularExpression all attributes of an xml tag in the different captured groups. I use a regex matching the tags and I manage to get the capture groups containing the attribute value but with a quantifier, I get only the last one.
I use this regex :
<[a-z]+(?: [a-z]+=("[^"]*"))*>
And I would like to get "a" and "b" with this text :
<p a="a" b="b">
Here is the code:
const QString text { "<p a=\"a\" b=\"b\">" };
const QRegularExpression pattern { "<[a-z]+(?: [a-z]+=(\"[^\"]*\"))*>" };
QRegularExpressionMatchIterator it = pattern.globalMatch(text);
while (it.hasNext())
{
const QRegularExpressionMatch match = it.next();
qDebug() << "Match with" << match.lastCapturedIndex() + 1 << "captured groups";
for (int i { 0 }; i <= match.lastCapturedIndex(); ++i)
qDebug() << match.captured(i);
}
And the output :
Match with 2 captured groups
"<p a=\"a\" b=\"b\">"
"\"b\""
Is it possible to get multiple capture groups with the quantifier * or have I to iterate using QRegularExpressionMatchIterator with a specific regex on the string literals?
This expression might help you to simply capture those attributes and it is not bounded from left and right:
([A-z]+)(=\x22)([A-z]+)(\x22)
Graph
This graph shows how the expression would work and you can visualize other expressions in this link, if you wish to know:
If you would like to add additional boundaries to it, which you might want to do so, you can further extend it, maybe to something similar to:
(?:^<p )?([A-z]+)(=\x22)([A-z]+)(\x22)
Test for RegEx
const regex = /(?:^<p )?([A-z]+)(=\x22)([A-z]+)(\x22)/gm;
const str = `<p attributeA="foo" attributeB="bar" attributeC="baz" attributeD="qux"></p>`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}

Regex Match only not keywords

I have a string like <dlskjfldjf>Text creation List<string> checking edit<br/>. Need help with regex to match only <dlskjfldjf> and not List<string>
Keyword could be any generic type like
IList<T>
List<T>
etc
I tried with <([a-zA-Z]+)> which would match <dlskjfldjf> and below which would match List<string> but not sure how to mix them both
((List|ilist|IList|IEnumerable|IQuerable)(<)([A-Za-z.,\[\]\s]*)(>))|(<T>)
Use a Zero-width negative lookbehind assertion (?<! expression ):
string pattern = #"(?<!(List|ilist|IList|IEnumerable|IQuerable))<([a-zA-Z]+)>";
In a language that supports Negative Lookbehind a pattern like this could work:
(?<!(List|ilist|IList|IEnumerable|IQuerable))<([a-zA-Z]+)>
In JavaScript you may need to use two patterns to achieve the same result, test once for the angle bracket pattern and then test again to ensure you don't have the type information preceding it.
What you could do is match what you don't want to keep, and capture in a group what you do want to keep:
\b(?:List|ilist|IList|IEnumerable|IQuerable)<[^<>]*>|(<[a-zA-Z]+>)
That will match:
\b Word boundary to prevent any of the listed words in the alternation being part of a larger word
(?: Non capturing group
List|ilist|IList|IEnumerable|IQuerable Alternation which will match any of the listed words
) Close non capturing group
<[^<>]*> Match <, not <> 0+ times, then matc >
| Or
( Capture group (What you want to keep)
<[a-zA-Z]+> Match <, then 1+ times a lower or uppercase char, then >
) Close capture group
For example:
const regex = /\b(?:List|ilist|IList|IEnumerable|IQuerable)<[^<>]*>|(<[a-zA-Z]+>)/g;
const str = `<dlskjfldjf>Text creation List<string> checking edit<br/> or IList<string> or <aAbB>`;
let m;
let res = [];
while ((m = regex.exec(str)) !== null) {
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
if (m[1] !== undefined) res.push(m[1]);
}
console.log(res);

Capturing empty instance in regex

Please pardon if my questions sounds basic. I have a text string with four values:
Field A|Field B|Field C|Field D
While getting an input one or more of these four values can be left blank, e.g:
Field A||Field C|Field D
Or
Field A||Field C||
I need to write a regex that can capture the values appropriately and assign it to specific buckets. Can someone please help?
Depending on the language you are using, they can be slightly different.
The implementation below is based on javascript. Essentially the pattern you're after is something like /(.*?)\|(.*?)\|(.*?)\|(.*)/
What this means is that you're capturing . everything and by specifying *? - this means non greedy capture until the first | pipe is seen.
Since we know there will be 4 groups and the last one will not have a | pipe, then by doing (.*) is adequate for the last set as it just means everything else on the string.
Try this:
const regex = /(.*?)\|(.*?)\|(.*?)\|(.*)/gm;
const str = `Field A||Field C|Field D`;
var m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}