QString convert camel case to space separated words - regex

I am trying to convert a camel cased QString into lowercased words separated by spaces. I currently have:
QString camelCase = "thisIsACamelCaseWord"
QString unCamelCase = camelCase.replace(QRegularExpression("([A-Z])", " $1")).toLower();
Which seems to work here,
"this Is A Camel Case Word"
but it is returning with:
"this $1s $1 $1amel $1ase $1ord"

Since QRegularExpression uses PRCE the back reference syntax is '\0', '\1' and so on as explained in the documentation.

Related

How to extract Json object strings separated by specific pattern?

So the question is quite straightforward. After a couple of hours browsing threads about regex i still can't come up with one that would handle a string as stated in the code section.
Here are some of the regular expressions I tried (without escaping the backslash for the sake of reading) :
/\d+({.*?})(?:(|\d+|$|))/;
/\d+({.+})(?:(|\d+|$|))/;
/\d+({.*?})(?:(|\d+|\B|))/;
/\d+({.+})(?:(|\d+|\B|))/;
/\d+({.*?})(?:(|\d+|))/;
/\d+({.+})\d+/;
/\d+({.*?})\d+/;
This one is the closest i got to what i except:
/\d+({.*?})\d+|\d+({.*?})/
QString haystack = "5:4{"type":"someType","data":{"subJson":123}}"\
"9406:22{"type":"SomeOtherType","data":{"subJson":648,"data":{"subSubJson":25}}}"\
"125:10{"last":79}"; // The quotes are obviously escaped but reading sake...
QRegularExpression re = QRegularExpression("\\d+({.*?})\\d+|\\d+({.*?})");
QRegularExpressionMatchIterator i = re.globalMatch(haystack);
QStringList matches;
while (i.hasNext()) {
QRegularExpressionMatch match = i.next();
QString result = match.captured(1); // Group match
matches << result;
}
qDebug() << matches;
What I expect:
"{"type":"someType","data":{"subJson":123}}"
"{"type":"SomeOtherType","data":{"subJson":648,"data":{"subSubJson":25}}}"
"{"last":79}"
What I actually get:
"{"type":"someType","data":{"subJson":123}}"
"{"type":"SomeOtherType","data":{"subJson":648,"data":{"subSubJson":25}}}"
"" //The last one wasn't matched
BUT the with the full match I get this:
"4{"type":"someType","data":{"subJson":123}}9406"
"22{"type":"SomeOtherType","data":{"subJson":648,"data":{"subSubJson":25}}}125"
"10{"last":79}"
The solution was:
/\d+({.*?})(?:\d+|$)/
First a check for prepending digits with '\d+', then a group match for everything between curly braces without being greedy thus the '({.*?})', and finally an excluded group match '?:' will stop the preivous group match at either a set of digits '\d+' or the end of word '$', '(?:\d+|$)'

python 3 regex string matching ignore whitespace and string.punctuation

I am new to regex and would like to know how to pattern match two strings. The use case would be something like finding a certain phrase in some text. I'm using python 3.7 if that makes a difference.
phrase = "some phrase" #the phrase I'm searching for
Possible matches:
text = "some##$#phrase"
^^^^ #non-alphanumeric can be treated like a single space
text = "some phrase"
text = "!!!some!!! phrase!!!"
These are not matches:
text = "some phrases"
^ #the 's' on the end makes it false
text = "ssome phrase"
text = "some other phrase"
I have tried using something like:
re.search(r'\b'+phrase+'\b', text)
I would very much appreciate an explanation of why the regex works if you provide a valid solution.
You should use something like this:
re.search(r'\bsome\W+phrase\b', text)
'\W' means non-word character
'+' means one or more times
In case you have a given phrase in a variable, you could try this before:
some_phrase = some_phrase.replace(r' ', r'\W+')

Double-escaping regex from inside a Groovy expression

Note: I had to simplify my actual use case to spare SO a lot of backstory. So if your first reaction to this question is: why would you ever do this, trust me, I just need to.
I'm trying to write a Groovy expression that replaces double-quotes (""") that appear in a string with single-quotes ("'").
// BEFORE: Replace my "double" quotes with 'single' quotes.
String toReplace = "Replace my \"double-quotes\" with 'single' quotes.";
// Wrong: compiler error
String replacerExpression = "toReplace.replace(""", "'");";
Binding binding = new Binding();
binding.setVariable("toReplace", toReplace);
GroovyShell shell = new GroovyShell(binding);
// AFTER: Replace my 'double' quotes with 'single' quotes.
String replacedString = (String)shell.evaluate(replacerExpression);
The problem is, I'm getting a compile error on the line where I assign replacerExpression:
Syntax error on token ""toReplace.replace("", { expected
I think it's because I need to escape the string that contains the double-quote character (""") but since it's a string-inside-a-string, I'm not sure how to properly escape it here. Any ideas?
You need to escape the quote within quotes in this line:
String replacerExpression = "toReplace.replace(""", "'");";
The string will be evaluated twice: once as a string literal, and once as a script. This means you have to escape it with a backslash, and escape the backslash too. Also, with the embedded quotes, it'll be much more readable if you use triple quotes.
Try this (in groovy):
String replacerExpression = """toReplace.replace("\\"", "'");""";
In Java, you're stuck with using backslashes to escape all the quotes and the embedded backslash:
String replacerExpression = "toReplace.replace(\"\\\"\", \"\'\");";
Triple-quotes work well, but one can also use single-quoted string to specify a double-quote, and a double-quoted string for a single-quote.
Consider this:
String toReplace = "Replace my \"double-quotes\" with 'single' quotes."
// key line:
String replacerExpression = """toReplace.replace('"', "'");"""
Binding binding = new Binding(); binding.setVariable("toReplace", toReplace)
GroovyShell shell = new GroovyShell(binding)
String replacedString = (String)shell.evaluate(replacerExpression)
That is, after the string literal evaluation, this is evaluated in the Groovy shell:
toReplace.replace('"', "'");
If that is too hard on the eyes, replace the "key line" above with another style (using slashy strings):
String ESC_DOUBLE_QUOTE = /'"'/
String ESC_SINGLE_QUOTE = /"'"/
String replacerExpression = """toReplace.replace(${ESC_DOUBLE_QUOTE}, ${ESC_SINGLE_QUOTE});"""
Please try to use regular expressions to solve this kind of problems, instead of messing your head to tackle the escaping of quotes.
I have put up a solution using groovy console. Please see if that helps.

Find substr between delimiter characters in Qt with RegEx

I need to obtain a substring in a string in Qt, but with a few details:
the substring I need is delimited by [ and ]
the substring might have some unpredictable characters like /, ^, -. This substring basically describes a unit of measurement.
Also, besides obtaining the substring itself, I need to have a test to check if such a substring exists in the string or not.
I don't know anything about RegEx and I'm new to Qt as well. Most of the examples I found here don't report to Qt and/or don't explicitly account for what I need.
QRegExp exp("\\[([^\\]]+)\\]");
QString s1 = "5 [sm^2]";
qDebug() << exp.indexIn(s1);
qDebug() << exp.capturedTexts();
Output:
2
("[sm^2]", "sm^2")
If none of the string's parts match the regexp, indexIn will indicate that by returning -1. Otherwise the result will be >= 0, and the capturedTexts()[1] will contain the text that was enclosed in brackets.

How to unpunctuate, lowercase, de-space and hyphenate a string with regex?

If I have a string like this
Newsflash: The Big(!) Brown Dog's Brother (T.J.) Ate The Small Blue Egg
how would I convert that into the following using regex:
newsflash-the-big-brown-dogs-brother-tj-ate-the-small-blue-egg
In other words, punctuation is discarded and spaces are replaced with hyphens.
It sounds like you want to create a "URL plug" -- a URL-friendly version of an article's title, for example. That means you'll want to make sure you remove all possible non-URL-friendly characters, not just a few. You might do it this way (in order):
Remove all non-letter non-number non-space characters by:
Replacing regex [^A-Za-z0-9 ] with the empty string "".
Replace all spaces with a dash by:
Replacing regex \s+ with the string "-".
Lower-case the string by:
Java s = s.toLowerCase();
JavaScript s = s.toLowerCase();
C# s = s.ToLowerCase();
Perl $s = lc($s);
Python s = s.lower()
PHP $s = strtolower($s);
Ruby s = s.downcase
Replace the regex [\s-]+ with "-", then replace [^\w-] with "".
Then, call ToLowerCase or equivalent.
In Javascript:
var s = "Newsflash: The Big(!) Brown Dog's Brother (T.J.) Ate The Small Blue Egg";
alert(s.replace(/[\s+-]/g, '-').replace(/[^\w-]/g, '').toLowerCase());
Replace /\W+/ with '-', that will replace all non-word characters with a dash.
Then, collapse dashes by replacing /-+/ with '-'.
Then, lowercase the string - pure regex solutions cannot do that. You didn't say which language you are using, so I cannot give you an example, but your language might have String.toLowercase() or a tr/// call (tr/A-Z/a-z/, for example, in Perl).