Regex in Flutter to find double quotes enclosed words and escaped single quotes - regex

I am using this Regex in my Flutter App to find words enclosed by single-quotes that end with a .tr:
r"'[^'\\]*(?:\\.[^'\\]*)*'\s*\.tr\b"
Now I need another expression that is almost the same but looks for words enclosed by dobule-quotes, ending with .tr and might contain escaped single-quotes.
I tried simply changing the single quotes to double quotes from the first expression, but Flutter is giving me errors... I need to escaped some characters but I can not make it work. Any idea?
An edge case it should match is:
"Hello, I\'m Chris".tr

You may use this regex for double quoted text that can have any escaped character followed by .tr and word boundary:
r""""[^"\\]*(?:\\.[^"\\]*)*"\s*\.tr\b"""
RegEx Demo

you need to use \ before every " in your RegExp's source, try this:
RegExp regExp = new RegExp(r'\"[^\"\\]*(?:\\.[^\"\\]*)*\"\s*\.tr\b');
print("${regExp.hasMatch('"Hello, I\'m Chris".tr')}"); // result = true

Related

Using regex how do I stop identifiers matching strings

Let's say I have this "code" I want to lex.
var text = 'hello'
Here's my regex.
String: ([a-z\s]+)
Identifier: [a-z]+
Now when I put my code into regexr.com and use the identifier regex, it matches the string as an identifier, how would I stop it from matching strings as identifiers?
What identifies a string? Quotation marks. In your case: single quotes.
Therefore, we want to match the content between quotes as a string. To do so, we can use the following lazy regex:
'.*?'
To allow both quotes, you could use: '.*?'|".*?" or the same with a backreference (['"]).*?\1.
If it is allowed to escape strings, it gets even more complicated. I suggest using a recursive regex to do so:
((['"])(?>[^'"\\]++|\\.|(?1))*+\2)
Samples matched:
a = "abc dsfsd", b= ' abc dsfsd'
c ="abc\" dsfsd"
d= "abc\\"
To match any identifiers but the strings you could use:
[a-z]+(?=([^']*['][^']*['])*[^']*$)
(Or here a version that matches both types of quotes: [a-z]+(?=([^'"]*(["'])[^"']*\2)*[^"']*$))
Again, it gets more involved if you want to account for escaped quotes:
[a-z]+(?=([^"'\\]*(\\.|(["'])([^"'\\]*\\.)*[^"'\\]*\3))*[^"']*$)
I hope, this helps.

Regex replace escaped characters

I would like to define a regex pattern which replaces escaped characters with the corresponding value.
For example the string
xy\tz\\x
Should be converted to
xy{tab}z\x
The problem is how to handle things like
xy\\\\\t
this string should become
xy\\{tab}
I don't know how to create a pattern which matches only odd backslashes.
This isn't something that can be accomplished using a single pattern. To start, strip out collections of backslashes:
s/\\\\/\\/g
This replaces two backslashes with a single one.
Then you can just apply one pattern per escaped character:
s/\\t/\t/g
The trick here is to escape the backslash you want to replace. What this'll do is replace the literal string "\t" with a tab character.

What regex expression will match all characters except ", except when it is \"?

I'm trying to parse an apache log, and I'm having problems with the right syntax for the referer because it is a string inside " (double-quotes), that can also have \" inside it.
"([^"]*)" doesn't work when there is a \" in the string.
How do I start at the 1st double-quote, then take all characters that are not double-quotes, unless it's \", in which case I include it, and keep going?
You could use this:
"((?:[^"]|\\")*)"
It will match zero or more of any character other than a double-quote or a slash-double-quote pair, all surrounded by double-quotes.
Could there be other escapes in the string, for example "hello \\"? In that case, you need a more general approach:
"((?:\\.|[^"\\])*)"
How about this? A negative-lookbehind to exclude a \ before the closing "
"(.+?)(?<!\\)"
This will match two quotes with any number of escaped quotes in-between:
"\([^"]\|\\"\)*"
First it looks for a quote. Next it searches for zero to infinity of the following:
a non-quote character
a quote character preceded by a backslash

Regex - Match string between quotes (") but do not match (\") before the string

I need to match a string that is in quotations, but make sure the first quotation is not escaped.
For example: First \"string\" is "Hello \"World\"!"
Should match only Hello \"World\"!
I am trying to modify (")(?:(?=(\\?))\2.)*?"
I tried adding [^\\"] to ("), and that kinda works, but it matches either only (") or every other letter that isn't (\") and I can't figure out a way to modify ([\\"]") to only match (") if it is not (\")
This is what I have so far ([^\\"]")(?:(?=(\\?))\2.)*?"
I've been trying to figure it out using these two pages, but still cannot get it.
Can Regex be used for this particular string manipulation?
RegEx: Grabbing values between quotation marks
Thanks
You can use negative look behind like this:
(?<!\\)"(.*?)(?<!\\)"
Check see it in action here on regex101
The first match group contains:
Hello \"World\"!

Regex: finding a string with an undetermined amount of words

I have a tag that is like
tag="text textwithdot. text text"
followed by a further tag that would resemble
tag="text text text"
I wanted to use the following regular expression
tag="\w+"
but that only finds one word, how do I find the whole string within the quotes, what wildcard does that?
This should work for you:
tag="([^"]*)"
That basically means tag=" followed by zero or more characters that are not a double quote, followed by a double quote.
BTW: I'm assuming that there is no such thing as a tag that contains the double quote character. If there is such a thing, it would need some escaping rule applied to it and the regular expression would be more complicated.
Also,
tag=['"]([^"]*)['"]
if that tags could change between ' and "
You could use an ungreedy match everything.
tag="[\s\S]*?"
Or use the . with dot matches newlines flag (assuming \n is a possibility).