I am using this Regex in my Flutter App to find words enclosed by single-quotes that end with a .tr:
r"'[^'\\]*(?:\\.[^'\\]*)*'\s*\.tr\b"
Now I need another expression that is almost the same but looks for words enclosed by dobule-quotes, ending with .tr and might contain escaped single-quotes.
I tried simply changing the single quotes to double quotes from the first expression, but Flutter is giving me errors... I need to escaped some characters but I can not make it work. Any idea?
An edge case it should match is:
"Hello, I\'m Chris".tr
You may use this regex for double quoted text that can have any escaped character followed by .tr and word boundary:
r""""[^"\\]*(?:\\.[^"\\]*)*"\s*\.tr\b"""
RegEx Demo
you need to use \ before every " in your RegExp's source, try this:
RegExp regExp = new RegExp(r'\"[^\"\\]*(?:\\.[^\"\\]*)*\"\s*\.tr\b');
print("${regExp.hasMatch('"Hello, I\'m Chris".tr')}"); // result = true
Related
Let's say I have this "code" I want to lex.
var text = 'hello'
Here's my regex.
String: ([a-z\s]+)
Identifier: [a-z]+
Now when I put my code into regexr.com and use the identifier regex, it matches the string as an identifier, how would I stop it from matching strings as identifiers?
What identifies a string? Quotation marks. In your case: single quotes.
Therefore, we want to match the content between quotes as a string. To do so, we can use the following lazy regex:
'.*?'
To allow both quotes, you could use: '.*?'|".*?" or the same with a backreference (['"]).*?\1.
If it is allowed to escape strings, it gets even more complicated. I suggest using a recursive regex to do so:
((['"])(?>[^'"\\]++|\\.|(?1))*+\2)
Samples matched:
a = "abc dsfsd", b= ' abc dsfsd'
c ="abc\" dsfsd"
d= "abc\\"
To match any identifiers but the strings you could use:
[a-z]+(?=([^']*['][^']*['])*[^']*$)
(Or here a version that matches both types of quotes: [a-z]+(?=([^'"]*(["'])[^"']*\2)*[^"']*$))
Again, it gets more involved if you want to account for escaped quotes:
[a-z]+(?=([^"'\\]*(\\.|(["'])([^"'\\]*\\.)*[^"'\\]*\3))*[^"']*$)
I hope, this helps.
I would like to define a regex pattern which replaces escaped characters with the corresponding value.
For example the string
xy\tz\\x
Should be converted to
xy{tab}z\x
The problem is how to handle things like
xy\\\\\t
this string should become
xy\\{tab}
I don't know how to create a pattern which matches only odd backslashes.
This isn't something that can be accomplished using a single pattern. To start, strip out collections of backslashes:
s/\\\\/\\/g
This replaces two backslashes with a single one.
Then you can just apply one pattern per escaped character:
s/\\t/\t/g
The trick here is to escape the backslash you want to replace. What this'll do is replace the literal string "\t" with a tab character.
I'm trying to parse an apache log, and I'm having problems with the right syntax for the referer because it is a string inside " (double-quotes), that can also have \" inside it.
"([^"]*)" doesn't work when there is a \" in the string.
How do I start at the 1st double-quote, then take all characters that are not double-quotes, unless it's \", in which case I include it, and keep going?
You could use this:
"((?:[^"]|\\")*)"
It will match zero or more of any character other than a double-quote or a slash-double-quote pair, all surrounded by double-quotes.
Could there be other escapes in the string, for example "hello \\"? In that case, you need a more general approach:
"((?:\\.|[^"\\])*)"
How about this? A negative-lookbehind to exclude a \ before the closing "
"(.+?)(?<!\\)"
This will match two quotes with any number of escaped quotes in-between:
"\([^"]\|\\"\)*"
First it looks for a quote. Next it searches for zero to infinity of the following:
a non-quote character
a quote character preceded by a backslash
I need to match a string that is in quotations, but make sure the first quotation is not escaped.
For example: First \"string\" is "Hello \"World\"!"
Should match only Hello \"World\"!
I am trying to modify (")(?:(?=(\\?))\2.)*?"
I tried adding [^\\"] to ("), and that kinda works, but it matches either only (") or every other letter that isn't (\") and I can't figure out a way to modify ([\\"]") to only match (") if it is not (\")
This is what I have so far ([^\\"]")(?:(?=(\\?))\2.)*?"
I've been trying to figure it out using these two pages, but still cannot get it.
Can Regex be used for this particular string manipulation?
RegEx: Grabbing values between quotation marks
Thanks
You can use negative look behind like this:
(?<!\\)"(.*?)(?<!\\)"
Check see it in action here on regex101
The first match group contains:
Hello \"World\"!
I have a tag that is like
tag="text textwithdot. text text"
followed by a further tag that would resemble
tag="text text text"
I wanted to use the following regular expression
tag="\w+"
but that only finds one word, how do I find the whole string within the quotes, what wildcard does that?
This should work for you:
tag="([^"]*)"
That basically means tag=" followed by zero or more characters that are not a double quote, followed by a double quote.
BTW: I'm assuming that there is no such thing as a tag that contains the double quote character. If there is such a thing, it would need some escaping rule applied to it and the regular expression would be more complicated.
Also,
tag=['"]([^"]*)['"]
if that tags could change between ' and "
You could use an ungreedy match everything.
tag="[\s\S]*?"
Or use the . with dot matches newlines flag (assuming \n is a possibility).