Matching single or double quoted strings in Vim - regex

I am having a hard time trying to match single or double quoted strings with Vim's
regular expression engine.
The problem is that I am assigning the regular expression to a variable and then using that
to play with matchlist.
For example, let's assume I know I am in a line that contains a quoted string and I want to match it:
let regex = '\v"(.*)"'
That would work to match anything that is double-quoted. Similarly, this would match single quoted strings:
let regex = "\v'(.*)'"
But If I try to use them both, like:
let regex = '\v['|"](.*)['|"]'
or
let regex = '\v[\'|\"](.*)[\'|\"]'
Then Vim doesn't know how to deal with it because it thinks that some quotes are not being closed in the actual variable definition and it messes up the regular expression.
What would be the best way to catch single or double quoted strings with a regular expression?
Maybe (probably!) I am missing something really simple to be able to use both quotes and not worry about the surrounding quotes for the actual regular expression.
Note that I prefer single quotes for regular expression because that way I do not need to double-backslash for escaping.

You need to use back references. Like so:
let regex = '\([''"]\)\(.\{-}\)\1'
Or with very-magic
let regex = '\v([''"])(.{-})\1'
Alternatively you could use (as it will not mess with your sub-matches):
let regex = '\%("\([^"]*\)"\|''\([^'']*\)''\)'
or with very magic:
let regex = '\v%("([^"]*)"|''([^'']*)'')'

look at this post
Replacing quote marks around strings in Vim?
might help in some way

This is a workable script I write for syntax the quoted strings.
syntax region myString start=/\v"/ skip=/\v(\\[\\"]){-1}/ end=/\v"/
syntax region myString start=/\v'/ end=/\v'/
You may use \v(\\[\\"]){-1} to skip something.

Related

Regular expression and extracting a value

I need to get hold oif the |value| below:
{"token":"<input name=\"__RequestVerificationToken\" type=\"hidden\" value=\"KhWUxVIL697p18Gm3T1b4pCmXjK7iQujsJieYiLOKcKmKbdvC55kgaqg4G-uGqeUzmV3x6EMAV_ejPHe-Ok2kFqnjzVmvZmHySMpwKzGvq01\" />"}
What kind of regular expression would match that?
I have tried to us this:
.check(regex("input name='__RequestVerificationToken' type='hidden' value='([A-Za-z0-9+=/'-'_]+?)'").saveAs("token")))
But it does not match.
Also using a regex tester does not get me anywhere, please help me.
I would use something like this:
regex("<input.+__RequestVerificationToken.+value=\\?(\"|\')(.+)\\?(\"|\').+>")
It can be made shorter, but I was not sure how actual example string looks (does it have escape chars at once, does it use single or double qoutes).
assuming that the string in your question is exactly the way it appears, with escaped double quotes \" etc.
here is the code:
val regexGroupExtractor = """.*value=\\"(.*)\\".*""".r
val regexGroupExtractor(e) = s
// e == "KhWUxVIL697p18Gm3T1b4pCmXjK7iQujsJieYiLOKcKmKbdvC55kgaqg4G-uGqeUzmV3x6EMAV_ejPHe-Ok2kFqnjzVmvZmHySMpwKzGvq01"
In general with regex it is often helpful to think of the pattern in reverse: instead of specifying what is included, specify what is not. In your case there is no need to specify which characters are "in" inside the (), instead focus on where the part you want starts and ends. Specifically in your example - quotes are outside the string you want, in fact the quotes are exactly the edges, so in my regex I capture whatever it is between them.

how to form a regex that will parse the following?

I need to parse the following line:
Action(X,X,Cash(50))Action(Y,Y,Material(30,Car,2))Action(I,I,Cash(50))
The output should look like:
Action(X,X,Cash(50))
Action(Y,Y,Material(30,Car,2))
Action(I,I,Cash(50))
The regex I used is:
String tokenRegex = "(Action+\\(([a-zA-Z]+|\\,|([a-zA-Z]+\\(\\d*|[a-zA-Z]+|\\,)\\))+\\))";
It fails to parse "Action(Y,Y,Material(30,Car,2))" but works for "Action(X,X,Cash(50))".
What am i doing wrong. What will be the correct regex?
I think this does it:
String tokenRegex = "(Action\\([a-zA-Z]+,[a-zA-Z]+,[a-zA-Z]+\\(\\d+(,([a-zA-Z]+|\\d+))*\\)\\))";
I removed some the parentheses that weren't needed for grouping in the regular expression. If you need them for capturing parts of the expression, you'll have to add them back.
Instead of using a regex just do something to the effect of
string.replace(")A", ")\nA");

regular expression ~ extract string

I need to extract using Regular Expression from following string
console.log("This can be anything except double quote"),
followed by comma and any other string
and the extraction output is
console.log("This can be anything except double quote"),
Note that the sample string shall not be read literally (e.g. can be anything means a random string or symbol
~!##$%^&*)
Any idea, what is the right regular expression for above case?
Using Regex for quoted string with escaping quotes:
(console\.log\("(?:[^"\\]|\\.)*"\),)
There are many solutions to this.
The simplest I could think of: (console.log[^,]+)
PS: This removes the comma at the end of the console statement. You can manually add that.

Need to extract text from within first curly brackets

I have strings that look like this
{/CSDC} CHOC SHELL DIP COLOR {17}
I need to extract the value in the first swirly brackets. In the above example it would be
/CSDC
So far i have this code which is not working
Dim matchCode = Regex.Matches(txtItems.Text, "/\{(.+?)\}/")
Dim itemCode As String
If matchCode.Count > 0 Then
itemCode = matchCode(0).Value
End If
I think the main issue here is that you are confusing your regular expression syntax between different languages.
In languages like Javascript, Perl, Ruby and others, you create a regular expression object by using the /regex/ notation.
In .NET, when you instantiate a Regex object, you pass it a string of the regular expression, which is delimited by quotes, not slashes. So it is of the form "regex".
So try removing the leading and trailing / from your string and see how you go.
This may not be the whole problem, but it is at least part of it.
Are you getting the whole string instead of just the 1st value? Regular expressions are greedy by default so .Net is trying to grab the largest matching string.
Try this:
Dim matchCode = Regex.Matches(txtItems.Text, "\{[^}]*\}")
Dim itemCode As String
If matchCode.Count > 0 Then
itemCode = matchCode(0).Groups(0).Value
End If
Edited: I've tried this in Linqpad and it worked.
It appears you are using a capture group.. so try matchCode(0).Groups(0).Value
Also, remove the /\ from the beginning of the pattern and remove the trailing /

Regular Expression Question regarding search&replace

I'm trying to match cases with regular expression to search and replace some text of given pattern. I can match the pattern, but I'd like to keep some of the literals when replacing.
For example, from the string "abcd123," I'd like to keep abcd but remove 123. I can match the pattern using a simple regular expression like [a-zA-Z0-9]+, but when I want to replace it, I don't know what to use for the replacement. Is this even possible with just regular expressions?
Thanks a lot.
The answer depends on what language/regex engine you are using. You typically use parentheses to save sections matched and either $1, $2, ... or \1, \2, ... in the replacement string to refer to those sections.
For example, from JavaScript:
var x = "Hello World";
x.replace( /([A-Z])\w+/g, '$1xx' );
// "Hxx Wxx"
What language or text editor are you using?