I have a batch file that I need to extract switches from.
The switches are in this format.
/Switch1=Value1 /Switch2="Value 2" /Switch3 /Switch4="C:\Program Files\DIR"
I need Switch=Value or Switch (only if it doesn't have any value for e.g. Switch3) extracted.
I am a beginner to regex. So far I have tried \/\w+=|\/\w+ this expression. But that doesn't give me a value.

Seems like you want this,

Not much information, but here is something in perl to get you going:
perl -p -i -e 'print "$1=$3\n" if /\/(\w+)(=((\"[^"]*\")|\S+))?/;'

you use the lookback searching "switch." and look ahead for the first slash you will have to trim the values after but you got the values

It can get hairy to parse a command line with switches.
Something like below.
# /([^ =]+)(?:=(?|"((?:[^"\\]*(?:\\.|[^"\\]*)*))"|([^ ]*)))?
( [^ =]+ ) # (1)
( # (2 start)
\\ .
) # (2 end)
( [^ ]* ) # (2)
** Grp 0 - ( pos 0 , len 15 )
** Grp 1 - ( pos 1 , len 7 )
** Grp 2 - ( pos 9 , len 6 )
** Grp 0 - ( pos 16 , len 18 )
/Switch2="Value 2"
** Grp 1 - ( pos 17 , len 7 )
** Grp 2 - ( pos 26 , len 7 )
Value 2
** Grp 0 - ( pos 35 , len 8 )
** Grp 1 - ( pos 36 , len 7 )
** Grp 2 - NULL
** Grp 0 - ( pos 44 , len 31 )
/Switch4="C:\Program Files\DIR"
** Grp 1 - ( pos 45 , len 7 )
** Grp 2 - ( pos 54 , len 20 )
C:\Program Files\DIR


Iterate through captures with boost::regex

I have a regular expression to capture three fields in a HTML tag using boost::regex
So, from
I get
Porky%E2%80%99s" title="Porky’s – German" lang="de" hreflang="de"
But I´d like to have {de, Porky%E2%80%99s, Deutsch} instead.
How can I make my regex to stop matching the second field as soon as it finds the first white space?
I tried
So the second field matches everything but whitespace but I get this crash report
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::runtime_error> >'
what(): Ran out of stack space trying to match the regular expression.
This might work -
I would use this instead -
( .{1,3}? ) # (1)
( [^\s>"]* ) # (2)
( .*? ) # (3)
** Grp 0 - ( pos 9 , len 98 )
//" title="Porky’s – German" lang="de" hreflang="de">Deutsch<
** Grp 1 - ( pos 11 , len 2 )
** Grp 2 - ( pos 33 , len 15 )
** Grp 3 - ( pos 99 , len 7 )

My regular expression won't match and I can't identify why

Here is an example of the text I am trying to match within a scalar:
1 N [51]Gone Girl [52]Fox $37,513,109 - 3,014 - $12,446 $37,513,109 $61 1
2 N [53]Annabelle [54]WB (NL) $37,134,255 - 3,185 - $11,659 $37,134,255 $6.5 1
3 1 [55]The Equalizer [56]Sony $18,750,375 -45.1% 3,236 - $5,794 $64,236,992 $55 2
4 3 [57]The Boxtrolls [58]Focus $11,979,588 -30.7% 3,464 - $3,458 $32,093,796 $60 2
5 2 [59]The Maze Runner [60]Fox $11,634,764 -33.3% 3,605 -33 $3,227 $73,556,159 $34 3
6 N [61]Left Behind (2014) [62]Free $6,300,147 - 1,825 - $3,452 $6,300,147 $16 1
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133 $1,466 $29,012,573 $19.8 3
8 5 [65]Dolphin Tale 2 [66]WB $3,422,377 -28.5% 2,790 -586 $1,227 $37,866,130 $36 4
Here is the regular expression I was using that won't seem to match up. Can anyone identify why?
if ($allData =~ /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+)\s+(\[\d+\])(.+)\s+(\$\.+)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+(\d+)\s+(\-\d+|\-|\+\d+)\s+(\$\.+)\s+(\$\.+)\s+(\.+)\s+(\d+)/g)
$current[$i] = $1;
$last[$i] = $2;
$title[$i] = $4;
$week[$i] = $7;
$cume[$i] = $12;
printf("%-4s%-4s%-35s%-10s%-10s", $current[$i], $last[$i], $title[$i], $week[$i], $cume[$i]);
if ($last[$i] ne '-'){
$gain = $last[$i] - $current[$i];
if ($gain < $bigloss){
$bigloss = $gain;
$losstitle = $title[$i];
if ($gain > $biggain){
$biggain = $gain;
$gaintitle = $title[$i];
if ($last[$i] eq '-'){
if ($current[$i] < $bigdebut){
$bigdebut = $current[$i];
$bigdebuttitle = $title[$i];
if ($current[$i] > $weakdebut){
$weakdebut = $current[$i];
$weakdebuttitle = $title[$i];
Could be the fix -
# /(\d+)\s+(\d+|[N])\s+(\[\d+\])(.+?)\s+(\[\d+\])(.+?)\s+(\$.+?)\s+(\-|\+\d+\.\d+%|\-\d+\.\d+%)\s+([\d,]+)\s+(\-\d+|\-|\+\d+)\s+(\$.+?)\s+(\$.+?)\s+(.+?)\s+(\d+)/g
( \d+ ) # (1)
( \d+ | [N] ) # (2)
( \[ \d+ \] ) # (3)
( .+? ) # (4)
( \[ \d+ \] ) # (5)
( .+? ) # (6)
( \$ .+? ) # (7)
( # (8 start)
| \+ \d+ \. \d+ %
| \- \d+ \. \d+ %
) # (8 end)
( [\d,]+ ) # (9)
( \- \d+ | \- | \+ \d+ ) # (10)
( \$ .+? ) # (11)
( \$ .+? ) # (12)
( .+? ) # (13)
( \d+ ) # (14)
Output sample:
** Grp 0 - ( pos 506 , len 98 )
7 4 [63]This is Where I Leave You [64]WB $4,009,345 -41.8% 2,735 -133 $1,466 $29,012,573 $19.8 3
** Grp 1 - ( pos 506 , len 1 )
** Grp 2 - ( pos 508 , len 1 )
** Grp 3 - ( pos 510 , len 4 )
** Grp 4 - ( pos 514 , len 25 )
This is Where I Leave You
** Grp 5 - ( pos 540 , len 4 )
** Grp 6 - ( pos 544 , len 2 )
** Grp 7 - ( pos 547 , len 10 )
** Grp 8 - ( pos 558 , len 6 )
** Grp 9 - ( pos 565 , len 5 )
** Grp 10 - ( pos 571 , len 4 )
** Grp 11 - ( pos 578 , len 6 )
** Grp 12 - ( pos 585 , len 11 )
** Grp 13 - ( pos 597 , len 5 )
** Grp 14 - ( pos 603 , len 1 )
Try this regex:

Parse Maven Filename

How can I parse a maven filename into the artifact and and version?
The filenames look like this:
I need to get
So the artifactId is the text before the first instance of a dash and a number and the version is the text after the first instance of a number up to .jar.
I could probably do it with split and several loops and checks but it feels like there should be a simpler way.
Actually, the regex wasn't as complicated as I thought!
new File("test").eachFile() { file ->
String fileName =['.') - 1]
//Split at the first instance of a dash and a number
def split = fileName.split("-[\\d]")
String artifactId = split[0]
String version = fileName.substring(artifactId.length() + 1, fileName.length())
EDIT2: Hmm. It fails on examples such as this:
Basically its just this ^(.+?)-(\d.*?)\.jar$
used in multi-line mode if there is more than one line.
( .+? )
( \d .*? )
\. jar
** Grp 0 - ( pos 0 , len 29 )
** Grp 1 - ( pos 0 , len 9 )
** Grp 2 - ( pos 10 , len 15 )
** Grp 0 - ( pos 31 , len 22 )
** Grp 1 - ( pos 31 , len 11 )
** Grp 2 - ( pos 43 , len 6 )

Regex to extract pattern from text

I have a string that contains a bunch of function calls within it. I need to extract every occurrence of the VariableSet function call. Functions can appear in any order. Here is an example:
parsedExpression = "VariableSet(b, 999)If(a = 0,"Black",SetColor(a,b,c))VariableSet("a" ,1.573) VariableSet( c,-2387)"
I need to find every match that starts with "VariableSet(" and ends with the first close parenthesis that follows it. So, for the example above, I need a list like this:
VariableSet(b, 999)
VariableSet("a" ,1.573)
VariableSet( c,-2387)
I planned to use the code below but I have not been able to determine the correct regex pattern. The best I could come up with is "VariableSet(.*(?i:)\b)" but it does not produce the list above.
Dim matches As MatchCollection = Regex.Matches(parsedExpression, "VariableSet\(.*(?i:\)\b)")
' Loop over matches.
For Each m As Match In matches
' Loop over captures.
For Each c As Capture In m.Captures
Dim varName As String = ""
Dim varValue As String = ""
Dim firstCommaPosition As Integer
'For every VariableSet that was found do the following:
'Parse the captured string to get the variable name and value
varName = c.Value.Replace("VariableSet(", "").Replace(")", "")
firstCommaPosition = varName.IndexOf(",")
varValue = varName.Substring(firstCommaPosition + 1)
varName = varName.Substring(0, firstCommaPosition).Replace("""", "")
'Set the variable
ce.Variables(varName) = ce.Evaluate(varValue)
'Remove this instance of VariableSet() function from parsedExpression
parsedExpression = parsedExpression.Replace(c.Value, "")
I would greatly appreciate it if someone could provide the correct regex pattern.
Maybe this will help you:
Dim strMatch As String = ""
Dim strVar1 As String = ""
Dim strVar2 As String = ""
Dim strExpression As String = "VariableSet(b, 999)If(a = 0,""Black"",SetColor(a,b,c))VariableSet(""a"" ,1.573) VariableSet( c,-2387)"
Dim rx As New RegularExpressions.Regex("VariableSet\((?<V1>.*?),(?<V2>.*?)\)", RegularExpressions.RegexOptions.IgnoreCase)
Dim rxMatch As RegularExpressions.MatchCollection = rx.Matches(strExpression)
For intI As Integer = 0 To rxMatch.Count - 1
strMatch = rxMatch(intI).Value 'VariableSet(b, 999)
strVar1 = rxMatch(intI).Groups("V1").ToString 'b
strVar2 = rxMatch(intI).Groups("V2").ToString ' 999
VariableSet\([^)]*\) should be a direct replacement.
If you wanted to get fancy, all your code could be done using a single regex.
# VariableSet\((\s*"?\s*([^,")]*?)\s*"?\s*(?:,\s*"?\s*([^,")]*?)\s*"?\s*)?)\)
\( # Open paren
( # (1 start), Inside paren's
"? \s*
( [^,")]*? ) # (2), Var
"? \s*
, # Comma
"? \s*
( [^,")]*? ) # (3), Value
"? \s*
) # (1 end)
\) # Close paren
Example input string:
VariableSet(b, 999)
VariableSet("a" ,1.573)
VariableSet( c,-2387)
VariableSet( , 999)
VariableSet( "aadsfasdf")
VariableSet( )
Output matches ( Var / Value ):
** Grp 2 - ( pos 12 , len 1 )
** Grp 3 - ( pos 16 , len 3 )
** Grp 2 - ( pos 35 , len 1 )
** Grp 3 - ( pos 40 , len 5 )
** Grp 2 - ( pos 63 , len 1 )
** Grp 3 - ( pos 65 , len 5 )
** Grp 2 - ( pos 86 , len 0 ) EMPTY
** Grp 3 - ( pos 88 , len 3 )
** Grp 2 - ( pos 108 , len 9 )
** Grp 3 - NULL
** Grp 2 - ( pos 136 , len 0 ) EMPTY
** Grp 3 - NULL

regular expression behaving weird in django urls

Here is regular expression in
url(r'^company_data/(?:[A-Za-z]+)/((?:0?[1-9]|[12][0-9]|3[01])(?:0?[1-9]|1[012])(?:20)?[0-9]{2})*/((?:0?[1-9]|[12][0-9]|3[01])(?:0?[1-9]|1[012])(?:20)?[0-9]{2})*$', 'stats.views.second', name='home'),
def second(request,comp_name,offset_min,offset_max=None):
I am calling in this way from browser /company_data/hello/24092014/25092014
Expecting in the below way
comp_name= "hello", offset_min="24092014",offset_max="25092014"
In reality it is
What wrong did I do here??
Thanks in advance!!
enter code here
You're missing a capture group 1.
Edit: Also note that groups 2 and 3 should be done like below, unless I'm reading you
wrong and you intend to retrieve the last part of particular number groups.
# '^/?company_data/([A-Za-z]+)/((?:(?:0?[1-9]|[12][0-9]|3[01])(?:0?[1-9]|1[012])(?:20)?[0-9]{2})*)/((?:(?:0?[1-9]|[12][0-9]|3[01])(?:0?[1-9]|1[012])(?:20)?[0-9]{2})*)$'
/? company_data /
( [A-Za-z]+ ) # (1)
( # (2 start)
(?: 0? [1-9] | [12] [0-9] | 3 [01] )
(?: 0? [1-9] | 1 [012] )
(?: 20 )?
) # (2 end)
( # (3 start)
(?: 0? [1-9] | [12] [0-9] | 3 [01] )
(?: 0? [1-9] | 1 [012] )
(?: 20 )?
) # (3 end)
** Grp 0 - ( pos 0 , len 37 )
** Grp 1 - ( pos 14 , len 5 )
** Grp 2 - ( pos 20 , len 8 )
** Grp 3 - ( pos 29 , len 8 )