Use a character to look ahead, but only when it exists - regex

Suppose I have these line:
/path-to-something/section/resource?var=name
/path-to-something/section/resource
I want to use regular expression to capture the text between /path-to-something/ and the ? sign. So for both cases, I want the output to be:
section/resource
The farthest I can go is to use this regex:
(?<=/path-to-something/).+(?=\?)
But it fails for the second case (where the URL doesn't have the ? sign):
section/resource
[no match]
Is there a way to do something like this in Regular Expression? I know I can do this without regular expression, but I wanted to know if this is possible to do in regex.

How about:
(?<=/path-to-something/)[^?]+

Related

Regular expression - How to exclude certain string from the results set?

I am pretty new in Regular Expression.
Input
/shop/earrings
/shop/yellow-gold-earrings
/shop/white-gold-earrings
/shop/rose-gold-earrings
/shop/necklaces
/shop/yellow-gold-necklaces
/shop/white-gold-necklaces
/shop/rose-gold-necklaces
/shop/best-buy-earrings
Regular Expression I used
\/shop\/[a-z-]*-?earrings
Desired Result
/shop/earrings
/shop/yellow-gold-earrings
/shop/white-gold-earrings
/shop/rose-gold-earrings
Actual Result
/shop/earrings
/shop/yellow-gold-earrings
/shop/white-gold-earrings
/shop/rose-gold-earrings
/shop/best-buy-earrings
I do not want /shop/best-buy-earrings to be in the result. Please help me to fix the Regular Expression. Thank you.
Simply add gold to the regex before the - and surround with parenthesis:
\/shop\/([a-z-]*gold-)?earrings
Assuming PCRE flavour, you can use:
\/shop\/(?!best-buy)(?:\w+-)*earrings
Or, if you can use other delimiter than slash:
/shop/(?!best-buy)(?:\w+-)*earrings
Demo

regular expression replace removes first and last character when using $1

I have string like this:
&breakUp=Mumbai;city,Puma;brand&
where Mumbai;city and Puma;brand are filters(let say) separated by comma(,). I have to add more filters like Delhi;State.
I am using following regular expression to find the above string:
&breakUp=.([\w;,]*).&
and following regular expression to replace it:
&breakUp=$1,Delhi;State&
It is finding the string correctly but while replacing it is removing the first and last character and giving the following result:
&breakUp=umbai;city,Puma;bran,Delhi;State&
How to resolve this?
Also, If I have no filters I don't want that first comma. Like
&breakUp=&
should become
&breakUp=Delhi;State&
How to do it?
My guess is that your expression is just fine, there are two extra . in there, that we would remove those:
&breakUp=([\w;,]*)&
In this demo, the expression is explained, if you might be interested.
To bypass &breakUp=&, we can likely apply this expression:
&breakUp=([^&]+)&
Demo
Your problem seems to be the leading and trailing period, they are matched to any character.
Try using this regex:
&breakUp=([\w;,]*)&

Go Regex to match tags with bracket

I want to get the index of all tags inside brackets using regex package.
str := "[tag=blue]Hello [tag2=red,tag3=blue]Good"
rg := regexp.MustCompile(`(?:^|\W)\[([\w-]+)=([\w-]+)\]`)
rgi := fmtRegex.FindAllStringIndex(str, -1)
fmt.Println(rgi)
// Want index for:
// [tag=blue], [tag2=red,tag3=blue]
The regex needs to return indexes for [tag=blue], [tag2=red,tag3=blue]
but it only returns [tag=blue].
How do I fix this regex (?:^|\W)\[([\w-]+)=([\w-]+)\] so that I can also match the comman when there is more than one tags in the brackets
I would like to post a comment to the Answer by #Avinash Raj but I don't have enought Repotation... so:
Seems like you want something like this,
...
\B\[([\w-]+)=([\w-]+)(?:,[\w-]+=[\w-]+)*\]
The provided regular expression will match only the first and the last pair of key=value in the string. Having something like:
[tag=val,tag1=val1,tag2=val2,tag3=val3]
The regular expression will only match the tag, val, tag3 and val3.
If you want to match all of them I would suggest using pure go without regular expressions. This is something that should be almost straight forward in go.
If you actually need only the index for the match, you can use the above regular expression and then parse the tags some other way.
Seems like you want something like this,
(?<!\w)\[([\w-]+)=([\w-]+)(?:,[\w-]+=[\w-]+)*\]
DEMO
OR
\B\[([\w-]+)=([\w-]+)(?:,[\w-]+=[\w-]+)*\]
\B matches between two word characters or two non-word characters.
DEMO
The correct regexp accepted by golang Regexp package to select tag expressions in the multiple brackets is:
rg := regexp.MustCompile(`\[([\w-]+)=([\w-]+)(?:,([\w-]+)=([\w-]+))*\]`)
Playground
See, if that was what you were looking for...
UPDATE: Just realized that it was already answered by #ndyakov.

Need help with regular expression

Is it possible to have this done with one regex?
I need to match only those strings that have exactly one period/dot but the restriction is that that period/dot must not be at the end of the string.
Example:
abc.d will match
.abcd will match
abcd. will not match
Yes, you can do it in one regex:
^[^.]*\.[^.]+$
I really like #codaddict's answer, but how about something without Regex? ( C# code below )
if(a.Split('.').Length>2 || a.EndsWith("."))
{
Console.WriteLine("invalid");
}
What I like is that it is much more clear that you don't want a string with two . and also a . should not be in the end. And this might actually be faster than using a regex.

Regular Expression - Want two matches get only one

I'm working wih a regular expression and have some lines in javascript. My expression should deliver two matches but recognizes only one and I don't know whats the problem.
The Lines in javascript look like this:
if(mode==1) var adresse = "?APPNAME=CampusNet&PRGNAME=ACTION&ARGUMENTS=-A7uh6sBXerQwOCd8VxEMp6x0STE.YaNZDsBnBOto8YWsmwbh7FmWgYGPUHysiL9u0.jUsPVdYQAlvwCsiktBzUaCohVBnkyistIjCR77awL5xoM3WTHYox0AQs65SoHAhMXDJVr7="; else var adresse = "?APPNAME=CampusNet&PRGNAME=ACTION&ARGUMENTS=-AHMqmg-jXIDdylCjFLuixe..udPC2hjn6Kiioq7O41HsnnaP6ylFkQLhaUkaWKINEj4l2JqL2eBSzOpmG.b5Av2AvvUxEinUhMBTt5awdgAL4SkBEgYXGejTGUxcgPE-MfiQjefc=";
My expression looks like this:
(?<Popup>(popUp\(')|(adresse...")).*\?((?<Parameters>APPNAME=CampusNet[^>"']*["']))
I want to have two matches with APPNAME...... as Parameters.
[UPDATE] Like Tim Pietzcker wrote i used the greedy version and should have used the lazy version. while he wrote that i solved it myself by using .? instead of . in the middle so the expression looks like this:
(?<Popup>(popUp\(')|(adresse...")).*?\\?((?<Parameters>APPNAME=CampusNet[^>"']*["']))
That worked. Thanks to Tim Pietzcker
Your regex matches too much - from the very first adresse until the very last " because it uses a greedy quantifier .*.
If you make that quantifier lazy, i. e.
(?<Popup>(popUp\(')|(adresse...")).*?\?((?<Parameters>APPNAME=CampusNet[^>"']*["']))
you get two matches.
Alternatively, if your data allows this, use a different quantifier that only matches non-space characters. This will match faster (but will fail of course if the text you're trying to match could possibly contain spaces):
(?<Popup>(popUp\(')|(adresse..."))\S*\?((?<Parameters>APPNAME=CampusNet[^>"']*["']))
Usually you must apply the regex with the "global" flag to find all matches. I can't really say more until I see the complete code sample you are working with.