Regex substring - regex

I'm trying to select a substring using regex and I'm going round in circles. I need to select everything before the first "_".
exampale URL - GI_2013_JUNE_10_VOL3_LASTCHANCE
So the result Im looking for from the URL above would be "GI". The text before the first "_" can vary in length.
Any help would be much apprecited

The regex would be:
^[^_]+
and grab the whole regex match. But as a comment says, using a substring function is more efficient!

^[^_]*
...is the expression you're looking for.
It basically says: Select everything that is not an underscore, starting at the beginning of the string.
http://regexr.com?356in

Related

How to get the only the digit using Regex expression from URL?

I need some help with Regex expression, as it s very new to me.
I have a URL which consists of Item Number or Product ID.
What I am looking to achieve is that could trim the URL part and extra part after a symbol of %.
Here is how the url looks like.
https://www.test.com/test-test/test/test-demo-demo-demo-demo.html?piid=12345678%2C24753325#seemoreoptions-b0uksl51j4m
OR
https://www.test.com/test-test/test/test-demo-demo-demo-demo.html?piid=12345678
So from the above URL I am looking to trim https://www.test.com/test-test/test/test-demo-demo-demo-demo.html?piid= and this part %2C24753325#seemoreoptions-b0uksl51j4m
So, this should give me only 12345678.
I have use the following Regex
(.*)(\=) Replace with $2
Above Regex does trim the url first part but does not the part after % symbol.
I tried to get solution on
https://regexr.com/
So for the both the above URL examples, I should get the result as
12345678
Thank you in advance
Instead of trimming part before and after digits you want, try another approach: extract digits you want.
You can use groups (parentheses) in regexp to extract found data.
piid=([0-9]+)
It means:
piid= - text to find
[0-9]+ - one or more digits
() - group
You can extract first group by $1 (or \1 etc. - depends of language you use).
Example: https://regexr.com/758d9

URL regex that skips ending periods

I'm trying to create a regex that matches url strings within normal text. I have this:
http[s]?://[^\s]+
This seems to work well with the exception that if the url is at the end of a sentence it will grab the period as well. For example for this string:
I am typing some text with the url http://something.com/something-?args=someargs. This is another sentence.
it matches:
http://something.com/some-thing?args=someargs.
I would like it to match:
http://something.com/some-thing?args=someargs
Obviously I can't exclude periods because they are in the url previously but I can't figure out how to tell it to exclude the last period if there is one. I could potentially use a negative lookahead for end of line or whitespace, but if it's in the middle of the line (without a period after it) that would leave off the last character of the url.
Most of the ones I have seen online have the same issue that they match the ending dot so maybe it's not possible? I know basic regex but certainly not a genius with it so if someone has a solution I would be very grateful :).
Also, I can do some post-process in this case to remove the dot if I need to, just seems like there should be a Regex solution...
Try this one
http[s]?://[^\s]+[^. ]

What's the right regular expression to match the exact word at the end of a string and excluding all other urls with more chars at the end?

I have to match an exact string at the end of a url, but not match all other urls that have more characters after that string
I can better explain with example.
I need to match the url having the string 'white' at its end: http//mysite.com/white
But I also need to not match urls having one or more characters postponed to it, like http//mysite.com/white__blue or http//mysite.com/white/yellow or http//mysite.com/white/
How to do that?
Thanks
Regex to match any url*
^(https?:\/\/)?([\da-z\.-]+\.[a-z\.]{2,6}|[\d\.]+)([\/:?=&#]{1}[\da-z\.-]+)*[\/\?]?$
Regex to match a url containing white in the end
^(https?:\/\/)?([\da-z\.-]+\.[a-z\.]{2,6}|[\d\.]+)([\/:?=&#]{1}[\da-z\.-]+)*[\/\?]?white$
You can check the regex here
From regexr.com
It does not match urls(which are not valid anyway) like
httpabrakadabra.co//
http:google.com
http://no-tld-here-folks.a
http://potato.54.211.192.240/
Based on your limited sample inputs, I'd say you could get away with this very minimal pattern:
^http[^\s]+white$
However, depending on what you are truly trying to achieve, what language/function you are implementing this pattern with, and what the full input string looks like, this pattern may need to be refined.
It would be best if you would improve your question to include all of the above relevant information.

Regex for url route with query string

I am having hard time learning regex and honestly I have no time at the moment.
I am looking for a regex expression that would match url route with query string
What I need is regex to match population?filter=nation of course where nation can be any string.
Based on my current regex knowledge I have also tried with regex expression /^population\/(?P<filterval>\d+)\/filter$/ to match population/nation/filter but this does not work.
Any suggestion and help is welcome.
This does match only your first query string format:
population\?filter=[\w]+[-_]?[\w]+
Addiotionally it allows for - and _ as bindings between words. If you know, that your string ends right there, you can also add an $ to the end to mark it so.
If you know that the nation is only alphabetical characters, yu can use the simplified version:
population\?filter=[\w]+
Demo

REGEXP to grab all text before second underscore, including second underscore

So I have strings that come across like this:
GRF_STHB_010_00
ABC_AB9_004_01
BGH_NP2_002_03
AG2_BVT_007_010
The text before the first underscore can be any combo of Letters or Numbers.
The text before the second underscore can also be any combo of letters or numbers.
I want to be able to grab the whole string before the 2nd underscore, including the second underscore.
I have come up with this for now:
^([^\d]*)
It works for the first one, and finds:
GRF_STHB_
But for the other two it stops at a number that it finds:
ABC_AB
BGH_NP
AG
I need this to work in REGEXP because this is being included in a spreadsheet for grabbing data.
How can I adjust it so that it works with numbers and would have a result of:
GRF_STHB_
ABC_AB9_
BGH_NP2_
AG2_BVT_
Here is a quick tester for anyone that can help:
regexpal.com
Thanks!
You can use this regex for this:
^([^_]*_){2}
Online Demo: http://regex101.com/r/cX7hL7
You can use this :
^[^_]*_[^_]*_
You can use this regex:
^([^_]*_[^_]*)_.*$
Demo