Match last word after / - regex

so, i have some kind of intern urls: for example "/img/pic/Image1.jpg" or "/pic/Image1.jpg" or just "Image1.jpg", and i need to match this "Image1.jpg" in other words i want to match last character sequence after / or if there are no / than just character sequence. Thank you in advance!

.*/(.*) won't work if there are no /s.
([^/]*)$ should work whether there are or aren't.

Actually you don't need regexp for this.
s="this/is/a/test"
s.substr(s.lastIndexOf("/")+1)
=> test
and it also works fine for strings without any / because then lastIndexOf returns -1.
s="hest"
s.substr(s.lastIndexOf("/")+1)
=> hest

.*/([^/]*)
The capturing group matches the last sequence after /.

The following expression would do the trick:
/([\w\d._-]*)$
Or even easier (but i think this has also been posted below before me)
([^/]+)$

A simple regex that I have tested:
\w+(.)\w+$
Here is a good site you can test it on: http://rubular.com/

In Ruby You would write
([^\/]*)$
Regexps in Ruby are quite universal and You can test them live here: http://rubular.com/
By the way: maybe there is other solution that not involves regexps? E.g File.basenam(path) (Ruby again)
Edit: profjim has posted it earlier.

I noticed you said in your comments you're using javascript. You don't actually need a regex for this and I always think it's nice to have an alternative to using regex.
var str = "/pic/Image1.jpg";
str.split("/").pop();
// example:
alert("/pic/Image1.jpg".split("/").pop()); // alerts "Image1.jpg"
alert("Image2.jpg".split("/").pop()); // alerts "Image2.jpg"

Something like .*/(.*)$ (details depend on whether we're talking about Perl, or some other dialect of regular expressions)
First .* matches everything (including slashes). Then there's one slash, then there's .* that matches everything from that slash to the end (that is $).
The * operates greedily from left to right, which means that when you have multiple slashes, the first .* will match all but the last one.

Related

RegEx for Google Analytics that picks text within urls

I am trying to build a RegEx that picks urls that end with "/topic". These urls have a different number of folders so whereas one might be www.example.com/pijamas/topic another could be www.example.com/pijamas/strippedpijamas/topic
What regular expression can I use to do that? My attempt is ^www.example.com/[a-zA-Z][1,]/topic$ but this hasn't worked. Even if it worked I'd like to have a shorter RegEx to do this really.
Any help on this would be much appreciated.
Thank you, A.
Try this:
^www\.example\.com\/[\w\/]*topic$
You need to make a few changes to your regex. Firstly, the dot (.) is a special character and needs to be escaped by prefacing it with a backslash.
Secondly, you probably meant {1,} instead of [1,] – the latter defines a character class. You can substitute {1,} with +.
Then there's the fact that your second URL has one more subdirectory, so you need to somehow incorporate a / into your regex.
Putting all this together:
^www\.example\.com/[a-zA-Z]+(/[a-zA-Z]+)*/topic$
To shorten it, you can use the i option to match regardless of case, cutting down the two [a-zA-Z] to [a-z]. Try this online here.

Regular expression to split optional groups

Full string syntax is: "db:server:port"
Server and port are optional, i.e. can have partial strings, such as:
db
or
db:server
Trying to use:
(.*):?(.*)?:?(.*)?
selects the whole string
Please advise.
Give this one a shot:
([^:]*?):?([^:]*?):?([^:]*?)$
Not sure what language you're using, so it may not work.
Example: http://regex101.com/r/eQ6bF0
Note on the example it's set for a global/multiline match - beware that this will match across newlines if you don't use the correct modifier.
You didn't specify a language that I can see, so there may be different specific answers, but the basic problem is that .* will match a ":" character. That means the first term will suck the entire string in. I would use ([^:]*) instead of (.*).
You can try this:
([^:]+)(?::([^:]+)(?::([^:]+))?)?
I think this is what you're looking for:
(db|:server|:port)
will match any and all of these:
db:server:port
db
db:server
Working example:
http://regex101.com/r/rK1lI5

RegExp extraction

Here's the input string:
loadMedia('mediacontainer1', 'http://www.something.com/videos/JohnsAwesomeVideo.flv', 'http://www.something.com/videos/JohnsAwesomeCaption.xml', '/videos/video-splash-image.gif)
With this RegExp: \'.+.xml\'
... we get this:
'mediacontainer1', 'http://www.something.com/videos/JohnsAwesomeVideo.flv', 'http://www.something.com/videos/JohnsAwesomeCaption.xml'
... but I want to extract only this:
http://www.something.com/videos/JohnsAwesomeCaption.xml
Any suggestions? I'm sure this problem has been asked before, but it's difficult to search for. I'll be happy to Accept a solution.
Thanks!
If you want to get everything within quotes that starts with http:
(?<=')http:[^']+(?=')
If you only want those ending with .xml
(?<=')http:[^']+\.xml(?=')
It doesn't select the quotation marks (as you asked)
It's fast!
Fair warning: it only works if the regex engine you're using can handle lookbehind
Knowing the language would be helpful. Basically, you are having a problem because the + quantifier is greedy, meaning it will match the largest part of the string that it can. you need to use a non-greedy quantifier, which will match as little as possible.
We will need to know the language you're in to know what the syntax for the non-greedy quantifier should be.
Here is a perl recipe. Just as a sidenote, instead of .+, you probably want to match [^.]+.xml.
\'.+?.xml\'
should work if your language supports perl-like regexes.
This should work (tested in javascript, but pretty sure it would work in most cases)
'[^']+?\.xml'
it looks for these rules
starts with '
is followed by anything but '
ends in .xml'
you can demo it at http://RegExr.com?2tp6q
in .net this regex works for me:
\'[\w:/.]+\.xml\'
breaking it down:
a ' character
followed by a word character or ':' or '/' or '.' any number of times (which matches the url bit)
followed by '.xml' (which differentiates the sought string from the other urls which it will match without this)
followed by another ' character
I tested it here
Edit
I missed that you don't want the quotes in the result, in which case as has been pointed out you need to use look behind and look ahead to include the quotes in the search, but not in the answer. again in .net:
(?<=')[\w:/.]+\.xml(?=')
but I think the best solution is a combination of those offered already:
(?<=')[^']+\.xml(?=')
which seems the simplest to read, at least to me.

Regexp groovy

I need a regexp to find strings that start with a specific word then comes colon and whitespace for example
"ErrorID: blabla"
Please help. :(
This should work fine:
^(\w+): (.+)$
First match group will give the first word (e.g. ErrorID), second the rest (e.g. blabla).
Exact implementation would depend on the programming language you use.
This should do what you want:
^ErrorID: .*$

Regex match everything after question mark?

I have a feed in Yahoo Pipes and want to match everything after a question mark.
So far I've figured out how to match the question mark using..
\?
Now just to match everything that is after/follows the question mark.
\?(.*)
You want the content of the first capture group.
Try this:
\?(.*)
The parentheses are a capturing group that you can use to extract the part of the string you are interested in.
If the string can contain new lines you may have to use the "dot all" modifier to allow the dot to match the new line character. Whether or not you have to do this, and how to do this, depends on the language you are using. It appears that you forgot to mention the programming language you are using in your question.
Another alternative that you can use if your language supports fixed width lookbehind assertions is:
(?<=\?).*
With the positive lookbehind technique:
(?<=\?).*
(We're searching for a text preceded by a question mark here)
Input: derpderp?mystring blahbeh
Output: mystring blahbeh
Example
Basically the ?<= is a group construct, that requires the escaped question-mark, before any match can be made.
They perform really well, but not all implementations support them.
\?(.*)$
If you want to match all chars after "?" you can use a group to match any char, and you'd better use the "$" sign to indicate the end of line.
?(.*\n)+
With this you can get everything Even a new line
Check out this site: http://rubular.com/ Basically the site allows you to enter some example text (what you would be looking for on your site) and then as you build the regular expression it will highlight what is being matched in real time.
str.replace(/^.+?\"|^.|\".+/, '');
This is sometimes bad to use when you wanna select what else to remove between "" and you cannot use it more than twice in one string. All it does is select whatever is not in between "" and replace it with nothing.
Even for me it is a bit confusing, but ill try to explain it. ^.+? (not anything OPTIONAL) till first " then | Or/stop (still researching what it really means) till/at ^. has selected nothing until before the 2nd " using (| stop/at). And select all that comes after with .+.