Getting the last backslash in a filepath via regex - regex

Given a file path such as: \\server\folder_A\folder_B\etc\more.mov
I need a regex that'll give me the last backslash so I can extract the actual file name.
My attempt of "$\\" is not returning anything.
I'm using coldfusion.
Suggestions...?

What about
<cfset fileName = GetFileFromPath("\\server\folder_A\folder_B\etc\more.mov") />

Do you just want everything after the last backslash (the filename)?
([^\\]+)$
The filename will be contained in the capture.
To match starting at the last backslash you'd do...
\\[^\\]+$
I'm not familiar with coldfusion, but I'm assuming that if it does regular expressions, it does captures as well. If you do really need the position and can get that from the match, the second expression might be what you want.
(Edited for clarity and to answer comment)

Do you absolutely have to use regex? Why not split the string and grab the last element?
<cfset fileName = ListLast(filePath, "\\")>

Related

How can I remove from string the four first characters? iMacros

How can I remove from string the four first characters with iMacros.
I know it something with EVAL function but don't know how to write it.
TY
I know nothing of iMacros, but a quick look at the docs suggests
SET !SHORTENEDSTRING = EVAL("\"{{!EXTRACT}}\".substr(4)")
The \" escapes the double-quote characters inside the string. {{!EXTRACT}} is for what was extracted by a previous iMacros statement. The .substr(4) is a JavaScript function: MDN substr documentation, and remember that it uses a zero-based index.

Regex Fine End of Search Field then remove rest of line

Hi I have been looking for a Regex I can find most of what im after but not quite right.
Im trying to do a find a replace using regex, which i can get to work but not quite the way i want to.
An example of what i am searching is
10/01/14PUT/a/users/84335httpetcetcetcete
10/01/14GET/a/users/663/badges?thisisatest
10/01/14GET/a/users/8836:thisisatestetc
What im trying to do is and the end of the user digits as shown below by a % i have put in temporarily i want to remove the rest of the line.
10/01/14PUT/a/users/84335%httpetcetcetcete
10/01/14GET/a/users/663%/badges?thisisatest
10/01/14GET/a/users/8836%:thisisatestetc
I have been using s = s.regex.replace(s, "a/users/\d*", " ")
but this if obviously not working, so close yet so far.
Any assistance is gratefully received.
Many thanks, VBVirg
You were actually on the right track, the regex you came up with is almost what you need:
a/users/\d*
But what your call did was actually replace what you wanted to preserve with a space.
The regex you're looking for would be more like this:
(a\/users\/\d*).*$
And you would use it in the Replace() method as follows:
s = Regex.Replace(s, "(a\/users\/\d*).*$", "$1") />
The $1 is a backreference to the capture group (the part of the regex in parentheses). So what this would do is take whatever part of the string matches that regex, and replace it with only what is in the capture group.
How about: s = s.regex.replace(s, "(a/users/\d*).*", "\1")
This will save the "a/users/(digits)" string to a variable (\1), so it doesn't get deleted by the replace function.
I think the following will do what you want:
s = Regex.Replace(s, "^(.*\/users\/\d*).*$", "$1")
It works by capturing the part of the string you are interested in and replacing the whole string with just the part that was captured.

Regex substring

I'm trying to select a substring using regex and I'm going round in circles. I need to select everything before the first "_".
exampale URL - GI_2013_JUNE_10_VOL3_LASTCHANCE
So the result Im looking for from the URL above would be "GI". The text before the first "_" can vary in length.
Any help would be much apprecited
The regex would be:
^[^_]+
and grab the whole regex match. But as a comment says, using a substring function is more efficient!
^[^_]*
...is the expression you're looking for.
It basically says: Select everything that is not an underscore, starting at the beginning of the string.
http://regexr.com?356in

Regex matching in ColdFusion OR condition

I am attempting to write a CF component that will parse wikiCreole text. I am having trouble getting the correct matches with some of my regular expression though. I feel like if I can just get my head around the first one the rest will just click. Here is an example:
The following is sample input:
You can make things **bold** or //italic// or **//both//** or //**both**//.
Character formatting extends across line breaks: **bold,
this is still bold. This line deliberately does not end in star-star.
Not bold. Character formatting does not cross paragraph boundaries.
My first attempt was:
<cfset out = REreplace(out, "\*\*(.*?)\*\*", "<strong>\1</strong>", "all") />
Then I realized that it would not match where the ** is not given, and it should end where there are two carriage returns.
So I tried this:
<cfset out = REreplace(out, "\*\*(.*?)[(\*\*)|(\r\n\r\n)]", "<strong>\1</strong>", "all") />
and it is close but for some reason it gives you this:
You can make things <strong>bold</strong>* or //italic// or <strong>//both//</strong>* or //<strong>both</strong>*//.
Character formatting extends across line breaks: <strong>bold,</strong>
this is still bold. This line deliberately does not end in star-star.
Not bold. Character formatting does not cross paragraph boundaries.
Any ideas?
PS: If anyone has any suggestions for better tags, or a better title for this post I am all ears.
The [...] represents a character class, so this:
[(\*\*)|(\r\n\r\n)]
Is effectively the same as this:
[*|\r\n]
i.e. it matches a single "*" and the "|" isn't an alternation.
Another problem is that you replace the double linefeed. Even if your match succeeded you would end up merging paragraphs. You need to either restore it or not consume it in the first place. I'd use a positive lookahead to do the latter.
In Perl I'd write it this way:
$string =~ s/\*\*(.*?)(?:\*\*|(?=\n\n))/<strong>$1<\/strong>/sg;
Taking a wild guess, the ColdFusion probably looks like this:
REreplace(out, "\*\*(.*?)(?:\*\*|(?=\r\n\r\n))", "<strong>\1</strong>", "all")
You really should change your
(.*?)
to something like
[^*]*?
to match any character except the *. I don't know if that is the problem, but it could be the any-character . is eating one of your stars. It also a generally accepted "best practice" when trying to balance matching characters like the double star or html start/end tags to explicitly exclude them from your match set for the inner text.
*Disclaimer, I didn't test this in ColdFusion for the nuances of the regex engine - but the idea should hold true.
I know this is an older question but in response to where Ryan Guill said "I tried the $1 but it put a literal $1 in there instead of the match" for ColdFusion you should use \1 instead of $1
I always use a regex web-page. It seems like I start from scratch every time I used regex.
Try using '$1' instead of \1 for this one - the replace is slightly different... but I think the pattern is what you need to get working.
Getting closer with this:
**(.?)**|//(.?)//
The tricky part is the //** or **//
Ok, first checking for //bold//
then //bold// then bold, then
//bold//
**//(.?)//**|//**(.?)**//|**(.?)**|//(.?)//
I find this app immensely helpful when I'm doing anything with regex:
http://www.gskinner.com/RegExr/desktop/
Still doesn't help with your actual issue, but could be useful going forward.

Need regexp to find substring between two tokens

I suspect this has already been answered somewhere, but I can't find it, so...
I need to extract a string from between two tokens in a larger string, in which the second token will probably appear again meaning... (pseudo code...)
myString = "A=abc;B=def_3%^123+-;C=123;" ;
myB = getInnerString(myString, "B=", ";" ) ;
method getInnerString(inStr, startToken, endToken){
return inStr.replace( EXPRESSION, "$1");
}
so, when I run this using expression ".+B=(.+);.+"
I get "def_3%^123+-;C=123;" presumably because it just looks for the LAST instance of ';' in the string, rather than stopping at the first one it comes to.
I've tried using (?=) in search of that first ';' but it gives me the same result.
I can't seem to find a regExp reference that explains how one can specify the "NEXT" token rather than the one at the end.
any and all help greatly appreciated.
Similar question on SO:
Regex: To pull out a sub-string between two tags in a string
Regex to replace all \n in a String, but no those inside [code] [/code] tag
Replace patterns that are inside delimiters using a regular expression call
RegEx matching HTML tags and extracting text
You're using a greedy pattern by not specifying the ? in it. Try this:
".+B=(.+?);.+"
Try this:
B=([^;]+);
This matches everything between B= and ; unless it is a ;. So it matches everything between B= and the first ; thereafter.
(This is a continuation of the conversation from the comments to Evan's answer.)
Here's what happens when your (corrected) regex is applied: First, the .+ matches the whole string. Then it backtracks, giving up most of the characters it just matched until it gets to the point where the B= can match. Then the (.+?) matches (and captures) everything it sees until the next part, the semicolon, can match. Then the final .+ gobbles up the remaining characters.
All you're really interested in is the "B=" and the ";" and whatever's between them, so why match the rest of the string? The only reason you have to do that is so you can replace the whole string with the contents of the capturing group. But why bother doing that if you can access contents of the group directly? Here's a demonstration (in Java, because I can't tell what language you're using):
String s = "A=abc;B=def_3%^123+-;C=123;";
Pattern p = Pattern.compile("B=(.*?);");
Matcher m = p.matcher(s);
if (m.find())
{
System.out.println(m.group(1));
}
Why do a 'replace' when a 'find' is so much more straightforward? Probably because your API makes it easier; that's why we do it in Java. Java has several regex-oriented convenience methods in its String class: replaceAll(), replaceFirst(), split(), and matches() (which returns true iff the regex matches the whole string), but not find(). And there's no convenience method for accessing capturing groups, either. We can't match the elegance of Perl one-liners like this:
print $1 if 'A=abc;B=def_3%^123+-;C=123;' =~ /B=(.*?);/;
...so we content ourselves with hacks like this:
System.out.println("A=abc;B=def_3%^123+-;C=123;"
.replaceFirst(".+B=(.*?);.+", "$1"));
Just to be clear, I'm not saying not to use these hacks, or that there's anything wrong with Evan's answer--there isn't. I just think we should understand why we use them, and what trade-offs we're making when we do.