Using "#" in regular expression with VB.NEt - regex

Assuming I have to check if "#" exists on a given string - should I use back slash before or not? So far I found they're both working for me, but I'm not sure if it always works on any Windows host (this is part of a VB.NET application that has to work world-wide)
The string: Hello #world
Pattern1: Hello #world
Pattern 2: Hello \#world
Which one should I use to get the most precise matching? pattern1 or pattern2?
I work with VB.NET on VS2010 (.NET FW 3.5)
Thank you

# is not a special regex character, at least not in VB.NET. Which means that both patterns are pretty much the same, and you can use whichever you prefer. Although for readability sake you probably should stick to the pattern without backslash.
You can find complete list of special regex characters in .NET here.

I would suggest you to leave this option on Regex engine. Just use its Regex.Escape function. It will escape the necessary things.

Related

EditPad: How to replace multiple search criteria with multiple values?

I did some searching and found tons of questions about multiple replacements with Regex, but I'm working in EditPadPro and so need a solution that works with the regex syntax of that environment. Hoping someone has some pointers as I haven't been able to work out the solution on my own.
Additional disclaimer: I suck with regex. I mean really... it's bad. Like I barely know wtf I'm doing.So that being said, here is what I need to do and how I'm currently approaching it...
I need to replace two possible values, with their corresponding replacements. My two searches are:
(.*)-sm
(.*)-rad
Currently I run these separately and replace each with simple strings:
sm
rad
Basically I need to lop off anything that comes prior to "sm" so I just detect everything up to and including sm, and then replace it all with that string (and likewise for "rad").
But it seems like there should be a way to do this in a single search/replace operation. I can do the search part fine with:
(.*)-sm|(.*)-rad
But then how to replace each with it's matching value? That's where I'm stuck. I tried:
sm|rad
but alas, that just becomes the literal complete string that is used for replacement.
Jonathan, first off let me congratulate you for using EPP Pro for regex in your text. It's my main text editor, and the main reason I chose it, as a regex lover, is that its support of regex syntax is vastly superior to competing editors. For instance Notepad++ is known for its shoddy support of regular expressions. The reason of course is that EPP's author Jan Goyvaerts is the author of the legendary RegexBuddy.
A picture is worth a thousand words... So here is how I would do your replacement. Just hit the "replace all button". The expression in the regex box assumes that anything before the dash that is not a whitespace character can be stripped, so if this is not what you want, we need to tune it.
Search for:
(.*)-(sm|rad)
Now, when you put something in parenthesis in Regex, those matches are stored in temporary variables. So whatever matched (.*) is stored in \1 and whatever matched (sm|rad) is stored in \2. Therefore, you want to replace with:
\2
Note that the replacement variable may be different depending on what programming language you are using. In Perl, for example, I would have to use $2 instead.

Regex match first characters of string

I am trying to create a regex that will match the first 3 characters of a string,
If I have a string ABCFFFF I want to verify that the first 3 characters are ABC.
It's pretty straightforward, the pattern would be ^ABC
As others may point out, using regular expressions for such a task is an overkill. Any programming language with regex support can do it better with simple string manipulations.
Just simple regex will work:
/^ABC/
But is it a good use case for using regex, I am not sure. Consider using substring in whatever language/platform you're using.
"^ABC" should work. '^' matches the start in most regex implementations.

RegExp extraction

Here's the input string:
loadMedia('mediacontainer1', 'http://www.something.com/videos/JohnsAwesomeVideo.flv', 'http://www.something.com/videos/JohnsAwesomeCaption.xml', '/videos/video-splash-image.gif)
With this RegExp: \'.+.xml\'
... we get this:
'mediacontainer1', 'http://www.something.com/videos/JohnsAwesomeVideo.flv', 'http://www.something.com/videos/JohnsAwesomeCaption.xml'
... but I want to extract only this:
http://www.something.com/videos/JohnsAwesomeCaption.xml
Any suggestions? I'm sure this problem has been asked before, but it's difficult to search for. I'll be happy to Accept a solution.
Thanks!
If you want to get everything within quotes that starts with http:
(?<=')http:[^']+(?=')
If you only want those ending with .xml
(?<=')http:[^']+\.xml(?=')
It doesn't select the quotation marks (as you asked)
It's fast!
Fair warning: it only works if the regex engine you're using can handle lookbehind
Knowing the language would be helpful. Basically, you are having a problem because the + quantifier is greedy, meaning it will match the largest part of the string that it can. you need to use a non-greedy quantifier, which will match as little as possible.
We will need to know the language you're in to know what the syntax for the non-greedy quantifier should be.
Here is a perl recipe. Just as a sidenote, instead of .+, you probably want to match [^.]+.xml.
\'.+?.xml\'
should work if your language supports perl-like regexes.
This should work (tested in javascript, but pretty sure it would work in most cases)
'[^']+?\.xml'
it looks for these rules
starts with '
is followed by anything but '
ends in .xml'
you can demo it at http://RegExr.com?2tp6q
in .net this regex works for me:
\'[\w:/.]+\.xml\'
breaking it down:
a ' character
followed by a word character or ':' or '/' or '.' any number of times (which matches the url bit)
followed by '.xml' (which differentiates the sought string from the other urls which it will match without this)
followed by another ' character
I tested it here
Edit
I missed that you don't want the quotes in the result, in which case as has been pointed out you need to use look behind and look ahead to include the quotes in the search, but not in the answer. again in .net:
(?<=')[\w:/.]+\.xml(?=')
but I think the best solution is a combination of those offered already:
(?<=')[^']+\.xml(?=')
which seems the simplest to read, at least to me.

How to search (using regex) for a regex literal in text?

I just stumbled on a case where I had to remove quotes surrounding a specific regex pattern in a file, and the immediate conclusion I came to was to use vim's search and replace util and just escape each special character in the original and replacement patterns.
This worked (after a little tinkering), but it left me wondering if there is a better way to do these sorts of things.
The original regex (quoted): '/^\//' to be replaced with /^\//
And the search/replace pattern I used:
s/'\/\^\\\/\/'/\/\^\\\/\//g
Thanks!
You can use almost any character as the regex delimiter. This will save you from having to escape forward slashes. You can also use groups to extract the regex and avoid re-typing it. For example, try this:
:s#'\(\\^\\//\)'#\1#
I do not know if this will work for your case, because the example you listed and the regex you gave do not match up. (The regex you listed will match '/^\//', not '\^\//'. Mine will match the latter. Adjust as necessary.)
Could you avoid using regex entirely by using a nice simple string search and replace?
Please check whether this works for you - define the line number before this substitute-expression or place the cursor onto it:
:s:'\(.*\)':\1:
I used vim 7.1 for this. Of course, you can visually mark an area before (onto which this expression shall be executed (use "v" or "V" and move the cursor accordingly)).

Regex: How to replace with the string literal "\1"?

I have a string, say r"a". I want to replace every r"a" with the string r"\1", but my regex engine does not understand this.
I have tried:
r"\1" -- crashes (can't match group 1 because there is no group 1)
r"\\1" -- crashes (not sure why)
Is this a limitation of my (proprietary) regex engine, or is it a general problem? Is there an elegant way of solving it? (I could e.g. replace "a" by "/1" and then StrReplace( "/", r"\" )... but that's not nice!)
The correct way would be to use r"\\1" as a replace string. So if your proprietary regex engine/language chokes on a \\, you should fix this bug.
If you look at your example, you don't need a regex engine at all. But perhaps the example is simpler than the actual requirement...