regular expression to check input on multiline textbox - regex

Lets say I have a multiline textbox that I would like to have checked for say, 1-2 digits over a max of 5 lines. I found a regular expression pattern answered on another similar question on here but it was not working for me even after modifying it a number of times.
I'm currently using the following without success.
Dim textCheck As New Regex("(^\d{1,2}$\r?\n?){0,5}", RegexOptions.Multiline)
Could somebody help me out with what I am doing wrong?
Thanks

So you're wanting to match a list of 1 to 2 digit numbers separated by a newline, up to five? if so, this should work. the last newline is optional and if theres anything else in the string it doesn't match. (for this, don't use RegexOptions.Multiline)
I checked this with C#, so I'm not sure if the escape characters are correct. i noticed yours only had 1 slash before the d. in c# you need two, but i removed it from this to make it look like yours.
Dim textCheck As New Regex("^\d{1,2}((\r|\n|\r\n)\d{1,2}){0,4}(\r|\n|\r\n)?$")

First, Grab a copy of RegEx Designer. it's free and worth it's wieght for this kind of thing.
http://www.radsoftware.com.au/?from=RegexDesigner
Then, I think what you might want is something like this
(^\d{1,2}\r?\n?){0,5}\z
and then test that the match includes the entire input. The $ in the middle won't help, the \z forces the match to the end of string. There's probably some details I've missed though. Again, RegExDesigner makes playing with regexes sooooo much more enjoyable!

Related

Matching all strings without 3 occurrences of/or final single character in RegEx

Trying to figure out the regex for the title,
i.e.,
foo
foo/bar/foo
foo/bar/foo/bar
foo/bar/d
I don't want it to match the 3rd or the 4th one but match the first two. In the 2nd option, the final foo can be anything but a single d.
You could use a regex but it will be more complicated than just counting the number of slashes and also checking the last character isn't a d. If you want to use a regex to check for the last part not being "/d" you could do something like check that it doesn't match ^.*/d$ but it may be clearer to just use code. (If counting slashes and checking string doesn't end in "/d" isn't exactly what you mean then it will help to have more examples)
Figured it out. See below if anyone is interested.
(^foo/?$)|(^foo/[^/]+/(([^d][^/]*)|(d[^/]+))/?$)

Vim S&R to remove number from end of InstallShield file

I've got a practical application for a vim regex where I'd like to remove numbers from the end of file location links. For example, if the developer is sloppy and just adds files and doesn't reuse file locations, you'll end up with something awful like this:
PATH_TO_MY_FILES&gt
PATH_TO_MY_FILES1&gt
...
PATH_TO_MY_FILES22&gt
PATH_TO_MY_FILES_ELSEWHERE&gt
PATH_TO_MY_FILES_ELSEWHERE1&gt
...
So all I want to do is to S&R and replace PATH_TO_MY_FILES*\d+ with PATH_TO_MY_FILES* using regex. Obviously I am not doing it quite right, so I was hoping someone here could not spoon feed the answer necessarily, but throw a regex buzzword my way to get me on track.
Here's what I have tried:
:%s\(PATH_TO_MY_FILES\w*\)\(\d+\)&gt:gc
But this doesn't work, i.e. if I just do a vim search on that, it doesn't find anything. However, if I use this:
:%s\(PATH_TO_MY_FILES\w*\)\(\d\)&gt:gc
It will match the string, but the grouping is off, as expected. For example, the string PATH_TO_MY_FILES22 will be grouped as (PATH_TO_MY_FILES2)(2), presumably because the \d only matches the 2, and the \w match includes the first 2.
Question 1: Why doesn't \d+ work?
If I go ahead and use the second string (which is wrong), Vim appears to find a match (even though the grouping is wrong), but then does the replacement incorrectly.
For example, given that we know the \d will only match the last number in the string, I would expect PATH_TO_MY_FILES22&gt to get replaced with PATH_TO_MY_FILES2&gt. However, instead it replaces it with this:
PATH_TO_MY_FILES2PATH_TO_MY_FILES22&gtgt
So basically, it looks like it finds PATH_TO_MY_FILES22&gt, but then replaces only the & with group 1, which is PATH_TO_MY_FILES2.
I tried another regex at Regexr.com to see how it would interpret my grouping, and it looked correct, but maybe a hack around my lack of regex understanding:
(PATH_TO_\D*)(\d*)&gt
This correctly broke my target string into the PATH part and the entire number, so I was happy. But then when I used this in Vim, it found the match, but still replaced only the &.
Question 2: Why is Vim only replacing the &?
Answer 1:
You need to escape the + or it will be taken literally. For example \d\+ works correctly.
Answer 2:
An unescaped & in the replacement portion of a substitution means "the entire matched text". You need to escape it if you want a literal ampersand.

What is wrong with my simple regex that accepts empty strings and apartment numbers?

So I wanted to limit a textbox which contains an apartment number which is optional.
Here is the regex in question:
([0-9]{1,4}[A-Z]?)|([A-Z])|(^$)
Simple enough eh?
I'm using these tools to test my regex:
Regex Analyzer
Regex Validator
Here are the expected results:
Valid
"1234A"
"Z"
"(Empty string)"
Invalid
"A1234"
"fhfdsahds527523832dvhsfdg"
Obviously if I'm here, the invalid ones are accepted by the regex. The goal of this regex is accept either 1 to 4 numbers with an optional letter, or a single letter or an empty string.
I just can't seem to figure out what's not working, I mean it is a simple enough regex we have here. I'm probably missing something as I'm not very good with regexes, but this syntax seems ok to my eyes. Hopefully someone here can point to my error.
Thanks for all help, it is greatly appreciated.
You need to use the ^ and $ anchors for your first two options as well. Also you can include the second option into the first one (which immediately matches the third variant as well):
^[0-9]{0,4}[A-Z]?$
Without the anchors your regular expression matches because it will just pick a single letter from anywhere within your string.
Depending on the language, you can also use a negative look ahead.
^[0-9]{0,4}[A-Za-z](?!.*[0-9])
Breakdown:
^[0-9]{0,4} = This look for any number 0 through 4 times at the beginning of the string
[A-Za-z] = This look for any characters (Both cases)
(?!.*[0-9]) = This will only allow the letters if there are no numbers anywhere after the letter.
I haven't quite figured out how to validate against a null character, but that might be easier done using tools from whatever language you are using. Something along this logic:
if String Doesn't equal $null Then check the Rexex
Something along those lines, just adjusted for however you would do it in your language.
I used RegEx Skinner to validate the answers.
Edit: Fixed error from comments

Replacing char in a String with Regular Expression

I got a string like this:
PREFIX-('STRING WITH SPACES TO REPLACE')
and i need this:
PREFIX-('STRING_WITH_SPACES_TO_REPLACE')
I'm using Notepad++ for the Regex Search and Replace, but i'm shure every other Editor capable of regex replacements can do it to.
I'm using:
PREFIX-\('(.*)(\s)(.*)'\)
for search and
PREFIX-('\1_\3')
for replace
but that replaces only one space from the string.
The regex search feature in Notepad++ is very, very weak. The only way I can see to do this in NPP is to manually select the part of the text you want to work on, then do a standard find/replace with the In selection box checked.
Alternatively, you can run the document through an external script, or you can get a better editor. EditPad Pro has the best regex support I've ever seen in an editor. It's not free, but it's worth paying for. In EPP all I had to do was this:
search: ((?:PREFIX-\('|\G)[^\s']+)\s+
replace: $1_
EDIT: \G matches the position where the previous match ended, or the beginning of the input if there was no previous match. In other words, the first time you apply the regex, \G acts like \A. You can prevent that by adding a negative lookahead, like so:
((?:PREFIX-\('|(?!\A)\G)[^\s']+)\s+
If you want to prevent a match at the very beginning of the text no matter what it starts with, you can move the lookahead outside the group:
(?!\A)((?:PREFIX-\('|\G)[^\s']+)\s+
And, just in case you were wondering, a lookbehind will work just as well as a lookahead:
((?:PREFIX-\('|(?<!\A)\G)[^\s']+)\s+
You have to keep matching from the beggining of the string untill you can match no more.
find /(PREFIX-\('[^\s']*)\s([^']*'\))/
replace $1_$2
like: while (/(PREFIX-\('[^\s']*)\s([^']*'\))/$1_$2/) {}
How about using Replace all for about 20 times? Or until you're sure no string contains more spaces
Due to nature of regex, it's not possible to do this in one step by normal regular expression.
But if I be in your place, I do such replaces in several steps:
find such patterns and mark them with special character
(Like replacing STRING WITH SPACES TO REPLACE with #STRING WITH SPACES TO REPLACE#
Replace #([^#\s]*)\s to #\1_ server times.
Remove markers!
I studied a little the regex tool in Notepad++ because I didn't know their possibilities.
I conclude that they aren't powerful enough to do what you want.
Your are obliged to learn and use a programming language having a real regex capability. There are a number of them. Personnaly, I use Python. It would take 1 mn to do what you want with it
You'd have to run the replace several times for each space but this regex will work
/(?<=PREFIX-\(')([^\s]+)\s+/g
Replace with
\1_ or $1_
See it working at http://refiddle.com/10z

Notepad++ Find/Replace Regex Help

I am having issues doing a string replacement in Notepad++, and need some help.
My file:
LastName,(tab)FirstName[optional]MiddleName
Some times there is data that has a middle name, sometimes not.
Public,JohnQ.
Doe,John
Clinton,WilliamJefferson
would be:
Public(tab)John(tab)Q
Doe(tab)John
Clinton(tab)William(tab)Jefferson
I want to split it out into this:
LastName(tab)FirstName(tab)MiddleName
Thanks for adding the sample input. It helps immensely to have that around. Try this and see if it does what you want.
Find, making sure Match case is checked:
([A-Z][a-z]*),([A-Z][a-z]*)(.*)
Replace with:
\1(tab)\2(tab)\3
Of course, (tab) is actually a tab character that you have to place in the replacement string yourself.
An ugly regex like this works for me on the example you've provided:
(\w+),(\w+?)(([A-Z]\w*\.?)?)\n
replace with
\1\t\2\t\3\n
Note:
This only works if the middle name starts with a letter in the A-Z. You might be able to replace [A-Z] with [[:upper:]] if notepad++ supports it (I don't know).
I need that second bracket around the middle name part because I need to match at least an empty string when there is no middle name.