Regular Expression: Filenames - regex

Extremely new to this and have been trying to figure this out on my own, but no luck.
It seems simple. I have files that are named either starting with L or P, followed by 6 numbers. I need to have 2 expressions, one that only reads files starting with L and one that only reads files starting with P.
I have tried using derivatives of ^[K-M], ^\L.*
No luck so far. Hoping someone can offer a suggestion.
Thanks for your time!

Try ^P\d{6} and ^L\d{6}. The ^ says start at the beginning of the string. The \d{6} matches 6 digits.
If at some point you wanted to match both in one go, you could do ^[LP]\d{6}. The [LP] says match one of L or P.
If the above doesn't work, you might be working with a more limited regex implementation. You could try ^P\d\d\d\d\d\d and ^L\d\d\d\d\d\d to get the same results.
If that doesn't work, you could try ^P[0-9][0-9][0-9][0-9][0-9][0-9] and ^L[0-9][0-9][0-9][0-9][0-9][0-9] which should work on all regex implementations. The \d is just shorthand for [0-9] anyway.

Seth's answer is correct.
If it doesn't matter what comes after the 'P' or 'L' you could also just use ^P and ^L.
In the future, you should try testing how regexes match your input strings using a regex tester such as RegexPal or Regular Expression Editor.

Related

Can not find specific regular expression

I can not find a regular expression that matches what I'm looking for.
I would like a regular expression that matches 15 consecutive characters (except space, exclamation point, comma, period). So far the expression is [^!\?.\s!,]{20}. But I do not want that match if in these 15 characters, 10 are identical.
So match with "jqshjsdfhjsdlfdjqlsmskjm" but not with "thaaaaaaaaaaaaaaank"
thank you
You can achieve something close to that: (([^!\?.\s!,])(?!\1)){15}. See the solution working here.
This solution, however, has a setback: it fails when it finds patterns like 131 or bab. If even with this setback the solution works for you, then good. If not, then this is as far as regex goes. You'll have to work out that logic programatically.
Disclaimer: I'm out of time right now and will edit my answer later to include an explanation of the regex and the reason why it has a setback Although someone else could edit this answer and do it for me : ) .

Regular expression everything after a link

I am a complete regular expression idiot, just keep that in mind :)
I am trying to create a regular expression that will match link:xxxxxx where everything after link: is a wildcard.
Can i just do link:* or am I totally misguided?
link:.* should work correctly.
. matches any character, and you want to repeat it "0 to unlimited" times so you add *.
If you're new to regex, a good way to learn it is by using regex101.
For your problem, you can check out this regex101 example
(Note that I have also added the g modifier, which means that you want to select all matches, not just the first matching line)

Going from regex to word vba (.Find)

I have this regex
<#([^\s]+).*?>\s?<a href=""(.*?)"".*?>(.*?)</a>(\s?\((Pending|Prepared)\))?
And i really need it in a vba version for words .find method (don't need the matching-groups), here is what i have so far
\<\#*\>*\<a href=*\>*\<\/a\>
But i cant get the last part to work, here I'm talking about
(\s?\((Pending|Prepared)\))?
I really hope someone can help me, as regex in this case is not an option (Although i know i can use regex in VBA!)
Cheers
I don't see an OR | in the documentation (Wildcard character reference) or the examples (Putting regular expressions to work in Word), so instead I suggest splitting it into two separate searches. The Word MVPs site has a good reference on the Word Regex as well if you want more information.
[^\s] can be written in the Word style regex as [! ] (note the space), + becomes #. It appears that neither the {n,} nor {n,m} syntax of VBA support an n value of 0, making ? and * hard to implement in Word. One option that the MS guys seem to use is *, which in Word is "Any string of characters". By my testing, * is lazy, meaning the pattern \<#*\> run against the string <#sometag> asdfsadfasdf > will only match <#sometag>. In addition, it can match 0 characters, for example \<\#*\> will match <#>.
So assuming that the first part is working as you expect, you could try the following two regex:
\<\#*\>*\<a href=*\>*\<\/a\>*\(Pending\)
and
\<\#*\>*\<a href=*\>*\<\/a\>*\(Prepared\)
The trouble here is that the * will match up until it hits the P of Pending or Prepared, so there could be other text in between, but it's the only way I can see of matching an optional space. If you can guaruntee that the space will or will not be there, that would go a long way towards making the regex safer.
Give that a try and see if it works for you!

What is wrong with my simple regex that accepts empty strings and apartment numbers?

So I wanted to limit a textbox which contains an apartment number which is optional.
Here is the regex in question:
([0-9]{1,4}[A-Z]?)|([A-Z])|(^$)
Simple enough eh?
I'm using these tools to test my regex:
Regex Analyzer
Regex Validator
Here are the expected results:
Valid
"1234A"
"Z"
"(Empty string)"
Invalid
"A1234"
"fhfdsahds527523832dvhsfdg"
Obviously if I'm here, the invalid ones are accepted by the regex. The goal of this regex is accept either 1 to 4 numbers with an optional letter, or a single letter or an empty string.
I just can't seem to figure out what's not working, I mean it is a simple enough regex we have here. I'm probably missing something as I'm not very good with regexes, but this syntax seems ok to my eyes. Hopefully someone here can point to my error.
Thanks for all help, it is greatly appreciated.
You need to use the ^ and $ anchors for your first two options as well. Also you can include the second option into the first one (which immediately matches the third variant as well):
^[0-9]{0,4}[A-Z]?$
Without the anchors your regular expression matches because it will just pick a single letter from anywhere within your string.
Depending on the language, you can also use a negative look ahead.
^[0-9]{0,4}[A-Za-z](?!.*[0-9])
Breakdown:
^[0-9]{0,4} = This look for any number 0 through 4 times at the beginning of the string
[A-Za-z] = This look for any characters (Both cases)
(?!.*[0-9]) = This will only allow the letters if there are no numbers anywhere after the letter.
I haven't quite figured out how to validate against a null character, but that might be easier done using tools from whatever language you are using. Something along this logic:
if String Doesn't equal $null Then check the Rexex
Something along those lines, just adjusted for however you would do it in your language.
I used RegEx Skinner to validate the answers.
Edit: Fixed error from comments

Floating Point - Regular expression

I am struggling to understand this simple regular expression. I have the following attempt:
[0-9]*\.?[0-9]*
I understand this as zero-to-many numeric digits, followed by one-to-zero periods and finally ending in zero-to-many numeric digits.
I am not want to match anything other than exactly as above. I do not want positive/negative support or any other special support types. However, for some reason, the above also matches what appear to be random characters. All of the following for whatever reason match:
f32
32a
32-
=33
In an answer, I am looking for:
An explanation of why my regular expression does not work.
A working version with an explanation of why it does work.
Edit: Due to what seems to be causing trouble, I have added the "QT" tag, that is the environment I am working with.
Edit: Due to continued confusion, I am going to add a bit of code. I am starting to think I am either misusing QT, or QT has a problem:
void subclassedQDialog::setupTxtFilters()
{
QRegExp numbers("^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$");
txtToFilter->setValidator(new QRegExpValidator(numbers,this));
}
This is from within a subclassed QDialog. txtToFilter is a QLineEdit. I can provide more code if someone can suggest what may be relevant. While the expression above is not the original, it is one of the ones from comments below and also fails in the same way.
Your problem is you haven't escaped the \ properly, you need to put \\. Otherwise the C++ compiler will strip out the \ (at least gcc does this, with a warning) and the regex engine will treat the . as any character.
Put ^ at the start and $ at the end. This anchors your regex to the start and end of the string.
Your expression find a match in the middle of the string. If you add anchors to the beginning and to the end of your expression, the strings from your list will be ignored. Your expression would match empty strings, but that't the price you pay for being able to match .99 and 99. strings.
^[0-9]*\.?[0-9]*$
A better choice would be
^[0-9]*(\.[0-9]+)?$
because it would match the decimal point only if at least one digit is present after it.
One of them needs to be a + instead of *. Do you want to allow ".9" to be valid, or will you require the leading 0?