Regex - Limiting the amount of characters in a string - regex

This one is giving me trouble on my assignment.
Assume the alphabet of:
-any lowercase or uppercase letter
-0-9 decimal digits
-_
-$
-%
I want to write an expression that give me strings that:
-starts with a uppercase letter or one of the three symbols
-can only have at most 6 lowercase or uppercase letters
I wanted to try something like
/^[a-z|_|$|%][a-z|A-Z|_|$|%]* {0,3}
but I'm having trouble with keeping track of the "at most" case depending on the initial character
edit: Sorry forgot examples.
_ababab <- OK
ab%$aaaa <- OK
_abababa <- NOT OK, because there is more than 6 alphabet characters
a$ababab <- NOT OK, because there is more than 6 alphabet characters

I think you need to add something like (?=.*[A-Z]{,6})(?=.*[a-z]{,6})

I think you want something like this,
^[A-Za-z](?:[^A-Za-z]*[A-Za-z][^A-Za-z]*){5}$|^[_$%](?:[^A-Za-z]*[A-Za-z][^A-Za-z]*){6}$
DEMO

Related

How do I fix this regex to match the strings I want?

I need to find out specific user names that meet a certain - albeit rather wide - criteria. My knowledge at regex is very limited so while my regex constructs did match what I wanted, they also matched about everything else. Can you help me?
The requirements for a valid match are:
string length: exactly 7 characters
contains only alphanumeric characters, mixed upper and lower case, or numbers 0-9
contains at least one number 0-9, can be more than that but never 3 in a row
not all upper case (can be all lower case but never upper case)
Unfortunately, the numbers can be anywhere in the string, and the alphanumeric characters can also be any combination.
Here is an excerpt of the data I need to match:
cgxh21o *
crittaz
Mist246
nOnameR
Gorebag
pu50pce *
rmygy62 *
aeifnz0 *
orp5k1v *
okn5nvr *
The ones marked with * are the ones I want to match. The remaining ones are valid and must not be included.
Is this even possible using regex?
My last attempt was:
/[a-z{0,}A-Z{0,}0-9{1,}]{7}+
but then I found user names that didn't follow that notation at all (more than one number) so it didn't work.
Here's a relatively short and simple regex that will work:
(?=(^.{7}$))(?=.*[a-z])(?=.*\d)(?!.*\d{3})
regex101 demo
Explanation:
(?=(^.{7}$)) check that there's exactly 7 characters (and capture them)
(?=.*[a-z]) at least one lower case letter
(?=.*\d) at least one digit
(?!.*\d{3}) there isn't 3 digits in a row anywhere
Here's a Python demo:
import re
pattern = re.compile(r"(?=(^.{7}$))(?=.*[a-z])(?=.*\d)(?!.*\d{3})")
ls = ["cgxh21o", "crittaz", "Mist246", "nOnameR", "Gorebag",
"pu50pce", "rmygy62", "aeifnz0", "orp5k1v", "okn5nvr",
"OKN5NVR", "short1", "aeifnz0aaaaaa", "12CeE12"]
for elem in ls:
print(elem, bool(re.search(pattern, elem)))
Output:
cgxh21o True
crittaz False
Mist246 False
nOnameR False
Gorebag False
pu50pce True
rmygy62 True
aeifnz0 True
orp5k1v True
okn5nvr True
OKN5NVR False
short1 False
aeifnz0aaaaaa False
12CeE12 True
You could use lookahead assertions:
^(?=[^a-z\s]*[a-z])(?=[^\d\s]*\d)(?!.*\d{3})[a-zA-Z0-9]{7}$
Explanation
^ Start of string
(?=[^a-z\s]*[a-z]) Assert a lowercase char a-z
(?=[^\d\s]*\d) Assert a digit
(?!.*\d{3}) Assert not 3 digits in a row
[a-zA-Z0-9]{7} Match 7 times any of the listed
$ End of string
Regex demo
Here's a possibility based on your updated question:
^(?![a-zA-Z]{7}|.*[0-9]{3}.*|[A-Z0-9]{7})([a-zA-Z0-9]){7}$
Doesn't match
crittaz
nOnameR
Gorebag
ABCDEFG
aBCDEFG
1234567
Mist246
ABCDEF7
Matches
cgxh21o
pu50pce
rmygy62
aeifnz0
orp5k1v
okn5nvr
There's probably a number of ways to do this. This one looks for 7 alphanumeric, but not if there's only 7 alphabetic, not if there's only uppercase and numbers, and not if there's 3 digits in a row...

newbie with regex, parsing one and only one character

i have to validate strings like:
10y9m12od or 9m12od or 12d or 10y9m or 9m
those are correct.
These are not correct:
10d2m5y, 2m5y10d...
As you can see, order of elements is important but elements are not mandatory...
I have this regex which I think it is fine but...:
([\d][yY]{1})?([\d][mM]{1})?([\d][o]{0,1}(d|D){1})$
Can anybody help me?
^(\d+[yY])?(\d{1,2}[mM])?(\d{1,2}o?[dD])?$
You need to allow more than one digit before each letter. Years can be any number of digits, months and days can be 1 or 2 digits.
There's no need to wrap \d and o in [].
You need a ^ anchor at the beginning.
There's no need for {1} to match a single repetition, that's the default for all patterns.

Regex to match digits with decimals that have spaces as well as no spaces

I am looking for a regex string to match a set of numbers:
9.50 (numbers without spaces, that have 2 to 4 decimal points)
1 9 . 5 0 (numbers with spaces that have 2 to 4 decimals points)
10 (numbers without spaces and without decimal points)
So far I have come up regex string [0-9\s\.]+, but this not doing what I want. Any cleaner solutions out there?
Many Thanks
Try this:
[\d\s]+(?:\.(?:\s*\d){2,4})?
This makes the decimal point and the digits/spaces after it optional. If there are digits after, it checks that there are 2-4 of them with {2,4}
DEMO
If this should only match the whole string, you can anchor it.
^[\d\s]+(?:\.(?:\s*\d){2,4})?\s*$
The problem with your regex is that it will match 127.0.0.1 as well, which is an IP4 address, not a number.
The following regex should do the trick:
[0-9]+[0-9\s]*(\.(\s*[0-9]){2,4})?
Assumption I've made: You need to place at least one digit (before the comma).
regex101 demo.
(\d+[\d\s]*\.((\s*\d){2,4})?|\d+)
I was still getting "trailing spaces" selected with the third example of 10
This eliminated them.
wouldn't this work as well - '[^. 0-9]' ?
my full postgresql query looks like this:
split_part(regexp_replace(columnyoudoregexon , '[^. 0-9]', '', 'g'), ' ', 1)
and its doing the following:
values in the column get everything except numbers, spaces and point(for decimal) replaced with empty string.
split this new char string with split_part() and call which element in the resulting list you want.
was stuck on this for a while. i hope it helps.

How Can I Create a RegEx Pattern that will Get N Words Using Custom Word Boundary?

I need a RegEx pattern that will return the first N words using a custom word boundary that is the normal RegEx white space (\s) plus punctuation like .,;:!?-*_
EDIT #1: Thanks for all your comments.
To be clear:
I'd like to set the characters that would be the word delimiters
Lets call this the "Delimiter Set", or strDelimiters
strDelimiters = ".,;:!?-*_"
nNumWordsToFind = 5
A word is defined as any contiguous text that does NOT contain any character in strDelimiters
The RegEx word boundary is any contiguous text that contains one or more of the characters in strDelimiters
I'd like to build the RegEx pattern to get/return the first nNumWordsToFind using the strDelimiters.
EDIT #2: Sat, Aug 8, 2015 at 12:49 AM US CT
#maraca definitely answered my question as originally stated.
But what I actually need is to return the number of words ≤ nNumWordsToFind.
So if the source text has only 3 words, but my RegEx asks for 4 words, I need it to return the 3 words. The answer provided by maraca fails if nNumWordsToFind > number of actual words in the source text.
For example:
one,two;three-four_five.six:seven eight nine! ten
It would see this as 10 words.
If I want the first 5 words, it would return:
one,two;three-four_five.
I have this pattern using the normal \s whitespace, which works, but NOT exactly what I need:
([\w]+\s+){<NumWordsOut>}
where <NumWordsOut> is the number of words to return.
I have also found this word boundary pattern, but I don't know how to use it:
a "real word boundary" that detects the edge between an ASCII letter
and a non-letter.
(?i)(?<=^|[^a-z])(?=[a-z])|(?<=[a-z])(?=$|[^a-z])
However, I would want my words to allow numbers as well.
IAC, I have not been able how to use the above custom word boundary pattern to return the first N words of my text.
BTW, I will be using this in a Keyboard Maestro macro.
Can anyone help?
TIA.
All you have to do is to adapt your pattern ([\w]+\s+){<NumWordsOut>} to, including some special cases:
^[\s.,;:!?*_-]*([^\s.,;:!?*_-]+([\s.,;:!?*_-]+|$)){<NumWordsOut>}
1. 2. 3. 4. 5.
Match any amount of delimiters before the first word
Match a word (= at least one non-delimiter)
The word has to be followed by at least one delimiter
Or it can be at the end of the string (in case no delimiter follows at the end)
Repeat 2. to 4. <NumWordsOut> times
Note how I changed the order of the -, it has to be at the start or end, otherwise it needs to be escaped: \-.
Thanks to #maraca for providing the complete answer to my question.
I just wanted to post the Keyboard Maestro macro that I have built using #maraca's RegEx pattern for anyone interested in the complete solution.
See KM Forum Macro: Get a Max of N Words in String Using RegEx

Regex to check for at least 3 characters?

I have this regex to allow for only alphanumeric characters.
How can I check that the string at least contains 3 alphabet characters as well.
My current regex,
if(!/^[a-zA-Z0-9]+$/.test(val))
I want to enforce the string to make sure there is at least 3 consecutive alphabet characters as well so;
111 // false
aaa1 // true
11a // false
bbc // true
1a1aa // false
+ means "1 or more occurrences."
{3} means "3 occurrences."
{3,} means "3 or more occurrences."
+ can also be written as {1,}.
* can also be written as {0,}.
To enforce three alphabet characters anywhere,
/(.*[a-z]){3}/i
should be sufficient.
Edit. Ah, you'ved edited your question to say the three alphabet characters must be consecutive. I also see that you may want to enforce that all characters should match one of your "accepted" characters. Then, a lookahead may be the cleanest solution:
/^(?.*[a-z]{3})[a-z0-9]+$/i
Note that I am using the case-insensitive modifier /i in order to avoid having to write a-zA-Z.
Alternative. You can read more about lookaround assertions here. But it may be a little bit over your head at this stage. Here's an alternative that you may find easier to break down in terms of what you already know:
/^([a-z0-9]*[a-z]){3}[a-z0-9]*$/i
This should do the work:
^([0-9]*[a-zA-Z]){3,}[0-9]*$
It checks for at least 3 "Zero-or-more numerics + 1 Alpha" sequences + Zero-or-more numerics.
You want to match zero or more digits then 3 consecutive letters then any other number of digits?
/\d*(?:[a-zA-Z]){3,}\d*/
This is vanilla JS you guys can use. My problem is solved using this.
const str = "abcdggfhf";
const pattern = "fhf";
if(pattern.length>2) {
console.log(str.search(pattern));
}