Regex to check input is not empty plus it's alphanumeric? - regex

I need a single regex to check if input must not be empty plus the input has alphanumeric characters only.
I know the alphanumeric part,^[\s+0-9a-zA-Z]+$, but I am not sure about the not empty requirement.
I can only use a single expression and I can't use any language function.

Simply use this regex to match a non-empty alphanumeric string:
^[a-zA-Z0-9]+$
Details
^ - string start
[a-zA-Z0-9]+ - one or more letters or digits
$ - string end.

I'm going to assume by Not empty you mean not only white space, otherwise you've got the answer you want. + means one or more.
^[a-zA-Z0-9][a-zA-Z0-9\s]*^
will make sure that the string has something other than white space in it.
Additionally if \s is valid then I assume \w is as well, meaning that this could more easily be said as
^[(?:\w|\s)*$
The ?: in the ( ) makes it a non-capture group. If you don't care about capture then this can be omitted, making it the very terse.
^\w(\w|\s)*$

Related

Regex in js may contain spaces at the start of a character after or at the end but must not have only spaces [duplicate]

I need to write a regular expression for form validation that allows spaces within a string, but doesn't allow only white space.
For example - 'Chicago Heights, IL' would be valid, but if a user just hit the space bar any number of times and hit enter the form would not validate. Preceding the validation, I've tried running an if (foo != null) then run the regex, but hitting the space bar still registers characters, so that wasn't working. Here is what I'm using right now which allows the spaces:
^[-a-zA-Z0-9_:,.' ']{1,100}$
It's very simple: .*\S.*
This requires one non-space character, at any place. The regular expression syntax is for Perl 5 compatible regular expressions, if you have another language, the syntax may differ a bit.
The following will answer your question as written, but see my additional note afterward:
^(?!\s*$)[-a-zA-Z0-9_:,.' ']{1,100}$
Explanation: The (?!\s*$) is a negative lookahead. It means: "The following characters cannot match the subpattern \s*$." When you take the subpattern into account, it means: "The following characters can neither be an empty string, nor a string of whitespace all the way to the end. Therefore, there must be at least one non-whitespace character after this point in the string." Once you have that rule out of the way, you're free to allow spaces in your character class.
Extra note: I don't think your ' ' is doing what you intend. It looks like you were trying to represent a space character, but regex interprets ' as a literal apostrophe. Inside a character class, ' ' would mean "match any character that is either ', a space character, or '" (notice that the second ' character is redundant). I suspect what you want is more like this:
^(?!\s*$)[-a-zA-Z0-9_:,.\s]{1,100}$
You could use simple:
^(?=.*\S).+$
if your regex engine supports positive lookaheads. This expression requires at least one non-space character.
See it on rubular.
If we wanted to apply validations only with allowed character set then I tried with USERNAME_REGEX = /^(?:\s*[.\-_]*[a-zA-Z0-9]{1,}[.\-_]*\s*)$/;
A string can contain any number of spaces at the beginning or ending or in between but will contain at least one alphanumeric character.
Optional ., _ , - characters are also allowed but string must have one alphanumeric character.
Try this regular expression:
^[^\s]+(\s.*)?$
It means one or more characters that are not space, then, optionally, a space followed by anything.
Just use \s* to avoid one or more blank spaces in the regular expression between two words.
For example, "Mozilla/ 4.75" and "Mozilla/4.75" both can be matched by the following regular expression:
[A-Z][a-z]*/\s*[0-9]\.[0-9]{1,2}
Adding \s* matches on zero, one or more blank spaces between two words.

Limiting RegEx to match only a string of 1-254 characters length

This is my RegEx:
"^[^\.]([\w-\!\#\$\%\&\'\*\+\-\/\=\`\{\|\}\~\?\^]+)([\.]{0,1})([\w-\!\#\$\%\&\'\*\+\-\/\=\`\{\|\}\~\?\^]+)[^\.]#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,6}|[0-9]{1,3})(\]?)$"
I need to match only strings less than 255 characters.
I've tried adding the word boundaries at the start of the RegEx but it fails:
"^(?=.{1,254})[^\.]([\w-\!\#\$\%\&\'\*\+\-\/\=\`\{\|\}\~\?\^]+)([\.]{0,1})([\w-\!\#\$\%\&\'\*\+\-\/\=\`\{\|\}\~\?\^]+)[^\.]#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,6}|[0-9]{1,3})(\]?)$"
You need the $ in the lookahead to make sure it's only up to 254. Otherwise, the lookahead will match even when there are more than 254.
(?=.{1,254}$)
Also, keep in mind that you can greatly simplify your regex because many characters that would usually need to be escaped do not need to when in a character class (square brackets).
"[\w-\!\#\$\%\&\'\*\+\-\/\=\`\{\|\}\~\?\^]"
is the same as this:
"[-\w!#$%&'*+/=`{|}~?^]"
Note that the dash must be first in the character class to be a literal dash, and the caret must not be first.
With some other simplifications, here is the complete string:
"^(?=.{1,254}$)[-\w!#$%&'*+/=`{|}~?^]+(\.[-\w!#$%&'*+/=`{|}~?^]+)*#((\d{1,3}\.){3}\d{1,3}|([-\w]+\.)+[a-zA-Z]{2,6})$"
Notes:
I removed the stipulation that the first char shouldn't be a period ([^.]) because the next character class doesn't match a period anyway, so it's redundant.
I removed many extraneous parens
I replaced [0-9] with \d
I replaced {0,1} with the shorthand "?"
After the # sign, it seemed that you were trying to match an IP address or text domain name, so I separated them more so it couldn't be a combination
I'm not sure what the optional square bracket at the end was for, so I removed it: "(]?)"
I tried it in Regex Hero, and it works. See if it works for you.
This depends on what language you are working in. In Python for example you can regex to split a text into separate strings, and then use len() to remove strings longer than the 255 characters you want
I think this post will help. It shows how to limit certain patterns but I am not sure how you would add it to the entire regex.

Regular expression to allow spaces between words

I want a regular expression that prevents symbols and only allows letters and numbers. The regex below works great, but it doesn't allow for spaces between words.
^[a-zA-Z0-9_]*$
For example, when using this regular expression "HelloWorld" is fine, but "Hello World" does not match.
How can I tweak it to allow spaces?
tl;dr
Just add a space in your character class.
^[a-zA-Z0-9_ ]*$
Now, if you want to be strict...
The above isn't exactly correct. Due to the fact that * means zero or more, it would match all of the following cases that one would not usually mean to match:
An empty string, "".
A string comprised entirely of spaces, " ".
A string that leads and / or trails with spaces, " Hello World ".
A string that contains multiple spaces in between words, "Hello World".
Originally I didn't think such details were worth going into, as OP was asking such a basic question that it seemed strictness wasn't a concern. Now that the question's gained some popularity however, I want to say...
...use #stema's answer.
Which, in my flavor (without using \w) translates to:
^[a-zA-Z0-9_]+( [a-zA-Z0-9_]+)*$
(Please upvote #stema regardless.)
Some things to note about this (and #stema's) answer:
If you want to allow multiple spaces between words (say, if you'd like to allow accidental double-spaces, or if you're working with copy-pasted text from a PDF), then add a + after the space:
^\w+( +\w+)*$
If you want to allow tabs and newlines (whitespace characters), then replace the space with a \s+:
^\w+(\s+\w+)*$
Here I suggest the + by default because, for example, Windows linebreaks consist of two whitespace characters in sequence, \r\n, so you'll need the + to catch both.
Still not working?
Check what dialect of regular expressions you're using.* In languages like Java you'll have to escape your backslashes, i.e. \\w and \\s. In older or more basic languages and utilities, like sed, \w and \s aren't defined, so write them out with character classes, e.g. [a-zA-Z0-9_] and [\f\n\p\r\t], respectively.
* I know this question is tagged vb.net, but based on 25,000+ views, I'm guessing it's not only those folks who are coming across this question. Currently it's the first hit on google for the search phrase, regular expression space word.
One possibility would be to just add the space into you character class, like acheong87 suggested, this depends on how strict you are on your pattern, because this would also allow a string starting with 5 spaces, or strings consisting only of spaces.
The other possibility is to define a pattern:
I will use \w this is in most regex flavours the same than [a-zA-Z0-9_] (in some it is Unicode based)
^\w+( \w+)*$
This will allow a series of at least one word and the words are divided by spaces.
^ Match the start of the string
\w+ Match a series of at least one word character
( \w+)* is a group that is repeated 0 or more times. In the group it expects a space followed by a series of at least one word character
$ matches the end of the string
This one worked for me
([\w ]+)
Try with:
^(\w+ ?)*$
Explanation:
\w - alias for [a-zA-Z_0-9]
"whitespace"? - allow whitespace after word, set is as optional
I assume you don't want leading/trailing space. This means you have to split the regex into "first character", "stuff in the middle" and "last character":
^[a-zA-Z0-9_][a-zA-Z0-9_ ]*[a-zA-Z0-9_]$
or if you use a perl-like syntax:
^\w[\w ]*\w$
Also: If you intentionally worded your regex that it also allows empty Strings, you have to make the entire thing optional:
^(\w[\w ]*\w)?$
If you want to only allow single space chars, it looks a bit different:
^((\w+ )*\w+)?$
This matches 0..n words followed by a single space, plus one word without space. And makes the entire thing optional to allow empty strings.
This regular expression
^\w+(\s\w+)*$
will only allow a single space between words and no leading or trailing spaces.
Below is the explanation of the regular expression:
^ Assert position at start of the string
\w+ Match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
1st Capturing group (\s\w+)*
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\s Match any white space character [\r\n\t\f ]
\w+ Match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
$ Assert position at end of the string
Just add a space to end of your regex pattern as follows:
[a-zA-Z0-9_ ]
This does not allow space in the beginning. But allowes spaces in between words. Also allows for special characters between words. A good regex for FirstName and LastName fields.
\w+.*$
For alphabets only:
^([a-zA-Z])+(\s)+[a-zA-Z]+$
For alphanumeric value and _:
^(\w)+(\s)+\w+$
If you are using JavaScript then you can use this regex:
/^[a-z0-9_.-\s]+$/i
For example:
/^[a-z0-9_.-\s]+$/i.test("") //false
/^[a-z0-9_.-\s]+$/i.test("helloworld") //true
/^[a-z0-9_.-\s]+$/i.test("hello world") //true
/^[a-z0-9_.-\s]+$/i.test("none alpha: ɹqɯ") //false
The only drawback with this regex is a string comprised entirely of spaces. "       " will also show as true.
It was my regex: #"^(?=.{3,15}$)(?:(?:\p{L}|\p{N})[._()\[\]-]?)*$"
I just added ([\w ]+) at the end of my regex before *
#"^(?=.{3,15}$)(?:(?:\p{L}|\p{N})[._()\[\]-]?)([\w ]+)*$"
Now string is allowed to have spaces.
This regex allow only alphabet and spaces:
^[a-zA-Z ]*$
Try with this one:
result = re.search(r"\w+( )\w+", text)

Regular expression to match non-integer values in a string

I want to match the following rules:
One dash is allowed at the start of a number.
Only values between 0 and 9 should be allowed.
I currently have the following regex pattern, I'm matching the inverse so that I can thrown an exception upon finding a match that doesn't follow the rules:
[^-0-9]
The downside to this pattern is that it works for all cases except a hyphen in the middle of the String will still pass. For example:
"-2304923" is allowed correctly but "9234-342" is also allowed and shouldn't be.
Please let me know what I can do to specify the first character as [^-0-9] and the rest as [^0-9]. Thanks!
This regex will work for you:
^-?\d+$
Explanation: start the string ^, then - but optional (?), the digit \d repeated few times (+), and string must finish here $.
You can do this:
(?:^|\s)(-?\d+)(?:["'\s]|$)
^^^^^ non capturing group for start of line or space
^^^^^ capture number
^^^^^^^^^ non capturing group for end of line, space or quote
See it work
This will capture all strings of numbers in a line with an optional hyphen in front.
-2304923" "9234-342" 1234 -1234
++++++++ captured
^^^^^^^^ NOT captured
++++ captured
+++++ captured
I don't understand how your pattern - [^-0-9] is matching those strings you are talking about. That pattern is just the opposite of what you want. You have simply negated the character class by using caret(^) at the beginning. So, this pattern would match anything except the hyphen and the digits.
Anyways, for your requirement, first you need to match one hyphen at the beginning. So, just keep it outside the character class. And then to match any number of digits later on, you can use [0-9]+ or \d+.
So, your pattern to match the required format should be:
-[0-9]+ // or -\d+
The above regex is used to find the pattern in some large string. If you want the entire string to match this pattern, then you can add anchors at the ends of the regex: -
^-[0-9]+$
For a regular expression like this, it's sometimes helpful to think of it in terms of two cases.
Is the first character messed up somehow?
If not, are any of the other characters messed up somehow?
Combine these with |
(^[^-0-9]|^.+?[^0-9])

Regex how to match an optional character

I have a regex that I thought was working correctly until now. I need to match on an optional character. It may be there or it may not.
Here are two strings. The top string is matched while the lower is not. The absence of a single letter in the lower string is what is making it fail.
I'd like to get the single letter after the starting 5 digits if it's there and if not, continue getting the rest of the string. This letter can be A-Z.
If I remove ([A-Z]{1}) +.*? + from the regex, it will match everything I need except the letter but it's kind of important.
20000 K Q511195DREWBT E00078748521
30000 K601220PLOPOH Z00054878524
Here is the regex I'm using.
/^([0-9]{5})+.*? ([A-Z]{1}) +.*? +([A-Z]{1})([0-9]{3})([0-9]{3})([A-Z]{3})([A-Z]{3}) +([A-Z])[0-9]{3}([0-9]{4})([0-9]{2})([0-9]{2})/
Use
[A-Z]?
to make the letter optional. {1} is redundant. (Of course you could also write [A-Z]{0,1} which would mean the same, but that's what the ? is there for.)
You could improve your regex to
^([0-9]{5})+\s+([A-Z]?)\s+([A-Z])([0-9]{3})([0-9]{3})([A-Z]{3})([A-Z]{3})\s+([A-Z])[0-9]{3}([0-9]{4})([0-9]{2})([0-9]{2})
And, since in most regex dialects, \d is the same as [0-9]:
^(\d{5})+\s+([A-Z]?)\s+([A-Z])(\d{3})(\d{3})([A-Z]{3})([A-Z]{3})\s+([A-Z])\d{3}(\d{4})(\d{2})(\d{2})
But: do you really need 11 separate capturing groups? And if so, why don't you capture the fourth-to-last group of digits?
You can make the single letter optional by adding a ? after it as:
([A-Z]{1}?)
The quantifier {1} is redundant so you can drop it.
You have to mark the single letter as optional too:
([A-Z]{1})? +.*? +
or make the whole part optional
(([A-Z]{1}) +.*? +)?
You also could use simpler regex designed for your case like (.*)\/(([^\?\n\r])*) where $2 match what you want.
here is the regex for password which will require a minimum of 8 characters including a number and lower and upper case letter and optional sepecial charactor
/((?=.\d)(?=.[a-z])(?=.*[A-Z])(?![~##$%^&*_-+=`|{}:;!.?"()[]]).{8,25})/
/((?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?![~##\$%\^&\*_\-\+=`|{}:;!\.\?\"()\[\]]).{8,25})/