Regular expression to allow spaces between words - regex

I want a regular expression that prevents symbols and only allows letters and numbers. The regex below works great, but it doesn't allow for spaces between words.
^[a-zA-Z0-9_]*$
For example, when using this regular expression "HelloWorld" is fine, but "Hello World" does not match.
How can I tweak it to allow spaces?

tl;dr
Just add a space in your character class.
^[a-zA-Z0-9_ ]*$
Now, if you want to be strict...
The above isn't exactly correct. Due to the fact that * means zero or more, it would match all of the following cases that one would not usually mean to match:
An empty string, "".
A string comprised entirely of spaces, " ".
A string that leads and / or trails with spaces, " Hello World ".
A string that contains multiple spaces in between words, "Hello World".
Originally I didn't think such details were worth going into, as OP was asking such a basic question that it seemed strictness wasn't a concern. Now that the question's gained some popularity however, I want to say...
...use #stema's answer.
Which, in my flavor (without using \w) translates to:
^[a-zA-Z0-9_]+( [a-zA-Z0-9_]+)*$
(Please upvote #stema regardless.)
Some things to note about this (and #stema's) answer:
If you want to allow multiple spaces between words (say, if you'd like to allow accidental double-spaces, or if you're working with copy-pasted text from a PDF), then add a + after the space:
^\w+( +\w+)*$
If you want to allow tabs and newlines (whitespace characters), then replace the space with a \s+:
^\w+(\s+\w+)*$
Here I suggest the + by default because, for example, Windows linebreaks consist of two whitespace characters in sequence, \r\n, so you'll need the + to catch both.
Still not working?
Check what dialect of regular expressions you're using.* In languages like Java you'll have to escape your backslashes, i.e. \\w and \\s. In older or more basic languages and utilities, like sed, \w and \s aren't defined, so write them out with character classes, e.g. [a-zA-Z0-9_] and [\f\n\p\r\t], respectively.
* I know this question is tagged vb.net, but based on 25,000+ views, I'm guessing it's not only those folks who are coming across this question. Currently it's the first hit on google for the search phrase, regular expression space word.

One possibility would be to just add the space into you character class, like acheong87 suggested, this depends on how strict you are on your pattern, because this would also allow a string starting with 5 spaces, or strings consisting only of spaces.
The other possibility is to define a pattern:
I will use \w this is in most regex flavours the same than [a-zA-Z0-9_] (in some it is Unicode based)
^\w+( \w+)*$
This will allow a series of at least one word and the words are divided by spaces.
^ Match the start of the string
\w+ Match a series of at least one word character
( \w+)* is a group that is repeated 0 or more times. In the group it expects a space followed by a series of at least one word character
$ matches the end of the string

This one worked for me
([\w ]+)

Try with:
^(\w+ ?)*$
Explanation:
\w - alias for [a-zA-Z_0-9]
"whitespace"? - allow whitespace after word, set is as optional

I assume you don't want leading/trailing space. This means you have to split the regex into "first character", "stuff in the middle" and "last character":
^[a-zA-Z0-9_][a-zA-Z0-9_ ]*[a-zA-Z0-9_]$
or if you use a perl-like syntax:
^\w[\w ]*\w$
Also: If you intentionally worded your regex that it also allows empty Strings, you have to make the entire thing optional:
^(\w[\w ]*\w)?$
If you want to only allow single space chars, it looks a bit different:
^((\w+ )*\w+)?$
This matches 0..n words followed by a single space, plus one word without space. And makes the entire thing optional to allow empty strings.

This regular expression
^\w+(\s\w+)*$
will only allow a single space between words and no leading or trailing spaces.
Below is the explanation of the regular expression:
^ Assert position at start of the string
\w+ Match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
1st Capturing group (\s\w+)*
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\s Match any white space character [\r\n\t\f ]
\w+ Match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
$ Assert position at end of the string

Just add a space to end of your regex pattern as follows:
[a-zA-Z0-9_ ]

This does not allow space in the beginning. But allowes spaces in between words. Also allows for special characters between words. A good regex for FirstName and LastName fields.
\w+.*$

For alphabets only:
^([a-zA-Z])+(\s)+[a-zA-Z]+$
For alphanumeric value and _:
^(\w)+(\s)+\w+$

If you are using JavaScript then you can use this regex:
/^[a-z0-9_.-\s]+$/i
For example:
/^[a-z0-9_.-\s]+$/i.test("") //false
/^[a-z0-9_.-\s]+$/i.test("helloworld") //true
/^[a-z0-9_.-\s]+$/i.test("hello world") //true
/^[a-z0-9_.-\s]+$/i.test("none alpha: ɹqɯ") //false
The only drawback with this regex is a string comprised entirely of spaces. "       " will also show as true.

It was my regex: #"^(?=.{3,15}$)(?:(?:\p{L}|\p{N})[._()\[\]-]?)*$"
I just added ([\w ]+) at the end of my regex before *
#"^(?=.{3,15}$)(?:(?:\p{L}|\p{N})[._()\[\]-]?)([\w ]+)*$"
Now string is allowed to have spaces.

This regex allow only alphabet and spaces:
^[a-zA-Z ]*$

Try with this one:
result = re.search(r"\w+( )\w+", text)

Related

Regex for 2-30 characters plus any alphanumeric characters separated by a single hyphen

I'm trying to come up with a regex for domain names that can either be 2-30 characters long with alphanumeric characters separated by a single hyphen with no other special characters allowed .
something like this thisi67satest-mydomain
What I have at the moment is this : /^[a-z0-9-]{2,30}$/ but this doesn't cover all scenarios especially with respect to the single hyphen.
I've always tried to google my way through these regexes. the above example will allow more than one hyphen which I don't want. How can i make the single hyphen mandatory?
Try this:
^(?=.{2,30}$)[a-z0-9]+-[a-z0-9]+$
^ the start of the line/string.
(?=.{2,30}$) ensures that the string between 2-30 characters.
[a-z0-9]+ one or more small letter or digit.
- one literal -.
[a-z0-9]+ one or more small letter or digit.
$ end of the line/string.
See regex demo
I think following pattern will work for you. Let me know if it work.
(\w|-(?!-)){2,30}

Regex in js may contain spaces at the start of a character after or at the end but must not have only spaces [duplicate]

I need to write a regular expression for form validation that allows spaces within a string, but doesn't allow only white space.
For example - 'Chicago Heights, IL' would be valid, but if a user just hit the space bar any number of times and hit enter the form would not validate. Preceding the validation, I've tried running an if (foo != null) then run the regex, but hitting the space bar still registers characters, so that wasn't working. Here is what I'm using right now which allows the spaces:
^[-a-zA-Z0-9_:,.' ']{1,100}$
It's very simple: .*\S.*
This requires one non-space character, at any place. The regular expression syntax is for Perl 5 compatible regular expressions, if you have another language, the syntax may differ a bit.
The following will answer your question as written, but see my additional note afterward:
^(?!\s*$)[-a-zA-Z0-9_:,.' ']{1,100}$
Explanation: The (?!\s*$) is a negative lookahead. It means: "The following characters cannot match the subpattern \s*$." When you take the subpattern into account, it means: "The following characters can neither be an empty string, nor a string of whitespace all the way to the end. Therefore, there must be at least one non-whitespace character after this point in the string." Once you have that rule out of the way, you're free to allow spaces in your character class.
Extra note: I don't think your ' ' is doing what you intend. It looks like you were trying to represent a space character, but regex interprets ' as a literal apostrophe. Inside a character class, ' ' would mean "match any character that is either ', a space character, or '" (notice that the second ' character is redundant). I suspect what you want is more like this:
^(?!\s*$)[-a-zA-Z0-9_:,.\s]{1,100}$
You could use simple:
^(?=.*\S).+$
if your regex engine supports positive lookaheads. This expression requires at least one non-space character.
See it on rubular.
If we wanted to apply validations only with allowed character set then I tried with USERNAME_REGEX = /^(?:\s*[.\-_]*[a-zA-Z0-9]{1,}[.\-_]*\s*)$/;
A string can contain any number of spaces at the beginning or ending or in between but will contain at least one alphanumeric character.
Optional ., _ , - characters are also allowed but string must have one alphanumeric character.
Try this regular expression:
^[^\s]+(\s.*)?$
It means one or more characters that are not space, then, optionally, a space followed by anything.
Just use \s* to avoid one or more blank spaces in the regular expression between two words.
For example, "Mozilla/ 4.75" and "Mozilla/4.75" both can be matched by the following regular expression:
[A-Z][a-z]*/\s*[0-9]\.[0-9]{1,2}
Adding \s* matches on zero, one or more blank spaces between two words.

Capturing uppercase words in text with regex

I'm trying to find words that are in uppercase in a given piece of text. The words must be one after the other to be considered and they must be at least 4 of them.
I have a "almost" working code but it captures much more: [A-Z]*(?: +[A-Z]*){4,}. The capture group also includes spaces at the start or the end of those words (like a boundary).
I have a playground if you want to test it out: https://regex101.com/r/BmXHFP/2
Is there a way to make the regex in example capture only the words in the first sentence? The language I'm using is Go and it has no look-behind/ahead.
In your regex, you just need to change the second * for a +:
[A-Z]*(?: +[A-Z]+){4,}
Explanation
While using (?: +[A-Z]*), you are matchin "a space followed by 0+ letters". So you are matching spaces. When replacing the * by a +, you matches spaces if there are uppercase after.
Demo on regex101
Replace the *s by +s, and your regex only matches the words in the first sentence.
.* also matches the empty string. Looking at you regex and ignoring both [A-Z]*, all that remains is a sequence of spaces. Using + makes sure that there is at least one uppercase char between every now and then.
You had to mark at least 1 upper case as [A-Z]*(?: +[A-Z]+){4,} see updated regex.
A better Regex will allow non spaces as [A-Z]*(?: *[A-Z]+){4,}.see better regex
* After will indicate to allow at least upper case even without spaces.

Regular Expression for re-verification

I am trying to validate verification question and this is the regular expressin I have, I am not what this mean but this expression not allowing spaces
^\S+$
For example if I enter 'Test Me', this expresson says it is not valid.. How do I fix this to allow spaces?
What exactly are you trying to match?
^ matches the beginning of the string
$ matches the end of the string
+ allows multiple occurances of the last expression
\S stands for anything but a whitespace
\s stands for white-spaces
The expression you have will match any string containing only non-white-space characters. If you could express what exactly you're trying to match, I could help you with it.
^\S+$
^^ ^^
|| ||
^ start of string-------------+| ||
\S anything but a whitespace----+ ||
+ one or more of what precedes---+|
$ end of string-------------------+
(visit regular-expressions.info for a larger reference)
Not sure what you want to change, really, since this regular expressions seems to have been written for the sole purpose of not allowing spaces.
^ means "start of the string"
\S is a special keyword in Regex that denotes "non-white space characters"
+ means find the previous one or more times
$ means "end of the string"
So in English, this Regex says: starting at the start of the string, find me ONLY non-white space characters one or more times before the end of the string. This is why it doesn't permit white space.
The reason it does not match is because you are not allowing white space characters in your string with \S
something that might serve you better is:
^[\w\s]+$
\w is equivalent to [A-Za-z0-9_]
\s matches whitespace
keep in mind that this regex will not allow punctuation, if you want that you may be better off using ^.+$

Regular expression that allows spaces in a string, but not only blank spaces

I need to write a regular expression for form validation that allows spaces within a string, but doesn't allow only white space.
For example - 'Chicago Heights, IL' would be valid, but if a user just hit the space bar any number of times and hit enter the form would not validate. Preceding the validation, I've tried running an if (foo != null) then run the regex, but hitting the space bar still registers characters, so that wasn't working. Here is what I'm using right now which allows the spaces:
^[-a-zA-Z0-9_:,.' ']{1,100}$
It's very simple: .*\S.*
This requires one non-space character, at any place. The regular expression syntax is for Perl 5 compatible regular expressions, if you have another language, the syntax may differ a bit.
The following will answer your question as written, but see my additional note afterward:
^(?!\s*$)[-a-zA-Z0-9_:,.' ']{1,100}$
Explanation: The (?!\s*$) is a negative lookahead. It means: "The following characters cannot match the subpattern \s*$." When you take the subpattern into account, it means: "The following characters can neither be an empty string, nor a string of whitespace all the way to the end. Therefore, there must be at least one non-whitespace character after this point in the string." Once you have that rule out of the way, you're free to allow spaces in your character class.
Extra note: I don't think your ' ' is doing what you intend. It looks like you were trying to represent a space character, but regex interprets ' as a literal apostrophe. Inside a character class, ' ' would mean "match any character that is either ', a space character, or '" (notice that the second ' character is redundant). I suspect what you want is more like this:
^(?!\s*$)[-a-zA-Z0-9_:,.\s]{1,100}$
You could use simple:
^(?=.*\S).+$
if your regex engine supports positive lookaheads. This expression requires at least one non-space character.
See it on rubular.
If we wanted to apply validations only with allowed character set then I tried with USERNAME_REGEX = /^(?:\s*[.\-_]*[a-zA-Z0-9]{1,}[.\-_]*\s*)$/;
A string can contain any number of spaces at the beginning or ending or in between but will contain at least one alphanumeric character.
Optional ., _ , - characters are also allowed but string must have one alphanumeric character.
Try this regular expression:
^[^\s]+(\s.*)?$
It means one or more characters that are not space, then, optionally, a space followed by anything.
Just use \s* to avoid one or more blank spaces in the regular expression between two words.
For example, "Mozilla/ 4.75" and "Mozilla/4.75" both can be matched by the following regular expression:
[A-Z][a-z]*/\s*[0-9]\.[0-9]{1,2}
Adding \s* matches on zero, one or more blank spaces between two words.