How to use regex for field validation on whole string? - regex

I've been working for many hours trying to do a "simple thing": use a regex to validate a text field.
I need to make sure of:
1- Only use (a-z), (A-Z) and (0-9) values
2- Add a SINGLE wildcard only at the end.
Ex.
Match
MICHE*
Match
JAMES
No match
MICHE**
No match
MIC_HEAL*
I have this regex till now:
[a-zA-Z0-9\s-]+.\z*?
The problem is it still matches when I introduce an invalid character as long as I have a matching sub-string See my REGEX
What can I do to force a match on the whole string? What am I missing?
Thx!

Use ^ (start of line) and $ (end of line) to only match the whole string:
^[a-zA-Z0-9\s-]+.\z*?$
(If you have a multiline input you can also use \A and \z - start and end of string)
On a second look, I don't understand the end of your regex: . (anything) \z * ? (end of string, zero or more times, zero or one time). This regex will match something like:
Ikdflfdf&
Is that correct? If you only want the character *, you should use:
^[a-zA-Z0-9\s-]+\*?$
Also, as Robbie pointed out, you're including spaces and the - in your list of accepted characters. If you only want letters and digits, a shortcut would be using \w (word characters):
^\w+\*$
However, depending on whether the matcher is Unicode-aware or not, \w will also match non-ASCII letters and digits, which may or may not be what you want.

Try this one :
^[a-zA-Z0-9]+\*?$
^ string start
$ string end
* is meta character so it should be escaped like \* to use it as a letter

I think you just need ^ at the begining and $ at the end
^[a-zA-Z0-9\s-]+.\*?$
Also, you don't need the \z
Also, you haven't mentioned that you want to allow spaces and dashes - but you have included them in your allowed character set.

Related

Need a Regex to identify the string "hostname abc_pqr_xyz" (The 3 words can be any thing)

I would like to check all the strings with the format hostname abc_pqr_xyz in a file. Need a regex for this. There should be exactly 2 _'s and 3 words in the string.
I have tried using the regex ^hostname \s+.*_.*_.*
But it is giving a positive result for abc_abc_abc_abc_abc, as it considering abc_abc_abc as one word.
You may use a [^_] negated character class that matches any char but _ instead of .:
^hostname\s+[^_]*_[^_]*_[^_]*$
See the regex demo and a Regulex graph:
See $ at the end that checks the end of the string.
Also, a space before \s+ will require a space and then 1 or more whitespace chars, thus, that space may be harmful, that's why I removed it from the expression.
Note you may group the _[^_]* and then set the number of repetitions that you may adjust in the future:
^hostname\s+[^_]*(?:_[^_]*){2}$
See this regex demo.

Regex match till end of text

I'm using Regex to match whole sentences in a text containing a certain string. This is working fine as long as the sentence ends with any kind of punctuation. It does not work however when the sentence is at the end of the text without any punctuation.
This is my current expression:
[^.?!]*(?<=[.?\s!])string(?=[\s.?!])[^.?!]*[.?!]
Works for:
This is a sentence with string. More text.
Does not work for:
More text. This is a sentence with string
Is there any way to make this word as intended? I can't find any character class for "end of text".
End of text is matched by the anchor $, not a character class.
You have two separate issues you need to address: (1) the sentence ending directly after string, and (2) the sentence ending sometime after string but with no end-of-sentence punctuation.
To do this, you need to make the match after string optional, but anchor that match to the end of the string. This also means that, after you recognize an (optional) end-of-sentence punctuation mark, you need to match everything that follows, so the end-of-string anchor will match.
My changes: Take everything after string in your original regex and surround it in (?:...)? - the (?:...) being a "non-remembered" group, and the ? making the entire group optional. Follow that with $ to anchor the end of the string.
Within that optional group, you also need to make the end-of-sentence itself optional, by replacing the simple [.?!] with (?:[.?!].*)? - again, the (?:...) is to make a "non-remembered" group, the ? makes the group optional - and the .* allows this to match as much as you want after the end-of-sentence has been found.
[^.?!]*(?<=[.?\s!])string(?:(?=[\s.?!])[^.?!]*(?:[.?!].*)?)?$
The symbol for end-of-text is $ (and, the symbol for beginning-of-text, if you ever need it, is ^).
You probably won't get what you're looking for with by just adding the $ to your punctuation list though (e.g., [.?!$]); you'll find it works better as an alternative choice: ([.?!]|$).
Your regex is way too complex for what you want to achieve.
To match only a word just use
"\bstring\b"
It will match start, end and any non-alphanum delimiters.
It works with the following:
string is at the start
this is the end string
this is a string.
stringing won't match (you don't want a match here)
You should add the language in the question for more information about using.
Here is my example using javascript:
var reg = /^([\w\s\.]*)string([\w\s\.]*)$/;
console.log(reg.test('This is a sentence with string. More text.'));
console.log(reg.test('More text. This is a sentence with string'));
console.log(reg.test('string'))
Note:
* : Match zero or more times.
? : Match zero or one time.
+ : Match one or more times.
You can change * with ? or + if you want more definition.

Regular expression let periods in (.)

My regular expression lets in periods for some reason, how can I keep that from happening.
Rules:
4-15 characters
Any alphanumeric characters
Underscore as long as it's not first or last
[A-Za-z][A-Za-z0-9_]{3,14}
I don't want "bad.example" for work.
Edit: changed to 4-15 characters
Your regex matches example as a substring of bad.example. Use anchors to prevent that:
^[A-Za-z][A-Za-z0-9_]{1,12}[A-Za-z]$
Note that (like your regex) this regex also prevents digits from matching in the first and last position - if they should be allowed (as per your specs), just add 0-9 at the end of the character classes.
^[A-Za-z][A-Za-z0-9_]{3,14}$
try this
This will match any alphanumeric at the beginning and end. In the middle it will accept from one up to twelve alphanumerics including an underscore:
^[a-zA-Z\d]\w{1,12}[a-zA-Z\d]$
It does not match bad.example but matches only example as your regex allows a character from 4 to 15.See here.
http://regex101.com/r/xV4eL5/5
To prevent it you need to match the whole input and not make partial matches.Put a ^ start anchor and $ end anchor.
Use
\A[A-Za-z0-9][\w]{1,12}[A-Za-z0-9]\Z

Regexp to take all numbers from the string in JavaScript and php

Was playing around with regexp to remove all numbers from string
I came up to this:
/([^0-9])$/
But it works only if string looks like this, e.g. Name123 but if you enter Name123Name than it doesn't work?
Can't understand why?
Any ideas?
Best regards,
Ilia
Your regular expression finds one character not in [0-9] at the end of the string.
To check if there is a digit anywhere in the string, remove the anchors:
/[0-9]/
To check that all characters are not digits, add a start of string anchor too:
/^[^0-9]+$/
This approach is called a blacklist - a list of characters you don't want to allow. Note that it's often better to create a whitelist instead - a list characters that you do want to allow.
remove the $ at the end, because that matches the end of the string. Also you can use \d to match a digit instead of [0-9] depending on the language you're using.
so in your example /[0-9]$/ matches Name123 because the 123 appears at the end of the string, thus matches the $ anchor. But in the other example, Name123Name, the $ anchor doesn't match because the digits are in the middle of the string.

regular expression generation

I need a regular expression to check a string should contain only letters and space.No other character other than letter [A-Z] and space are allowed.
Please help.
The complete regex looks like this
^[A-Z ]+$
You can simply create a character class and put the characters in that you want to allow:
[A-Z ]
if you want to allow also lower case letters then use
[A-Za-z ]
or use the i (IgnoreCase) option
So your character class matches 1 character. you want to repeat it to match more than one character.
+ would be at least one character, where
* would additionally match 0 characters
As last step you need to ensure that the complete string is matched, you can do this using anchors.
^ matches the beginning of the string
$ matches the end of the string (or a newline if you use the m (multiline) option
A character class should be sufficient
[A-Z ]+
i.e. one or more of letters between A-Z and space
Check that the string matches the following:
^[a-zA-Z ]*$
Regex character classes can be negated by putting a ^ symbol at the begining of them.
Your example could be negated like this: [^A-Z]. Add a space to allow the full range of characters you want to check for and you have [^A-Z ].
Now you have a validator that meets your criteria: If that regex returns true then your validation fails.
Since you didn't specify the programming language you're working in, I can't help you much further than that.
This will match what you need:
^[A-Z\s]+$
try matching with this regex
^[A-Za-z\s]+$
this should do the trick