Regular expression for alpahbet,underscore,hyphen,apostrophe only - regex

I want a regular expression that accept only alphabets,hyphen,apostrophe,underscore.
I tried
/^[ A-Za-z-_']*$/
but its not working. Please help.

Your regex is wrong. Try this:
/^[0-9A-Za-z_#'-]+$/
OR
/^[\w#'-]+$/
Hyphen needs to be at first or last position inside a character class to avoid escaping. Also if empty string isn't allowed then use + (1 or more) instead of * (0 or more)
Explanation:
^ assert position at start of the string
[\w#'-]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible
\w match any word character [a-zA-Z0-9_]
#'- a single character in the list #'- literally
$ assert position at end of the string

Move the hyphen at the end or the beginig of the character class or escape it:
^[ A-Za-z_'-]*$
or
^[- A-Za-z_']*$
or
^[ A-Za-z\-_']*$
If you want all letters:
^[ \pL_'-]*$
or

When using a hyphen in a character class, be sure to place it at the end of the character class as a best practice.
The reason for this is because the hyphen is used to signify a range of characters in the character class, and when it is at the end of the class, it will not create any ranges.

My best bet would be :
/[A-Za-z-\'_#0-9]+/g

You can use the following (in Java):
String acceptHyphenApostropheUnderscoreRegEx = "^(\\p{Alpha}*+((['_-]+)\\p{Alpha})?)*+$";
If you want to have spaces and # also (as some have given above) try:
String acceptHyphenApostropheUnderscoreRegEx = "^(\\p{Alpha}*+((\\s|['#_-]+)\\p{Alpha})?)*+$";

Related

Regex multiple replace with capture group and negated character class

I have a problem with a regex and I cannot figure out if what I'm doing is possible. I was trying to write a regex to replace some strings with the following code
String string = "address='21 Street' and country='United Kingdom'";
Pattern pattern = Pattern.compile(" (address|country)='[^']'");
String replacedString = pattern.matcher(string).replaceAll(" $1='call us'");
System.out.println(replacedString);
What I'm expecting is to print the string
address='call us' and country='call us'
I'm not going to end up implementing this with a regex, as there are other better ways, but I just want to know why this is not working :'(.
What confuses me is that the negated character class [^'] is does not "work" and the regex doesn't replace anything.
You want [^']* and not [^']. The former matches any number of characters, the latter matches exactly a single non-' character.
You're missing a quantifier. By itself, a character class matches exactly one character in the input string, so you need to specify a quantifier of some sort make it match more than one character.
Try adding a + (one or more) or * (zero or more) after your character class:
Pattern pattern = Pattern.compile(" (address|country)='[^']*'");

Extract text before _ in a string using regex

I have some large number of strings which starts like DD_filename.
How can I extract the characters before _ using regular expression.
I tried learning using from here and in that it is given a.b will retrieve characters starting from a and ending on b
I tried similarly ^._ but it is not working for me.
^._ will only match one character before _. Try this pattern:
^.*?(?=_)
Starting from the beginning of the string, capture all non-underscore characters:
"^[^_]*"
The first ^ (caret) character means that the match starts from the beginning of the string. The brackets allow you to define a set of possible characters (character class). The second ^ character means "not". So the character class is "not underscore". The star means "zero or more". So in plain English: "match from the start of the string zero or more non underscore characters".
You can try something like
.*?(?=_)
. matches any character and *? is a reluctant quantifier. (?=_) is a positive lookahead to ensure our match is followed by an _.
If you want to only extract characters that occur at the beginning of a string you can add the ^ anchor: ^.*?(?=_). ^ matches the position before the first character in the string.
Just capture all characters that are not an underscore:
"[^_]*"
Regular Expression to get all characters before "-"
Check out #stema's answer. He gives four ways to do this, but the first is probably the best.
Match result = Regex.Match(text, #"^.*?(?=-)");
Console.WriteLine(result);

How to use regex for field validation on whole string?

I've been working for many hours trying to do a "simple thing": use a regex to validate a text field.
I need to make sure of:
1- Only use (a-z), (A-Z) and (0-9) values
2- Add a SINGLE wildcard only at the end.
Ex.
Match
MICHE*
Match
JAMES
No match
MICHE**
No match
MIC_HEAL*
I have this regex till now:
[a-zA-Z0-9\s-]+.\z*?
The problem is it still matches when I introduce an invalid character as long as I have a matching sub-string See my REGEX
What can I do to force a match on the whole string? What am I missing?
Thx!
Use ^ (start of line) and $ (end of line) to only match the whole string:
^[a-zA-Z0-9\s-]+.\z*?$
(If you have a multiline input you can also use \A and \z - start and end of string)
On a second look, I don't understand the end of your regex: . (anything) \z * ? (end of string, zero or more times, zero or one time). This regex will match something like:
Ikdflfdf&
Is that correct? If you only want the character *, you should use:
^[a-zA-Z0-9\s-]+\*?$
Also, as Robbie pointed out, you're including spaces and the - in your list of accepted characters. If you only want letters and digits, a shortcut would be using \w (word characters):
^\w+\*$
However, depending on whether the matcher is Unicode-aware or not, \w will also match non-ASCII letters and digits, which may or may not be what you want.
Try this one :
^[a-zA-Z0-9]+\*?$
^ string start
$ string end
* is meta character so it should be escaped like \* to use it as a letter
I think you just need ^ at the begining and $ at the end
^[a-zA-Z0-9\s-]+.\*?$
Also, you don't need the \z
Also, you haven't mentioned that you want to allow spaces and dashes - but you have included them in your allowed character set.

regular expression generation

I need a regular expression to check a string should contain only letters and space.No other character other than letter [A-Z] and space are allowed.
Please help.
The complete regex looks like this
^[A-Z ]+$
You can simply create a character class and put the characters in that you want to allow:
[A-Z ]
if you want to allow also lower case letters then use
[A-Za-z ]
or use the i (IgnoreCase) option
So your character class matches 1 character. you want to repeat it to match more than one character.
+ would be at least one character, where
* would additionally match 0 characters
As last step you need to ensure that the complete string is matched, you can do this using anchors.
^ matches the beginning of the string
$ matches the end of the string (or a newline if you use the m (multiline) option
A character class should be sufficient
[A-Z ]+
i.e. one or more of letters between A-Z and space
Check that the string matches the following:
^[a-zA-Z ]*$
Regex character classes can be negated by putting a ^ symbol at the begining of them.
Your example could be negated like this: [^A-Z]. Add a space to allow the full range of characters you want to check for and you have [^A-Z ].
Now you have a validator that meets your criteria: If that regex returns true then your validation fails.
Since you didn't specify the programming language you're working in, I can't help you much further than that.
This will match what you need:
^[A-Z\s]+$
try matching with this regex
^[A-Za-z\s]+$
this should do the trick

Regular Expression related: first character alphabet second onwards alphanumeric+some special characters

I have one question related with regular expression. In my case, I have to make sure that
first letter is alphabet, second onwards it can be any alphanumeric + some special characters.
Regards,
Anto
Try something like this:
^[a-zA-Z][a-zA-Z0-9.,$;]+$
Explanation:
^ Start of line/string.
[a-zA-Z] Character is in a-z or A-Z.
[a-zA-Z0-9.,$;] Alphanumeric or `.` or `,` or `$` or `;`.
+ One or more of the previous token (change to * for zero or more).
$ End of line/string.
The special characters I have chosen are just an example. Add your own special characters as appropriate for your needs. Note that a few characters need escaping inside a character class otherwise they have a special meaning in the regular expression.
I am assuming that by "alphabet" you mean A-Z. Note that in some other countries there are also other characters that are considered letters.
More information
Character Classes
Repetition
Anchors
Try this :
/^[a-zA-Z]/
where
^ -> Starts with
[a-zA-Z] -> characters to match
I think the simplest answer is to pick and match only the first character with regex.
String str = "s12353467457458";
if ((""+str.charAt(0)).matches("^[a-zA-Z]")){
System.out.println("Valid");
}