I'm working on a regular expressions pattern, but it contains a number of special characters. I'm not really sure how to incorporate them in a normal regex pattern string. Specifically, I need to test to see if a string contains '+/-'...
I've tried using quotes etc but have no luck (I'm extremely new to regex). I am coding this in C# 4.0.
One string example is "3Z1Z +/- 5.5"
Any help is much appreciated - Thanks a lot!
Create a simple regex :
foundMatch = Regex.IsMatch(SubjectString, #"\+/-");
Will return true if this sequence of characters is found anywhere in your string. The explanation is left as an exercise to you.
Read more here.
These are part of the special character list (see also). Basically, add them to the pattern by prefixing them with a backslash (\). e.g. + becomes \+
^\+|\-$ # + or -
The same would go for anything else with special meaning, such as ., {, }, (, ), ^, $, |, [, ], etc.
There are some exceptions though. For instance, when creating a class such as: [a-z] the hyphen (-) would have special meaning (all letters from a through z). So if you wanted a literal hyphen you'd have to escape it (unless it falls as the last character of the class). e.g.
[a-z-A-Z] # hyphen should be escaped if you wanted a literal hyphen
[a-z\-A-Z] # the "correct" counter-part
[a-zA-Z-] # actually legal because it's inserted as the last character
# and therefor treated as a literal hyphen despite not being
# escaped.
Related
$.validator.addMethod('AZ09_', function (value) {
return /^[a-zA-Z0-9.-_]+$/.test(value);
}, 'Only letters, numbers, and _-. are allowed');
When I use somehting like test-123 it still triggers as if the hyphen is invalid. I tried \- and --
Escaping using \- should be fine, but you can also try putting it at the beginning or the end of the character class. This should work for you:
/^[a-zA-Z0-9._-]+$/
Escaping the hyphen using \- is the correct way.
I have verified that the expression /^[a-zA-Z0-9.\-_]+$/ does allow hyphens. You can also use the \w class to shorten it to /^[\w.\-]+$/.
(Putting the hyphen last in the expression actually causes it to not require escaping, as it then can't be part of a range, however you might still want to get into the habit of always escaping it.)
The \- maybe wasn't working because you passed the whole stuff from the server with a string. If that's the case, you should at first escape the \ so the server side program can handle it too.
In a server side string: \\-
On the client side: \-
In regex (covers): -
Or you can simply put at the and of the [] brackets.
Generally with hyphen (-) character in regex, its important to note the difference between escaping (\-) and not escaping (-) the hyphen because hyphen apart from being a character themselves are parsed to specify range in regex.
In the first case, with escaped hyphen (\-), regex will only match the hyphen as in example /^[+\-.]+$/
In the second case, not escaping for example /^[+-.]+$/ here since the hyphen is between plus and dot so it will match all characters with ASCII values between 43 (for plus) and 46 (for dot), so will include comma (ASCII value of 44) as a side-effect.
\- should work to escape the - in the character range. Can you quote what you tested when it didn't seem to? Because it seems to work: http://jsbin.com/odita3
A more generic way of matching hyphens is by using the character class for hyphens and dashes ("\p{Pd}" without quotes). If you are dealing with text from various cultures and sources, you might find that there are more types of hyphens out there, not just one character. You can add that inside the [] expression
I'm just starting to get to grips with Regular Expressions. My first task is to remove all the characters in a string except a-z (upper and lower case), 0-9, and the characters - \ . : and ,
So I tried
objInstance.mystring.replaceAll("[^A-Za-z0-9\\- .:,]", "")
However, this still removes the hyphen and the backslash.
I suspect its the placement of the \ but some guidance would be helpful here.
You need to escape the backslash, as well as the hyphen. These are characters that have meaning in the regex so you need to escape them to have the actual character being monitored.
[A-Za-z0-9\\\-.:,] should be the correct regex. There's also a space in yours, there's no mention of it in your question so I removed that as well. There's also a ^ character in your regex. This signifies the start of a String, again as there was no mention of this in your question, I removed it in the regex.
Can anyone provide Regular expression for there statements--
annotations/16/16366.eng
annotations/29/21345.eng
annotations/10/20132.eng
And these type of statements. I have tried 'a(\w+).eng' but, it did not worked.
To match alphanumeric separated by slashes, ending with .eng you can do:
(\w+\/\w+\/\w+\.eng)
Remember that [ and ] are used for sets. You can specify a word verbatim as a match, without any flags. If you wanted to match anything in the same format with annotations you can do:
annotations\/\w+\/\w+\.eng
Where \/ escapes a / and \. escapes a period.
And to simplify it:
[\w/]*\.eng
Meaning "Match any repetitions of the set with alphanumeric characters \w, and / followed by `.eng'.
I need to embed user-input in my regular expression, so it needs to be escaped for any regex special characters, and I don't know in advance what the string will be.
It would be something like
string pattern = "\\d+ " + myEscapeFunction(userData);
Which special characters do I need to escape? Or is there an equivalent function to Qt's QRegExp::escape?
The list of characters that you have to escape depends on which of the various regular expression grammars you're using. If you're using the default ECMAScript, it looks like the list in the QRegExp::escape documentation is a good place to start. It says:
The special characters are $, (,), *, +, ., ?, [, ,], ^, {, | and }.
That list leaves out \ for some reason.
But it's slightly more complicated than that, because inside square brackets, none of the characters except \ and ] are special, and \] has to stay unescaped.
Further, a ? that comes right after a ( is not special. For example, in (?=x) the ? should not be escaped.
I think that's pretty much it, but I haven't put enough time into this to be sure.
I'm new to regular expression and I having trouble finding what "\'.-" means.
'/^[A-Z \'.-]{2,20}$/i'
So far from my research, I have found that the regular expression starts (^) and requires two to twenty ({2,20}) alphabetical (A-Z) characters. The expression is also case insensitive (/i).
Any hints about what "\'.-" means?
The character class is the entire expression [A-Z \'.-], meaning any of A-Z, space, single quote, period, or hyphen. The \ is needed to protect the single quote, since it's also being used as the string quote. This charclass must be repeated 2 to 20 times, and because of the leading ^ and trailing $ anchors that must be the entire content of the matching string.
It means to escape the single quote (') that delmits the regex (as to not prematurely end the string), and then a . which means a literal . and a - which means a literal -.
Inside of the character range, the . is treated literally, and if the - isn't part of a valid range, e.g. a-z, then it is treated literally as well.
Your regex says Match the characters a-zA-Z '.- between 2 and 20 times as the entire string, with an optional trailing \n.
This regex is in a string. The backslash is there to escape the single quote so the string doesn't end early, in the middle of the regex. The dot and dash are just what they are, a period and a dash.
So, you were nearly right, except it's 2-20 characters that are letters, space, single quote, period, or dash.
It's quoting the quote.
The regular expression is ^[A-Z'.-]{2,20}$.
In the programming language you are using, you write it as a quoted string:
'SOMETHING'
To get a single quote in there, it's been backslashed.
Everything inside the square brackets is part of the character class, and will match a single character listed. In your example, the characters listed are the letters A through Z, a space, a single quote, a period, or a hyphen. (Note the hyphen must be listed last to avoid indicating a range, like A-Z.) Your full regular expression will match between 2 and 20 of the listed characters. The single quote is needed so the compiler knows you are not ending the string that defines the regular expression.
Some examples of things this will match:
....................
abaca af - .
AAfa- - ..
.z
And so on.