What is the meaning of this regex? [a-zA-Z]|\d
I know that [a-zA-Z] means all of a to Z chars but whats the mean of \d?
\d is a digit character. Your code means "any alphabetic or numeric character". It could more easily be expressed as [A-Za-z0-9].
\d just means a digit character, it is equivalent to [0-9].
Here's a good reference: http://www.regular-expressions.info/reference.html
In most regex flavors, \d means any numeric digit, and is the same as [0-9].
Your regex as a whole means "match either a single letter of the alphabet, or a single digit."
\d matches any digits.
\d matches any digit ( i.e. 0-9 ).
See for example regular expression list
\d means digit and is synonymous with [0-9]. As I type this I see this question is answered twice more, and I bet with the same information.
My favorite books on regex are
http://www.amazon.com/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/1565922573
and
http://www.amazon.com/Beginning-Regular-Expressions-Programmer/dp/0764574892/ref=sr_1_1?s=books&ie=UTF8&qid=1305497415&sr=1-1
they are such a powerful thing to master.
Depending upon what your goal is, you might be able to replace that with \w which is a "word character" i.e. any letter, digit or the underscore character.
Related
What does this RegExp mean please?
[\w-\.]
I know the \w stands for word characters and could alternatively be written as:
[A-Za-z0-9_]
I know the \. means that the point will be treated as an ordinary character.
The only thing I don't really know is the hyphen character. Is this used as a Range Operator here or just the hyphen character in e.g. "fine-tune"?
Hyphen here is just the hyphen character.
Hyphen is treated as a range operator only when it is between two other characters.
Hyphen is normal character here, so it works as [a-zA-z0-9_-\.] (number, letters, and these three characters: -_.).
I am using regular expression to validate a pattern followed by a fraction. I found these and they match what I need. Overall I want to match 1 to 2 numbers followed by the fraction. How are these expressions different?
/^[0-9]+(?:[\xbc\xbd\xbe])$/ugm
/^\d+(?:[\xbc\xbd\xbe])$/ugm
/^\w+(?:\w+)$/ugm
I need to match the following:
12½
1¼
11¾
but not match..
111½
11111¼
0¾
Well to begin with, [0-9] matches any character of: (0 to 9) and is not the same as \d
\d matches digits (0-9) and other digit characters such as Unicode.
\w matches any word character (letter, number, or underscore)
Although these given expressions may match the same pattern, you will eventually fail using your 3rd solution.
It will match a pattern like foobar where as you can see there are no (0-9) characters or Unicode fractions in this pattern.
And with running a quick benchmark, your 2nd solution is about 16% slower than your first, plus it matches Unicode and other digit characters.
I would stick with your first expression, and change it to match between 1-2 number characters.
/^[1-9][0-9]?(?:[\xbc\xbd\xbe])$/ugm
or even
/^[1-9][0-9]?(?:[\xbc-\xbe])$/ugm
Try the following:
^[1-9][0-9]?[\xbc\xbd\xbe]$
[0-9] and \d are equivalent. \w matches a "word" character. The expression [1-9] matches a digit which is not zero (since you specifically asked how to exclude that).
This unattractively hard-codes for some legacy 8-bit character set; for future compatibility, you should consider switching to Unicode.
You can try
/^[1-9][0-9]?(?:[\xbc\xbd\xbe])$/ugm
I'm not exactly a pro when it comes to regex and I have a PHP script that runs things through this regex:
^[\d\D]{1,}$
What does this supposed to do, it seems that it matches everything?
\d matches any digit
\D matches any non-digit.
[\d\D] matches all digits and non-digits.
{1,} asks for the match in [] to be repeated at least 1 time (with no upper limit).
So it matches everything with at least 1 character in it.
Reference: http://www.regular-expressions.info/reference.html
In short all that regex is doing is this:
^.+$
Which means match any character (digits OR non-digits) of 1 or greater length.
^[\d\D]{1,}$ will match a string which contains one or more {1,} of any digit \d or non-digit \D character including newline characters.
In contrast ^.+$ will match a string containing one or more of any character except newlines. If the singleline modifier was added to the regex, i.e. /^.+$/s then the . would also match any character including newlines.
[\d\D] is equivalent to using . in singleline mode, although more commonly [\s\S] is used with the same result.
+ is equivalent to {1,}.
The regex will match the whole of any string that contains at least one character of any kind.
You are right. In fact anything that is at least one character long. But in a kind of overcomplicated and pointless way. [\d\D] is equivalent to . and {1,} is equivalent to +
What does \d+ mean in a regular expression?
\d is a digit (a character in the range [0-9]), and + means one or more times. Thus, \d+ means match one or more digits.
For example, the string "42" is matched by the pattern \d+.
You can also find explanations for pieces of regular expressions like this using a tool like Regex101 (online, free) or Regex Coach (downloadable for Windows, free) that will let you enter a regular expression and sample text, then indicate what (if anything) matches the regex. They also try to explain, in words, what the regular expression does.
\d is called a character class and will match digits. It is equal to [0-9].
+ matches 1 or more occurrences of the character before.
So \d+ means match 1 or more digits.
\d means 'digit'. + means, '1 or more times'. So \d+ means one or more digit. It will match 12 and 1.
\d is a digit, + is 1 or more, so a sequence of 1 or more digits
I need a Regex that matches all instances of any character that is not a-z (space and things like apostrophes need to be selected). Sorry for the noob factor.
//novice
With a somewhat sophisticated regex engine (grep will do just fine) this will be quite general:
/[^[:lower:]]+/
(Note the ^!)
The difference between [:lower:] and [a-z] is that the former should be I18N friendly and match e.g. ü, â etc.
For case insensitive matching use [:alpha:], to also include digits use [:alnum:]. [:alnum:] differs from \W in that it doesn't include _ (underscore).
Note that character classes written in this style may be combined as usual (like a-z etc.), e.g. [^[:lower:][:digit:]]+ would match a non-empty string of characters not including any lowercase letters or digits.
Here is regex that will literally match any char that is not a-z. The /g flag indicates a global match which will cover all instances of the match.
/[^a-z]+/g
If you need uppercase letters too, you can either pass the /i flag which indicates case insensitivity:
/[^a-z]+/gi
or include the uppercase chars in character class:
/[^a-zA-Z]+/g
The character class [^a-zA-Z] will match any character that isn't (upper or lowercase) a-z.
I'm sure you can figure out the rest.
\W will match any non-alphanumeric (a-z, 0-9, and underscore) character.
The following regular expression matches any letter other than [a-z]:
/[^a-z]+/
OK.
/[^a-z]+/ will match anything other than lowercase letters.
/[^A-Za-z]+/ will match anything non-alpha.
/\W+/ on most systems will match non-'word' characters. Word characters include A-Z, a-z, 0-9, and '_' (underscore). Note that that is an uppercase W.
If you ever need to create another regex try reading this. Teaching to fish and all that. :)