Regex: Match an Alphanumeric string that could contain the characther '-' - regex

Eg: "_V9DXkFMCEeGrv54B-L8--A"
\w+ alone will not work

You can use:
[\w-]+

use this pattern: [-\w]+ \w is an alpanumeric character. Actually pattern depends from your language. For example in java you have to write [-\\w]+ and there also can be languages where - is a special character and you should escape it too. So please edit your question and add the language you use.

Related

Why does below regex not work in vim? (Specifically, the ? in the regex)

[0-9]* (-)?[0-9]* q
This regex does not seem to work in vim for below text
34778965 -1 q some text here
[SOLVED]
Thank you! I realize it all has to be escaped. What felt inconsistent is some needed to be escaped like \? but some not, like * or $. vimregex.com helped.
You can use \d to denote numbers in vim regex.
A good pattern for your text would be
\d\+ -\? \d\+ q
In general, vim assumes that the characters as their literal ones. So, if you give \d+, it would be understood as any digit follows by a plus sign. So, you will have to escape such regex specific characters in patterns.
In Vim, you need to escape some common regex special characters for them to act as special operators. E.g. (-) group must be written in a non very magic mode as \(-\). In a very magic mode, your pattern would work as is - :%s/\v[0-9]* (-)?[0-9]* q/replace/g
In your case, you just do not need the grouping at all because you quantify one single hyphen inside parentheses, so they can be removed:
[0-9]* -\? [0-9]* q

Reguar expression to allow few Special Characters

I am new to Validation through RegEx
I want to Validate an input field through regex that
Must have Alphanumeric Characters
Must contain - _ / . ( )
In this case your regex will define a set (you need to escape some special characters with \):
^[a-zA-Z0-9\-_\/\.\(\)]*$
This one will do the job:
^[\w/.()-]+$
\w means [a-zA-Z0-9_]
For alpha numeric characters (assuming English characters), you can use the following: ^[A-Za-z0-9_\/.()-]+$.
Please take a look at this tutorial for more information and here for a more detailed explanation of the regex.

How can I create an alphanumeric Regex for all languages?

I had this problem today:
This regex matches only English: [a-zA-Z0-9].
If I need support for any language in this world, what regex should I write?
If you use character class shorthands and a Unicode aware regex engine you can do that. The \w class matches "word characters" (letters, digits, and underscores).
Beware of some regex flavors that don't do this so well: JavaScript uses ASCII for \d (digits) and \w, but Unicode for \s (whitespace). XML does it the other way around.
Alphabet/Letter: \p{L}
Number: \p{N}
So for alphnum match for all languages, you can use: [\p{L}\p{N}]+
I was looking for a way to replace all non-alphanum chars for all languages with a space in JS and ended up using the following way to do it:
const regexForNonAlphaNum = new RegExp(/[^\p{L}\p{N}]+/ug);
someText.replace(regexForNonAlphaNum, " ");
Here as it is JS, we need to add u at end to make the regex unicode aware and g stands for global as I wanted match all instances and not just a single instance.
References:
https://www.linkedin.com/pulse/regex-one-pattern-rule-them-all-find-bring-darkness-bind-carranza/?trackingId=U6tRte%2BzTAG6O4AA3CrFmA%3D%3D
https://www.regular-expressions.info/unicode.html
Regex supporting most languages
^[A-zÀ-Ÿ\d-]*$
The regex below is the only one worked for me:
"\\p{LD}+" ==> LD means any letter or digit.
If you want to clean your text from any non alphanumeric characters you can use the following:
text.replaceAll("\\P{LD}+", "");//Note P is capital.

Regex help NOT a-z or 0-9

I need a regex to find all chars that are NOT a-z or 0-9
I don't know the syntax for the NOT operator in regex.
I want the regex to be NOT [a-z, A-Z, 0-9].
Thanks in advance!
It's ^. Your regex should use [^a-zA-Z0-9]. Beware: this character class may have unexpected behavior with non-ascii locales. For instance, this would match é.
Edited
If the regexes are perl-compatible (PCRE), you can use \s to match all whitespace. This expands to include spaces and other whitespace characters. If they're posix-compatible, use [:space:] character class (like so: [^a-zA-Z0-9[:space:]]). I would recommend using [:alnum:] instead of a-zA-Z0-9.
If you want to match the end of a line, you should include a $ at the end. Turning on multiline mode is only when your match should extend across multiple lines, and it reduces performance for larger files since more must be read into memory.
Why don't you include a copy of sample input, the text you want to match, and the program you are using to do so?
It's pretty simple; you just add ^ at the beginning of a character set to negate that character set.
For example, the following pattern will match everything that's not in that character set -- i.e., not a lowercase ASCII character or a digit:
[^a-z0-9]
As a side note, some of the more helpful Regular Expression resources I've found have been this site and this cheat sheet (C# specific).
Put at ^ at the begining of your character class expression: [^a-z0-9]
At start [^a-zA-Z0-9]
for condition;
pre_match();
pre_replace();
ergi();
try this
You can also use \W it's a shorthand for non-word character (equal to [^a-zA-Z0-9_])

Regex to match all of a set except certain ones

I'm sure this has been asked before, but I can't seem to find it (or know the proper wording to search for)
Basically I want a regex that matches all non-alphanumeric except hyphens. So basically match \W+ except exclude '-' I'm not sure how to exclude specific ones from a premade set.
\W is a shorthand for [^\w]. So:
[^\w-]+
A bit of background:
[…] defines a set
[^…] negates a set
Generally, every \v (smallcase) set is negated by a \V (uppercase) where V is any letter that defines a set.
for international characters, you may want to look into [[:alpha:]] and [[:alnum:]]
[^\w-]+
will do just that. Match any characters not in the \w set except hyphen.
You can use:
[^a-zA-Z0-9_-]
or
[^\w-]
to match a single non-hyphen, non-alphanumeric. To match one or more of then prefix with a +
In Java7 or above, you need to prepend the (?U) to match all locale specific characters. e.g.
(?U)[^\w-]
In a Java string (you need to escape \ character with another one):
(?U)[^\\w-]