Regex to find special characters in a String with some exceptions - regex

I just had a similar (but not exact) question answered. Now I need help with the question mentioned below.
I want to write a regex which matches a character if its a non word, non digit and non star (*) character. So, the characters [0-9][a-z][A-Z] * should not match and the others should.
I tried writing [\W[^*]] but it doesn't seem to work.

Try this instead:
[^\w\*]

The simplest regular expression that matches a single character, that is not one of those you described, independent of any particular regular-expression extensions, would be:
[^0-9a-zA-Z *]

[^\w\*]
Simple enough.

Please try the following regex:
[\W_]

Related

Understanding regex with OR

I have a regular expression like this: ('0'|['0'‐'9']+'.'['0'‐'9' 'a'‐'f']*)
In order to test it I am using a handy tool called http://www.regexpal.com/
The thing is that I am getting stuck when trying to understand the logic, inserting a '0' is fine but then I don't get why the OR prevents inserting other characters. Any explanation is appreciated.
I'm not sure you understand how the brackets in the regex are working. It isn't the OR part that is preventing you.
('0'|['0'‐'9']+'.'['0'‐'9' 'a'‐'f']*)
Will match either '0' with the quotes or for example 0000000'z''''9 or anything else like it. The quotes are treated as literal and the period must be escaped because it is a wildcard.
(0|[0-9]+\.[0-9a-f]*)
May be what you are looking for. This will match values such as 0 or 23. or 3.14159
There are numerous problems in your regex (as others have pointed out), but I'll explain something about alternations.
Most regex flavors will short-circuit alternations.
This means that you should reorder it, if you want it to match the other expression first.

How do regex positive look-behinds work?

I have been solving old question from stack so that I can improve my regex knowledge. As I have a basic knowledge of regex, most of them were easy but this question regex problem is tough.
It asks for a regex that extracts from this kind of string ou=persons,ou=(.*),dc=company,dc=org the last string immediately preceded by a comma not followed by (.*). In the last case, this should give dc=company,dc=org.
The solution is (?<=,(?!.*\Q(.*)\E)).* but I cannot understand its flow. I understood (?!.*\Q(.*)\E) portion but other are still mystery to me. Specially ?<= which is a positive look-behind. Does it search from end of string? Can anyone explain it to me like I am a 7 year old kid — and please http://regex101.com/ is not helping.
The RegEx (?<=,(?!.*\Q(.*)\E)).* look-behind potion works like this:
Start at the beginning of the string at first character.
Can we match the the thing we are looking for? ,(?!.*\Q(.*)\E)
If we can't: Move forward one character, Go To 2. and check match again.
If a match is found: Capture all the remaining characters until we can't find any .* (or generally then try the matching the remaining RegEx).
For a more wordly explaination consider reading Lookahead and Lookbehind Zero-Length Assertions.
A lookbehind allows you to specify a context just before the actual match.
You can say ,(dc=) and only return the capture group, or ,\Kdc=, or (?<=,)dc= to return the match on dc= but require that the comma is present just before the match.
The facility also allows for multiple lookbehinds, so you could do (?<=a.*)(?<=b.*)c to match c only if it is preceded by both a and b somewhere in the input.
A lookbehind is basically syntactic sugar, in that you can usually rephrase your conditions using some other regex construct. It can be really handy when you have multiple unanchored constraints, like in the last example

look through a string skipping/ignoring a specific character

i'm trying to wrap my head around this and all the examples on google and stackoverflow aren't helping me understand.
i have this string
{{#test: mytest}}
{{#test mytest}}
I want to capture test and mytest from both those examples. so if that occurs in my string both will return the same array sets.
this is what i have/am trying to so far
/{{\s*#\s*(.*)\s*:\s*(.*)\s*}}/
this will work on the first example but will not on the second one.
so i thought maybe the answer would be to search trough the string skipping the :?
You haven't clearly defined your criteria, but assuming that the texts you want to match may not contain whitespace, you can use this:
/{{\s*#\s*([^:\s]*)\s*:?\s*([^}\s]*)\s*}}/
[^:\s]* matches a string that contains neither colons nor whitespace
[^}\s]* matches a string that contains neither closing braces nor whitespace
:? matches an optional colon
Try \{\{\s*#([^#:\s]+)\s*:?\s+([^}\s]+)\s*\}\}.
EDIT: Ack, Tim Pietzcker beat me to it. There are some subtle differences between our expressions, but either one should do what you want, I think. I have chosen to use + rather than * in a few places, where my interpretation of your question led me to believe that there had to be at least one of a given character.
EDIT 2: One advantage of my approach would be that it does not match {{#testmy-test}}.

How to write this using regular expression?

I am looking for a regex to match a string like this: 1,2,4-6,9,11-13,20.
Restrictions:
Only numbers, comma and hyphen are allowed
no spaces are allowed
Your question is rather vague. I would suggest improving it, or reading a tutorial on regexes.
Based on your restriction your regex is /^[-\d,]*$/ but I am quite sure that this is not what you want.
You should provide examples of input, output, the regex flavor you will be using and last but not least your attempts to solve the problem.
I am guessing you want to match comma seprated lists of positive integers or positive integer ranges. \d+ matches integers, to allow ranges, you'd use \d+(-\d+)?.
So, the regex
\d+(-\d+)?(,\d+(-\d+)?)*
would do.

Regular expression that rejects all input?

Is is possible to construct a regular expression that rejects all input strings?
Probably this:
[^\w\W]
\w - word character (letter, digit, etc)
\W - opposite of \w
[^\w\W] - should always fail, because any character should belong to one of the character classes - \w or \W
Another snippets:
$.^
$ - assert position at the end of the string
^ - assert position at the start of the line
. - any char
(?#it's just a comment inside of empty regex)
Empty lookahead/behind should work:
(?<!)
The best standard regexs (i.e., no lookahead or back-references) that reject all inputs are (after #aku above)
.^
and
$.
These are flat contradictions: "a string with a character before its beginning" and "a string with a character after its end."
NOTE: It's possible that some regex implementations would reject these patterns as ill-formed (it's pretty easy to check that ^ comes at the beginning of a pattern and $ at the end... with a regular expression), but the few I've checked do accept them. These also won't work in implementations that allow ^ and $ to match newlines.
(?=not)possible
?= is a positive lookahead. They're not supported in all regexp flavors, but in many.
The expression will look for "not", then look for "possible" starting at the same position (since lookaheads don't move forward in the string).
One example of why such thing could possibly be needed is when you want to filter some input with regexes and you pass regex as an argument to a function.
In spirit of functional programming, for algebraic completeness, you may want some trivial primary regexes like "everything is allowed" and "nothing is allowed".
To me it sounds like you're attacking a problem the wrong way, what exactly
are you trying to solve?
You could do a regular expression that catches everything and negate the result.
e.g in javascript:
if (! str.match( /./ ))
but then you could just do
if (!foo)
instead, as #[jan-hani] said.
If you're looking to embed such a regex in another regex, you
might be looking for $ or ^ instead, or use lookaheads like #[henrik-n] mentioned.
But as I said, this looks like a "I think I need x, but what I really need is y" problem.
Why would you even want that? Wouldn't a simple if statment do the trick? Something along the lines of:
if ( inputString != "" )
doSomething ()
[^\x00-\xFF]
It depends on what you mean by "regular expression". Do you mean regexps in a particular programming language or library? In that case the answer is probably yes, and you can refer to any of the above replies.
If you mean the regular expressions as taught in computer science classes, then the answer is no. Every regular expression matches some string. It could be the empty string, but it always matches something.
In any case, I suggest you edit the title of your question to narrow down what kinds of regular expressions you mean.
[^]+ should do it.
In answer to aku's comment attached to this, I tested it with an online regex tester (http://www.regextester.com/), and so assume it works with JavaScript. I have to confess to not testing it in "real" code. ;)
EDIT:
[^\n\r\w\s]
Well,
I am not sure if I understood, since I always thought of regular expression of a way to match strings. I would say the best shot you have is not using regex.
But, you can also use regexp that matches empty lines like ^$ or a regexp that do not match words/spaces like [^\w\s] ...
Hope it helps!