Issue while formulating regex for the password field - regex

I am trying to formulate a regexp for a password field, which accepts at least one special character and one alpha numeric character.
I have already tried with this regexp ((?=.*\d)(?=.*[a-zA-Z])(?=.*\W)) on Rubular, which I cooked up. But it's not working properly.
Test String : test#123
Kindly suggest a way to overcome this.
If you can please give some explanation as well.

Your regex actually does match your test string. It seems that you are wanting it to be in your capture group though as you wrapped the look-aheads in parenthesis.
Wrapping a capture group around your look-aheads wont capture anything as they are just looking ahead to verify. You'll have to create a capture group capturing the entire thing after like this:
^(?=.*\d)(?=.*[a-zA-Z])(?=.*\W)(.{6,20})$
The ^ and $ are just checking the entire string passed. The . within the capture group () is just saying to grab the entire match. The {6,20} is saying it has to be between 6 and 20 characters long. You can change the numbers if you want.
Rubular

Related

How to extract characters from a string with optional string afterwards using Regex?

I am in the process of learning Regex and have been stuck on this case. I have a url that can be in two states EXAMPLE 1:
spotify.com/track/1HYcYZCOpaLjg51qUg8ilA?si=Nf5w1q9MTKu3zG_CJ83RWA
OR EXAMPLE 2:
spotify.com/track/1HYcYZCOpaLjg51qUg8ilA
I need to extract the 1HYcYZCOpaLjg51qUg8ilA ID
So far I am using this: (?<=track\/)(.*)(?=\?)? which works well for Example 2 but it includes the ?si=Nf5w1q9MTKu3zG_CJ83RWA when matching with Example 1.
BUT if I remove the ? at the end of the expression then it works for Example 1 but not Example 2! Doesn't that mean that last group (?=\?) is optional and should match?
Where am I going wrong?
Thanks!
I searched a handful of "Questions that may already have your answer" suggestions from SO, and didn't find this case, so I hope asking this is okay!
The capturing group in your regular expression is trying to match anything (.) as much as possible due to the greediness of the quantifier (*).
When you use:
(?<=track\/)(.*)(?=\?)
only 1HYcYZCOpaLjg51qUg8ilA from the first example is captured, as there is no question mark in your second example.
When using:
(?<=track\/)(.*)(?=\??)
You are effectively making the positive lookahead optional, so the capturing group will try to match as much as possible (including the question mark), so that 1HYcYZCOpaLjg51qUg8ilA?si=Nf5w1q9MTKu3zG_CJ83RWA and 1HYcYZCOpaLjg51qUg8ilA are matched, which is not the desired output.
Rather than matching anything, it is perhaps more appropriate for you to match alphanumerical characters \w only.
(?<=track\/)(\w*)(?=\??)
Alternatively, if you are expecting other characters , let's say a hyphen - or a underscore _, you may use a character class.
(?<=track\/)([a-zA-Z0-9_-]*)(?=\??)
Or you might want to capture everything except a question mark ? with a negated character class.
(?<=track\/)([^?]*)(?=\??)
As pointed out by gaganso, a look-behind is not necessary in this situation (or indeed the lookahead), however it is indeed a good idea to start playing around with them. The look-around assertions do not actually consume the characters in the string. As you can see here, the full match for both matches only consists of what is captured by the capture group. You may find more information here.
This should work:
track\/(\w+)
Please see here.
Since track is part of both the strings, and the ID is formed from alphanumeric characters, the above regex which matches the string "track/" and captures the alphanumeric characters after that string, should provide the required ID.
Regex : (\w+(?=\?))|(\w+&)
See the demo for the regex, https://regexr.com/3s4gv .
This will first try to search for word which has '?' just after it and if thats unsuccessful it will fetch the last word.

Complicated regex to match anything NOT within quotes

I have this regex which scans a text for the word very: (?i)(?:^|\W)(very)[\W$] which works. My goal is to upgrade it and avoid doing a match if very is within quotes, standalone or as part of a longer block.
Now, I have this other regex which is matching anything NOT inside curly quotes: (?<![\S"])([^"]+)(?![\S"]) which also works.
My problem is that I cannot seem to combine them. For example the string:
Fred Smith very loudly said yesterday at a press conference that fresh peas will "very, very defintely not" be served at the upcoming county fair. In this bit we have 3 instances of very but I'm only interested in matching the first one and ignore the whole Smith quotation.
What you describe is kind of tricky to handle with a regular expression. It's difficult to determine whether you are inside a quote. Your second regex is not effective as it only ignores the first very that is directly to the right of the quote and still matches the second one.
Drawing inspiration from this answer, that in turn references another answer that describes how to regex match a pattern unless ... I can capture the matches you want.
The basic idea is to use alternation | and match all the things you don't want and then finally match (and capture) what you do want in the final clause. Something like this:
"[^"]*"|(very)
We match quoted strings in the first clause but we don't capture them in a group and then we match (and capture) the word very in the second clause. You can find this match in the captured group. How you reference a captured group depends on your regex environment.
See this regex101 fiddle for a test case.
This regex
(?i)(?<!(((?<DELIMITER>[ \t\r\n\v\f]+)(")(?<FILLER>((?!").)*))))\bvery\b(?!(((?<FILLER2>((?!").)*)(")(?<DELIMITER2>[ \t\r\n\v\f]+))))
could work under two conditions:
your regex engine allows unlimited lookbehind
quotes are delimited by spaces
Try it on http://regexstorm.net/tester

Looking for feedback on regex config

Was looking for a review of a regex I created, as I'm looking to see where improvements could be made
I have the following log message:
2017-02-09T14:12:07.381648+00:00 ATA-CENTER ATA[4844] CEF:0|Microsoft|ATA|1.7.5757.57477|AbnormalBehaviorSuspiciousActivity|Suspicion of identity theft based on abnormal behavior|5|start=2017-02-09T14:07:22.1490000Z app=Kerberos shost=xxx suser=Last Name, First Name msg=text here. cs1Label=url cs1=https://xxx-xxx.xxxx.xxx.xxx/suspiciousActivity/589c7796135ca912ec5b75b0
Here is my regex:
.*?\|ATA\|(?<version>.*?)\|(?:\w+)\|(?<alert>.*?)\|(?<severity>.*?)\|(?:.*?)\s\w+=(?<app>.*?)\s\w+=(?<src_host>.*?)\s\w+=(?<user>.*?)\s\w+=(?<msg>.*?).\s.*?
I'm trying to disregard everything up to ATA, and then disregard everything after the period at the end of the msg (starting at cs1Label).
Would appreciate any feedback.
Thx
There is one small error in your regex. The dot . after the message will match everything. If you really want to match a dot, you need to escape it. (?:\w+) will match every character and every digit. You could also just use \d to match a digit. Moreover, I wouldn't label the capture groups. In my opinion, this does not make the regex more readable.
Otherwise it seems fine to me. Here is a live demo with the fixed dot.

Regex for value.contains() in Google Refine

I have a column of strings, and I want to use a regex to find commas or pipes in every cell, and then make an action. I tried this, but it doesn't work (no syntax error, just doesn't match neither commas nor pipes).
if(value.contains(/(,|\|)/), ...
The funny thing is that the same regex works with the same data in SublimeText. (Yes, I can work it there and then reimport, but I would like to understand what's the difference or what is my mistake).
I'm using Google Refine 2.5.
Since value.match should return captured texts, you need to define a regex with a capture group and check if the result is not null.
Also, pay attention to the regex itself: the string should be matched in its entirety:
Attempts to match the string s in its entirety against the regex pattern p and returns an array of capture groups.
So, add .* before and after the pattern you are looking inside a larger string:
if(value.match(/.*([,|]).*/) != null)
You can use a combination of if and isNonBlank like:
if(isNonBlank(value.match(/your regex/), ...

Find Regex mismatch part in a string using vb.net

I had a regex expression
^\d{9}_[a-zA-Z]{1}_(0[1-9]|1[0-2]).(0[1-9]|[1-2][0-9]|3[0-1]).[0-9]{4}_\d*_[0-9a-zA-Z]*_[0-9a-zA-Z]*
and string that match regex expression
000066874_A_12.31.2014_001_2Q_ICAN14
if user by mistake enters the string other than above format like
000066874_12.31.14_001_2Q_ICAN14
I need to find out in which part of my regex got failed. I tried using Regex.Matches and Regex.Match but using this I couldn't find in which part my string got miss matched with my Regex expression. I am using vb.net
This is very complicated to do with regex. I managed to make this regex, but you still have to check the capture groups after that.
^(?:(?:(\d{9})|.*?)_)?(?:(?:([a-zA-Z]{1})|.*?)_)?(?:(?:((?:0[1-9]|1[0-2]).(?:0[1-9]|[1-2][0-9]|3[0-1]).[0-9]{4})|.*?)_)?(?:(?:(\d*)|.*?)_)?(?:(?:([0-9a-zA-Z]*)|.*?)_)?(?:([0-9a-zA-Z]*)|.*?)$ will work if you, as seen in demo: https://regex101.com/r/aJ1wG1/2
Each part before an underline is a capture group, if a capture group is not there, there's an error in it. As you can see in the example, $3 is not present in 1st example, hence, a mistake in date is there. In second example, the $2 is not present, hence $2 onward are not there. 3rd example is correct and all 6 caputre groups are there.
When regexes get this massive, it's a sign that probably a different method should be used to solve the problem, but this might work for you with some additional code for group result checks.