Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
Can someone please explain what this regexp matches?
#\b(https://exampleurl.com/)([^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))#
I have no experience with regexp and I need to know what this one does.
Trying with link. It explains all:
/[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|))/
[^\s()<>]+ match a single character not present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\s match any white space character [\r\n\t\f ]
()<> a single character in the list ()<> literally (case sensitive)
(?:([\w\d]+)|([^[:punct:]\s]|)) Non-capturing group
1st Alternative: ([\w\d]+)
\( matches the character ( literally
[\w\d]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\w match any word character [a-zA-Z0-9_]
\d match a digit [0-9]
\) matches the character ) literally
2nd Alternative: ([^[:punct:]\s]|)
1st Capturing group ([^[:punct:]\s]|)
1st Alternative: [^[:punct:]\s]
[^[:punct:]\s] match a single character not present in the list below
[:punct:] matches punctuation characters [POSIX]
\s match any white space character [\r\n\t\f ]
2nd Alternative: ([^[:punct:]\s]|)
1st Capturing group ([^[:punct:]\s]|)
1st Alternative: [^[:punct:]\s]
[^[:punct:]\s] match a single character not present in the list below
[:punct:] matches punctuation characters [POSIX]
\s match any white space character [\r\n\t\f ]
2nd Alternative: (null, matches any position)
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
What does this ((\d\d\d)\s)? regex match?
\d matches the digits. it is all about the langugae you are using.
In python3, [0-9] matches only 0123456789 characters, while \d matches [0-9] and other digit characters, for example Eastern Arabic numerals ٠١٢٣٤٥٦٧٨٩.
\s matches any whitespace character
\d matches digits from [0-9].
\s matches white-space characters like [ \t\n\r]
? is means optional, it matches even if the following regex are not present.
() are used for grouping.
Now the question is what does ((\d\d\d)\s)? match?
\d\d\d matches 3 consecutive digits and group them to $1.
((\d\d\d)\s) matches 3 consecutive followed by space and this is grouped to $2.
since we have ? at the end of the regex, it matches digits followed with space and also if there are no such match.
In case if there is no match, it points to start of the line.
The regex expression :
The first backslash escapes the open parenthesis that follows, as it is a special character, so the regex will search for an open and a close parenthesis in the input string
Example : (111)
have a look at this site
https://regex101.com/r/yS5fU8/2
1st Capturing Group (\d\d\d)
p (\d\d\d) \d matches a digit (equal to [0-9])
\d matches a digit (equal to [0-9])
\d matches a digit (equal to [0-9])
\d matches a digit (equal to [0-9])
and
- \s matches any whitespace character (equal to [\r\n\t\f\v ])
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
How can I remove all spaces into braces with Notepad++ and RegEx?
For example:
I have string [Word1 Word2 Word3]
I need: [Word1Word2Word3]
Thanks
\s++(?=[^[]*])
\s++
matches any whitespace character (equal to [\r\n\t\f\v ])
++ Quantifier — Matches between one and unlimited times, as many times as possible, without giving back (possessive)
Positive Lookahead (?=[^[]*])
Assert that the Regex below matches
Match a single character not present in the list below [^[]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
[ matches the character [ literally (case sensitive)
] matches the character ] literally (case sensitive)
(?:\[|\G(?!^))[^]\s]*\K\s+
Non-capturing group (?:\[|\G(?!^))
1st Alternative \[
\[ matches the character [ literally (case sensitive)
2nd Alternative \G(?!^)
\G asserts position at the end of the previous match or the start of the string for the first match
Negative Lookahead (?!^)
Assert that the Regex below does not match
^ asserts position at start of the string
Match a single character not present in the list below [^]\s]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
] matches the character ] literally (case sensitive)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match
\s+
matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I'm writing a regex to pull a URL out of an auto-generated email from my monitoring system. For example:
https://mon.contoso.com/mon/call.py?fn=edit&num=1389896156
I need a regex to match:
https://mon.contoso.com/mon/call.py?fn=edit&num=XXXXXXXXX
whereby the "x"'s always change. I run into an issue with the "?". The point of this is to append the URL to a field in JIRA.
Pattern p = new Pattern("https://mon.contoso.com/mon/call.py?fn=edit&num=(\d+)")
Matcher m = p.matcher(inputEmail);
return m.matches() ? m.group(1) : "";
This returns num if it is numeric, otherwise you might want to use \w instead of \d. If you want the whole URL, remove the group() parameter.
You don't indicate what language you're working in.
In Python and JavaScript, this regex will identify a variety of URLs:
/\[[^\]\n]+\](?:\([^\)\n]+\)|\[[^\]\n]+\])|(?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,})[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]*/gi
You can refer to this regex101 test for examples of the regex in use.
Explanation:
/\[[^\]\n]+\](?:\([^\)\n]+\)|\[[^\]\n]+\])|(?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,})[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]*/gi
1st Alternative: \[[^\]\n]+\](?:\([^\)\n]+\)|\[[^\]\n]+\])
\[ matches the character [ literally
[^\]\n]+ match a single character not present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\] matches the character ] literally
\n matches a line-feed (newline) character (ASCII 10)
\] matches the character ] literally
(?:\([^\)\n]+\)|\[[^\]\n]+\]) Non-capturing group
1st Alternative: \([^\)\n]+\)
\( matches the character ( literally
[^\)\n]+ match a single character not present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\) matches the character ) literally
\n matches a line-feed (newline) character (ASCII 10)
\) matches the character ) literally
2nd Alternative: \[[^\]\n]+\]
\[ matches the character [ literally
[^\]\n]+ match a single character not present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\] matches the character ] literally
\n matches a line-feed (newline) character (ASCII 10)
\] matches the character ] literally
2nd Alternative: (?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,})[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]*
(?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,}) Non-capturing group
1st Alternative: \/\w+\/
\/ matches the character / literally
\w+ match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\/ matches the character / literally
2nd Alternative: .:\\
. matches any character (except newline)
: matches the character : literally
\\ matches the character \ literally
3rd Alternative: \w*:\/\/
\w* match any word character [a-zA-Z0-9_]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
: matches the character : literally
\/ matches the character / literally
\/ matches the character / literally
4th Alternative: \.+\/[./\w\d]+
\.+ matches the character . literally
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\/ matches the character / literally
[./\w\d]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
./ a single character in the list ./ literally
\w match any word character [a-zA-Z0-9_]
\d match a digit [0-9]
5th Alternative: (?:\w+\.\w+){2,}
(?:\w+\.\w+){2,} Non-capturing group
Quantifier: {2,} Between 2 and unlimited times, as many times as possible, giving back as needed [greedy]
\w+ match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\. matches the character . literally
\w+ match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
./ a single character in the list ./ literally
\w match any word character [a-zA-Z0-9_]
\d match a digit [0-9]
:/?# a single character in the list :/?# literally
\[ matches the character [ literally
\] matches the character ] literally
#!$&'()*+,;= a single character in the list #!$&'()*+,;= literally (case insensitive)
\- matches the character - literally
~% a single character in the list ~% literally
g modifier: global. All matches (don't return on first match)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
This is some URL:
http://www.mywebsite.com/name1-name2-name3-name4-342/46547657/ca
http://www.mywebsite.com/name5-487659826/da
http://www.mywebsite.com/name6-name7-567/5677/ca
http://www.mywebsite.com/name8-name9-name10-48765766/da
http://www.mywebsite.com/name11-name12-name13-name14-name15/11117657/ca
http://www.mywebsite.com/name16-4866626/da
So, output will be:
name1-name2-name3-name4-342
name5
name6-name7-567
name8-name9-name10
name11-name12-name13-name14-name15
name16
Do you give me a regex which do that, please ?
For the given urls you have provided, you could use the following to extract the wanted substrings.
http://[^/]+/\K\w+(?:-(?!\d{4,})\w+)*
Live Demo
http://.*mywebsite\.com/(\w+(?:-(?!\d{4,})\w+)*)
Options: ^ and $ match at line breaks
Match the characters “http://” literally «http://»
Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the characters “mywebsite” literally «mywebsite»
Match the character “.” literally «\.»
Match the characters “com/” literally «com/»
Match the regular expression below and capture its match into backreference number 1 «(\w+(?:-(?!\d{4,})\w+)*)»
Match a single character that is a “word character” (letters, digits, etc.) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regular expression below «(?:-(?!\d{4,})\w+)*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “-” literally «-»
Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!\d{4,})»
Match a single digit 0..9 «\d{4,}»
Between 4 and unlimited times, as many times as possible, giving back as needed (greedy) «{4,}»
Match a single character that is a “word character” (letters, digits, etc.) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Created with RegexBuddy
Matches all:
http://mywebsite.com/name6-name7-567 name6-name7-567
http://mywebsite.com/name6-name7-567 name6-name7-567
http://www.mywebsite.com/name1-name2-name3-name4-342 name1-name2-name3-name4-342
http://www.mywebsite.com/name5 name5
http://www.mywebsite.com/name6-name7-567 name6-name7-567
http://www.mywebsite.com/name8-name9-name10 name8-name9-name10
http://www.mywebsite.com/name11-name12-name13-name14-name15 name11-name12-name13-name14-name15
http://www.mywebsite.com/name16 name16
I can match an 'a' followed by at least 2 other characters before another 'a' with the following regular expression.
a.{2,}?a
Interestingly, including the question mark makes the regex match the instance with the fewest number of middle characters possible, so for instance, given the following string,
abbabbbba
the regex will match the leftmost abba instead of the whole string. Why does including the question mark cause the regex to match the instance with the fewest number of middle characters?
The question mark after a quantifier makes the quantifier lazy. It is a basic feature of regex, you need to learn more about it.
a link: regular-expressions.info
(?:or|and) the one in hwnd comment.
? implies a lazy match
here is the details of your regex
/a.{2,}?a/
a matches the character a literally (case sensitive)
. matches any character (except newline)
{2,} Quantifier: Between 2 and unlimited times
? as few times as possible, expanding as needed [lazy]
a matches the character a literally (case sensitive)