This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
Beginner here and I'm trying to understand this. Can someone please break down the part in between the single quotes and describe what it does?
grep -oP '(?<=\S\/1\.\d.\s)[345]\d+'
Many thanks in advance!
Positive Lookbehind (?<=\S/1.\d.\s) Assert that the Regex below matches
\S matches any non-whitespace character (equal to [^\r\n\t\f\v ])
\/ matches the character / literally (case sensitive)
1 matches the character 1 literally (case sensitive)
\. matches the character . literally (case sensitive)
\d matches a digit (equal to [0-9])
. matches any character (except for line terminators)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
Match a single character present in the list below [345]
345 matches a single character in the list 345 (case sensitive)
\d+ matches a digit (equal to [0-9])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
Output simply copied from https://regex101.com/r/HfJSNm/1 : very handy to test/share/have automatic explications on regexes.
Related
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
What is the output of this regular expression?
/\s+(\d+)\s+/
In particular what is the meaning of /\s
In your regex \s+ matches any number of whitespaces sequentially and /d+ matches any number of digits sequentially .
\s and \d matches a single whitespace and single digit respectively the + makes it match any number of sequential whitespaces and digits respectively.
You can find a full explanation at regex101.com.
/\s+(\d+)\s+/
\s+ matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
1st Capturing Group (\d+)
\d+ matches a digit (equal to [0-9])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s+ matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
https://regex101.com/
might be useful :D
\s+(\d+)\s+ / ↵ matches the character
↵ literally (case sensitive)
\s+
matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy) 1st Capturing Group
(\d+)
\d+ matches a digit (equal to [0-9])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s+ matches any whitespace
character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I am new to sed command and regular expressions.
Could anyone help me to explain the below command
echo "FIXTEST_XYZ 23040/tcp # 127.0.0.1:23040" | sed -n 's/^FIXTEST_\([A-Za-z0-9]\+\)\s\+\([0-9]\+\)\/tcp\s\+#\s*\([A-Za-z0-9.]\+\):\([0-9]\+\)\s*/\1 \2 \3 \4/p'
Thank you for your help !!
Check explainshell for command explaination.
Check regex101 for regex explanation:
^FIXTEST_([A-Za-z0-9]+)\s+([0-9]+)\/tcp\s+#\s*([A-Za-z0-9.]+):([0-9]+)\s*
FIXTEST_ matches the characters FIXTEST_ literally (case sensitive)
1st Capturing Group ([A-Za-z0-9]+)
Match a single character present in the list below [A-Za-z0-9]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
A-Z a single character in the range between A (index 65) and Z (index 90) (case sensitive)
a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
\s+ matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Capturing Group ([0-9]+)
Match a single character present in the list below [0-9]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
\/ matches the character / literally (case sensitive)
tcp matches the characters tcp literally (case sensitive)
\s+ matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
# matches the character # literally (case sensitive)
\s* matches any whitespace character (equal to [\r\n\t\f\v ])
3rd Capturing Group ([A-Za-z0-9.]+)
: matches the character : literally (case sensitive)
4th Capturing Group ([0-9]+)
\s* matches any whitespace character (equal to [\r\n\t\f\v ])
Global pattern flags
g modifier: global. All matches (don't return after first match)
1st Capturing Group matches: XYZ
2nd Capturing Group matches: 23040
3rd Capturing Group matches: 127.0.0.1
4th Capturing Group matches: 23040
Bash standard output: XYZ 23040 127.0.0.1 23040
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
How can I remove all spaces into braces with Notepad++ and RegEx?
For example:
I have string [Word1 Word2 Word3]
I need: [Word1Word2Word3]
Thanks
\s++(?=[^[]*])
\s++
matches any whitespace character (equal to [\r\n\t\f\v ])
++ Quantifier — Matches between one and unlimited times, as many times as possible, without giving back (possessive)
Positive Lookahead (?=[^[]*])
Assert that the Regex below matches
Match a single character not present in the list below [^[]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
[ matches the character [ literally (case sensitive)
] matches the character ] literally (case sensitive)
(?:\[|\G(?!^))[^]\s]*\K\s+
Non-capturing group (?:\[|\G(?!^))
1st Alternative \[
\[ matches the character [ literally (case sensitive)
2nd Alternative \G(?!^)
\G asserts position at the end of the previous match or the start of the string for the first match
Negative Lookahead (?!^)
Assert that the Regex below does not match
^ asserts position at start of the string
Match a single character not present in the list below [^]\s]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
] matches the character ] literally (case sensitive)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match
\s+
matches any whitespace character (equal to [\r\n\t\f\v ])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I'm writing a regex to pull a URL out of an auto-generated email from my monitoring system. For example:
https://mon.contoso.com/mon/call.py?fn=edit&num=1389896156
I need a regex to match:
https://mon.contoso.com/mon/call.py?fn=edit&num=XXXXXXXXX
whereby the "x"'s always change. I run into an issue with the "?". The point of this is to append the URL to a field in JIRA.
Pattern p = new Pattern("https://mon.contoso.com/mon/call.py?fn=edit&num=(\d+)")
Matcher m = p.matcher(inputEmail);
return m.matches() ? m.group(1) : "";
This returns num if it is numeric, otherwise you might want to use \w instead of \d. If you want the whole URL, remove the group() parameter.
You don't indicate what language you're working in.
In Python and JavaScript, this regex will identify a variety of URLs:
/\[[^\]\n]+\](?:\([^\)\n]+\)|\[[^\]\n]+\])|(?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,})[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]*/gi
You can refer to this regex101 test for examples of the regex in use.
Explanation:
/\[[^\]\n]+\](?:\([^\)\n]+\)|\[[^\]\n]+\])|(?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,})[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]*/gi
1st Alternative: \[[^\]\n]+\](?:\([^\)\n]+\)|\[[^\]\n]+\])
\[ matches the character [ literally
[^\]\n]+ match a single character not present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\] matches the character ] literally
\n matches a line-feed (newline) character (ASCII 10)
\] matches the character ] literally
(?:\([^\)\n]+\)|\[[^\]\n]+\]) Non-capturing group
1st Alternative: \([^\)\n]+\)
\( matches the character ( literally
[^\)\n]+ match a single character not present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\) matches the character ) literally
\n matches a line-feed (newline) character (ASCII 10)
\) matches the character ) literally
2nd Alternative: \[[^\]\n]+\]
\[ matches the character [ literally
[^\]\n]+ match a single character not present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\] matches the character ] literally
\n matches a line-feed (newline) character (ASCII 10)
\] matches the character ] literally
2nd Alternative: (?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,})[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]*
(?:\/\w+\/|.:\\|\w*:\/\/|\.+\/[./\w\d]+|(?:\w+\.\w+){2,}) Non-capturing group
1st Alternative: \/\w+\/
\/ matches the character / literally
\w+ match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\/ matches the character / literally
2nd Alternative: .:\\
. matches any character (except newline)
: matches the character : literally
\\ matches the character \ literally
3rd Alternative: \w*:\/\/
\w* match any word character [a-zA-Z0-9_]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
: matches the character : literally
\/ matches the character / literally
\/ matches the character / literally
4th Alternative: \.+\/[./\w\d]+
\.+ matches the character . literally
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\/ matches the character / literally
[./\w\d]+ match a single character present in the list below
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
./ a single character in the list ./ literally
\w match any word character [a-zA-Z0-9_]
\d match a digit [0-9]
5th Alternative: (?:\w+\.\w+){2,}
(?:\w+\.\w+){2,} Non-capturing group
Quantifier: {2,} Between 2 and unlimited times, as many times as possible, giving back as needed [greedy]
\w+ match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\. matches the character . literally
\w+ match any word character [a-zA-Z0-9_]
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
[./\w\d:/?#\[\]#!$&'()*+,;=\-~%]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
./ a single character in the list ./ literally
\w match any word character [a-zA-Z0-9_]
\d match a digit [0-9]
:/?# a single character in the list :/?# literally
\[ matches the character [ literally
\] matches the character ] literally
#!$&'()*+,;= a single character in the list #!$&'()*+,;= literally (case insensitive)
\- matches the character - literally
~% a single character in the list ~% literally
g modifier: global. All matches (don't return on first match)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
i have text like this:
Date: 01.02.2015 //<-stable format
something
something more
some random more
Date: 02.02.2015
something random
i dont know
so i have many such blocks. Starts with Date... ends with next Date... start.
The text in the lines in the block could be anything, but not Date... format
I need an array at the end, with such blocks:
array[0] = "Date: 01.02.2015
something
something more
some random more"
array[1] = "Date: 02.02.2015
something random
i dont know"
for now i add some unique splitter before Date... than split by the splitter.
Question: is it possible to get such blocks only by regex?
(i use VBA to parse the text, RegExp object)
Instead of split just match using
\bDate:\s\d{1,2}\.\d{1,2}\.\d{4}[\s\S]*?(?=\nDate:|$)
See demo.
https://regex101.com/r/uF4oY4/77
Syntax explanation (from the linked site):
\b assert position at a word boundary: (^\w|\w$|\W\w|\w\W)
Date: matches the characters Date: literally (case sensitive)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\d{1,2} matches a digit (equal to [0-9]) between 1 and 2 times, as many times as possible, giving back as needed (greedy)
. matches the character . literally (case sensitive)
\d{1,2} matches a digit (equal to [0-9]) between 1 and 2 times, as many times as possible, giving back as needed (greedy)
. matches the character . literally (case sensitive)
\d{4} matches a digit (equal to [0-9]) exactly 4 times
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\S matches any non-whitespace character (equal to [^\r\n\t\f\v ])
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy) , what specified in previous brackets
?= Positive Lookahead - Assert that the following Regex matches
\nDate Option 1
\n matches a line-feed (newline) character (ASCII 10)
Date matches the characters Date: literally (case sensitive)
$: Option 2 - $ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)