I have an input string ("My Email id is abc # gmail.com"). From the input string I need to validate Email id using Regex and need to replace it with (xxxxxxx).
I am using the below pattern but it doesn't work if the Email Id contains white Space.
\\w+([-+.']\\w+)*#\\w+([-.]\\w+)*\\.\\w+([-.]\\w+)*
Thanks.
If all you want to do is add whitespaces to word characters and maintain the original
regex integrity, it starts to get ugly:
// (?=\\s*\\w)[\\w\\s]+(?:[-+.'](?=\\s*\\w)[\\w\\s]+)*#(?=\\s*\\w)[\\w\\s]+(?:[-.](?=\\s*\\w)[\\w\\s]+)*\\.(?=\\s*\\w)[\\w\\s]+(?:[-.](?=\\s*\\w)[\\w\\s]+)*
(?= \s* \w )
[\w\s]+
(?:
[-+.']
(?= \s* \w )
[\w\s]+
)*
#
(?= \s* \w )
[\w\s]+
(?:
[-.]
(?= \s* \w )
[\w\s]+
)*
\.
(?= \s* \w )
[\w\s]+
(?:
[-.]
(?= \s* \w )
[\w\s]+
)*
Related
I'm struggling with that one. I want to capture the content of parenthesis where there isn't only digit %. This means I would want to capture this (essiccato, ricco di flavonoidi) or (ricco di 23% pollo, in parte essiccato, in parte idrolizzato) but not this (23 %)or (23)or (23 %)
Here is an exemple : https://regex101.com/r/yW4aZ3/896
So far I'm there : \([^()][^()]*\)
You may use
r'\((?!\s*\d+(?:[.,]\d+)?\s*)[^()]+\)'
See the regex demo and the regex graph:
Details
\( - a ( char
(?!\s*\d+(?:[.,]\d+)?\s*) - a negative lookahead that matches a location not immediately followed with
\s* - 0+ whitespaces
\d+ - 1+ digits
(?:[.,]\d+)? - an optional occurrence of . or , and 1+ digits
\s* - 0+ whitespaces
[^()]+ - 1+ chars other than ( and )
\) - a ) char.
You might use a negative lookahead what follows after the opening parenthesis is not digits followed by an optional percentage sign:
\((?!\s*\d+\s*%?\s*\))[^)]+\)
Explanation
\( Match (
(?! Negative lookahead, assert what is on the right is not
\s*\d+\s*%?\s*\) match 1+ digits followed by an optional % till )
) Close lookahead
[^)]+\) Match 1+ times any char except ), then match )
Regex demo
Assuming that (...) are all balanced and there is no escaping of parentheses inside, you may use this regex with a character class and 2 negated character classes:
\([\d%]*[^%\d()][^()]*\)
Updated RegEx Demo
RegEx Details
\(: Match opening (
[\d%]*: Match 0 or more of any characters that is either a digit or %
[^%\d()]: Match a character that is not (, ), % and a digit
[^()]*: Match 0 or more of any characters that are not ( and not a )
\): Match closing )
I am trying to get this regex dialed-in to validate whether a URL begins with https and if a port is supplied the only valid values are 443 or 5443. This regex is pretty close but not quite there.
^(https:\/\/)([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,})(:5{0,1}443)?(.)*
How do I solve this problem?
This is a mainstream URL validator that tests if it's between whitespace boundary's.
It only allows https device and the port numbers 5443 or 443.
(?<!\S)https://(?:\S+(?::\S*)?#)?(?:(?:(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]+-?)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})))|localhost)(?::5?443)?(?:/[^\s]*)?(?!\S)
Readable version
(?<! \S )
https ://
(?:
\S+
(?: : \S* )?
#
)?
(?:
(?:
(?:
[1-9] \d?
| 1 \d\d
| 2 [01] \d
| 22 [0-3]
)
(?:
\.
(?: 1? \d{1,2} | 2 [0-4] \d | 25 [0-5] )
){2}
(?:
\.
(?:
[1-9] \d?
| 1 \d\d
| 2 [0-4] \d
| 25 [0-4]
)
)
| (?:
(?: [a-z\u00a1-\uffff0-9]+ -? )*
[a-z\u00a1-\uffff0-9]+
)
(?:
\.
(?: [a-z\u00a1-\uffff0-9]+ -? )*
[a-z\u00a1-\uffff0-9]+
)*
(?:
\.
(?: [a-z\u00a1-\uffff]{2,} )
)
)
| localhost
)
(?: : 5? 443 )?
(?: / [^\s]* )?
(?! \S )
You should append a / after this optional port group so it doesn't allow any digits before a /. Try using this regex,
^(https:\/\/)([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,})(:5?443)?\/\S*
Notice, I've also changed (:5{0,1}443)? to (:5?443)? and changed last .* to \S* so the url doesn't capture spaces as spaces in URL is not a valid thing. Besides that, you can also get rid of so many groups in your regex, unless you need them.
Regex Demo
Edit:
As you said in comments, that you want to match following URLs too,
https://example.com
https:example.com
https:example.com:443
you need to make \/\S* part optional by placing a ? after them. The modified regex becomes this, which will match above URLs.
^https:\/\/([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,})(:5?443)?(\/\S*)?
Demo with filepath part being optional
Your RegEx seems to work okay. You may try using this RegEx and add additional boundaries, just for safety, if you wish so:
^(https:\/\/)([a-zA-Z\d\.]{2,})\.([a-zA-Z]{2,}):(5443|443)?$
I only added a $ end char so that to bound your original expression from the right. You may just simply add a few port numbers, if you may have, in this capturing group:
(5443|443)
You can also remove unnecessary boundaries, if you wish.
I have this iframe code that I want to match for both the text right in the beginning of the string and continue with the code to find the "soundcloud" text:
<iframe src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/297769462&color=%23ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&show_teaser=true" width="100%" height="166" frameborder="no" scrolling="no"></iframe>
My regex, which is: (<iframe.*?><\/iframe>), which tries to match the iframe and anything in between.
What I want is the + skip everything in between until it finds soundcloud. If both conditions are fulfilled, then it's a match.
Any help would be great thank you.
Try this
(?i)<iframe(?=((?:[^>"']|"[^"]*"|'[^']*')*?\s(src\s*=\s*(['"])(?:(?!\3)[\S\s])*?soundcloud(?:(?!\3)[\S\s])*\3)(?:"[\S\s]*?"|'[\S\s]*?'|[^>]*?)+>))\1\s*</iframe\s*>
https://regex101.com/r/KkJH6x/1
Formatted
(?i) # Case insensitive modifier
< iframe # The iframe tag
(?= # Asserttion (a pseudo atomic group)
( # (1 start)
(?: [^>"'] | " [^"]* " | ' [^']* ' )*?
\s
( # (2 start), src attribute with 'soundcloud' in value
src \s* = \s*
( ['"] ) # (3), Quote
(?:
(?! \3 )
[\S\s]
)*?
soundcloud # 'Soundcloud'
(?:
(?! \3 )
[\S\s]
)*
\3 # Close quote
) # (2 end)
# The remainder of the tag parts
(?: " [\S\s]*? " | ' [\S\s]*? ' | [^>]*? )+
>
) # (1 end)
)
\1
\s*
</iframe \s* >
I am trying to match to automatically grade a student's answer to a question where the correct answer is:
read and execute for owner and read for everyone
The order of their answer doesn't matter so
read for everyone and read and execute for owner
is an acceptable answer. Then all the fluff (and, for) doesn't matter so I am really just looking for either of these
read execute owner read everyone
read everyone read execute owner
I can get a regex to accept either answer
(?=.*read.*execute.*owner)(?=.*read.*everyone)
But obviously that accepts more answer that are clearly wrong like "read execute owner read execute everyone". So I tried using the negative look-ahead for "execute" with everyone, but then it still matches the "execute" for owner and says no regex match.
Is there are way to accomplish what I am trying to do? Thanks.
Just make the and/for optional.
# (?=.*(\bread(?:\s+and)?\s+execute(?:\s+for)?\s+owner\b))(?=.*(\bread(?:\s+for)?\s+everyone\b))
(?=
.*
( # (1 start)
\b read
(?: \s+ and )?
\s+ execute
(?: \s+ for )?
\s+
owner \b
) # (1 end)
)
(?=
.*
( # (2 start)
\b read
(?: \s+ for )?
\s+ everyone \b
) # (2 end)
)
Edit: You could also allow for optionally any words between the
key words by excluding all the keywords from between the keywords.
Like this -
# (?=.*(\bread(?:\s+(?:(?!\b(?:read|execute|owner|everyone)\b).)+?)?\s+execute(?:\s+(?:(?!\b(?:read|execute|owner|everyone)\b).)+?)?\s+owner\b))(?=.*(\bread(?:\s+(?:(?!\b(?:read|everyone|execute|owner)\b).)+?)?\s+everyone\b))
(?=
.*
( # (1 start)
\b read
(?:
# Optional words Between Keywords -
# not any of this or the other ones keywords
\s+
(?:
(?!
\b
(?:
read # this
| execute # this
| owner # this
| everyone # other
)
\b
)
.
)+?
)?
\s+ execute
(?:
# Optional words Between Keywords
\s+
(?:
(?!
\b
(?:
read
| execute
| owner
| everyone
)
\b
)
.
)+?
)?
\s+
owner \b
) # (1 end)
)
(?=
.*
( # (2 start)
\b read
(?:
# Optional words Between Keywords -
# not any of this or the other ones keywords
\s+
(?:
(?!
\b
(?:
read # this
| everyone # this
| execute # other
| owner # other
)
\b
)
.
)+?
)?
\s+ everyone \b
) # (2 end)
)
I want to take Parameter1 using regex in Sublime Text. Other parameter will not be used.
Initial tags:
<description><![CDATA[<b>Parameter1</b></br></br>
This not to be copied and can be long]]></description>
This expression in Regex Sublime Text...
<description><!\[CDATA\[<b>(\w+)</b></br></br>(\w*)\]\]</description>
cannot find what I need (when I reach it stops finding)
Your regex doesn't match the test string.
There are whitespaces between the word letters.
It also won't match non-word letters like punctuation.
Below are two Regexs'
1. This is just to match your test string.
# <description>\s*<!\[CDATA\[\s*<b>([\s\w]+)</b>\s*</br>\s*</br>([\s\w]*)\]\]\s*</description>
<description>
\s*
<!\[CDATA\[
\s*
<b>
( # (1)
[\s\w]+
)
</b> \s* </br> \s* </br>
( # (2)
[\s\w]*
)
\]\]
\s*
</description>
2. This is how it should be done if your engine supports lookahead assertions.
# (?s)<description>\s*<!\[CDATA\[\s*<b>((?:(?!\]\]|\s*</b>).)+?)\s*</b>\s*</br>\s*</br>\s*((?:(?!\s*\]\]).)*)\s*\]\]\s*</description>
(?s)
<description>
\s*
<!\[CDATA\[
\s*
<b>
( # (1)
(?:
(?! \]\] | \s* </b> )
.
)+?
)
\s* </b> \s* </br> \s* </br> \s*
( # (2)
(?:
(?! \s* \]\] )
.
)*
)
\s*
\]\]
\s*
</description>