I want to modify this regex to include apostrophe - regex

This regex is used for validating email addresses, however it doesn't include the case for apostrophy (') which is a valid character in the first part of an email address.
I have tried myself and to use some examples I found, but they don't work.
^([\w-\.]+)#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$
How do I modify it slightly to support the ' character (apostraphy)?

Per the documentation for an email address, the apostrophe can appear anywhere before the # symbol, which, in your current regex is:
^([\w-\.]+)#
You should be able to add the apostrophe into the brackets of valid characters:
^([\w-\.']+)#
This would make the entire regex:
^([\w-\.']+)#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$
EDIT (regex contained in single-quotes)
If you're using this regex inside a string with single-quotes, such as in PHP with $regex = '^([\w ..., you will need to escape the single-quote in the regex with \':
^([\w-\.\']+)#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$

You need to update the first part as follows:
^([\'\w-\.]+)

Related

Use regex to strip out emails

I know that this is a notoriously difficult topic. The best regex that I've found after trawling many different answers is the one at http://emailregex.com/
It works great at validating an email address, but I'm struggling to alter this regex to find all email addresses in a string.
I'm using the PHP version of the regex.
How would I go about using this regex to find all of the email addresses in a string?
I know about the preg functions, my PHP code isn't as much the problem as adapting that regex.
$redacted = preg_replace_callback(
"/$emailRegex/i",
function ($matches) {
return '[' . $this->getHashedValue($matches[0]) . ']';
},
$input
);
If you already have a working regular expression, you can use PHP's preg_replace to replace all (non-overlapping) matches by a certain string, in our case "" (to remove them).
preg_replace($your_regex, "", $your_string)
This should strip all matches from your string.
Also, as #MonkeyZeus commented, if your regex contains the start anchor (^) or the end anchor ($), make sure to remove those before using preg_replace. Otherwise, the only match you can get will be the entire string, if it matches.

Escape string in PowerShell Regex to a regular string

I have a string that contains some regex in, $(=?, Now this string is a password that I need to pass for some application that I'm building.
The code that I'm trying to use is:
$x = 'GIWs#K?hks2v&HKXb$S9=HK*AZN=i!(S?7'
[Regex]::Escape($x)
I've already tried the method with [Regex]::Escape() and it doesn't meet my requirements because I'm trying to insert the string as a password and it replacing the Regex with \.
Perhaps after I'm doing the [Regex]::Escape() should I try to delete the \ that I'm getting from the result of the command?
After running the [Regex]::Escape() this is the result I'm getting when printing the output:
GIWs#K\?hks2v&HKXb=HK\*AZN=i!\(S\?7
I'm trying to achieve the string without the ' \ ' characters but with the Escape function:
GIWs#K?hks2v&HKXb=HK*AZN=i!(S?7
This is not an answer because I don't know what the problem actually is. However, there are some inherent problems with your current attempt to handle the password string. If you use double quotes ("") around a string, PowerShell will interpolate the string inside the quotes. So any alphanumeric characters following an unescaped $, will be considered a variable name during interpolation. If that variable has no value, $variable will be replaced with a null value. You can see this behavior below:
"rt4837s$GT=\"
rt4837s=\
You should use single quotes ('') when quoting string literals (characters that will be left as is). PowerShell will not attempt interpolation when unescaped single quote pairs are encountered unless there is quote nesting. See below:
'rt4837s$GT=\'
rt4837s$GT=\
If you need a regex escaped string, the same rules apply from above and you should use single quotes.
[regex]::escape('dfaseryh$S9=r??*')
dfaseryh\$S9=r\?\?\*
If for any reason, you need to access that string later without the escape characters, then you can use the regex method Unescape().
[regex]::unescape('dfaseryh\$S9=r\?\?\*')
dfaseryh$S9=r??*
Practical Example of Using Regex Replace:
$OriginalString = 'Username = Anonymous; Password = <password>'
$regexReplace = [regex]::Escape('<password>')
$Password = 'GIWs#K?hks2v&HKXb$S9=HK*AZN=i!(S?7'
$OriginalString -replace $regexReplace,($Password -replace '\$','$$$$')
# Output
Username = Anonymous; Password = GIWs#K?hks2v&HKXb$S9=HK*AZN=i!(S?7
In the code above, $OriginalString is just an ordinary string that can be retrieved from any command or set by a coder. It contains a string <password> that we want to replace with a complex password string GIWs#K?hks2v&HKXb$S9=HK*AZN=i!(S?7.
$Password contains the complex password. Since we only care about replacing <password> and are choosing to use regex replace operator -replace, we need a valid regex expression for matching <password>. There is a caveat here though. When using -replace, the $ in the replacement string is used to prefix capture group names. So there can be cases where the literal string has an unintentional replacement. Capture group 0 is always there if there is a match. So $0 will always cause issues without proper escaping. It is probably best to just escape $ regardless.
For the regex match, we use [regex]::Escape('<password>') since we are unsure if <> are special in regex. If there are no special characters, then the string within the regex expression will not be modified. If it does contain special characters, they will be escaped with \.
As a result, <password> is replaced with GIWs#K?hks2v&HKXb$S9=HK*AZN=i!(S?7.
A recap of the syntax is as follows:
'String With Something You Want to Replace' -replace 'Regex Expression to Match String You Want to Replace','Replacement That Is a Literal String With Escaped $'

Regex to Match a single or multiple email addresses separated by comma

I need to have a regex pattern that matches the following kind of string
#keyword1 a#b.com or #keyword2 a#b.com;b#c.com;d#e.com
The following regex pattern doesn't do exactly what I want:
/(#)(?:keyword1|keyword2)\s([a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)/g
The above regex expression only matches #keyword1 a#b.com correctly.
But for the second it matches everything before the first semicolon. I need it to match the entire thing. How can I do that please?
I would suggest parsing the string in two steps. First distinguish the keyword from the array of email addresses and then split the array.
First retrieve both the keyword and the arrray, assuming that is all that the string consists of. I'm using the JavaScript RegExp notation, but you should be able to understand what is happening.
Assume the string is "#keyword2 a#b.com;b#c.com;d#e.com".
/^#(keyword1|keyword2) (.*)$/g
Group 1 will be "keyword2" and group 2 will be "a#b.com;b#c.com;d#e.com". Now apply the following pattern to group 2 and loop through the matches to retrieve each email address.
/([^;]*)(?:;|$)/g
This pattern makes no assumptions about whether or not the email addresses are properly formatted, just that they are separated by a semicolon. This also works if there's only a single email address.

How to do regular Expression in AutoIt Script

In Autoit script Iam unable to do Regular expression for the below string Here the numbers will get changed always.
Actual String = _WinWaitActivate("RX_IST2_AM [PID:942564 NPID:10991 SID:498702881] sbivvrwm060.dev.ib.tor.Test.com:30000","")
Here the PID, NPID & SID : will be changing and rest of the things are always constant.
What i have tried below is
_WinWaitActivate("RX_IST2_AM [PID:'([0-9]{1,6})' NPID:'([0-9]{1,5})' SID:'([0-9]{1,9})' sbivvrwm060.dev.ib.tor.Test.com:30000","")
Can someone please help me
As stated in the documentation, you should write the prefix REGEXPTITLE: and surround everything with square brackets, but "escape" all including ones as the dots (.) and spaces () with a backslash (\) and instead of [0-9] you might use \d like "[REGEXPTITLE:RX_IST2_AM\ \[PID:(\d{1,6})\ NPID:(\d{1,5})\ SID:(\d{1,9})\] sbivvrwm060\.dev\.ib\.tor\.Test\.com:30000]" as your parameter for the Win...(...)-Functions.
You can even omit the round brackets ((...)) but keep their content if you don't want to capture the content to process it further like with StringRegExp(...) or StringRegExpReplace(...) - using the _WinWaitActivete(...)-Function it won't make sense anyways as it is only matching and not replacing or returning anything from your regular expression.
According to regex101 both work, with the round brackets and without - you should always use a tool like this site to confirm that your expression is actually working for your input string.
Not familiar with autoit, but remember that regex has to completely match your string to capture results. For example, (goat)s will NOT capture the word goat if your string is goat or goater.
You have forgotten to add a ] in your regex, so your pattern doesn't match the string and capture groups will not be extracted. Also I'm not completely sold on the usage of '. Based on this page, you can do something like StringRegExp(yourstring, 'RX_IST2_AM [PID:([0-9]{1,6}) NPID:([0-9]{1,5}) SID:([0-9]{1,9})]', $STR_REGEXPARRAYGLOBALMATCH) and $1, $2 and $3 would be your results respectively. But maybe your approach works too.

parsing url for specific param value

im looking to use a regular expression to parse a URL to get a specific section of the url and nothing if I cannot find the pattern.
A url example is
/te/file/value/jifle?uil=testing-cdas-feaw:jilk:&jklfe=https://value-value.jifels/temp.html/topic?id=e997aad4-92e0-j30e-a3c8-jfkaliejs5#c452fds-634d-f424fds-cdsa&bf_action=jildape
I wish to get the bolded text in it.
Currently im using the regex "d=([^#]*)" but the problem is im also running across urls of this pattern:
and im getting the bold section of it
/te/file/value/jifle?uil=testing-cdas-feaw:jilk:&jklfe=https://value-value.jifels/temp.html/topic?id=e997aad4-92e0-j30e-a3c8-jfkaliejs5&bf_action=jildape
I would prefer it have no matches of this url because it doesnt contain the #
Regexes are not a magic tool that you should always use just because the problem involves a string. In this case, your language probably has a tool to break apart URLs for you. In PHP, this is parse_url(). In Perl, it's the URI::URL module.
You should almost always prefer an existing, well-tested solution to a common problem like this rather than writing your own.
So you want to match the value of the id parameter, but only if it has a trailing section containing a '#' symbol (without matching the '#' or what's after it)?
Not knowing the specifics of what style of regexes you're using, how about something like:
id=([^#&]*)#
regex = "id=([\\w-])+?#"
This will grab everything that is character class[a-zA-Z_0-9-] between 'id=' and '#' assuming everything between 'id=' and '#' is in that character class(i.e. if an '&' is in there, the regex will fail).
id=
-Self explanatory, this looks for the exact match of 'id='
([\\w-])
-This defines and character class and groups it. The \w is an escaped \w. '\w' is a predefined character class from java that is equal to [a-zA-Z_0-9]. I added '-' to this class because of the assumed pattern from your examples.
+?
-This is a reluctant quantifier that looks for the shortest possible match of the regex.
#
-The end of the regex, the last character we are looking for to match the pattern.
If you are looking to grab every character between 'id=' and the first '#' following it, the following will work and it uses the same logic as above, but replaces the character class [\\w-] with ., which matches anything.
regex = "id=(.+?)#"