What does the (?i)\\. regular expression mean? - regex

The code uses the following regular expression
img[src~=(?i)\\.(png|jpe?g)]
I'm not sure if the . is escaped or the \

the \ is escaped, which appears to be an error given what it's trying to do....
actually, you've taken that out of context. that's probably in a string. if it's in a string, then it's escaping the slash, and then that slash is escaping the dot.
the ~= means "ends with" and the (?i) switches it into case-insensitive mode.
errr... now that i think about it, that actually looks like a hybrid between a CSS selector (probably used in jquery) and a regex (being familiar with both syntaxes, I thought nothing of it!). The ~= doesn't do anything in a regex (they're literal chars) the [ and ] represent a character set though.
So...I don't know what the result of this is. I suspect someone got confused and tried mixing the two.

It means match case insensitively, any string that ends in:
\.png
\.jpeg
\.jpg
But this is dependant on context. If used in a context, were \ need to be escaped out at a higher level, then it means match case insensitively:
.png
.jpeg
.jpg

In this expression , '/' is escaped ,which in turn escapes the '.'

Related

How do i select only the files that starts with either CH or OTC [duplicate]

I'm creating a javascript regex to match queries in a search engine string. I am having a problem with alternation. I have the following regex:
.*baidu.com.*[/?].*wd{1}=
I want to be able to match strings that have the string 'word' or 'qw' in addition to 'wd', but everything I try is unsuccessful. I thought I would be able to do something like the following:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
but it does not seem to work.
replace [wd|word|qw] with (wd|word|qw) or (?:wd|word|qw).
[] denotes character sets, () denotes logical groupings.
Your expression:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
does need a few changes, including [wd|word|qw] to (wd|word|qw) and getting rid of the redundant {1}, like so:
.*baidu.com.*[/?].*(wd|word|qw)=
But you also need to understand that the first part of your expression (.*baidu.com.*[/?].*) will match baidu.com hello what spelling/handle????????? or hbaidu-com/ or even something like lkas----jhdf lkja$##!3hdsfbaidugcomlaksjhdf.[($?lakshf, because the dot (.) matches any character except newlines... to match a literal dot, you have to escape it with a backslash (like \.)
There are several approaches you could take to match things in a URL, but we could help you more if you tell us what you are trying to do or accomplish - perhaps regex is not the best solution or (EDIT) only part of the best solution?

How to do regular Expression in AutoIt Script

In Autoit script Iam unable to do Regular expression for the below string Here the numbers will get changed always.
Actual String = _WinWaitActivate("RX_IST2_AM [PID:942564 NPID:10991 SID:498702881] sbivvrwm060.dev.ib.tor.Test.com:30000","")
Here the PID, NPID & SID : will be changing and rest of the things are always constant.
What i have tried below is
_WinWaitActivate("RX_IST2_AM [PID:'([0-9]{1,6})' NPID:'([0-9]{1,5})' SID:'([0-9]{1,9})' sbivvrwm060.dev.ib.tor.Test.com:30000","")
Can someone please help me
As stated in the documentation, you should write the prefix REGEXPTITLE: and surround everything with square brackets, but "escape" all including ones as the dots (.) and spaces () with a backslash (\) and instead of [0-9] you might use \d like "[REGEXPTITLE:RX_IST2_AM\ \[PID:(\d{1,6})\ NPID:(\d{1,5})\ SID:(\d{1,9})\] sbivvrwm060\.dev\.ib\.tor\.Test\.com:30000]" as your parameter for the Win...(...)-Functions.
You can even omit the round brackets ((...)) but keep their content if you don't want to capture the content to process it further like with StringRegExp(...) or StringRegExpReplace(...) - using the _WinWaitActivete(...)-Function it won't make sense anyways as it is only matching and not replacing or returning anything from your regular expression.
According to regex101 both work, with the round brackets and without - you should always use a tool like this site to confirm that your expression is actually working for your input string.
Not familiar with autoit, but remember that regex has to completely match your string to capture results. For example, (goat)s will NOT capture the word goat if your string is goat or goater.
You have forgotten to add a ] in your regex, so your pattern doesn't match the string and capture groups will not be extracted. Also I'm not completely sold on the usage of '. Based on this page, you can do something like StringRegExp(yourstring, 'RX_IST2_AM [PID:([0-9]{1,6}) NPID:([0-9]{1,5}) SID:([0-9]{1,9})]', $STR_REGEXPARRAYGLOBALMATCH) and $1, $2 and $3 would be your results respectively. But maybe your approach works too.

parsing url for specific param value

im looking to use a regular expression to parse a URL to get a specific section of the url and nothing if I cannot find the pattern.
A url example is
/te/file/value/jifle?uil=testing-cdas-feaw:jilk:&jklfe=https://value-value.jifels/temp.html/topic?id=e997aad4-92e0-j30e-a3c8-jfkaliejs5#c452fds-634d-f424fds-cdsa&bf_action=jildape
I wish to get the bolded text in it.
Currently im using the regex "d=([^#]*)" but the problem is im also running across urls of this pattern:
and im getting the bold section of it
/te/file/value/jifle?uil=testing-cdas-feaw:jilk:&jklfe=https://value-value.jifels/temp.html/topic?id=e997aad4-92e0-j30e-a3c8-jfkaliejs5&bf_action=jildape
I would prefer it have no matches of this url because it doesnt contain the #
Regexes are not a magic tool that you should always use just because the problem involves a string. In this case, your language probably has a tool to break apart URLs for you. In PHP, this is parse_url(). In Perl, it's the URI::URL module.
You should almost always prefer an existing, well-tested solution to a common problem like this rather than writing your own.
So you want to match the value of the id parameter, but only if it has a trailing section containing a '#' symbol (without matching the '#' or what's after it)?
Not knowing the specifics of what style of regexes you're using, how about something like:
id=([^#&]*)#
regex = "id=([\\w-])+?#"
This will grab everything that is character class[a-zA-Z_0-9-] between 'id=' and '#' assuming everything between 'id=' and '#' is in that character class(i.e. if an '&' is in there, the regex will fail).
id=
-Self explanatory, this looks for the exact match of 'id='
([\\w-])
-This defines and character class and groups it. The \w is an escaped \w. '\w' is a predefined character class from java that is equal to [a-zA-Z_0-9]. I added '-' to this class because of the assumed pattern from your examples.
+?
-This is a reluctant quantifier that looks for the shortest possible match of the regex.
#
-The end of the regex, the last character we are looking for to match the pattern.
If you are looking to grab every character between 'id=' and the first '#' following it, the following will work and it uses the same logic as above, but replaces the character class [\\w-] with ., which matches anything.
regex = "id=(.+?)#"

Perl regex with exclamation marks

How do you define/explain this Perl regex:
$para =~ s!//!/!g;
I know the s means search, and g means global (search), but not sure how the exclamation marks ! and extra slashes / fit in (as I thought the pattern would look more like s/abc/def/g).
Perl's regex operators s, m and tr ( thought it's not really a regex operator ) allow you to use any symbol as your delimiter.
What this means is that you don't have to use / you could use, like in your question !
# the regex
s!//!/!g
means search and replace all instances of '//' with '/'
you could write the same thing as
s/\/\//\/g
or
s#//#/#g
or
s{//}{/}g
if you really wanted but as you can see the first one, with all the backslashes, is very hard to understand and much more cumbersome.
More information can be found in the perldoc's perlre
The substitution regex (and other regex operators, like m///) can take any punctuation character as delimiter. This saves you the trouble of escaping meta characters inside the regex.
If you want to replace slashes, it would be awkward to write:
s/\/\//\//g;
Which is why you can write
s!//!/!g;
...instead. See http://perldoc.perl.org/perlop.html#Regexp-Quote-Like-Operators
And no, s/// is the substitution. m/// is the search, though I do believe the intended mnemonic is "match".
The exclamation marks are the delimiter; perl lets you choose any character you want, within reason. The statement is equivalent to the (much uglier) s/\/\//\//g — that is, it replaces // with /.

eregi_replace to preg_replace conversion stuff

Regular expressions are not strong point.
I can do simple stuff, but this one has just got my goat !!
So could someone give me a hand with this one.
Here's the comment in the code :
// If utf8 detection didnt work before, strip those weird characters for an underscore, as a last resort.
eregi_replace("[^a-z0-9 \-\.\(\)\/\\]","_",$str);
to (here's what I tried)
preg_replace("{[^a-z0-9 \-\.\(\)\/\\]}i","_",$str);
Any regex pros out there who give me a hand?
You need to specify regexp identifier such as # or /
preg_replace("#[^a-z0-9 \-\.\(\)\/\\]#i","_",$str);
So you should enclose your regular expression in those identifier characters.
First, I believe the { and } are fine as delimiters for the expression from the flags, but I know there are some regex flavors that don't support it, so it might be a good idea to just use something like ! or #
Second, I am not sure how the expression before worked, because AFAIK escaping with a \ character does not work with ERE expressions. You have to represent special characters like ^, -, and ] by their position within the class (^ cannot be the first character, ] must be the first character, and - must be either the first or the last character). The - character in the first expression would be interpreted as a range specifier (in this case a character in the range between \ and \). Additionally, the \ characters are treated literally, so you've got a confusing looking and largely redundant regex.
The replacement expression, however, needs to be in preg notation/flavor, so there are rule changes:
Very few things need to be escaped in a character class, even with the new rules
The \ character needs to be escaped twice - once for the string, and then one more time for the regex - otherwise, it will escape the closing bracket ]
Assuming you want to match a dash (or rather match something OTHER than a dash, it needs to be moved to the end of the class
So, here is some code (link) that I believe does what you need it to do:
$source = 'hello! ##$%^&* wazzup-dawg?.()/\\[]{}<>:"';
$blah = preg_replace('![^a-z0-9 .()/\\\\-]!i','_',$source);
print($blah);
preg_replace("{[^a-z0-9]-.()/\/}i","_",$str)
works just fine.
I tried it with all # and / and { and they all worked.