Regular expression lookbehind problem - regex

I use
(?<!value=\")##(.*)##
to match string like ##MyString## that's not in the form of:
<input type="text" value="##MyString##">
This works for the above form, but not for this: (It still matches, should not match)
<input type="text" value="Here is my ##MyString## coming..">
I tried:
(?<!value=\").*##(.*)##
with no luck. Any suggestions will be deeply appreciated.
Edit: I am using PHP preg_match() function

This is not perfect (that's what HTML parsers are for), but it will work for the vast majority of HTML files:
(^|>)[^<>]*##[^#]*##[^<>]*(<|$)
The idea is simple. You're looking for a string that is outside of tags. To be outside of tags, the closest preceding angled bracket to it must be closing (or there's no bracket at all), and the closest following one must be opening (or none). This assumes that angled brackets are not used in attribute values.
If you actually care that the attribute name be "value", then you can match for:
value\s*=\s*"([^\"]|\\\")*##[^#]*##([^\"]|\\\")*\"
... and then simply negate the match (!preg_match(...)).

#OP, you can do it simply without regex.
$text = '<input type="text" value=" ##MyString##">';
$text = str_replace(" ","",$text);
if (strpos($text,'value="##' ) !==FALSE ){
$s = explode('value="##',$text);
$t = explode("##",$s[1]);
print "$t[0]\n";
}

here is a starting point at least, it works for the given examples.
(?<!<[^>]*value="[^>"]*)##(.*)##

Related

How to allow single quote in regular expression for WhitelistingTextInputFormatter? [duplicate]

I have a form with three input.
The first is only number and for the other two I can accept all letters from A to Z, space and apostrophe.
HTML:
<input type="text" name="number" id="number"/>
<input type="text" name="name" id="name"/>
<input type="text" name="surname" id="surname"/>
Javascript:
$("body").on("click","#button",function(){
let regexMessaggio = new RegExp("^[0-9]{1,5}$");
if (!regexMessaggio.test($('#number').val())){
alert("error");
return false;
}
let regexMessaggio2 = new RegExp("^[a-zA-ZÀ-ÿ ']{2,60}$");
if (!regexMessaggio2.test($('#name').val())){
alert("error");
return false;
}
if (!regexMessaggio2.test($('#surname').val())){
alert("error");
return false;
}
alert("OK");
});
In every browser on desktop this code works, but not on iOS. I have tried on chrome and safari but it doesn't recognize the apostrophe.
I have also tried to change the regular expression with:
- "^([a-zA-ZÀ-ÿ ']{2,60})+$"
- /^[a-zA-ZÀ-ÿ ']{2,60}$/
- "[a-zA-ZÀ-ÿ ']{2,60}"
If the "Smart Punctuations" are switched on all straight single apostrophes are automatically converted to curly apostrophes. Even if you have access to the smartQuotesType property of UI text input fields, users may paste curly apostrophes there and you should account for them.
In the Regex and Smart Punctuation in iOS post, the author suggests adding ‘’ to the regex:
let regexMessaggio2 = /^[a-zA-ZÀ-ÿ ‘’']{2,60}$/;
Or, in case there are any encoding issues use hex representations:
let regexMessaggio2 = /^[a-zA-ZÀ-ÿ \u2018\u2019']{2,60}$/;
Note that a regex literal notation (/.../) is preferable to the RegExp constructor notation when the pattern is static, i.e. you are not passing any variables to the regex pattern.
Thanks to #Wiktor Stribiżew for the help.
The correct string is new RegExp("^[a-zA-ZÀ-ÿ ‘’']{2,60}$")
I also have find this link that can help.

Pattern attribute value is not a valid regular expression

My HTML has the following input element (it is intended to accept email addresses that end in ".com"):
<input type="email" name="p_email_ad" id="p_email_ad" value="" required="required" pattern="[\-a-zA-Z0-9~!$%\^&*_=+}{\'?]+(\.[\-a-zA-Z0-9~!$%\^&*_=+}{\'?]+)*#([a-zA-Z0-9_][\-a-zA-Z0-9_]*(\.[\-a-zA-Z0-9_]+)*\.([cC][oO][mM]))(:[0-9]{1,5})?$" maxlength="64">
At some point in the past 2 months, Chrome has started returning the following JavaScript error (and preventing submission of the parent form) when validating that input:
Pattern attribute value
[\-a-zA-Z0-9~!$%\^&*_=+}{\'?]+(\.[\-a-zA-Z0-9~!$%\^&*_=+}{\'?]+)*#([a-zA-Z0-9_][\-a-zA-Z0-9_]*(\.[\-a-zA-Z0-9_]+)*\.([cC][oO][mM]))(:[0-9]{1,5})?$
is not a valid regular expression: Uncaught SyntaxError: Invalid
regular expression:
/[\-a-zA-Z0-9~!$%\^&*_=+}{\'?]+(\.[\-a-zA-Z0-9~!$%\^&*_=+}{\'?]+)*#([a-zA-Z0-9_][\-a-zA-Z0-9_]*(\.[\-a-zA-Z0-9_]+)*\.([cC][oO][mM]))(:[0-9]{1,5})?$/: Invalid escape
Regex101.com likes the regex pattern, but Chrome doesn't. What syntax do I have wrong?
Use
pattern="[-a-zA-Z0-9~!$%^&*_=+}{'?]+(\.[-a-zA-Z0-9~!$%^&*_=+}{'?]+)*#([a-zA-Z0-9_][-a-zA-Z0-9_]*(\.[-a-zA-Z0-9_]+)*\.([cC][oO][mM]))(:[0-9]{1,5})?"
The problem is that some chars that should not be escaped were escaped, like ' and ^ inside the character classes. Note that - inside a character class may be escaped, but does not have to when it is at its start.
Note also that HTML5 engines wraps the whole pattern inside ^(?: and )$ constructs, so there is no need using $ end of string anchor at the end of the pattern.
Test:
<form>
<input type="email" name="p_email_ad" id="p_email_ad" value="" required="required" pattern="[-a-zA-Z0-9~!$%^&*_=+}{'?]+(\.[-a-zA-Z0-9~!$%^&*_=+}{'?]+)*#([a-zA-Z0-9_][-a-zA-Z0-9_]*(\.[-a-zA-Z0-9_]+)*\.([cC][oO][mM]))(:[0-9]{1,5})?" maxlength="64">
<input type="Submit">
</form>
I was experiencing the same issue with my application but had a slightly different approach to a solution. My regex has the same issue that the accepted answer describes (special characters being escaped in character classes when they didn't need to be), however the regex I'm dealing with is coming from an external source so I could not modify it. This kind of regex is usually fine for most languages (passes validation in PHP) but as we have found out it breaks with HTML5.
My simple solution, url encode the regex before applying it to the input's pattern attribute. That seems to satisfy the HTML5 engine and it works as expected. JavaScript's encodeURIComponent is a good fit.

regular expression exclude match that contains a string pattern

I'm trying to narrow down my RegEx to ignore form elements with type="submit". I only want to select the portion of elements up to the part class="*" but still ignore if type="submit" comes before or after the class.
My regular expression thus far:
(<(?:input|select|textarea){1}.*[^type="submit"]class=")(((?!form\-control)[a-zA-Z0-9_ -])*")
Test case:
Line one should match up to the end of class, and line 2 ignored.
<input type="text" name="name" id="test" class="example-class" max-length="7" required="required">
<input type="submit" class="btn-primary" value="send">
Is this acheivable?
Thanks for your comments. The answer was a negative look ahead.
Adding (?!.*type="submit.*) to the start of the regex appears to have given me my desired result.
Working Regex:
(?!.*type="submit.*)(<(?:input|select|textarea).*class=")(((?!form\-control)[a-zA-Z0-9_ -])*")
(<(?:input|select|textarea)\s((?!type="submit")[\w\-]+\b="[^"]*"\s?)*>)
This expression is bound to the single tag.
It is better to avoid expressions like .* since it can go further and match a string which would begin inside one tag and end-up inside another.

regex issue for interpolate setting in underscore.js

I have following regex and template for underscore.js templates.
the problem is that the regex for interpolate is not working properly, what will be correct regex?
Regex
var settings = {
evaluate: /\{\{\#([\s\S]+?)\}\}/g,
interpolate: /\{\{[a-zA-Z](.+?)\}\}/g
};
Template
{{# if ( item ) { }}
{{ item.title }}
{{# } }}
The template compiler will use the last capture group in the expressions to build the JavaScript form of the template. In your case, the interpolate will (as noted by Jerry) ignore the first alphabetic character so {{ab}} will end up looking for b in the template parameters. Your regex also doesn't account for leading whitespace. You'd want something more like this:
/\{\{\s*([a-zA-Z](?:.+?))\s*\}\}/g
or better, just leave out the original group instead of converting it to a non-capturing group:
/\{\{\s*([a-zA-Z].+?)\s*\}\}/g
Or even better, get right to the heart of the problem and say exactly what you mean:
/\{\{([^#].*?)\}\}/g
I think the problem that led you to your original regex is that interpolate is checked before evaluate; that order makes sense when interpolate is looking for <%= ... %> and evaluate is looking for <% ... %> (i.e. the default delimiters are being used). In your case, you need a bit of extra [^#] trickery to get around the regex checking order.
Similarly, we can simplify your evaluate regex:
/\{\{#(.+?)\}\}/g
Demo: http://jsfiddle.net/ambiguous/V6rv2/
I'd also recommend that you add an escape pattern for completeness.

How to write a PCRE regular expression to find a string with different content?

This may be a simple one but I can't seem to find the answer to it.
I have a string in a HTML file that I am looking for:
<div class="button" onclick="document.$name.submit(); return false\">Save</div>
where $name is is generated by code, so can be anything.
I need to write a PCRE regular expression that will find this string in the file but disregard the $name section of the string.
I have tried this :
/<div class=\"button\" document.(.+?).submit\(\); return false\">Save<\/div>/
It will return the group that equals to what is in $name. but not define it as a match, which is what I need.
The following should work:
/<div class="button" onclick="document\.(.+?)\.submit\(\); return false">Save<\/div>/
Most likely your problem was that you forgot to escape the parenthesis after submit(), so it tried to match submit;.
Try this
#<div class="button" onclick="document.(.*?).submit\(\); return false">Save</div>#