Regex validation of a Colfusion property doesn't work with commas (,) - regex

I have this little component in ColdFusion 9:
component
displayname = "My Component"
accessors = "true"
{
property
name = "myProperty"
type = "string"
validate = "regex"
validateparams = "{ pattern = '(Eats)|(Shoots)|(Leaves)' }";
}
which works as expected:
<cfscript>
myComponentInstance = new myComponent();
myComponentInstance.setMyProperty('Eats');
// Property is correctly set
myComponentInstance.setMyProperty('Shoots');
// Property is correctly set
myComponentInstance.setMyProperty('Drinks');
// Error: The value does not match the regular expression pattern provided.
</cfscript>
But if I modify the validation regex to allow a value like with a comma (,) in it
validateparams = "{ pattern = '(Eats)|(Shoots)|(Leaves)|(Eats, Shoots & Leaves)' }"
then I get an error on the instance creation
<cfscript>
myComponentInstance = new myComponent();
/* Error while parsing the validateparam
'{ pattern = '(Eats)|(Shoots)|(Leaves)|(Eats, Shoots & Leaves)' }'
for property myProperty */
</cfscript>
It seems like ColdFusion can't process a regular expression with a comma, nor have I found a way of escaping it.
If I try to use a backslash (\), as a regex escaping character, it is then processed as a foreslash (/) by ColdFusion:
validateparams = "{ pattern = '(Eats)|(Shoots)|(Leaves)|(Eats\, Shoots & Leaves)' }"
<cfscript>
myComponentInstance = new myComponent();
/* Error while parsing the validateparam
'{ pattern = '(Eats)|(Shoots)|(Leaves)|(Eats/, Shoots & Leaves)' }'
for property myProperty */
</cfscript>
Other forms of escaping that I have tried, but to no avail, are:
validateparams = "{ pattern = '(Eats)|(Shoots)|(Leaves)|(Eats#chr(44)# Shoots & Leaves)' }"
validateparams = "{ pattern = '(Eats)|(Shoots)|(Leaves)|(Eats,, Shoots & Leaves)' }"

It's a bug in ColdFusion. Raise it as such: https://bugbase.adobe.com/. I can replicate it in CF 9.0.1. I'm working on a work-around... will get back to you if I come up with something.
NB: one can pare the repro validateparams string down to this: {pattern = ","}. I'm guessing Adobe are using the comma as a delim, and it never occurred to them it might be data (they're a bit like that with delimited strings).

Related

Jest cell name won't recognise regex [duplicate]

I want to add a (variable) tag to values with regex, the pattern works fine with PHP but I have troubles implementing it into JavaScript.
The pattern is (value is the variable):
/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is
I escaped the backslashes:
var str = $("#div").html();
var regex = "/(?!(?:[^<]+>|[^>]+<\\/a>))\\b(" + value + ")\\b/is";
$("#div").html(str.replace(regex, "" + value + ""));
But this seem not to be right, I logged the pattern and its exactly what it should be.
Any ideas?
To create the regex from a string, you have to use JavaScript's RegExp object.
If you also want to match/replace more than one time, then you must add the g (global match) flag. Here's an example:
var stringToGoIntoTheRegex = "abc";
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
In the general case, escape the string before using as regex:
Not every string is a valid regex, though: there are some speciall characters, like ( or [. To work around this issue, simply escape the string before turning it into a regex. A utility function for that goes in the sample below:
function escapeRegExp(stringToGoIntoTheRegex) {
return stringToGoIntoTheRegex.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
}
var stringToGoIntoTheRegex = escapeRegExp("abc"); // this is the only change from above
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
Note: the regex in the question uses the s modifier, which didn't exist at the time of the question, but does exist -- a s (dotall) flag/modifier in JavaScript -- today.
If you are trying to use a variable value in the expression, you must use the RegExp "constructor".
var regex = "(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b";
new RegExp(regex, "is")
I found I had to double slash the \b to get it working. For example to remove "1x" words from a string using a variable, I needed to use:
str = "1x";
var regex = new RegExp("\\b"+str+"\\b","g"); // same as inv.replace(/\b1x\b/g, "")
inv=inv.replace(regex, "");
You don't need the " to define a regular expression so just:
var regex = /(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is; // this is valid syntax
If value is a variable and you want a dynamic regular expression then you can't use this notation; use the alternative notation.
String.replace also accepts strings as input, so you can do "fox".replace("fox", "bear");
Alternative:
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(.*?)\b/", "is");
Keep in mind that if value contains regular expressions characters like (, [ and ? you will need to escape them.
I found this thread useful - so I thought I would add the answer to my own problem.
I wanted to edit a database configuration file (datastax cassandra) from a node application in javascript and for one of the settings in the file I needed to match on a string and then replace the line following it.
This was my solution.
dse_cassandra_yaml='/etc/dse/cassandra/cassandra.yaml'
// a) find the searchString and grab all text on the following line to it
// b) replace all next line text with a newString supplied to function
// note - leaves searchString text untouched
function replaceStringNextLine(file, searchString, newString) {
fs.readFile(file, 'utf-8', function(err, data){
if (err) throw err;
// need to use double escape '\\' when putting regex in strings !
var re = "\\s+(\\-\\s(.*)?)(?:\\s|$)";
var myRegExp = new RegExp(searchString + re, "g");
var match = myRegExp.exec(data);
var replaceThis = match[1];
var writeString = data.replace(replaceThis, newString);
fs.writeFile(file, writeString, 'utf-8', function (err) {
if (err) throw err;
console.log(file + ' updated');
});
});
}
searchString = "data_file_directories:"
newString = "- /mnt/cassandra/data"
replaceStringNextLine(dse_cassandra_yaml, searchString, newString );
After running, it will change the existing data directory setting to the new one:
config file before:
data_file_directories:
- /var/lib/cassandra/data
config file after:
data_file_directories:
- /mnt/cassandra/data
Much easier way: use template literals.
var variable = 'foo'
var expression = `.*${variable}.*`
var re = new RegExp(expression, 'g')
re.test('fdjklsffoodjkslfd') // true
re.test('fdjklsfdjkslfd') // false
Using string variable(s) content as part of a more complex composed regex expression (es6|ts)
This example will replace all urls using my-domain.com to my-other-domain (both are variables).
You can do dynamic regexs by combining string values and other regex expressions within a raw string template. Using String.raw will prevent javascript from escaping any character within your string values.
// Strings with some data
const domainStr = 'my-domain.com'
const newDomain = 'my-other-domain.com'
// Make sure your string is regex friendly
// This will replace dots for '\'.
const regexUrl = /\./gm;
const substr = `\\\.`;
const domain = domainStr.replace(regexUrl, substr);
// domain is a regex friendly string: 'my-domain\.com'
console.log('Regex expresion for domain', domain)
// HERE!!! You can 'assemble a complex regex using string pieces.
const re = new RegExp( String.raw `([\'|\"]https:\/\/)(${domain})(\S+[\'|\"])`, 'gm');
// now I'll use the regex expression groups to replace the domain
const domainSubst = `$1${newDomain}$3`;
// const page contains all the html text
const result = page.replace(re, domainSubst);
note: Don't forget to use regex101.com to create, test and export REGEX code.
var string = "Hi welcome to stack overflow"
var toSearch = "stack"
//case insensitive search
var result = string.search(new RegExp(toSearch, "i")) > 0 ? 'Matched' : 'notMatched'
https://jsfiddle.net/9f0mb6Lz/
Hope this helps

Search for multiple string occurrence in String via PHP

I am working on an ecommerce website using MVC,php. I have a field called description. The user can enter multiple product id's in the description field.
For example {productID = 34}, {productID = 58}
I am trying to get all product ID's from this field. Just the product ID.
How do i go about this?
This solution doesn't use a capture group. Rather, it uses \K so that the full string elements become what would otherwise be captured using parentheses. This is a good practice because it reduces the array element count by 50%.
$description="{productID = 34}, {productID = 58}";
if(preg_match_all('/productID = \K\d+/',$description,$ids)){
var_export($ids[0]);
}
// output: array(0=>'34',1=>'58')
// \K in the regex means: keep text from this point
Without using regex, something like this should work for returning the string positions:
<code>
$html = "dddasdfdddasdffff";
$needle = "asdf";
$lastPos = 0;
$positions = array();
while (($lastPos = strpos($html, $needle, $lastPos))!== false) {
$positions[] = $lastPos;
$lastPos = $lastPos + strlen($needle);
}
// Displays 3 and 10
foreach ($positions as $value) {
echo $value ."<br />";
}
</code>

RegExp and special characters

I need to use regexp for matching and the code below works fine. However, I need to KEEP the dollar sign ($) as a true dollar sign and not a special character.
I've tried excluding but nothing is working.
IE: [^$]
Here's the code. It works as expected except when the text contains a $ or IS the $.
textNode = "$19,000";
regex = RegExp("$19,000",'ig');
text = '$';
textReplacerFunc: function (textNode, regex, text) {
var sTag = '<span class="highlight">';
var eTag = '</span>';
var re = '(?![^<>]*>)(' + text + '(?!#8212;))';
var regExp = new RegExp(re, 'ig');
textNode.data = textNode.data.replace(regExp, sTag + '$1' + eTag);
},
RESULT: $ not highlighted. desired results:
$19,000
Make sure to double escape the $ as in :
text = '\\$';
Since you are using construction of RegExp instance using a string here.

String parsing with RegExp in Actionscript

I have a string that is similar to a path, but I have tried some regex patterns that are supposed to parse paths and they don't quite work.
Here's the string
f|MyApparel/Templates/Events/
I need the "name parts" between the slashes.
I tried (\w+) but the array came back [0] = "f" and [1] = "f".
I tested the pattern on http://www.gskinner.com/RegExr/ and it seems to work correctly.
Here's the AS code:
var pattern : RegExp = /(\w+)/g;
var hierarchy : Array = pattern.exec(params.category_id);
params.name = hierarchy.pop() as String;
pattern.exec() works like in JavaScript. It resets the lastIndex property every time it finds a match for a global regex, and next time you run it it starts from there.
So it does not return an array of all matches, but only the very next match in the string. Hence you must run it in a loop until it returns null:
var myPattern:RegExp = /(\w+)/g;
var str:String = "f|MyApparel/Templates/Events/";
var result:Object = myPattern.exec(str);
while (result != null) {
trace( result.index, "\t", result);
result = myPattern.exec(str);
}
I don't know between which two slashes you want but try
var hierarchy : Array = params.category_id.split(/[\/|]/);
[\/|] means a slash or a vertical bar.

Regular expression to remove one parameter from query string

I'm looking for a regular expression to remove a single parameter from a query string, and I want to do it in a single regular expression if possible.
Say I want to remove the foo parameter. Right now I use this:
/&?foo\=[^&]+/
That works as long as foo is not the first parameter in the query string. If it is, then my new query string starts with an ampersand. (For example, "foo=123&bar=456" gives a result of "&bar=456".) Right now, I'm just checking after the regex if the query string starts with ampersand, and chopping it off if it does.
Example edge cases:
Input | Expected Output
-------------------------+--------------------
foo=123 | (empty string)
foo=123&bar=456 | bar=456
bar=456&foo=123 | bar=456
abc=789&foo=123&bar=456 | abc=789&bar=456
Edit
OK as pointed out in comments there are there are way more edge cases than I originally considered. I got the following regex to work with all of them:
/&foo(\=[^&]*)?(?=&|$)|^foo(\=[^&]*)?(&|$)/
This is modified from Mark Byers's answer, which is why I'm accepting that one, but Roger Pate's input helped a lot too.
Here is the full suite of test cases I'm using, and a Javascript snippet which tests them:
$(function() {
var regex = /&foo(\=[^&]*)?(?=&|$)|^foo(\=[^&]*)?(&|$)/;
var escapeHtml = function (str) {
var map = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
return str.replace(/[&<>"']/g, function(m) { return map[m]; });
};
//test cases
var tests = [
'foo' , 'foo&bar=456' , 'bar=456&foo' , 'abc=789&foo&bar=456'
,'foo=' , 'foo=&bar=456' , 'bar=456&foo=' , 'abc=789&foo=&bar=456'
,'foo=123' , 'foo=123&bar=456' , 'bar=456&foo=123' , 'abc=789&foo=123&bar=456'
,'xfoo' , 'xfoo&bar=456' , 'bar=456&xfoo' , 'abc=789&xfoo&bar=456'
,'xfoo=' , 'xfoo=&bar=456' , 'bar=456&xfoo=' , 'abc=789&xfoo=&bar=456'
,'xfoo=123', 'xfoo=123&bar=456', 'bar=456&xfoo=123', 'abc=789&xfoo=123&bar=456'
,'foox' , 'foox&bar=456' , 'bar=456&foox' , 'abc=789&foox&bar=456'
,'foox=' , 'foox=&bar=456' , 'bar=456&foox=' , 'abc=789&foox=&bar=456'
,'foox=123', 'foox=123&bar=456', 'bar=456&foox=123', 'abc=789&foox=123&bar=456'
];
//expected results
var expected = [
'' , 'bar=456' , 'bar=456' , 'abc=789&bar=456'
,'' , 'bar=456' , 'bar=456' , 'abc=789&bar=456'
,'' , 'bar=456' , 'bar=456' , 'abc=789&bar=456'
,'xfoo' , 'xfoo&bar=456' , 'bar=456&xfoo' , 'abc=789&xfoo&bar=456'
,'xfoo=' , 'xfoo=&bar=456' , 'bar=456&xfoo=' , 'abc=789&xfoo=&bar=456'
,'xfoo=123', 'xfoo=123&bar=456', 'bar=456&xfoo=123', 'abc=789&xfoo=123&bar=456'
,'foox' , 'foox&bar=456' , 'bar=456&foox' , 'abc=789&foox&bar=456'
,'foox=' , 'foox=&bar=456' , 'bar=456&foox=' , 'abc=789&foox=&bar=456'
,'foox=123', 'foox=123&bar=456', 'bar=456&foox=123', 'abc=789&foox=123&bar=456'
];
for(var i = 0; i < tests.length; i++) {
var output = tests[i].replace(regex, '');
var success = (output == expected[i]);
$('#output').append(
'<tr class="' + (success ? 'passed' : 'failed') + '">'
+ '<td>' + (success ? 'PASS' : 'FAIL') + '</td>'
+ '<td>' + escapeHtml(tests[i]) + '</td>'
+ '<td>' + escapeHtml(output) + '</td>'
+ '<td>' + escapeHtml(expected[i]) + '</td>'
+ '</tr>'
);
}
});
#output {
border-collapse: collapse;
}
#output tr.passed { background-color: #af8; }
#output tr.failed { background-color: #fc8; }
#output td, #output th {
border: 1px solid black;
padding: 2px;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<table id="output">
<tr>
<th>Succ?</th>
<th>Input</th>
<th>Output</th>
<th>Expected</th>
</tr>
</table>
If you want to do this in just one regular expression, you could do this:
/&foo(=[^&]*)?|^foo(=[^&]*)?&?/
This is because you need to match either an ampersand before the foo=..., or one after, or neither, but not both.
To be honest, I think it's better the way you did it: removing the trailing ampersand in a separate step.
/(?<=&|\?)foo(=[^&]*)?(&|$)/
Uses lookbehind and the last group to "anchor" the match, and allows a missing value. Change the \? to ^ if you've already stripped off the question mark from the query string.
Regex is still not a substitute for a real parser of the query string, however.
Update: Test script: (run it at codepad.org)
import re
regex = r"(^|(?<=&))foo(=[^&]*)?(&|$)"
cases = {
"foo=123": "",
"foo=123&bar=456": "bar=456",
"bar=456&foo=123": "bar=456",
"abc=789&foo=123&bar=456": "abc=789&bar=456",
"oopsfoo=123": "oopsfoo=123",
"oopsfoo=123&bar=456": "oopsfoo=123&bar=456",
"bar=456&oopsfoo=123": "bar=456&oopsfoo=123",
"abc=789&oopsfoo=123&bar=456": "abc=789&oopsfoo=123&bar=456",
"foo": "",
"foo&bar=456": "bar=456",
"bar=456&foo": "bar=456",
"abc=789&foo&bar=456": "abc=789&bar=456",
"foo=": "",
"foo=&bar=456": "bar=456",
"bar=456&foo=": "bar=456",
"abc=789&foo=&bar=456": "abc=789&bar=456",
}
failures = 0
for input, expected in cases.items():
got = re.sub(regex, "", input)
if got != expected:
print "failed: input=%r expected=%r got=%r" % (input, expected, got)
failures += 1
if not failures:
print "Success"
It shows where my approach failed, Mark has the right of it—which should show why you shouldn't do this with regex.. :P
The problem is associating the query parameter with exactly one ampersand, and—if you must use regex (if you haven't picked up on it :P, I'd use a separate parser, which might use regex inside it, but still actually understand the format)—one solution would be to make sure there's exactly one ampersand per parameter: replace the leading ? with a &.
This gives /&foo(=[^&]*)?(?=&|$)/, which is very straight forward and the best you're going to get. Remove the leading & in the final result (or change it back into a ?, etc.). Modifying the test case to do this uses the same cases as above, and changes the loop to:
failures = 0
for input, expected in cases.items():
input = "&" + input
got = re.sub(regex, "", input)
if got[:1] == "&":
got = got[1:]
if got != expected:
print "failed: input=%r expected=%r got=%r" % (input, expected, got)
failures += 1
if not failures:
print "Success"
Having a query string that starts with & is harmless--why not leave it that way? In any case, I suggest that you search for the trailing ampersand and use \b to match the beginning of foo w/o taking in a previous character:
/\bfoo\=[^&]+&?/
It's a bit silly but I started trying to solve this with a regexp and wanted to finally get it working :)
$str[] = 'foo=123';
$str[] = 'foo=123&bar=456';
$str[] = 'bar=456&foo=123';
$str[] = 'abc=789&foo=123&bar=456';
foreach ($str as $string) {
echo preg_replace('#(?:^|\b)(&?)foo=[^&]+(&?)#e', "'$1'=='&' && '$2'=='&' ? '&' : ''", $string), "\n";
}
the replace part is messed up because apparently it gets confused if the captured characters are '&'s
Also, it doesn't match afoo and the like.
Thanks. Yes it uses backslashes for escaping, and you're right, I don't need the /'s.
This seems to work, though it doesn't do it in one line as requested in the original question.
public static string RemoveQueryStringParameter(string url, string keyToRemove)
{
//if first parameter, leave ?, take away trailing &
string pattern = #"\?" + keyToRemove + "[^&]*&?";
url = Regex.Replace(url, pattern, "?");
//if subsequent parameter, take away leading &
pattern = "&" + keyToRemove + "[^&]*";
url = Regex.Replace(url, pattern, "");
return url;
}
I based myself on your implementation to get a Java impl that seems to work:
public static String removeParameterFromQueryString(String queryString,String paramToRemove) {
Preconditions.checkArgument(queryString != null,"Empty querystring");
Preconditions.checkArgument(paramToRemove != null,"Empty param");
String oneParam = "^"+paramToRemove+"(=[^&]*)$";
String begin = "^"+paramToRemove+"(=[^&]*)(&?)";
String end = "&"+paramToRemove+"(=[^&]*)$";
String middle = "(?<=[&])"+paramToRemove+"(=[^&]*)&";
String removedMiddleParams = queryString.replaceAll(middle,"");
String removedBeginParams = removedMiddleParams.replaceAll(begin,"");
String removedEndParams = removedBeginParams.replaceAll(end,"");
return removedEndParams.replaceAll(oneParam,"");
}
I had troubles in some cases with your implementation because sometimes it did not delete a &, and did it with multiple steps which seems easier to understand.
I had a problem with your version, particularly when a param was in the query string multiple times (like param1=toto&param2=xxx&param1=YYY&param3=ZZZ&param1....)
it's never too late right
did the thing using conditional lookbehind to be sure it doesn't mess up &s
/(?(?<=\?)(foo=[^&]+)&*|&(?1))/g
if ? is behind we catch foo=bar and trailing & if it exists
if not ? is behind we catch &foo=bar
(?1) represents 1st cathing group, in this example it's the same as (foo=[^&]+)
actually i needed a oneliner for two similar parameters page and per-page
so i altered this expression a bit
/(?(?<=\?)((per-)?page=[^&]+)&*|&(?1))/g
works like charm
You can use the following regex:
[\?|&](?<name>.*?)=[^&]*&?
If you want to do exact match you can replace (?<name>.*?) with a url parameter.
e.g.:
[\?|&]foo=[^&]*&?
to match any variable like foo=xxxx in any URL.
For anyone interested in replacing GET request parameters:
The following regex works for also for more general GET method queries (starting with ?) where the marked answer fails if the parameter to be removed is the first one (after ?)
This (JS flavour) regex can be used to remove the parameter regardless of position (first, last, or in between) leaving the query in well formated state.
So just use a regex replace with an empty string.
/&s=[^&]*()|\?s=[^&]*$|s=[^&]*&/
Basically it matches one of the three cases mentioned above (hence the 2 pipes)