Regex Match whole word string in coldfusion - regex

Im trying this example
first example
keyword = "star";
myString = "The dog sniffed at the star fish and growled";
regEx = "\b"& keyword &"\b";
if (reFindNoCase(regEx, myString)) {
writeOutput("found it");
} else {
writeOutput("did not find it");
}
Example output -> found it
second example
keyword = "star";
myString = "The dog sniffed at the .star fish and growled";
regEx = "\b"& keyword &"\b";
if (reFindNoCase(regEx, myString)) {
writeOutput("found it");
} else {
writeOutput("did not find it");
}
output -> found it
but i want to find only whole word. punctuation issue for me how can i using regex for second example output: did not find it

Coldfusion does not support lookbehind, so, you cannot use a real "zero-width boundary" check. Instead, you can use groupings (and fortunately a lookahead):
regEx = "(^|\W)"& keyword &"(?=\W|$)";
Here, (^|\W) matches either the start of a string, and (?=\W|$) makes sure there is either a non-word character (\W) or the end of string ($).
See the regex demo
However, make sure you escape your keyword before passing to the regex. See ColdFusion 10 now provides reEscape() to prepare string literals for native RE-methods.
Another way is to match spaces or start/end of string:
<cfset regEx = "(^|\s)" & TABLE_NAME & "($|\s)">

Related

Groovy regex with hyphen isn't matching

I want to write a regex that will match any time the substring "my-app" is encountered inside any given string.
I have the following Groovy code:
String regex = ".*my-app*"
String str = getStringFromUserInput()
if(str.matches(regex) {
println "Match!"
} else {
println "Doesn't match..."
}
When getStringFromUserInput() returns a string like "blahmy-appfizz", the code above still reports Doesn't match.... So I figured that hyphens must be a special character in regexes and tried changing the regex to:
String regex = ".*my--app*"
But still nothing has changed. Any ideas as to where I'm going wrong?
The hyphen is no special character.
matches validates the entire input. Try:
String regex = ".*my-app.*"
Note that p* matches zero or more p's and p.* matches a p followed by zero or more chars (other than line breaks).
Assuming getStringFromUserInput() does not leave any line break char in the input. In which case you'd need to do a trim() to get rid of it, since the .* does not match line break chars.
String.contains seems like a simpler solution than a regex, e.g.
String stringFromUser = 'my-app'
assert 'foomy-appfoo'.contains(stringFromUser)
assert !'foo'.contains(stringFromUser)

Using regex to match certain text

I try to look for this answer for a while but no luck (sorry if I could describe it well). I am still newbie with regex. I am trying to match a string with only number and a certain delimiter. For example: the patter would be 8/16/32/64/.... the number will be split by '/' with arbitrary amount of number, I could find a way to match them.
My attempt is \d+/\d+? but couldn't get it to work.
You could remove the '/' delimiter and then test for the existence of a number
Here is some C# as an example:
static void Main(string[] args)
{
string text = "8/16/32/64/";
Console.WriteLine(text);
TestForNum(text);
text = "8/16/32/64/b";
Console.WriteLine(text);
TestForNum(text);
Console.ReadKey();
}
private static void TestForNum(string text)
{
string tmp = Regex.Replace(text, #"/", "");
Match m = Regex.Match(tmp, #"^\d+$");
if(m.Success)
{
Console.WriteLine("\t" + m.Groups[0]);
}
else Console.WriteLine("\tno match");
}
A naive approach would be
[\d/]+
However, this does match //// as well as just 12345. To match only "proper" strings:
\d+(/\d+)+
Reads digits followed by delimiter+digits repeated at least once. If trailing/leading delimiters are allowed, then
/?(\d+/)+\d*
If you're using a flavor that uses slashes to quote the regex (like javascript), you'll need to escape them:
/\d+(\/\d+)+/
You can do:
(\d+)(\D|$)
See this work That will split a list of digits delimited by any non digit, so 1?2!3.4 would match
If you want a specific delimiter, such as /:
(\d+)(?:/|$)
As simple as possible:
(\d+\/?)+
Every digit followed by [a] slash, as many as possible. You may use g flag for all matches.

Dart: RegExp by example

I'm trying to get my Dart web app to: (1) determine if a particular string matches a given regex, and (2) if it does, extract a group/segment out of the string.
Specifically, I want to make sure that a given string is of the following form:
http://myapp.example.com/#<string-of-1-or-more-chars>[?param1=1&param2=2]
Where <string-of-1-or-more-chars> is just that: any string of 1+ chars, and where the query string ([?param1=1&param2=2]) is optional.
So:
Decide if the string matches the regex; and if so
Extract the <string-of-1-or-more-chars> group/segment out of the string
Here's my best attempt:
String testURL = "http://myapp.example.com/#fizz?a=1";
String regex = "^http://myapp.example.com/#.+(\?)+\$";
RegExp regexp= new RegExp(regex);
Iterable<Match> matches = regexp.allMatches(regex);
String viewName = null;
if(matches.length == 0) {
// testURL didn't match regex; throw error.
} else {
// It matched, now extract "fizz" from testURL...
viewName = ??? // (ex: matches.group(2)), etc.
}
In the above code, I know I'm using the RegExp API incorrectly (I'm not even using testURL anywhere), and on top of that, I have no clue how to use the RegExp API to extract (in this case) the "fizz" segment/group out of the URL.
The RegExp class comes with a convenience method for a single match:
RegExp regExp = new RegExp(r"^http://myapp.example.com/#([^?]+)");
var match = regExp.firstMatch("http://myapp.example.com/#fizz?a=1");
print(match[1]);
Note: I used anubhava's regular expression (yours was not escaping the ? correctly).
Note2: even though it's not necessary here, it is usually a good idea to use raw-strings for regular expressions since you don't need to escape $ and \ in them. Sometimes using triple-quote raw-strings are convenient too: new RegExp(r"""some'weird"regexp\$""").
Try this regex:
String regex = "^http://myapp.example.com/#([^?]+)";
And then grab: matches.group(1)
String regex = "^http://myapp.example.com/#([^?]+)";
Then:
var match = matches.elementAt(0);
print("${match.group(1)}"); // output : fizz

Using Regex is there a way to match outside characters in a string and exclude the inside characters?

I know I can exclude outside characters in a string using look-ahead and look-behind, but I'm not sure about characters in the center.
What I want is to get a match of ABCDEF from the string ABC 123 DEF.
Is this possible with a Regex string? If not, can it be accomplished another way?
EDIT
For more clarification, in the example above I can use the regex string /ABC.*?DEF/ to sort of get what I want, but this includes everything matched by .*?. What I want is to match with something like ABC(match whatever, but then throw it out)DEF resulting in one single match of ABCDEF.
As another example, I can do the following (in sudo-code and regex):
string myStr = "ABC 123 DEF";
string tempMatch = RegexMatch(myStr, "(?<=ABC).*?(?=DEF)"); //Returns " 123 "
string FinalString = myStr.Replace(tempMatch, ""); //Returns "ABCDEF". This is what I want
Again, is there a way to do this with a single regex string?
Since the regex replace feature in most languages does not change the string it operates on (but produces a new one), you can do it as a one-liner in most languages. Firstly, you match everything, capturing the desired parts:
^.*(ABC).*(DEF).*$
(Make sure to use the single-line/"dotall" option if your input contains line breaks!)
And then you replace this with:
$1$2
That will give you ABCDEF in one assignment.
Still, as outlined in the comments and in Mark's answer, the engine does match the stuff in between ABC and DEF. It's only the replacement convenience function that throws it out. But that is supported in pretty much every language, I would say.
Important: this approach will of course only work if your input string contains the desired pattern only once (assuming ABC and DEF are actually variable).
Example implementation in PHP:
$output = preg_replace('/^.*(ABC).*(DEF).*$/s', '$1$2', $input);
Or JavaScript (which does not have single-line mode):
var output = input.replace(/^[\s\S]*(ABC)[\s\S]*(DEF)[\s\S]*$/, '$1$2');
Or C#:
string output = Regex.Replace(input, #"^.*(ABC).*(DEF).*$", "$1$2", RegexOptions.Singleline);
A regular expression can contain multiple capturing groups. Each group must consist of consecutive characters so it's not possible to have a single group that captures what you want, but the groups themselves do not have to be contiguous so you can combine multiple groups to get your desired result.
Regular expression
(ABC).*(DEF)
Captures
ABC
DEF
See it online: rubular
Example C# code
string myStr = "ABC 123 DEF";
Match m = Regex.Match(myStr, "(ABC).*(DEF)");
if (m.Success)
{
string result = m.Groups[1].Value + m.Groups[2].Value; // Gives "ABCDEF"
// ...
}

Regex to find substring between two strings

I'd like to capture the value of the Initial Catalog in this string:
"blah blah Initial Catalog = MyCat'"
I'd like the result to be: MyCat
There could or could not be spaces before and after the equal sign and there could or could not be spaces before the single quote.
Tried this and various others but no go:
/Initial Catalog\s?=\s?.*\s?\'/
Using .Net.
You need to put parentheses around the part of the string that you would like to match:
/Initial Catalog\s*=\s*(.*?)\s*'/
Also you would like to exclude as many spaces as possible before the ', so you need \s* rather than \s?. The .*? means that the extracted part of the string doesn't take those spaces, since it is now lazy.
This is a nice regex
= *(.*?) *'
Use the idea and add \s and more literal text as needed.
In C# group 1 will contain the match
string resultString = null;
try {
Regex regexObj = new Regex("= *(.*?) *'");
resultString = regexObj.Match(subjectString).Groups[1].Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Regex rgx = new Regex(#"=\s*([A-z]+)\s*'");
String result = rgx.Match(text).Groups[1].Value;