I have a search requirement.
For example, I am want to search a word "Microsoft Account" in a large content.
In the large text, it may be defined like
"Microsoft_Account" or "Microsoft-Account".
My search logic should identify the above words also.
Is there any way to implement this using regular expression?
(can be done by splitting and loop search but would be great if any solution using regular expression)
If you just need the regEx, it's : a[ -_]b
Where a and b are the two part of what you search
If you need an algo :
You need to first split you search word (in many language this yourString.split(regex)) whit th regex spliter : [ -_] that will allow the three different caracter.
In many cas, split return a table of string. So then you have to look in your table to recreate your regex.
Algorithm
str = your_search_string
tab_string = str.split("[ -_]")
res = ""
foreach part in tab_string
res = res + part + "[ -_]"
endForeach
res = res[0 length-5] //to remove "[ -_]" at the end
with this little algorithm you will have for your exemple :
str = "Microsoft Account"
tab_string = ["Microsoft", "Account"]
res = ""
forEach
| res = "Microsoft[ -_]"
| res = "Microsoft[ -_]Account[ -_]"
EndforEach
res = "Microsoft[ -_]Account"
Related
I need a regex for filtering out a query. For example, I get a query input as below.
state:CA AND country:US OR postalcode:8888
Here, I need to extract terms based on " AND ", " OR " (any case). Can someone please provide the regex with which I can extract terms like "state:CA", "country:US" etc?
I want to consider the spaces before and after the AND, OR as the other terms might contain "and", "or" as part of string.
Eg: state:OR AND country:US
UPDATE:
I have tried something like this
\sAND\s|\sOR\s
With this, I could find the patterns " AND ", " OR ". But, how to make it case-insensitive?
What flavor or regex are you using ?
If the value in your key/pair values will always be comprised of one word only, this would do:
\w+:\w+
Test it here.
Update:
Since your values are comprised by more than one word only, I think you should be splitting the string into key/value pairs instead of using regexes.
Here's how you could do it in javascript:
var s = 'state:New York AND country:US OR postalcode:8888'
var dataBlocks = s.replace(/AND|and|And|OR|Or/g, '|').split('|')
for(var i = 0; i < dataBlocks.length; i++) dataBlocks[i] = dataBlocks[i].trim()
//your resulting array would like like
//Array [ "state:New York", "country:US", "postalcode:8888" ]
The same solution, in C#:
Regex r = new Regex(#"AND|and|And|OR|Or");
var s = "state:New York AND country:US OR postalcode:8888";
var keyValuePairs = r.Replace(s, "|").Split(new char[] { '|' }).Select(z =>
{
var keyValue = z.Trim().Split(new char[] { ':' });
return new KeyValuePair<string, string>(keyValue.FirstOrDefault(), keyValue.LastOrDefault());
});
foreach (var keyValuePair in keyValuePairs)
Console.WriteLine("Key: {0}\tValue:{1}", keyValuePair.Key, keyValuePair.Value);
I have a SPROC which is having the multiple instances of string Say '#TRML_CLOSE'.
I want to make them to be concatenated with a sequence of numbers.
Eg:
Search and find string '#TRML_CLOSE'
And
Replace the 1st Instance with '#TRML_CLOSE_1',
Replace the 2nd Instance with '#TRML_CLOSE_2',
Replace the 3nd Instance with '#TRML_CLOSE_3',
and so on.
How do I achieve this in Notepad++ using expressions.
I don't know the extent you can script Notepad++, but I do know you can throw together a quick JavaScript snippet to do what you want. http://jsfiddle.net/x4eSr/
Just go to the JS fiddle, and hit the button.
document.getElementById("btn").onclick = function() {
var elm = document.getElementById("txt");
var val = elm.value;
var cnt = 1;
val = val.replace(/#TRML_CLOSE(?!=[_])/g, function(m) {
return m + "_" + cnt++;
});
elm.value = val;
};
Using JavaScript's string.replace(regex, function(){}) which calls the function on each match and a globally incremented "cnt" variable.
How can I locate all positions of some word in text in one call using regular expressions in actionscript.
In example, I have this regular expression:
var wordsRegExp:RegExp = /[^a-zA-Z0-9]?(include|exclude)[^a-zA-Z0-9]?/g;
and it finds words "include" and "exclude" in text.
I am using
var match:Array;
match = wordsRegExp.exec(text)
to locate the words, but it finds first one first. I need to find all words "include" and "exclude" and there position so i do this:
var res:Array = new Array();
var match:Array;
while (match = wordsRegExp.exec(text)) {
res[res.length]=match;
}
And this does the trick, BUT very very slow for large amount of text. I was searching for some other method and didn't find it.
Please help and thanks in advance.
EDIT: I tried var arr:Array = text.match(wordsRegExp);
it finds all words, but not there positions in string
I think that's the nature of the beast. I don't know what you mean with "large amount of text", but if you want better performance, you should write your own parsing function. This shouldn't be that complicated, as your search expression is fairly simple.
I've never compared the performance of the String search functions and RegExp, because I thought there are based on the same implementation. If String.match() is faster, then you should try String.search(). With the index you could compute the substring for the next search iteration.
Found this on the help.adobe.com site,...
"Methods for using regular expressions with strings: The exec() method"
… The array also includes an index property, indicating the index position of the start of the substring match …
var pattern:RegExp = /\w*sh\w*/gi;
var str:String = "She sells seashells by the seashore";
var result:Array = pattern.exec(str);
while (result != null)
{
trace(result.index, "\t", pattern.lastIndex, "\t", result);
result = pattern.exec(str);
}
//output:
// 0 3 She
// 10 19 seashells
// 27 35 seashore
I have a list of several phrases in the following format
thisIsAnExampleSentance
hereIsAnotherExampleWithMoreWordsInIt
and I'm trying to end up with
This Is An Example Sentance
Here Is Another Example With More Words In It
Each phrase has the white space condensed and the first letter is forced to lowercase.
Can I use regex to add a space before each A-Z and have the first letter of the phrase be capitalized?
I thought of doing something like
([a-z]+)([A-Z])([a-z]+)([A-Z])([a-z]+) // etc
$1 $2$3 $4$5 // etc
but on 50 records of varying length, my idea is a poor solution. Is there a way to regex in a way that will be more dynamic? Thanks
A Java fragment I use looks like this (now revised):
result = source.replaceAll("(?<=^|[a-z])([A-Z])|([A-Z])(?=[a-z])", " $1$2");
result = result.substring(0, 1).toUpperCase() + result.substring(1);
This, by the way, converts the string givenProductUPCSymbol into Given Product UPC Symbol - make sure this is fine with the way you use this type of thing
Finally, a single line version could be:
result = source.substring(0, 1).toUpperCase() + source(1).replaceAll("(?<=^|[a-z])([A-Z])|([A-Z])(?=[a-z])", " $1$2");
Also, in an Example similar to one given in the question comments, the string hiMyNameIsBobAndIWantAPuppy will be changed to Hi My Name Is Bob And I Want A Puppy
For the space problem it's easy if your language supports zero-width-look-behind
var result = Regex.Replace(#"thisIsAnExampleSentanceHereIsAnotherExampleWithMoreWordsInIt", "(?<=[a-z])([A-Z])", " $1");
or even if it doesn't support them
var result2 = Regex.Replace(#"thisIsAnExampleSentanceHereIsAnotherExampleWithMoreWordsInIt", "([a-z])([A-Z])", "$1 $2");
I'm using C#, but the regexes should be usable in any language that support the replace using the $1...$n .
But for the lower-to-upper case you can't do it directly in Regex. You can get the first character through a regex like: ^[a-z] but you can't convet it.
For example in C# you could do
var result4 = Regex.Replace(result, "^([a-z])", m =>
{
return m.ToString().ToUpperInvariant();
});
using a match evaluator to change the input string.
You could then even fuse the two together
var result4 = Regex.Replace(#"thisIsAnExampleSentanceHereIsAnotherExampleWithMoreWordsInIt", "^([a-z])|([a-z])([A-Z])", m =>
{
if (m.Groups[1].Success)
{
return m.ToString().ToUpperInvariant();
}
else
{
return m.Groups[2].ToString() + " " + m.Groups[3].ToString();
}
});
A Perl example with unicode character support:
s/\p{Lu}/ $&/g;
s/^./\U$&/;
I need a AS3 regular expression that allows me to find/replace in strings like these:
var str1:String = "<value1 att="1"> some text</value1>";
var str2:String = "<value1 att="1" var="a"> some text and more</value1>";
var str3:String = "<value1 att="ok" var="b" def="12"> some text</value1>";
to this:
str1 = "<value1 att="1">*some text</value1>";
str2 = "<value1 att="1" var="a">**some text and more</value1>";
str3 = "<value1 att="ok" var="b" def="12">*****some text</value1>";
I want to be able to replace the spaces at the beginning (inside the > <) for other character. It shouldn't affect the number of character at the right of the spaces or the attributes in the value1 definition.
Assuming that there are no "* " sequences in the text blocks, this should work:
var s:String = "<value1 att='ok' var='b' def='12'> some text</value1>";
//find all spaces after a tag closing bracket and replace with a *
s = s.replace(/>\s/g, ">*");
//find all spaces after a * and replace it with a *
//keep doing this until no more can be found
while (s.match(/>\*+\s/g).length) {
s = s.replace(/\*\s/g, "**");
}
I can't think of a way to do it in one replace though.
I think the easiest way to accomplish what you need is to use a function in replace() expression.
var replaceMethod:Function = function (match:String, tagName:String, tagContent:String, spaces:String, targetText:String, index:int, whole:String) : String
{
trace("\t", "found", spaces.length,"spaces in tag '"+tagName+"'");
trace("\t", "matched string:", match);
// check tag name or whatever you may want
// do something with found spaces
var replacement:String = spaces.replace(" ", "*");
return "<"+tagName+" "+tagContent+">"+replacement+targetText;
}
var str1:String = '<value1 att="1"> some text</value1>';
var exp:RegExp = /<(\w+)([ >].*?)>(\s+)(some text)/gm;
trace("before:", str1);
str1 = str1.replace(exp, replaceMethod);
trace("after:", str1);
It's not performance-safe though; if you are using huge blocks of text and/or launching this routine very frequently, you may want to do something more comlicated, but optimized. One optimization technique is reducing the number of arguments of replaceMathod().
p.s. I think this can be done with one replace() expression and without using replaceMethod(). Look at positive lookaheads and noncapturing groups, may be you can figure it out. http://livedocs.adobe.com/flex/3/html/help.html?content=12_Using_Regular_Expressions_09.html