how to use regex in javascript to capture everything between two segments - regex

take this url
http://service.com/room/dothings?adsf=asdf&dafadsf=dfasdf
http://service.com/room/saythings?adsf=asdf&dafadsf=dfasdf
say if i want to capture dothings, saythings,
I now the following regex
/room\/(.+)\?/.exec(url)
and in the result i get this.
["room/dothings?", "dothings"]
what should i write to obtain the string above with only one item in an array.

I know this doesn't answer your question, but parsing a URL with regex is not easy, and in some cases not even safe. I would do the parsing without regex.
In browser:
var parser = document.createElement('a');
parser.href = 'http://example.com/room/dothings?adsf=asdf&dafadsf=dfasdf';
In node.js:
var url = require('url');
var parser = url.parse('http://example.com/room/dothings?adsf=asdf&dafadsf=dfasdf');
And then in both cases:
console.log(parser.pathname.split('/')[2]);

That's actually easy. You were almost there.
With all the obligatory disclaimers about parsing html in regex...
<script>
var subject = 'http://service.com/room/dothings?adsf=asdf&dafadsf=dfasdf';
var regex = /room\/(.+)\?/g;
var group1Caps = [];
var match = regex.exec(subject);
while (match != null) {
if( match[1] != null ) group1Caps.push(match[1]);
match = regex.exec(subject);
}
if(group1Caps.length > 0) document.write(group1Caps[0],"<br>");
</script>
Output: dothings
If you add strings in subject you can for (key in group1Caps) and it will spit out all the matches.
Online demo

Related

Jest cell name won't recognise regex [duplicate]

I want to add a (variable) tag to values with regex, the pattern works fine with PHP but I have troubles implementing it into JavaScript.
The pattern is (value is the variable):
/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is
I escaped the backslashes:
var str = $("#div").html();
var regex = "/(?!(?:[^<]+>|[^>]+<\\/a>))\\b(" + value + ")\\b/is";
$("#div").html(str.replace(regex, "" + value + ""));
But this seem not to be right, I logged the pattern and its exactly what it should be.
Any ideas?
To create the regex from a string, you have to use JavaScript's RegExp object.
If you also want to match/replace more than one time, then you must add the g (global match) flag. Here's an example:
var stringToGoIntoTheRegex = "abc";
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
In the general case, escape the string before using as regex:
Not every string is a valid regex, though: there are some speciall characters, like ( or [. To work around this issue, simply escape the string before turning it into a regex. A utility function for that goes in the sample below:
function escapeRegExp(stringToGoIntoTheRegex) {
return stringToGoIntoTheRegex.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
}
var stringToGoIntoTheRegex = escapeRegExp("abc"); // this is the only change from above
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
Note: the regex in the question uses the s modifier, which didn't exist at the time of the question, but does exist -- a s (dotall) flag/modifier in JavaScript -- today.
If you are trying to use a variable value in the expression, you must use the RegExp "constructor".
var regex = "(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b";
new RegExp(regex, "is")
I found I had to double slash the \b to get it working. For example to remove "1x" words from a string using a variable, I needed to use:
str = "1x";
var regex = new RegExp("\\b"+str+"\\b","g"); // same as inv.replace(/\b1x\b/g, "")
inv=inv.replace(regex, "");
You don't need the " to define a regular expression so just:
var regex = /(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is; // this is valid syntax
If value is a variable and you want a dynamic regular expression then you can't use this notation; use the alternative notation.
String.replace also accepts strings as input, so you can do "fox".replace("fox", "bear");
Alternative:
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(.*?)\b/", "is");
Keep in mind that if value contains regular expressions characters like (, [ and ? you will need to escape them.
I found this thread useful - so I thought I would add the answer to my own problem.
I wanted to edit a database configuration file (datastax cassandra) from a node application in javascript and for one of the settings in the file I needed to match on a string and then replace the line following it.
This was my solution.
dse_cassandra_yaml='/etc/dse/cassandra/cassandra.yaml'
// a) find the searchString and grab all text on the following line to it
// b) replace all next line text with a newString supplied to function
// note - leaves searchString text untouched
function replaceStringNextLine(file, searchString, newString) {
fs.readFile(file, 'utf-8', function(err, data){
if (err) throw err;
// need to use double escape '\\' when putting regex in strings !
var re = "\\s+(\\-\\s(.*)?)(?:\\s|$)";
var myRegExp = new RegExp(searchString + re, "g");
var match = myRegExp.exec(data);
var replaceThis = match[1];
var writeString = data.replace(replaceThis, newString);
fs.writeFile(file, writeString, 'utf-8', function (err) {
if (err) throw err;
console.log(file + ' updated');
});
});
}
searchString = "data_file_directories:"
newString = "- /mnt/cassandra/data"
replaceStringNextLine(dse_cassandra_yaml, searchString, newString );
After running, it will change the existing data directory setting to the new one:
config file before:
data_file_directories:
- /var/lib/cassandra/data
config file after:
data_file_directories:
- /mnt/cassandra/data
Much easier way: use template literals.
var variable = 'foo'
var expression = `.*${variable}.*`
var re = new RegExp(expression, 'g')
re.test('fdjklsffoodjkslfd') // true
re.test('fdjklsfdjkslfd') // false
Using string variable(s) content as part of a more complex composed regex expression (es6|ts)
This example will replace all urls using my-domain.com to my-other-domain (both are variables).
You can do dynamic regexs by combining string values and other regex expressions within a raw string template. Using String.raw will prevent javascript from escaping any character within your string values.
// Strings with some data
const domainStr = 'my-domain.com'
const newDomain = 'my-other-domain.com'
// Make sure your string is regex friendly
// This will replace dots for '\'.
const regexUrl = /\./gm;
const substr = `\\\.`;
const domain = domainStr.replace(regexUrl, substr);
// domain is a regex friendly string: 'my-domain\.com'
console.log('Regex expresion for domain', domain)
// HERE!!! You can 'assemble a complex regex using string pieces.
const re = new RegExp( String.raw `([\'|\"]https:\/\/)(${domain})(\S+[\'|\"])`, 'gm');
// now I'll use the regex expression groups to replace the domain
const domainSubst = `$1${newDomain}$3`;
// const page contains all the html text
const result = page.replace(re, domainSubst);
note: Don't forget to use regex101.com to create, test and export REGEX code.
var string = "Hi welcome to stack overflow"
var toSearch = "stack"
//case insensitive search
var result = string.search(new RegExp(toSearch, "i")) > 0 ? 'Matched' : 'notMatched'
https://jsfiddle.net/9f0mb6Lz/
Hope this helps

How to build regex to capture all possible matching group

I have a string which contains the data in xml format like as
str = "<p><a>_a_10gd_</a><a>_a_xy8a_</a><a>_a_1020_</a><a>_a_dfa7_</a><a>_a_ABCD_</a></p>";
What I am trying to do is that I want to capture _abc__(Value)__ from all possible mach. I have tried it that way
Let say I am doing this in JavaScript :-
var regex = /_a_(.+)_/g ;
var str = "<a>_a_10gd_</a><a>_a_xy8a_</a><a>_a_1020_</a><a>_a_dfa7_</a><a>_a_ABCD_</a>";
while(m = regex.exec(str)){
console.log(m[1]); // m[1] should contains each mach
}
I want to get all maching group in an array like this :-
var a = ['10gd', 'xy8a', '1020', 'dfa7', 'ABCD'];
Please tell me that what will be required regex and explain it also because I am new to regex and their capturing group.
Just change (.+) to (.+?) see:
var regex = /_a_(.+?)_/g ;
var str = "<a>_a_10gd_</a><a>_a_xy8a_</a><a>_a_1020_</a><a>_a_dfa7_</a><a>_a_ABCD_</a>";
while(m = regex.exec(str)){
console.log(m[1]); // m[1] should contains each mach
}
for more information about greediness, see What do lazy and greedy mean in the context of regular expressions?
Another option is to accept only characters except _ before the _ (instead of . which you have used), like so:
var regex = /_a_([^_]+)_/g ;

RegEx not finding all matches

I have the following code (AS3 & CS 5.5):
var regEx:RegExp = new RegExp(/(?:^|\s)(\#[^\s$]+)/g);
var txt:String = "This #asd is a test tweet #hash1 test #hash2 test";
var matches:Object = regEx.exec(txt);
trace(matches);
The trace returns '#asd,#asd'. I really don't understand why it would to this, as in my RegEx testing application 'RegExhibit' it returns '#asd,#hash1,#hash2', which is what I'd expect. Can anyone shed any light on this please?
Thanks in advance!
If you are using .exec, you should run it multiple times to get all results:
In the following example, the g (global) flag is set in the regular expression, so you can use exec() repeatedly to find multiple matches:
var myPattern:RegExp = /(\w*)sh(\w*)/ig;
var str:String = "She sells seashells by the seashore";
var result:Object = myPattern.exec(str);
while (result != null) {
trace (result.index, "\t", result);
result = myPattern.exec(str);
}
Source: http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/RegExp.html
A better alternative is probably to use String.match:
If pattern is a regular expression, in order to return an array with more than one matching substring, the g (global) flag must be set in the regular expression
An example should be (not tested):
var regEx:RegExp = /(?:^|\s)(\#[^\s$]+)/g;
var txt:String = "This #asd is a test tweet #hash1 test #hash2 test";
var matches:Object = txt.match(regEx);

String parsing with RegExp in Actionscript

I have a string that is similar to a path, but I have tried some regex patterns that are supposed to parse paths and they don't quite work.
Here's the string
f|MyApparel/Templates/Events/
I need the "name parts" between the slashes.
I tried (\w+) but the array came back [0] = "f" and [1] = "f".
I tested the pattern on http://www.gskinner.com/RegExr/ and it seems to work correctly.
Here's the AS code:
var pattern : RegExp = /(\w+)/g;
var hierarchy : Array = pattern.exec(params.category_id);
params.name = hierarchy.pop() as String;
pattern.exec() works like in JavaScript. It resets the lastIndex property every time it finds a match for a global regex, and next time you run it it starts from there.
So it does not return an array of all matches, but only the very next match in the string. Hence you must run it in a loop until it returns null:
var myPattern:RegExp = /(\w+)/g;
var str:String = "f|MyApparel/Templates/Events/";
var result:Object = myPattern.exec(str);
while (result != null) {
trace( result.index, "\t", result);
result = myPattern.exec(str);
}
I don't know between which two slashes you want but try
var hierarchy : Array = params.category_id.split(/[\/|]/);
[\/|] means a slash or a vertical bar.

What's wrong with my ActionScript Regex?

I am trying pull a field out of a string and return it.
My function:
public function getSubtype(ut:String):String {
var pattern:RegExp = new RegExp("X=(\w+)","i");
var nut:String = ut.replace(pattern, "$1");
trace("nut is " + nut);
return nut;
}
I'm passing it the string:
http://foo.bar.com/cgi-bin/ds.pl?type=boom&X=country&Y=day&Z=5
the trace statements return the above string with out modification.
I've tried the pattern out on Ryan Swanson's Flex 3 Regular Expresion Explorer and it returns: X=country. My wished for result is "country".
Must be obvious, but I can't see it. Any help will be appreciated.
TB
changed my function to the following and it works:
public function getSubtype2(ut:String):String {
trace("searching " + ut);
var pattern:RegExp = new RegExp("X=([a-z]+)");
var r:Object = pattern.exec(ut);
trace("result is " + r[1]);
return r[1].toString();
Interestingly, though, using X=(\w+) does not match and causes an error. ????
}
The replace method is used for replacing. That is if you want to modify the given string. Replacing given portion with his own occurrence produces the same string.
I think you are looking for the match method, that produces an array of matches, see below.
function getSubtype(ut:String):String {
var pattern:RegExp = new RegExp("X=([a-z]+)","i");
var nut:Array = ut.match(pattern);
trace("nut is " + nut[1]);
return nut[1];
}
nut[0] beeing the full matched string, followed by nut[1] the first brackets group and so on.
The replace method does not mutate the string it operates on, it returns a new string. Try:-
var nut:String = ut.replace(pattern, "$1");
Note: I don't know ActionScript...
Your RE Explorer seems to return the matched pattern, see if there is a possibility to see the captures as well.
And if AS behaves like most languages I know, your replace() call replaces X=country with country.
Instead of
var pattern:RegExp = new RegExp("X=(\w+)","i");
You can write this:
var pattern:RegExp = /X=(\w+)/i;
Then you will not have problems with backslashes.
var pattern : RegExp = /[\\?&]X=([^&#]*)/g;
var XValue : String = pattern.exec(ut)[1];
See http://livedocs.adobe.com/flex/3/langref/RegExp.html#exec%28%29 for further explanations.
I have also found this flex regexp testing tool to be quite helpful.