Coming up with a regex solution - regex

I have a page that has content on it that I need to try extract a number from with a regex.
Here is an example of the format of the page
Text here
<script type="text/javascript">
var USER_ACCOUNT_NUMBER = "12345";
var USER_FULL_NAME = "";
var IS_INTERNATIONAL_USER = false;
</script>
Text here
From this content I need to get the value of user_account_number which would be 12345.
I tried something like this USER_ACCOUNT_NUMBER([\s\S])=([\s\S])""; which finds the part i need but not sure how to only get whats in those quotes.

I assume, the number of spaces around = isn't fixed. Hence, you can use this regex:
USER_ACCOUNT_NUMBER\s*=\s*"(.*)"
Online test

/USER_ACCOUNT_NUMBER = "(.*)";/
Returns: 12345
http://regex101.com/r/dW2hK7

i would try something like:
USER_ACCOUNT_NUMBER = "[1-9]*"

Related

Regex to grab data from massive HTML string

I am grabbing a HTML source dump that includes some sort of JSON props created by react.
Trying to grab data in syntax like this: "siteName":"Example Site". I want to grab that "Example Site" text without the quotations.
I know I could be using an HTML parser but this is actually within some JS code in the source.
Any thoughts on how I could do this? Thanks
With this regex you get it but I would use something else like a Json parser
var regex = /"siteName":"(.+?)"/g;
var str = `{"siteName":"ABC Example Business","contactName":"Jeff","siteKey":"abcexample","tabKey":"service","entityKey":"1192289","siteId":152285976,"entityId":13123055221,"phone":"","mobile":"0100 000 000",}`;
var result = regex.exec(str);
console.log(result[1]);
How about that:
\"siteName\":\"(.+)\"

RegEx to normalize XML syntax

I have an XML-code where some tags generate xml parse errors (Error #1090). The problem is in attributes that need to be quoted:
<div class=treeview>
Help me please to write a regular expression to make them as following:
<div class="treeview">
this one will be correct:
var pattern:RegExp = /(\w+)(=)(\w+)/g;
trace('regexTest:', pString.replace(pattern, '$1$2"$3"'));
because, there must be 3 groups: attribute_name, = (equals), attribute_value
Please, could you try the next code:
var regExp:RegExp = /(class\=)(\w+)/g;
var sourceText:String = "<div class=treeview>";
var replacedText:String = sourceText.replace(regExp, '$1"$2"');
trace(replacedText);
In a nutshell, this RegExp means:
Find 2 groups: (class=) and (any-word-after-it)
Add before and after the group 2 quotes.
You should try the following regex>
regex = /(<div[^>]*class=)(\S+)([^>]*>)/g;
sourceString.replace(regex, '$1"$2"$3');
Try using a general purpose markup repair tool such as John Cowan's TagSoup. This is likely to be much more robust than anything you attempt yourself (for example, most of the suggested regular expressions don't even check that the keyword=value construct is within a start tag).

Regular expression in node js

I'm trying to strip out facebook.com from a URL. Can someone please help me with the RegEx to do this in NodeJs and express?
"http://www.facebook.com/RalphLauren"
I need to be left with "RalphLauren" as a string.
Thanks in advance!!!
Update:
This did the trick for what I wanted:
var url = 'http://www.facebook.com/RalphLauren';
var name = url.substr(url.lastIndexOf('/')+1);
No need for a regular expression. Use the parse method of the URL module and extract the path.
var parts = url.parse("http://www.facebook.com/RalphLauren");
console.log(parts.path); // '/RalphLauren'
Try
var newString = "http://www.facebook.com/RalphLauren".replace( /http:\/\/(www.)?facebook.com\/(.+)$/, "$2" );
You can use the replace method on strings
var url = "http://www.facebook.com/RalphLauren";
url.replace(/http:\/\/.*\.facebook\.com/, '');
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace

Can a sizzle selector evaluate a regular expression?

I need to select links with a specific format of URLs. Can I use sizzle to evaluate a link's href attribute against a regular expression?
For example, can I do something like this:
var arrayOfLinks = Sizzle('a[HREF=[0-9]+$]');
to create an array of all links on the page whose URL ends in a number?
Give this a try. I've attempted to convert the jQuery regex selector that Kobi linked to into a Sizzle selector extension. Seems to work, but I haven't put it through a lot of testing.
Sizzle.selectors.filters.regex = function(elem, i, match){
var matchParams = match[3].split(',', 2);
var attr = matchParams[0];
var pattern = matchParams[1];
var regex = new RegExp(pattern.replace(/^\s+|\s+$/g,''), 'ig');
return regex.test(elem.getAttribute(attr));
};
In this case, your example would be written as:
var arrayOfLinks = Sizzle('a:regex(href,[0-9]+$)');

How to use regular expression in WatiN

I'm working on WatiN automation tool. I'm having problem in regular expression. I've situation where i have to enter some text and click on a button in the popup window. I'm using AttachToIE method and URL attribute("http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=ef5ad7ef5490-4656-9669-32464aeba7cd") of the popup to attach to the popup.
The problem is each time the popup appears the ID value in the URL changes. So i'm not able to access the popup. can anyone plz help with this by giving me Regular Expression for the changing value of ID in the below URL
("http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=ef5ad7ef5490-4656-9669-32464aeba7cd")
thanking you
It appears that you have a URL with 2 query string parameters Type and ID and your pattern is:
"http://192.168.25.10:215/admin/SelectUsers.aspx?Type=Feedback&ID={some id}"
You can use the Find.ByUrl() attribute constraint method and pass it to AttachToIE() as shown below with the regex for matching that pattern.
string url = "http://192.168.25.10:215/admin/SelectUsers.aspx?Type=Feedback&ID="
Regex regex = new Regex(url + "[a-z0-9]+", RegexOptions.IgnoreCase);
IE ie = IE.AttachToIE(Find.ByUrl(regex));
string baseUrl ="http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID="
Regex urlIE= new Regex(baseUrl + "[\\wd]+", RegexOptions.IgnoreCase);
IE ie = IE.AttachToIE(Find.ByUrl(urlIE);
I'm not familiar with WatiN but it looks like it's runs on .Net so perhaps this might help?
var desiredId = "000000000000-0000-0000-000000000000";
var url = "http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=ef5ad7ef5490-4656-9669-32464aeba7cd&someMoreStuff";
var pattern = #"(?i)(?<=FeedBackId=)[-a-z0-9]+";
var result = Regex.Replace(url, pattern, desiredId);
Console.WriteLine(result);
//Output: http://192.168.25.10:215/admin/SelectUsers.aspx?Type=FeedbackID=000000000000-0000-0000-000000000000&someMoreStuff
The following pattern should have the same affect but is more defensive. It should only match stuff in the query string, it requires the id to be 35 characters and won't match similar parameter names like "PreviousFeedBackId".
var pattern = #"(?i)(?<=\?.*\bFeedBackId=)[-a-z0-9]{35,35}\b";
If you just want to extract the id:
var id = Regex.Match(url, pattern).Value;
Console.WriteLine(id);
//output: ef5ad7ef5490-4656-9669-32464aeba7cd
WatiN has a feature where in we can use the url by neglecting the query string. Below is the code which is working fine for me.
string baseUrl = "http://192.168.25.10:215/admin/SelectUsers.aspx";
IE ie = IE.AttachToIE(Find.ByUrl(baseUrl,true));