Regular Expression - Extract Tracking ID from Google Analytic script

Regular Expression - Extract Tracking ID from Google Analytic script - regex

I'm trying to extract UA-123456-7 from the following Google Analytic using regular expression. I think I'm too close, but I'm not sure it is even possible.
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-123456-7");
pageTracker._trackPageview();
</script>
Here is what I get when I run at www.regular-expressions.info
Regex: ^[<>%\w_\/.:;()\+-=?"]*(.*?)[<>\w_.;()]*$
Replacement text: $1
Result: gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));var pageTracker = _gat._getTracker("UA-140422-1"
Could someone please shed the light? Thanks in Advance!

Why don't you use a substring ?
Find the position of the string "_getTracker(" -> Pos A
same for string ")" -> Pos B
And substring between Pos A and Pos B.
Is that helpfull ?

Related

Matching word pattern with character pattern

So I have a very interesting question where I have a long string s such as:
eatsleepeatwalksleepwalk
and a smaller string p such as:
esetst
so on a quick look you can deduce that:
eat = e
sleep = s
walk = t
The problem statement is to tell whether the pattern of characters in smaller string p matches the words in the bigger string s
Size of s = 0 to 1000
Size of p = 0 to 1000
I'm aware of simple pattern matching using KMP, however this problem seems quite tricky and I'm unable to get to a starting point of solving this problem.
Any hints?
Edit 1: Look at #Neverever's answer below. Seems quite interesting, awaiting examination of space/time complexity.

Tried to solve it using JavaScript RegExp
$("button").click(function() {
let p = $("#p").val()
, s = $("#s").val()
, regMap = []
, regStr = "";
for (let c of p) {
let idx = regMap.indexOf(c);
if (idx === -1) {
regMap.push(c);
regStr += "(.+)";
} else {
regStr += "\\" + (idx + 1);
}
}
let reg = new RegExp("^" + regStr + "$");
console.log("RegExp used: " + regStr)
console.log("Result: " + reg.test(s));
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<label>String `s`: <input type="text" id="s" value="eatsleepeatwalksleepwalk" /></label><br>
<label>String `p`: <input type="text" id="p" value="esetst" /></label><br>
<button type="button">Run</button>

delete all text after the nth occurance of a whitespace

I used this snippet to make line breaks inside text. how can I delete the text behind the nth whitespace completely?
var str:String = ("This is just a test string").replace(/(( [^ ]+){2}) /, "$1\n");
regards

This works using regex (([^ ]* ){2}).* and replace pattern $1:
function removeAfterNthSpace() {
var nth = parseInt($("#num").val());
var regEx = new RegExp("(([^ ]* ){" + nth + "}).*", "g")
var str = ("This is just a test string").replace(regEx, "$1");
console.log(str);
}
$('#num').change(removeAfterNthSpace);
removeAfterNthSpace();
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input type="number" id="num" value="2" />
See it working on regex101.

how to use regex in javascript to capture everything between two segments

take this url
http://service.com/room/dothings?adsf=asdf&dafadsf=dfasdf
http://service.com/room/saythings?adsf=asdf&dafadsf=dfasdf
say if i want to capture dothings, saythings,
I now the following regex
/room\/(.+)\?/.exec(url)
and in the result i get this.
["room/dothings?", "dothings"]
what should i write to obtain the string above with only one item in an array.

I know this doesn't answer your question, but parsing a URL with regex is not easy, and in some cases not even safe. I would do the parsing without regex.
In browser:
var parser = document.createElement('a');
parser.href = 'http://example.com/room/dothings?adsf=asdf&dafadsf=dfasdf';
In node.js:
var url = require('url');
var parser = url.parse('http://example.com/room/dothings?adsf=asdf&dafadsf=dfasdf');
And then in both cases:
console.log(parser.pathname.split('/')[2]);

That's actually easy. You were almost there.
With all the obligatory disclaimers about parsing html in regex...
<script>
var subject = 'http://service.com/room/dothings?adsf=asdf&dafadsf=dfasdf';
var regex = /room\/(.+)\?/g;
var group1Caps = [];
var match = regex.exec(subject);
while (match != null) {
if( match[1] != null ) group1Caps.push(match[1]);
match = regex.exec(subject);
}
if(group1Caps.length > 0) document.write(group1Caps[0],"<br>");
</script>
Output: dothings
If you add strings in subject you can for (key in group1Caps) and it will spit out all the matches.
Online demo

Regular expression to match word pairs joined with colons

I don't know regular expression at all. Can anybody help me with one very simple regular expression which is,
extracting 'word:word' from a sentence. e.g "Java Tutorial Format:Pdf With Location:Tokyo Javascript"?
Little modification:
the first 'word' is from a list but second is anything. "word1 in [ABC, FGR, HTY]"
guys situation demands a little more
modification.
The matching form can be "word11:word12 word13 .. " till the next "word21: ... " .
things are becoming complex with sec.....i have to learn reg ex :(
thanks in advance.

You can use the regex:
\w+:\w+
Explanation:
\w - single char which is either a letter(uppercase or lowercase), digit or a _.
\w+ - one or more of above char..basically a word
so \w+:\w+
would match a pair of words separated by a colon.

Try \b(\S+?):(\S+?)\b. Group 1 will capture "Format" and group 2, "Pdf".
A working example:
<html>
<head>
<script type="text/javascript">
function test() {
var re = /\b(\S+?):(\S+?)\b/g; // without 'g' matches only the first
var text = "Java Tutorial Format:Pdf With Location:Tokyo Javascript";
var match = null;
while ( (match = re.exec(text)) != null) {
alert(match[1] + " -- " + match[2]);
}
}
</script>
</head>
<body onload="test();">
</body>
</html>
A good reference for regexes is https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/RegExp

Use this snippet :
$str=" this is pavun:kumar hello world bk:systesm" ;
if ( preg_match_all ( '/(\w+\:\w+)/',$str ,$val ) )
{
print_r ( $val ) ;
}
else
{
print "Not matched \n";
}

Continuing Jaú's function with your additional requirement:
function test() {
var words = ['Format', 'Location', 'Size'],
text = "Java Tutorial Format:Pdf With Location:Tokyo Language:Javascript",
match = null;
var re = new RegExp( '(' + words.join('|') + '):(\\w+)', 'g');
while ( (match = re.exec(text)) != null) {
alert(match[1] + " = " + match[2]);
}
}

I am currently solving that problem in my nodejs app and found that this is, what I guess, suitable for colon-paired wordings:
([\w]+:)("(([^"])*)"|'(([^'])*)'|(([^\s])*))
It also matches quoted value. like a:"b" c:'d e' f:g
Example coding in es6:
const regex = /([\w]+:)("(([^"])*)"|'(([^'])*)'|(([^\s])*))/g;
const str = `category:"live casino" gsp:S1aik-UBnl aa:"b" c:'d e' f:g`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Example coding in PHP
$re = '/([\w]+:)("(([^"])*)"|\'(([^\'])*)\'|(([^\s])*))/';
$str = 'category:"live casino" gsp:S1aik-UBnl aa:"b" c:\'d e\' f:g';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
You can check/test your regex expressions using this online tool: https://regex101.com
Btw, if not deleted by regex101.com, you can browse that example coding here

here's the non regex way, in your favourite language, split on white spaces, go through the element, check for ":" , print them if found. Eg Python
>>> s="Java Tutorial Format:Pdf With Location:Tokyo Javascript"
>>> for i in s.split():
... if ":" in i:
... print i
...
Format:Pdf
Location:Tokyo
You can do further checks to make sure its really "someword:someword" by splitting again on ":" and checking if there are 2 elements in the splitted list. eg
>>> for i in s.split():
... if ":" in i:
... a=i.split(":")
... if len(a) == 2:
... print i
...
Format:Pdf
Location:Tokyo

([^:]+):(.+)
Meaning: (everything except : one or more times), :, (any character one ore more time)
You'll find good manuals on the net... Maybe it's time for you to learn...

Replace each RegExp match with different text in ActionScript 3

I'd like to know how to replace each match with a different text?
Let's say the source text is:
var strSource:String = "find it and replace what you find.";
..and we have a regex such as:
var re:RegExp = /\bfind\b/g;
Now, I need to replace each match with different text (for example):
var replacement:String = "replacement_" + increment.toString();
So the output would be something like:
output = "replacement_1 it and replace what you replacement_2";
Any help is appreciated..

You could also use a replacement function, something like this:
var increment : int = -1; // start at -1 so the first replacement will be 0
strSource.replace( /(\b_)(.*?_ID\b)/gim , function() {
return arguments[1] + "replacement_" + (increment++).toString();
} );

I came up with a solution finally..
Here it is, if anyone needs:
var re:RegExp = /(\b_)(.*?_ID\b)/gim;
var increment:int = 0;
var output:Object = re.exec(strSource);
while (output != null)
{
var replacement:String = output[1] + "replacement_" + increment.toString();
strSource = strSource.substring(0, output.index) + replacement + strSource.substring(re.lastIndex, strSource.length);
output = re.exec(strSource);
increment++;
}
Thanks anyway...

leave off the g (global) flag and repeat the search with the appropriate replace string. Loop until the search fails

Not sure about actionscript, but in many other regex implementations you can usually pass a callback function that will execute logic for each match and replace.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regular Expression - Extract Tracking ID from Google Analytic script - regex

Why don't you use a substring ? Find the position of the string "_getTracker(" -> Pos A same for string ")" -> Pos B And substring between Pos A and Pos B. Is that helpfull ?

Related

Matching word pattern with character pattern

delete all text after the nth occurance of a whitespace

how to use regex in javascript to capture everything between two segments

Regular expression to match word pairs joined with colons

Replace each RegExp match with different text in ActionScript 3

Categories

Resources