XPath regex combined with preg_match

XPath regex combined with preg_match - regex

I have a simple but invisible (for me) error in code. Can you help me?
With this code in my php file:
$location = $xpath2->query("//script")->item(1)->textContent;
I got (select) this:
<script class="" type="text/javascript">
//<![CDATA[
var html = '';
var lat = 44.793530904744074;
var lang = 20.5364727973938;
if (GBrowserIsCompatible())
{
var map = new GMap2(document.getElementById("map_canvas"));
var ct = new GLatLng(lat, lang);
map.setCenter(ct, 15);
map.addControl( new GSmallMapControl() );
//map.addControl( new GHierarchicalMapTypeControl () );
var gm=new GMarker(ct);
if(html != '') {
GEvent.addListener(gm, "click", function() {
this.openInfoWindowHtml( html );
});
}
map.addOverlay(gm);
map.enableContinuousZoom();
map.enableInfoWindow();
}
//]]>
</script>
Then I try to fetch 'lat' and 'lang' with this code:
$location = $xpath2->query("//script")->item(1)->textContent;
preg_match('/var\s+lat\s+=\s+(\d+\.\d+)\s*;/', $location, $lat);
preg_match('/var\s+lang\s+=\s+(\d+\.\d+)\s*;/', $location, $lng);
$data['lat'] = $lat[1];
$data['lng'] = $lng[1];
But always show that lat and lang is 0, 0 when they should be 44.34534 and 20.5345.
PLEASE HELP! where you think that I'm wrong (my English is not very well, sorry for that)

Maybe something like below. Beware though you're trying to parse JavaScript.
preg_match('/(?:^|(?<=\s))var\s+lat \s* = \s* (?=[^;]*\d) ([+-]?\d*\.?\d*)\s*; /x', $location, $lat);
preg_match('/(?:^|(?<=\s))var\s+lang\s* = \s* (?=[^;]*\d) ([+-]?\d*\.?\d*)\s*; /x', $location, $lng);
Run sample: http://www.ideone.com/SEgVb
Or, just try to get more general information:
preg_match('/(?:^|(?<=\s))var\s+lat \s*=\s* ([^;]*) \s*; /x', ...

Try like this:
preg_match('/var\s+lat\s+=\s+([\d.-]+)/', $location, $lat);
preg_match('/var\s+lang\s+=\s+([\d.-]+)/', $location, $lng);
The [\d.-]+ matches any group with numbers . or - (lat/lon can be negative)

Related

Regex which matches anchor tags wrapping an url

I have following text:
https://google.com
website
<em>https://google.com</em>
which I want to transform into:
https://google.com
website
<em>https://google.com</em>
by replacing anchor tags which contain urls with just the url.
i came this far: <a.*?href="http.*?>(.*?)<\/a> but struggle making the group more strict. it should check for the http string and allow wrapping tags such as <em>.
any help is appreciated, thanks!

I came up with:
// your code goes here
var s =
'https://google.com\n' +
' website \n' +
'website\n' +
'<em>https://google.com</em>\n' +
' <em>https://google.com</em> \n' +
'<a href="https://www.google.com">\n' +
' <em>https://www.google.com</em>\n' +
'</a>\n';
var re = /<a\s+href="([^"]+)"\s*>\s*(.+?)\s*<\/a>/isg;
var new_s = s.replace(re, function(match, p1, p2) {
if (p2.indexOf('http') == -1)
return match; /* in effect, no substritution */
return p2;
});
console.log(new_s);
See demo

You can try using DOMParser
let str = `https://google.com
website
<em>https://google.com</em>`
let html = new DOMParser()
let parsed = html.parseFromString(str, 'text/html')
let final = [...parsed.getElementsByTagName('a')].map(tag=>{
let href = tag.href
if(tag.innerHTML.includes(tag.href.replace(/\/$/,''))){
return tag.innerHTML
}
return tag
})
console.log(final)

req.params is working with $regex of mongodb [duplicate]

I have a find statement like this
collSession.find({"Venue.type": /.*MT.*/}).toArray(function (err, _clsSession)
{
console.log(_clsSession);
});
It is giving answer.But i need to some value of variable instead of that harcoded value MT.
How to achieve this ?
Thanks.
UPDATE I tried like "/."+searchterm+"./"
Its not working.

Instead of using the inline syntax to create a regular expression, you can also use the RegExp object to create one based on a string
var searchPhrase = "MT";
var regularExpression = new RegExp(".*" + searchPhrase + ".*");
collSession.find({"Venue.type": regularExpression}) [...]

Replace /.*MT.*/ with new RegExp( ".*" + variable + ".*" )

Try this:
var pattern = 'concatenate string' + here,
regexp = new Regexp(pattern);

Finally i got from here
it is "Venue.type": new RegExp(queryParams.et)

Take a look at this code: (I'm using mongoose)
exports.getSearchPosts = (req, res, next) => {
const keyword = req.body.keyword;
Post.find({ postTitle: new RegExp( ".*" + keyword + ".*" ) }).then(posts => {
res.render('post/search', {
pageTitle: 'Search result for: ' + keyword,
posts: posts,
category: postCategory,
posts: catPost,
});
}).catch(err => console.log(err));
}
I think you will find it helpful

How do I outdent by a specific amount of tabs?

I'm trying to create a function that I can use to outdent (versus indent) a specific amount.
Here is what I have so far. This removes all tabs at the beginning of the lines. I think I need to create a dynamic pattern or use a function but I'm stuck:
var outdentPattern:RegExp = /([\t ]*)(.+)$/gm;
function outdent(input:String, outdentAmount:String = "\t"):String {
var outdentedText:String = input.replace(outdentPattern, outdentAmount + "$2");
return outdentedText;
}
Here is test data:
<s:BorderContainer>
<html:htmlOverride><![CDATA[
<script>
var test:Boolean = true;
test = "string";
</script>]]>
</html:htmlOverride>
</s:BorderContainer>
The test would be remove one tab, remove two tabs, etc.
Expected results at one tab would be:
<s:BorderContainer>
<html:htmlOverride><![CDATA[
<script>
var test:Boolean = true;
test = "string";
</script>]]>
</html:htmlOverride>
</s:BorderContainer>
And two tabs:
<s:BorderContainer>
<html:htmlOverride><![CDATA[
<script>
var test:Boolean = true;
test = "string";
</script>]]>
</html:htmlOverride>
</s:BorderContainer>
And three tabs with the inner tabs (whitespace) collapsing down:
<s:BorderContainer x="110" height="160" width="240" y="52">
<html:htmlOverride><![CDATA[
<script>
var test:Boolean = true;
</script>
]]></html:htmlOverride>
</s:BorderContainer>
Interesting note:
The editor on SO is outdenting when you click code button when the code is already indented.

You could either construct a RegExp object from a template, or you could use the regular expression several times:
var temp:String = '^[\t ]{0,';
function outdent(input:String, amount:Number = 1):String {
return input.replace(new RegExp(temp + amount.toString() + '}', 'gm'), '');
}
Or:
var pattern:RegExp = /^[\t ]/gm;
function outdent(input:String, amount:Number = 1):String {
for (var i:Number = 0; i < amount; i++)
input = input.replace(pattern, '');
return input;
}

Reg-ex query is too greedy

Consider the following snippet:
<Offering id=1 blah blah templateid=abc something=blah
gretre
rtert
ret
tr
/Offering>
<Offering id=2 blah blah templateid=def something=blah>
gretre
rtert
ret
tr
</Offering>
<Offering id=3 blah blah templateid=ghi something=blah>
gretre
rtert
ret
tr
</Offering>
Given that all I know is the template id, I need to return the whole Offering node that contains it. i.e. for templateid=def, I need to return:
<Offering id=2 blah blah templateid=def something=blah>
gretre
rtert
ret
tr
</Offering>
I've tried all sorts but the closest I can get is something along the lines of (?s)<Offering.+?templateid=def.+?</Offering> which returns from the first offering until the end of the offering containing my template id. I understand why but nothing I've tried can fix it. I'm guessing lookarounds but I just can't get it right.
How can I return the whole offering node?

You could modify your regex using negation and I would probably use a word boundary as well.
<Offering[^>]*\btemplateid=def[^>]*>[^<]*</Offering>
If you have other tags inside of this tag, you could do ...
(?s)<Offering[^>]*\btemplateid=def.+?</Offering>

This should work but please notice that I escaped the / character, and you may not need to do that depending on what language you're using:
(<Offering[^>]* templateid=ghi [^>]*>[^<]*<\/Offering>)

As you say you "need to return the whole Offering node", the arguably simpler, safer and more readable way would be a DOM parser. I've included examples of how you might do this in JavaScript and PHP below.
PHP
$doc = new DOMDocument();
#$doc->loadHTML($testStr); //Only needed if you're loading HTML like in the example which has repeated attributes and other things that could cause errors
$body = $doc->getElementsByTagName('body')->item(0);
$templateID = 'def';
$myNode = null;
foreach($body->childNodes as $node)
{
if($node->nodeName=='offering')
{
if($node->attributes->getNamedItem('templateid')->nodeValue == $templateID)
{
$myNode = $node;
}
}
}
//$id = $myNode->attributes->getNamedItem('id')->nodeValue;
//$html = $doc->saveHTML($myNode)
JavaScript
var testStr = document.getElementById('str_container').innerHTML;
var parser = new DOMParser();
var doc = parser.parseFromString(testStr,'text/html');
var templateID = 'def';
var myEl = null;
for(var i=0,c=doc.body.children.length;i<c;i++)
{
if(doc.body.children[i].getAttribute('templateid')===templateID)
{
myEl = doc.body.children[i];
}
}
//var id = myEl.id;
//var html = myEl.outerHTML;
console.log(myEl || 'not found');
JavaScript >= IE8
var testStr = document.getElementById('str_container').innerHTML;
var parser = new DOMParser();
var doc = parser.parseFromString(testStr,'text/html');
var templateID = 'def';
var myEl = doc.body.querySelector('offering[templateid='+templateID+']');
//var id = myEl.id;
//var html = myEl.outerHTML;
console.log(myEl || 'not found');

RegExp in Node Js not working: Cannot call method 'toString' of null

This exact function does not work on Node Js, even though it work fine on a regular browser. How can I make it work on NodeJS?
Example: http://jsfiddle.net/wC53N/
var MyHereDoc = function() {
var bodyEmail = function(){
var test = 1;
/*HEREDOC
<div>
Hola1
<p>
hola2
</p>
</div>
HEREDOC*/
};
var here = "HEREDOC";
var reobj = new RegExp("/\\*"+here+"\\n[\\s\\S]*?\\n"+here+"\\*/", "m");
str = reobj.exec(bodyEmail).toString();
str = str.replace(new RegExp("/\\*"+here+"\\n",'m'),'').toString();
toPrint = str.replace(new RegExp("\\n"+here+"\\*/",'m'),'').toString();
document.getElementById("demo").innerHTML = toPrint;
}
MyHereDoc();
If you try to run this same code on NodeJS u will get this error:
Cannot call method 'toString' of null
But Why? any idea how to make it works, or what am I doing wrong?
Thanks

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

XPath regex combined with preg_match - regex

Try like this: preg_match('/var\s+lat\s+=\s+([\d.-]+)/', $location, $lat); preg_match('/var\s+lang\s+=\s+([\d.-]+)/', $location, $lng); The [\d.-]+ matches any group with numbers . or - (lat/lon can be negative)

Related

Regex which matches anchor tags wrapping an url

req.params is working with $regex of mongodb [duplicate]

How do I outdent by a specific amount of tabs?

Reg-ex query is too greedy

RegExp in Node Js not working: Cannot call method 'toString' of null

Categories

Resources