fixing the url from the string using replace - coldfusion

I am doing a cfhttp call using a cfx_http5 tag as it is faster and better than cfhttp. so the links are coming as:
sort A
so i add the following script using the replace
http://mysubdomain.domain.com/http://mysubdomain.domain.com/e9.asp?rpttype=298&sortBy=1&sortOrder=2
<cfset lnk = ReplaceNoCase(objget, 'href="', 'href="http://mysubdomain.domain.com/', 'all')>
in few of the links, it is coming as correct but in few of the links it is coming as above appending one more to the already existing one,
so i want to make it conditional, if it exists, do not append or add or else add if there is no link
any ideas?

You can use regular expressions with negative lookahead like this:
<cfset lnk = reReplaceNoCase(objget, 'href=\"(?!http\:\/\/)','href="http://mysubdomain.domain.com', 'ALL')>
This will work for both type of links.

Simply create a conditional if statement
lnk = 'http://mysubdomain.domain.com/e9.asp?rpttype=298&sortBy=1&sortOrder=2';
if (!findNoCase('mysubdomain.domain.com', lnk)) {
lnk = ReplaceNoCase(objget, 'href="', 'href="http://mysubdomain.domain.com/', 'all');
}

Related

how to get value of specifc string in coldfusion with regex

I am trying to get the url value of the following string with coldfusion, i am using list function but i am lost how do i get that please advice
<cfsavecontent variable="foo">
function modalwindow() {
url = "http://www.idea.com?mycode=9&pagenum=-1&sortBy=1&sortOrder=1";
mywin = window.open (url,"win",'toolbar=yes,location=yes,resizable=yes,copyhistory=yes,scrollbars=yes,width=878,height=810');
mywin.focus();
return false;
}
</cfsavecontent>
<cfset a = listgetat(foo,2,"url")>
<cfoutput>#a#</cfoutput>
but i am getting weird results, i need to fetch the URL value
This regex will return the URL and accounts for the whitespace being optional, however I should add the disclaimer that this is a little brittle and probably not a good way of going about whatever it is you're after for several reasons.
#reReplaceNoCase( foo, '.*url\s*=\s*"(.*?)".*', '\1' )#

Variable name in the method of ColdFusion Object

I am trying to set a variable in a cffunction.
The result is this:
<cfset local.layouts.appLayout = '../../app/layouts' & local.appController.new()>
The above code works. In the local.layouts.appLayout structure it assigns the return of the new method in the appControler. That is what I need it to do.
My problem is that I need to dynamically assign the method portion of that statement. I have another variable coreRoute.action that equals "new" in that function but I cannot seem to get the syntax right.
I have tried this:
<cfset local.layouts.appLayout = '../../app/layouts' & local.appController.coreRoute.action()>
That does not work and I can see why. I have also tried this:
<cfset local.layouts.appLayout = '../../app/layouts' & local.appController & #coreRoute.action# & '()'>
I have tried many variations of this syntax and I just cannot get it right.
Anyone have any ideas about how to do this. I am stuck.
Thanks in advance for any help.
UPDATE: With Todd Sharp's help I ended up using this and it worked great:
<cfinvoke component="#local.appController#" method="#coreRoute.action#" returnvariable="local.act">
<cfset local.layouts.appLayout = '../../app/layouts' & local.act>
You should look into using <cfinvoke> for dynamic method invocation. Try a Google search for "coldfusion dynamic method invocation" - here's one of the top results:
http://www.bennadel.com/blog/1320-ColdFusion-CFInvoke-Eliminates-The-Need-For-Evaluate-When-Dynamically-Executing-User-Defined-Functions.htm
In addition, if you want to do it entirely in script, you can, using this approach:
dynFn = this["foo" & bar];
dynFn(stuff);
This is in a cfc, if you're doing it from outside the cfc or not using a cfc at all, just change "this" to wherever your method is.

How to use Regexp to retrieve URL where link text has number in the bracket

I want to retrieve all links from the page, where link text is in the below format.
(10) Now I tried using below method but it didn't work.
There are many similar links available on the same page where number is not in sequence and also there are many repeated numbers for the link text, so I want to first collect such web element and then using attribute I can get the URL.
Similar to this page.
http://www.dmoz.org/search?q=surat&start=0&type=more&all=no&cat=
I want the link after we click on those numbers present in the bracket.
List<WebElement> catLinks = driver.findElements(By.xpath("//html/body/div[#id='doc']/div[#id='bd-cross']/ol/li[1]/a[2]"));
for (WebElement catLink : catLinks) {
System.out.println(nLink + ". " + catLink.getAttribute("href"));
}
Link XPath is:
//html/body/div[#id='doc']/div[#id='bd-cross']/ol/li[***1***]/a[2]
Using Above XPath I can get the first link URL. Now What I can do to get all links URL.
I tried using regexp :
//html/body/div[#id='doc']/div[#id='bd-cross']/ol/li[\\d\\.\\*]/a[2]
But it is not working.
I also tried using below method.
List<WebElement> catLinks = driver.findElements(By.linkText("\\d\.\*"));
for (WebElement catLink : catLinks) {
System.out.println(nLink + ". " + catLink.getAttribute("href"));
}
but no luck.
Now What I can do to get all links
URL.
I triedn using regex :
//html/body/div[#id='doc']/div[#id='bd-cross']/ol/li[\\d\\.\\*]/a[2]
Nop. Use:
/html/body/div[#id='doc']/div[#id='bd-cross']/ol/li/a[2]
Less is more.
You don't need to include the /html/body/ in the xpath locator, this will just make it more fragile if the page structure changes. Try this much simpler xpath locator: id('bd-cross')//li/a[2]

How do I get a number in a link out of HTML code with preg_match?

I need to find out, whether in an array there is a specific HTML code. The array contains HTML codes and I need to get a number, that is included in a link.
This would be what I am searching for (the number 10 ist the number I want):
class = "active" href = "http://www.example.com/something-10
So I tried the following using preg_match:
if(preg_match('/class = "active" href = "http://www.example.com/something-(.*)/',$array["crawler"],$arr)) { print_r($arr,true); }
Unfortunately this will give me nothing as result. So I guess, something is wrong with my preg_match. I allready checked all the manuals, but I still dont get what I am doing wrong.
Could someone help me with this? Thank you!
phpheini
Aside from advising you to not parse HTML using regular expressions, your particular regular expression needs different delimiters:
preg_match('~class = "active" href = "http://www\.example\.com/something-(\d+)~', ...)
Alternatively, you could have escaped the slashes within the regex, but that leads to LSS (leaning slash syndrome):
preg_match('/class = "active" href = "http:\/\/www\.example\.com\/something-(.*)/', ...)
And that's just ugly.
You should have gotten an error, if your error_reporting is turned on.

How to use regex in selenium locators

I'm using selenium RC and I would like, for example, to get all the links elements with attribute href that match:
http://[^/]*\d+com
I would like to use:
sel.get_attribute( '//a[regx:match(#href, "http://[^/]*\d+.com")]/#name' )
which would return a list of the name attribute of all the links that match the regex.
(or something like it)
thanks
The answer above is probably the right way to find ALL of the links that match a regex, but I thought it'd also be helpful to answer the other part of the question, how to use regex in Xpath locators. You need to use the regex matches() function, like this:
xpath=//div[matches(#id,'che.*boxes')]
(this, of course, would click the div with 'id=checkboxes', or 'id=cheANYTHINGHEREboxes')
Be aware, though, that the matches function is not supported by all native browser implementations of Xpath (most conspicuously, using this in FF3 will throw an error: invalid xpath[2]).
If you have trouble with your particular browser (as I did with FF3), try using Selenium's allowNativeXpath("false") to switch over to the JavaScript Xpath interpreter. It'll be slower, but it does seem to work with more Xpath functions, including 'matches' and 'ends-with'. :)
You can use the Selenium command getAllLinks to get an array of the ids of links on the page, which you could then loop through and check the href using the getAttribute, which takes the locator followed by an # and the attribute name. For example in Java this might be:
String[] allLinks = session().getAllLinks();
List<String> matchingLinks = new ArrayList<String>();
for (String linkId : allLinks) {
String linkHref = selenium.getAttribute("id=" + linkId + "#href");
if (linkHref.matches("http://[^/]*\\d+.com")) {
matchingLinks.add(link);
}
}
A possible solution is to use sel.get_eval() and write a JS script that returns a list of the links. something like the following answer:
selenium: Is it possible to use the regexp in selenium locators
Here's some alternate methods as well for Selenium RC. These aren't pure Selenium solutions, they allow interaction with your programming language data structures and Selenium.
You can also get get HTML page source, then regular expression the source to return a match set of links. Use regex grouping to separate out URLs, link text/ID, etc. and you can then pass them back to selenium to click on or navigate to.
Another method is get HTML page source or innerHTML (via DOM locators) of a parent/root element then convert the HTML to XML as DOM object in your programming language. You can then traverse the DOM with desired XPath (with regular expression or not), and obtain a nodeset of only the links of interest. From their parse out the link text/ID or URL and you can pass back to selenium to click on or navigate to.
Upon request, I'm providing examples below. It's mixed languages since the post didn't appear to be language specific anyways. I'm just using what I had available to hack together for examples. They aren't fully tested or tested at all, but I've worked with bits of the code before in other projects, so these are proof of concept code examples of how you'd implement the solutions I just mentioned.
//Example of element attribute processing by page source and regex (in PHP)
$pgSrc = $sel->getPageSource();
//simple hyperlink extraction via regex below, replace with better regex pattern as desired
preg_match_all("/<a.+href=\"(.+)\"/",$pgSrc,$matches,PREG_PATTERN_ORDER);
//$matches is a 2D array, $matches[0] is array of whole string matched, $matches[1] is array of what's in parenthesis
//you either get an array of all matched link URL values in parenthesis capture group or an empty array
$links = count($matches) >= 2 ? $matches[1] : array();
//now do as you wish, iterating over all link URLs
//NOTE: these are URLs only, not actual hyperlink elements
//Example of XML DOM parsing with Selenium RC (in Java)
String locator = "id=someElement";
String htmlSrcSubset = sel.getEval("this.browserbot.findElement(\""+locator+"\").innerHTML");
//using JSoup XML parser library for Java, see jsoup.org
Document doc = Jsoup.parse(htmlSrcSubset);
/* once you have this document object, can then manipulate & traverse
it as an XML/HTML node tree. I'm not going to go into details on this
as you'd need to know XML DOM traversal and XPath (not just for finding locators).
But this tutorial URL will give you some ideas:
http://jsoup.org/cookbook/extracting-data/dom-navigation
the example there seems to indicate first getting the element/node defined
by content tag within the "document" or source, then from there get all
hyperlink elements/nodes and then traverse that as a list/array, doing
whatever you want with an object oriented approach for each element in
the array. Each element is an XML node with properties. If you study it,
you'd find this approach gives you the power/access that WebDriver/Selenium 2
now gives you with WebElements but the example here is what you can do in
Selenium RC to get similar WebElement kind of capability
*/
Selenium's By.Id and By.CssSelector methods do not support Regex and By.XPath only does where XPath 2.0 is enabled. If you want to use Regex, you can do something like this:
void MyCallingMethod(IWebDriver driver)
{
//Search by ID:
string attrName = "id";
//Regex = 'a number that is 1-10 digits long'
string attrRegex= "[0-9]{1,10}";
SearchByAttribute(driver, attrName, attrRegex);
}
IEnumerable<IWebElement> SearchByAttribute(IWebDriver driver, string attrName, string attrRegex)
{
List<IWebElement> elements = new List<IWebElement>();
//Allows spaces around equal sign. Ex: id = 55
string searchString = attrName +"\\s*=\\s*\"" + attrRegex +"\"";
//Search page source
MatchCollection matches = Regex.Matches(driver.PageSource, searchString, RegexOptions.IgnoreCase);
//iterate over matches
foreach (Match match in matches)
{
//Get exact attribute value
Match innerMatch = Regex.Match(match.Value, attrRegex);
cssSelector = "[" + attrName + "=" + attrRegex + "]";
//Find element by exact attribute value
elements.Add(driver.FindElement(By.CssSelector(cssSelector)));
}
return elements;
}
Note: this code is untested. Also, you can optimize this method by figuring out a way to eliminate the second search.