How to extract href value from an html response in postman - postman

I have been trying to figure out how to extract the value from of a href attribute from an html response and have not had any luck.
I have the following response:
<body id="bodytag" class="taskTab">
<script></script>
<div id="downloads">
<div class="files">Files.pdf
</div>
</div>
</body>
I have gathered that I can use cheerio to load the html and potentially get the value but the only thing that I have managed to get is the text Files.pdf. What I need is the path in the href attribute so that I can store it in a variable to use in a sub0sequent request.
This is just one example of what I have tried:
const $ = cheerio.load(pm.response.text());
console.log($('.files', '#downloads').text());
I also tried to use xpath without any luck. Any help would be greatly appreciated.

Try
const $ = cheerio.load(pm.response);
console.log($('.files').attr('href'));
This should return you the href of the element. Documentation here

I was close:
const $ = cheerio.load(pm.response.text());
var href = $('.files a').attr('href');
pm.environment.set('downloadLink', href);

Related

How to inject HTML via Tag-Manager on specific place?

I have to inject a Codesnippet on a specific place in a html.
For example:
<section class="newsletter">
content content content
</section>
Now I want the Tagmanager to inject on this section another code instead of the section newsletter.
<div class="injector">
content content content
</div>
Ive tried some scripts but none of them showed the content. The custom html has been loaded, but did not replace the section.
How should i do this?
Thank u!
you can use a custom script to append the content in an HTML tag
for example and with the provided data, using Jquery.
The only thing that you have to care about is that the function is executed, in this case i'm using a anonymous function
<script>
(function(){
var div = document.createElement('div');
var h1 = document.createElement('h1');
h1 = $(h1).text('tille');
div = $(div).text('content content content')
div = $(div).attr('class','injection')
div = $(div).append(h1)
$('.post-text').append(div)
}
)()
</script>
I made it this way:
<script>
jQuery('div#container_to_hang_on').append('<div id="injector"></div>');
</script>
But you #Kemen Paulos Plaza said its not recommended at all. Why? Is there a better way?

Postman: How to extract value from html response and pass it on to next request in postman

Example url: https://abc.xyz.com/m#
HTML Response:
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
.
.
</head>
<body class="abc">
.
.
.
<script>xab.start('{\"first\":\"123xyz\",\"second\":\"abc123\",\"third"..;</script>
</div>
</body>
</html>
In the above mentioned response i want to extract the value of the parameter second '("second\":\"abc123\")' from the response and pass it on to the next request.
It would be simpler if the response is JSON, but in the case this is HTML response.
I was able to do this on JMeter using Regex but having hard time to do it on Postman.
Thanks!
You could look at using Cheerio to get the values, it's one of the built in modules within the Postman native application.
You could add something like this example, to extract the value from the HTML.
This is getting the value from the title html tag, of the jsonplaceholder page, then setting it as an environment variable:
const $ = cheerio.load(pm.response.text())
pm.test("it should return a title", () => {
pm.expect($('title').text()).to.not.be.empty
})
pm.environment.set('title', $('title').text())
I'm sure you could use this to get the value you need from your example.
my example how I get data from script tag in html response
const $ = cheerio.load(pm.response.text())
var script = ($('script').text().replace("window.__STATE__ = ",""));
var jsonData = JSON.parse(script);
var uid = pm.environment.get("uid");
if (jsonData.stared == uid) pm.environment.set('reactArticle', false);
else {pm.environment.set('reactArticle', true);}
Try to wrap the response through Json.parse and then splice this response using the splice method of JavaScript of substring whichever you feel comfortable.
After that save the same in a variable by setVariable method and then you can use it in another requests.

Regex for HTML RESPONSE BODY present under div tag

I need to build a regex for extracting the value present under value field.
i.e "f70a8c3d0a6cbe2e235c7fd1dd27d052df7412ea"
HTML RESPONSE BODY :
Note: I have pasted just a minor part of the response....but formToken key is unique
<div class="hidden">
<input name="formToken type="hidden"
value="f70a8c3d0a6cbe2e235c7fd1dd27d052df7412ea"
/>
</div>
I wrote the below regex but it returned nothing:
regex("formToken" type="hidden" value="([^"]*)"/>).find(0).exists, found nothing
Can you try this?
regex("type="hidden".*value="(.*?)[ \t]*"/>).find(0).exists
Instead of a regex, you could use a css selector check which is probably way easier once you have ids or css classes to search for.
Thank you all....I was able to get formToken using css
.check(css("input[name='formToken']", "value").saveAs("formTokex"))
Works like this for me:
.exec(http("request_1")
.get("<<<<YOUR_URL>>>>>")
.check(css("form[name='signInForm']", "action").saveAs("urlPath"))
and later printing it:
println(session( "urlPath" ).as[String])

Select every text node in a HTML document except script nodes with XPath

I am currently writing a web crawler with Scrapy, and I would like to fetch all the text displayed on the screen of every HTML document with a single XPath query.
Here is the HTML I'm working with:
<body>
<div>
<h1>Main title</h1>
<div>
<script>var grandson;</script>
<p>Paragraph</p>
</div>
</div>
<script>var child;</script>
</body>
As you can see, there are some script tags that I want to filter when getting the text inside the body tag
Here is my first XPath query and its result:
XPath: /body/*//text()
Result: Main title / var grandson; / Paragraph / var child;
This is not good because it also fetches the text inside the script tag.
Here is my second try:
XPath: /body/*[not(self::script)]//text()
Result: Main title / var grandson; / Paragraph
Here, the last script tag (which is body's child) is filtered, but the inner script is not.
How would you filter all the script tags ? Thanks in advance.
Try
//*[not(self::script)]/text()
This xPath does what you want.
.//text()[not(parent::script)]
So we have looking what is parent of text.
More interesting sample. I can use it for each element which contains html code.
.//text()[not(ancestor::script|ancestor::style|ancestor::noscript)]

How to filter the html markups when render a template with jinja2?

Now I'm biulding a django project with jinja2 dealing with templates. Some page contents are submited by the client with wysiwy editor, and thing's going fine with the detail pages.
But the list pages are wrong with the slice of the contents.
My code:
<div class="summary ">
<div class="content">{{ question.content[:200]|e}}...</div>
</div>
But the output is:
<p>what i want to show here is raw text without markups</p>...
The expected result is that the html markups like <p></p> <section>.... are gone (filtered or eliminated) and only the raw text shows!
So how can I fix it? Thanks in advance!
Use striptags filter:
striptags(value)
Strip SGML/XML tags and replace adjacent whitespace
by one space.
<div class="content">{{ question.content|striptags}}...</div>
Jinja2 striptags filter test will also help you to understand how it works.
Hope that helps.