RegExing a veiwstate - regex

First of all, what is a viewstate?
In testautomation I probably need to correlate this value as it is unique for every user logging in?
How can I get the 'value' / token below using regex?
<div>
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTUxOTg3NDM2NGQYCAUkY3RsMDAkTWFpbk1lbnUkTWVudSRjdGwwMSRjdGwwMCRNZW51DxQrAA5kZGRkZGRkPCsABQACBWRkZGYC/////w9kBSRjdGwwMCRNYWluTWVudSRNZW51JGN0bDAzJGN0bDAwJE1lbnUPFCsADmRkZGRkZGQ8KwAEAAIEZGRkZgL/////D2QFJGN0bDAwJE1haW5NZW51JE1lbnUkY3RsMDQkY3RsMDAkTWVudQ8UKwAOZGRkZGRkZDwrAAcAAgdkZGRmAv////8PZAUkY3RsMDAkTWFpbk1lbnUkTWVudSRjdGwwNiRjdGwwMCRNZW51DxQrAA5kZGQCAmRkZDwrAAkAAglkZGRmAv////8PZAUkY3RsMDAkTWFpbk1lbnUkTWVudSRjdGwwMiRjdGwwMCRNZW51DxQrAA5kZGQCAmRkZDwrAAsAAgtkZGRmAv////8PZAUpY3RsMDAkRm9vdGVyUmVnaW9uJGN0bDAwJEZvb3RlckxpbmtzJExpc3QPD2ZkZAUTY3RsMDAkY3RsMDMkUnNzTGlzdA8PZmRkBSRjdGwwMCRNYWluTWVudSRNZW51JGN0bDA1JGN0bDAwJE1lbnUPFCsADmRkZGRkZGQ8KwAJAAIJZGRkZgL/////D2R3kjxauWd2eu+C/bmZz+/bI7YRkg==" />

Read this: RegEx match open tags except XHTML self-contained tags
then if you still want to have a go, use this:
(?<=input )(?:.*)(value\=\".*\")

Related

Can't read the XML node elements in ColdFusion

I'm trying to read some values from the XML file which I created, but it gives me the following error:
coldfusion.runtime.UndefinedElementException: Element MYXML.UPLOAD is undefined in XMLDOC.
Here is my code
<cffile action="read" file="#expandPath("./config.xml")#" variable="configuration" />
<cfset xmldoc = XmlParse(configuration) />
<div class="row"><cfoutput>#xmldoc.myxml.upload-file.size#</cfoutput></div>
Here is my config.xml
<myxml>
<upload-file>
<size>15</size>
<accepted-format>pdf</accepted-format>
</upload-file>
</myxml>
Can someone help me to figure out what is the error?
When I am printing the entire variable as <div class="row"><cfoutput>#xmldoc#</cfoutput></div> it is showing the values as
15 pdf
The problem is the hyphen - contained in the <upload-file> name within your XML. If you are in control of the XML contents the easiest fix will be to not use hyphens in your field names. If you cannot control the XML contents then you will need to do more to get around this issue.
Ben Nadel has a pretty good blog article in the topic - Accessing XML Nodes Having Names That Contain Dashes In ColdFusion
From that article:
To get ColdFusion to see the dash as part of the node name, we have to "escape" it, for lack of a better term. To do so, we either have to use array notation and define the node name as a quoted string; or, we have to use xmlSearch() where we can deal directly with the underlying document object model.
He goes on to give examples. As he states in that article, you can either quote the node name to access the data. Like...
<div class="row">
<cfoutput>#xmldoc.myxml["upload-file"].size#</cfoutput>
</div>
Or you can use the xmlSearch() function to parse the data for you. Note that this will return an array of the data. Like...
<cfset xmlarray = xmlSearch(xmldoc,"/myxml/upload-file/")>
<div class="row">
<cfoutput>#xmlarray[1].size#</cfoutput>
</div>
Both of these examples will output 15.
I created a gist for you to see these examples as well.

Regex for HTML RESPONSE BODY present under div tag

I need to build a regex for extracting the value present under value field.
i.e "f70a8c3d0a6cbe2e235c7fd1dd27d052df7412ea"
HTML RESPONSE BODY :
Note: I have pasted just a minor part of the response....but formToken key is unique
<div class="hidden">
<input name="formToken type="hidden"
value="f70a8c3d0a6cbe2e235c7fd1dd27d052df7412ea"
/>
</div>
I wrote the below regex but it returned nothing:
regex("formToken" type="hidden" value="([^"]*)"/>).find(0).exists, found nothing
Can you try this?
regex("type="hidden".*value="(.*?)[ \t]*"/>).find(0).exists
Instead of a regex, you could use a css selector check which is probably way easier once you have ids or css classes to search for.
Thank you all....I was able to get formToken using css
.check(css("input[name='formToken']", "value").saveAs("formTokex"))
Works like this for me:
.exec(http("request_1")
.get("<<<<YOUR_URL>>>>>")
.check(css("form[name='signInForm']", "action").saveAs("urlPath"))
and later printing it:
println(session( "urlPath" ).as[String])

Django: How can I invisibly pass a variable to another template?

I have three templates in my project—we'll call them first.html, second.html, third.html.
first.html gets a string from the user, using an <input> tag:
<input type="radio" name="selection" value="example_string" />
second.html displays this string using {{selection}}. (In my views.py, I got the value using request.POST.get and render_to_response().)
The question is: how do I send this value from second.html to third.html? One of my attempts—using a <span> tag to save the information in a variable—is illustrated below, but it doesn't seem to work.
<span name="selection" value={{selection}}>{{selection}}</span>
Edit: The following line works by creating a dummy single radio button. I don't know why it shouldn't be possible to create a variable without an <input> tag [visible to the user].
<input type="radio" name="selected" value={{selected}} checked="checked" />
You need to understand how the web works: each page is entirely separate, and is requested using a separate request.
Your basic options are: save data on the client side, or post it back to the server.
Both options can be performed with javascript, or posting back can also be performed by posting the form back to the server.
If you want to send it back to the server, it will have to be stored in the current session, or in a model.
There are many javascript libraries. If you want to use them, I suggest you google around the subject.
Answering my own question, now that I've found the answer on Django's documentation.
There's a special kind of <input> tag precisely for this: "hidden". The following line accomplishes the same as was asked in the question, but without a dummy element visible to the user:
<input type="hidden" name="selected" value={{selected}} />

Could anyone tell me why / how this XSS vector works in the browser?

I have suffered a number of XSS attacks against my site. The following HTML fragment is the XSS vector that has been injected by the attacker:
<a href="mailto:">
<a href=\"http://www.google.com onmouseover=alert(/hacked/); \" target=\"_blank\">
<img src="http://www.google.com onmouseover=alert(/hacked/);" alt="" /> </a></a>
It looks like script shouldn't execute, but using IE9's development tool, I was able to see that the browser translates the HTML to the following:
<a href="mailto:"/>
<a onmouseover="alert(/hacked/);" href="\"http://www.google.com" target="\"_blank\"" \?="">
</a/>
After some testing, it turns out that the \" makes the "onmouseover" attribute "live", but i don't know why. Does anyone know why this vector succeeds?
So to summarize the comments:
Sticking a character in front of the quote, turns the quote into a part of the attribute value instead of marking the beginning and end of the value.
This works just as well:
href=a"http://www.google.com onmouseover=alert(/hacked/); \"
HTML allows quoteless attributes, so it becomes two attributes with the given values.

Pythonic way to find a regular expression match

Is there a more succinct/correct/pythonic way to do the following:
url = "http://0.0.0.0:3000/authenticate/login"
re_token = re.compile("<[^>]*authenticity_token[^>]*value=\"([^\"]*)")
for line in urllib2.urlopen(url):
if re_token.match(line):
token = re_token.findall(line)[0]
break
I want to get the value of the input tag named "authenticity_token" from an HTML page:
<input name="authenticity_token" type="hidden" value="WTumSWohmrxcoiDtgpPRcxUMh/D9m7O7T6HOhWH+Yw4=" />
Could you use Beautiful Soup for this? The code would essentially look something like so:
from BeautifulSoup import BeautifulSoup
url = "hhttp://0.0.0.0:3000/authenticate/login"
page = urlli2b.urlopen(page)
soup = BeautifulSoup(page)
token = soup.find("input", { 'name': 'authenticity_token'})
Something like that should work. I didn't test this but you can read the documentation to get it exact.
You don't need the findall call. Instead use:
m = re_token.match(line)
if m:
token = m.group(1)
....
I second the recommendation of BeautifulSoup over regular expressions though.
there's nothing "pythonic" with using regex. If you don't want to use BeautifulSoup(which you should ideally), just use Python's excellent string manipulation capabilities
for line in open("file"):
line=line.strip()
if "<input name" in line and "value=" in line:
item=line.split()
for i in item:
if "value" in i:
print i
output
$ more file
<input name="authenticity_token" type="hidden" value="WTumSWohmrxcoiDtgpPRcxUMh/D9m7O7T6HOhWH+Yw4=" />
$ python script.py
value="WTumSWohmrxcoiDtgpPRcxUMh/D9m7O7T6HOhWH+Yw4="
As to why you shouldn't use regular expressions to search HTML, there are two main reasons.
The first is that HTML is defined recursively, and regular expressions, which compile into stackless state machines, don't do recursion. You can't write a regular expression that can tell, when it encounters an end tag, what start tag it encountered on its way to that tag it belongs to; there's nowhere to save that information.
The second is that parsing HTML (which BeautifulSoup does) normalizes all kinds of things that are allowable in HTML and that you're probably not going to ever consider in your regular expressions. To pick a trivial example, what you're trying to parse:
<input name="authenticity_token" type="hidden" value="xxx"/>
could just as easily be:
<input name='authenticity_token' type="hidden" value="xxx"/>
or
<input type = "hidden" value = "xxx" name = 'authenticity_token' />
or any one of a hundred other permutations that I'm not thinking about right now.