iFrame srcdoc - possible to use dynamic content / regex? HTML5 - regex

I know the srcdoc in iframes is relatively new (hence why I've been unable to find much iformation on it), but I've been wondering (and hoping!) if it's possible to use it dynamically, i.e. to take a section of a website and display it in an iframe?
For example, say I'd want to take just the breaking news section of the BBC website where the content keeps changing and put that in an iframe, just that bit, is the srcdoc able to do that?

This is not possible, and this is also not what srcdoc is designed for. The srcdoc attribute is available in WebKit browsers to set the content of an iframe rather than have it load a separate page from the server. Learn more about it at w3schools.
If you have an iframe with a page from another server (like the BBC website) then you have no way of accessing that page to extract or manipulate its contents because of the browser's same-origin policy.
The way to do it is having a server-side script downloading the BBC page, manipulating it and sending it to the client.

Related

Is a web page without text fields and text areas immune to XSS?

If a web page comprises of drop downs,radio buttons,check boxes etc for user input and avoid text fields and text area to evade untrusted data(malicious javascript entered by the user).
Is such a web page immune to XSS?
If not how to secure such an application using ESAPI.
You can edit the form in your browser using for example Firebug and just add any field with any name.
Even more so, you can just forge whole post/get requests with any data you like (using curl or many other tools).
So: no, it is not.
Not necessarily. The input type doesn't matter, because requests can be spoofed (easy with GET, but not too hard with POST requests). What matters is that the result of the form is sanitized before inserting it into the page.

Is QtWebkit needed to fetch data from websites that need login?

As the title implies,
I need to fetch data from certain website which need logins to use.
The login procedure might need cookies, or sessions.
Do I need QtWebkit, or can I get away with just QNetworkAccessManager?
I have no experience at both, and will start learning as I go.
So please save me a bit of time of comparing both ^^
Thank you in advance,
Evan
Edit: Having read some related answers,
I'll add some clarifications:
The website in concern does not have an API. So I will need to scrape web elements for the data myself.
Can I do that with just QNetworkAccessManager?
No, in most cases you don't need a full simulated web browser. In most cases, just performing the same web requests like a web browser would do is enough.
Try to record the web requests in your browser, using a plugin like "HTTP Live Headers" or "Firebug" in Firefox. I think Chrome provides a similar tool out of the box. These tools record the GET and POST requests done by the website when you send a form in the webpage.
Another option is to inspect the HTML code of the login page. Find the <form> tag and its fields. Put them together in a GET / POST request in your application to simulate the same form.
Remember that some pages use randomized "tokens" in their forms, some set the tokens as cookies. In such cases, you need to request the login page itself in your application first (before sending the filled in form). Both QWebView and QNetworkAccessManager have cookie support.
To sum things up, I think QWebView provides a far more elegant way to simulate user interaction with a web page. The manual way is, however, more "lightweight", as you don't need Webkit and your application might be faster (because only the HTML page is loaded, without any linked resources like images, CSS, javascript files).
QWebView as class name states is a view, so it views something (in this case web pages). If you don't need to display loaded page, then you don't need a view. QNetworkAccessManager may do the work, but you need some knowledge about HTTP protocol, and also anything about target site: how does it hande logins, what type of request you have to send to login etc.

Does Facebook support Hash Bang #! Ajax Crawlable Urls?

Does Facebook support Google's ajax crawling specification and, if so, what do you need to do to implement it?
I am trying to get the Facebook "Like" button to work with AJAX crawlable urls as defined here: code.google.com/web/ajaxcrawling/docs/specification.html
I have this url which I can go to directly and it loads. Note the "#!" in the url:
http://www.idkshouldi.com/?#!idkDetails_idkKey=agppZGtzaG91bGRpcmMLEiljb21faWRrc2hvdWxkaV93ZWJfc2VydmVyX2dhZV9vYmpfSWRrVXNlciIDamltDAsSKWNvbV9pZGtzaG91bGRpX3dlYl9zZXJ2ZXJfZ2FlX29ial9JZGtJdGVtGN6kBgw
When I "Like" this page it should crawl this "escaped fragment" url:
http://www.idkshouldi.com/?_escaped_fragment_=idkDetails_idkKey=agppZGtzaG91bGRpcmMLEiljb21faWRrc2hvdWxkaV93ZWJfc2VydmVyX2dhZV9vYmpfSWRrVXNlciIDamltDAsSKWNvbV9pZGtzaG91bGRpX3dlYl9zZXJ2ZXJfZ2FlX29ial9JZGtJdGVtGN6kBgw
Why won't it crawl this page? The Facebook linter is not properly crawling my page. If one uses the Facebook linter tool here: developers.facebook.com/tools/debug
It won't properly crawl an AJAX enabled URL with the "#!" in it. This is Google's specification. What Facebook's lint crawler needs to do is to replace the "#!" with "_escaped_fragment_". It doesn't appear to do that with my AJAX enabled links.
This is also a big problem for me, but unfortunately it appears Facebook does not support this Google URL notation. Facebook's crawler/parser does not translate from hash bang (#!) to an _escaped_fragment_ format URL.
Like you I have tested my page on Facebook's URL linter and it only picks up static Open Graph tags within the dynamic original page, rather than the page-specific Open Graph tags in the _escaped_fragment_ server-side variant of my page. Unfortunately, this means that Facebook sees my Open Graph tags as site-specific, rather than page specific.
It is rather an irony that this appears to be unsupported as Facebook uses this approach itself to allow Google's crawlers to pick up Facebook pages.
One potential workaround, that may help you a little bit, is:
1) Use your _escaped_fragment_ page version in Facebook links
2) Add an automatic redirect to your _escaped_fragment_ variant to the proper version.
This should mean that Facebook will pick up the proper meta tags, and the user will click the link and end up on the correct page. The downside of this approach is that the user has to know the rather ugly _escaped_fragment_ URL. In other words, it will probably only be you that knows it, unless you add some sort of 'generate shareable link' button to your page.
It is surely only a matter of time before Facebook adds support for this as single-page hash bang sites are only going to become more prevalent.

Retrieve information from URL to share it on my website

I am about to develop a new feature on my website that allow the user to give me a URL then I would use this URL to get the site title, description and image(s) so that I store these information on my website. I need to know if there is any script that can do that or if there is a web service that would take the url and give me the information I need or shall I start developing this from scratch.
Also, I would like to know if there is any kind of standards used in the information sharing mechanism as I want to allow the user to share a video or photo from the web.
There is no single script that can extract information from all sites, because the source HTML for most websites is different. You will need to write code specifically for the sites you are scraping.
As for syndicating the content, you can use RSSĀ (Really Simple Syndication), which is an XML format commonly used for sharing content.

cookie or localStorage with chrome extensions

I've read all the other q's here regarding the topic but couldn't solve my problem.
I'm setting on my website the email of the user in the localStorage and i want to retrieve it in the extension.
localStorage.setItem("user", "andrei.br92#gmail.com" );
But when i try to receive it with the chrome extension it fails to do so
value = localStorage.getItem("user");
Which way is easier ? cookies localstorage ? im not pretentious
Please see this:
http://code.google.com/chrome/extensions/content_scripts.html#host-page-communication
Content scripts are run in a separate JavaScript world, which means the content script's localStorage is different from the website's localStorage. The only thing the two share is the DOM, so you'll need to use DOM nodes/events to communicate.
Use chrome.storage.local instead of localstorage. Content scripts using chrome.storage see the same thing that the extension page sees. More at https://developer.chrome.com/extensions/storage.html
Please see the information on Chrome content scripts. I'm betting you fell into the same initial trap that I did -- trying to read localStorage from an page-injected script, yes?
You do not want to use cookies when localstorage can do. That is because
Cookies can be accessed/modified through background page only.
Cookies are stored in context of a url/domain and not extension. So you will have to store a cookie for every domain that you wish to operate upon.
With every HttpRequest all the cookies associated with corresponding url/domain gets transmitted to server, so in effect you will be adding overhead to user's requests.)