How to Check if a website contains Flash - c++

I've created a web browser using mfc and i'm using IHhmlReader to read the contents of html when the user enters a url in the browser and page is completely loaded.Now i want to check if the webpage has any flash in it.
Any Helps would be highly appreciated.
Thank You.

I think this is a bit difficult to do, just reading from the HTML source, unless you try to instantiate the page and see if it's making a call to the Flash object. I have listed some options you can try, but you'll need to make sure that the code element is not commented out and check include files and iframes to see if Flash is called from there.
* Look for the OBJECT and EMBED tags (see http://kb2.adobe.com/cps/127/tn_12701.html)
* In page's JavaScript, look for SWFObject() call
* Look for the call to .swf file (could even be in an img tag)
Good luck...

Related

content empty when using scrapy

Thanks for everyone in advance.
I encountered a problem when using Scrapy on Python 2.7.
The webpage I tried to crawl is a discussion board for Chinese stock market.
When I tried to get the first number "42177" just under the banner of this page (the number you see on that webpage may not be the number you see in the picture shown here, because it represents the number of times this article has been read and is updated realtime...), I always get an empty content. I am aware that this might be the dynamic content issue, but yet don't have a clue how to crawl it properly.
The code I used is:
item["read"] = info.xpath("div[#id='zwmbti']/div[#id='zwmbtilr']/span[#class='tc1']/text()").extract()
I think the xpath is set correctly and I have checked the return value of this response and it indeed told me that there is nothing under this directory. Results shown here:'read': [u'<div id="zwmbtilr"></div>']
If it has something, there should be something between <div id="zwmbtilr"> and </div>.
Really appreciated if you guys share any thoughts on this!
I just opened your link in Firefox with NoScript enabled. There nothing inside the <div #id='zwmbtilr'></div>. If I enable the javascripts, I can see the content you want. So, as you already new, it is a dynamic content issue.
Your first option is try to identify the request generated by javascript. If you can do that, you can send the same request from scrapy. If you can't do it, the next option is usually to use some package with javascript/browser emulation or someting like that. Something like ScrapyJS or Scrapy + Selenium.

django - ckeditor bug renders text/string in raw html format

am using django ckeditor. Any text/content entered into its editor renders raw html output on the webpage.
for ex: this is rendered output of ckeditor field (RichTextField) on a webpage;
<p><span style="color:rgb(0, 0, 0)">this is a test file ’s forces durin</span><span style="color:rgb(0, 0, 0)">galla’s good test is one that fails Thereafter, never to fail in real environment. </span></p>
I have been looking for a solution for a long time now but unable to find one :( There are some questions which are similar but none of those have been able to help. It will be helpful if any changes suggested are provided with the exact location where it needs to be changed. Needless to say I am a newbie.
Thanks
You need to mark the relevant variable that contains the html snippet in your template as safe
Obviously you should be sure, that the text comes from trusted users and is safe, because with the safe filter you are disabling a security feature (autoescaping) that Django applies per default.
If your ckeditor is part of a comment form and your mark the entered text as safe, anybody with access to the form could inject Javascipt and other (potentially nasty) stuff in your page.
The whole story is explained pretty well in the official docs: https://docs.djangoproject.com/en/dev/topics/templates/#automatic-html-escaping

How to ensure users only embed SoundCloud iFrame?

I am building a social networking website for musicians and I would like them to be able to enter the embed code provided by SoundCloud, so that they may have a sound clip on their posts.
However, I am unsure how I would sanitise the input, to ensure that it's only a SoundCloud iframe embed code that they enter. I want to avoid them pasting in embed code for say, YouTube or anything else for that matter.
An example embed code from SoundCloud looks like:
<iframe width="100%" height="166" scrolling="no" frameborder="no" src="https://w.soundcloud.com/player/?url=http%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F85146642"></iframe>
I am using the HTML parser, jSoup to sanitise input.
The key fragment to this is the src content:
https://w.soundcloud.com/player/?url=http%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F85146642
One possibility I thought of, was to extract the src parameters value and then rebuild the iframe myself, this way, only storing the URL and ensuring that any HTML output to the browser is that which I have created myself. Doing this may also allow me to run checks on the domain name etc.
I'm wondering what the best approach would be for this?
Appreciate any input you may have.
Thanks,
Michael.
PS - I am using Railo (ColdFusion server) and the Java jSoup library, but I guess the same principles would apply regardless of what language one would use.

How to provide image data for embedded web control in C++

In my C++ app I'm embedding (via COM) a web browser (Internet Explorer) control (CLSID_WebBrowser).
I can display my own html in that control by using IHTMLDocument2::write() method but if the html has <img src="foo.png"> element, it's not displayed.
I assume there is a way for me to provide the data for foo.png somehow to the web control, but I can't find the right place to hook this functionality?
I need to be in full control of providing the content of foo.png, so work-arounds like using res:// protocol or saving to disk and using file:// protocol are not good enough. I just want to plug my code somehow so that when embedded CLSID_WebBrowser control sees <img src="foo.png"> in html data given with IHTMLDocument2::write() it will ask me to provide this data.
To answer my own question, the solution that finally worked for me is:
register custom IInternetProtocol/IInternetProtocolInfo/ via custom IClassFactory given to IInternetSession::RegisterNameSpace(). For reasons that seem like a bug to me, it has to be a protocol already known to IE (I've chosen "its") even though it would be much better if it was my own, unique namespace.
feed html data via custom IMoniker through IPersistentMoniker::Load() and make sure that IMoniker::GetDisplayName() (which is a base url according to which relative links in provided html will be resolved) starts with that protocol scheme (in my case "its://"). That way relative link "foo.png" in the html data will be its://foo.png to IE which will make urlmon call IInternetProtocol::Start() and IInternetProtocol::Read() to ask for the data for that url.
This is all rather complicated, you can look at the actual (BSD-licensed) code here:
http://code.google.com/p/sumatrapdf/source/browse/trunk/src/utils/HtmlWindow.cpp
You can embed a small webserver such as mongoose and reference those impage from there.
In mongoose, you can attach callback to specific path, thus returning images from C++ code.
We use this for our debugging tools, where each images is accessible from a web interface
The easiest solution would be a Data URI. You'd inline out the image directly with IHTMLDocument2::write().

How set a website as homepage in IE, Firefox, Chrome and Safari with C++?

Is there a way to set a website like google.com as homepage through C++ or C ? How ?
Not sure what your motive is, but I don't think of this as something I want any code on my system to be setting out from under me. It sounds like the kind of thing adware/malware would do to your grandparents (who wouldn't know how to fix it once it's set). Note the negative comments when the question was asked of how to do it from JavaScript:
How can I set default homepage in FF and Chrome via javascript?
It's better to point people at instructions for doing it themselves. Remind with a banner which says "Make us your homepage!", and link to something along these lines:
http://www.makeuseof.com/tag/how-to-change-your-homepage-in-5-browsers/
If not for the aesthetic reasons, there are technical reasons not to try and write code for it. Each browser stores this information in its own place. In IE's case, there appears to be a registry setting:
HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main\Start Page
So you'd use calls to the Windows Registry API to query it and set it. But Firefox doesn't save this in the registry, it saves it in something called prefs.js and you'll be looking for:
user_pref("browser.startup.homepage", .... );
Then there's Opera, Safari, Chrome, etc. All told, better to just give people directions and put them in control of their experience!
Imports Microsft.Win32
...
Module Util
Sub SetHomePage(Dim theUrl As String)
Registry.SetValue("HKCU\Software\Microsoft\Internet Explorer\Main", "Start Page", theUrl)
End Sub
End Module
Yes.
Find the way each browser saves its configuration to disk and edit that (*). It may be a file, or records in a database, or some data in a central registry, or some other scheme --- the browser documentation should tell you.
To open/read/write/save/close a file, the C functions declared in the header <stdio.h> may be helpful.
(*) for Firefox it's a file named "prefs.ini" in a directory somewhere under the users home path; there may be more than 1 such file if the user has more than 1 profile.