I am learning boost/asio. I can do an endpoint, active and passive sockets. Now I want to write something like a simple client application, which will get specific data from web pages.
So I have few questions about that:
If I have a done an socket, which is related with a web page, how can specify some content on the page. For example, I want to get an image. There are many images on the page. Not only images. I want to identify specific image. How can i do that?(may be and "id" from html or some how else).
After that I want to get that specific image on my PC. How can I download it and save it?
If it is not image, if I want to work with audio file, video file, text, hyperlink and e.t.c. How to generalize it for any type of content?
How can I follow links on web page?
You also may use boost/beast in answer for this question.
offtop
(cpp is not good idea for dealing with that stuff, I know that)
Related
As the title implies,
I need to fetch data from certain website which need logins to use.
The login procedure might need cookies, or sessions.
Do I need QtWebkit, or can I get away with just QNetworkAccessManager?
I have no experience at both, and will start learning as I go.
So please save me a bit of time of comparing both ^^
Thank you in advance,
Evan
Edit: Having read some related answers,
I'll add some clarifications:
The website in concern does not have an API. So I will need to scrape web elements for the data myself.
Can I do that with just QNetworkAccessManager?
No, in most cases you don't need a full simulated web browser. In most cases, just performing the same web requests like a web browser would do is enough.
Try to record the web requests in your browser, using a plugin like "HTTP Live Headers" or "Firebug" in Firefox. I think Chrome provides a similar tool out of the box. These tools record the GET and POST requests done by the website when you send a form in the webpage.
Another option is to inspect the HTML code of the login page. Find the <form> tag and its fields. Put them together in a GET / POST request in your application to simulate the same form.
Remember that some pages use randomized "tokens" in their forms, some set the tokens as cookies. In such cases, you need to request the login page itself in your application first (before sending the filled in form). Both QWebView and QNetworkAccessManager have cookie support.
To sum things up, I think QWebView provides a far more elegant way to simulate user interaction with a web page. The manual way is, however, more "lightweight", as you don't need Webkit and your application might be faster (because only the HTML page is loaded, without any linked resources like images, CSS, javascript files).
QWebView as class name states is a view, so it views something (in this case web pages). If you don't need to display loaded page, then you don't need a view. QNetworkAccessManager may do the work, but you need some knowledge about HTTP protocol, and also anything about target site: how does it hande logins, what type of request you have to send to login etc.
I've made a basic HTML5/JS comic creation tool that uses the canvas element.
I want users to be able to upload their comics via the Facebook API.
I don't believe Facebook allows posting images in the form of base64 strings from the canvas.toDataURI() method, and don't want to use my own server to convert these images & temporarily store them.
What's the best way to go about this? Possibilities I've wondered about: Convert canvas to blob? Store blob via web service (if so, suggestions?) Upload blob directly to Facebook? (Is that possible?)
I don’t see why this should not be possible doing a „normal” upload. You can create a new photo for a user by posting to PROFILE_ID/photos, with a source parameter of type multipart/form-data.
So first thing I’d try is getting the picture info from the canvas object into a „normal” form (writing it into a input element in the right format(?)), and sending that to Facebook. If this step succeeds, I’d see if jQuery or some other lib’s form.serialize method can build requests of type multipart/form-data. If that’s also possible, then there should be no further problem in taking the data in that format and posting it using FB.api (although you might want to tell your users to be patient, because that might take a while).
Can’t tell for sure if this’ll work, but I’d give it a try.
Facebook partners with Heroku for free app hosting, you can use it as the temporary server.
I am about to develop a new feature on my website that allow the user to give me a URL then I would use this URL to get the site title, description and image(s) so that I store these information on my website. I need to know if there is any script that can do that or if there is a web service that would take the url and give me the information I need or shall I start developing this from scratch.
Also, I would like to know if there is any kind of standards used in the information sharing mechanism as I want to allow the user to share a video or photo from the web.
There is no single script that can extract information from all sites, because the source HTML for most websites is different. You will need to write code specifically for the sites you are scraping.
As for syndicating the content, you can use RSS (Really Simple Syndication), which is an XML format commonly used for sharing content.
Can someone please help me out as to how I can completely prevent the user's activity stream from posting back into their wall after they liked a linked? Because, I really find that to be annoying. Afterall, mine is an application that needs to be integrated into an image gallery viewer serving more than 7.5K photos each with its own like button.
If this seems impossible, is there a way to specifically set an image as thumbnail, description, etc as is with the cases of feed and send buttons?
Because, my application is purely dynamic in nature built out of 100% Javascript where more than 80% of its contents are generated by using Ajax calls under a static single URL.
As a result, the like button activity stream always end up pulling the wrong image and descriptions than desired(but this is not so for feed and send buttons),
Thank you
No, you won't have control over not sending items to peoples activity feeds when they click like, unless your domain gets blocked for spam. You would need to create a dynamic url or hashbang url for each independent image and when those images are requested the page hosting it would need to have the proper open graph meta tags sets for thumbnail image, description, etc. Then for each like button in the gallery, you would need to set the href property to this url.
Like the title suggested, I want to implement a two-steps picture uploading mechanism.
User selects a picture to upload and
click on "Upload". Once the server
receives the request that contains
the picture, it would save the
picture in a temporary location, on
disk or memory, and the resize it to
a standard size, if needed. The
server then renders a new response
for the user to preview the uploaded
picture.
After previewing the picture, the user needs to click on "Save" to confirm for the server. The server then moves the picture from the temporary location to a permanent one and updates the corresponding entry in the DB.
What's a good way to implement this? What are some of the apps out there that might be able to help me? Thanks.
I would suggest either rolling your own (you might find it surprisingly easy, start with reading the docs on forms) or customizing django-photologue or django-filebrowser.