Simple interface for getting HTML content in Boost.Asio - c++

There are a lot of examples how to make HTTP request to a server and get reply via boost.asio library. However, I couldn't find a good example of simple interface and wondering, if I need to implement it myself.
For instance, if I need to get content of http://www.foo.bar/path/to/default.html, is there any way to get a content without validating URL, making HTTP request and parsing server answer?
Basically, I am looking for something like this:
std::string str = boost::asio::get_content("http://www.foo.bar/path/to/default.html");
std::cout << str;
#
<HTML>
<BODY>
Simple HTML page!
</BODY>
</HTML>
There are couple of things that I would like to avoid using boost.asio.
Avoid parsing and validating URL.
Manually creating HTTP request.
Cutting HTTP response from HTML page content.

Since then, there is a newcomer; the C++ Network Library: cpp-netlib as pointed out here.
You wanted to use asio. I suppose you fancied the portability and the ease of use of this lib, so cpp-netlib will be a great choice in that case. It is based on same principles as boost and its authors aim at integrating it into boost.
It is pretty simple to use:
http::client client;
/*<< Creates a request using a URI supplied on the command line. >>*/
http::client::request request("http://www.foo.bar/path/to/default.html");
/*<< Gets a response from the HTTP server. >>*/
http::client::response response = client.get(request);
/*<< Prints the response body to the console. >>*/
std::cout << body(response) << std::endl;
I haven't tried this one but it seems to be possible to do exactly what you need:
cout << body(client().get(client::request("http://www.foo.bar/path/to/default.html")));
This question was asked a long time ago, sorry for digging it out of its grave.

You'll need to implement these functions yourself. Boost.Asio is a socket library primarily, that can be used to implement various protocols. But there's no built-in convenience functions just for some specific protocol like HTTP or SMTP. (Well, actually there's built in DNS resolution, but that's about it.)
However, the Boost.Asio source code comes with pre-made examples of an HTTP client/server, so you can easily start with that.

boost.asio is powerful and sophisticated, but probably overkill for this.
Have you looked at libcurl?

From the person who wrote boost.asio
http://think-async.com/Urdl/doc/html/urdl/getting_started/integrating_with_boost_asio.html
boost::urdl is a library for reading urls into strings easily.

boost.asio doesn't provide such functionality. But I believe there are a number of libraries that do. See POCO libraries for example.

Meanwhile, Boost.Beast made an appearance, which wraps Boost.Asio to provide a simpler interface (not only) for HTTP GET.

Related

C++: Cloud computing library: is there such a library where I don't need to write much network stuff?

I want my server app to be able to send data to be processed by a bunch of various clients, and then have the processed data returned to the server.
Ideally, I'd have some call like some_process = send_to_client_for_calculating(connection, data)
I just need to be able to send a bunch of data to a client, tell the client what to do (preferably in the same message, which can be done with an array [command, data]), and then return the data...
I'm breaking up pieces of a neural network (tis very large), and then assembling them all later.
If I need to be clearer, let me know how.
I'm shocked no one has thrown it out there... how about boost::asio.
Why don't you have a look at using Apache ActiveMQ? It's a Java JMS server, but it has C++ bindings, and does what you want with a minimum of writing networking code. You basically just subscribe to messages, and send responses back. The MQ server takes care of dispatch and message persistence for you.
You could try using beanstalkd, a fast working queue. I don't know if it fits your purposes. There is a client library written in C, which you should be able to use from C++.
I'd suggest looking at gSOAP, which implements SOAP in C++, including networking.

C++ library for dealing with multiple HTTP connections

I'm looking for a library to deal with multiple simultaneous HTTP connections (pref. on a single thread) to use in C++ in Windows so it can be Win32 API based. This is for a client application that must process a list of requests but keep 4 running at all times until the list is complete.
So far, I have tried cURL (multi interface) which seems to be the most appropriate that I have found but my problem is that I may have a queue of 200 requests but I need to only run 4 of them at a time. This becomes problematic when one request may take 2 seconds and another may take 2 mins as you have to wait on all handles and receive the result of all requests in one block. If anyone knows a way round this it would be very useful.
I have also tried rolling my own using WinHTTP but I need to throttle the requests so they would ideally need to be on a single thread and use callbacks for data which WinHTTP does not do.
The best thing I've found which would solve all my problems is ASIHTTPRequest but unfortunately it's Mac OSX only.
Thanks,
J
Have you looked at boost.asio? http://www.boost.org/doc/libs/1_46_1/doc/html/boost_asio.html
Its meant to scale well and has http server examples:
http://www.boost.org/doc/libs/1_46_1/doc/html/boost_asio/examples.html
Did you tried Boost Asio ?
Is multiplataform and stellar performance, and with nice examples of HTTP.
http://www.boost.org/doc/libs/1_46_1/doc/html/boost_asio.html
Asio is a great library but it's generic, the HTTP examples are just that: examples, there's no support for redirection, authentication and so on.
I know of two libraries built on top of Boost & Asio that support the HTTP protocol: cpp-netlib and Pion Network Library but AFAIK neither directly supports what you want.
All that being said if you're comfortable with using libcurl it should be pretty easy to use the "easy" interface with callbacks and implement the requests queue yourself.
libcurl's multi interface supports exactly what you're asking for.

Download a file from the web in C++ without using non-standard libraries on Linux

When I say non-standard libraries, I am referring to things like Boost, libCurl and anything else that may be able to do this far easier than standard C++ can. The reason for this is that I am writing an application as a piece of coursework (the class is dedicated to C++) and I am required to use only standard libraries and functions.
I am looking to download a RSS file, using a URL that the user will supply (I'm building a rudimentary RSS client), and the biggest problem I'm facing is that I'm not sure how to get the file down. Once I get past that bit, parsing it for the xml tags and displaying the content will be relatively straightforward. I've been looking around and I've only found solutions that say to use non-standard libraries, usually libCurl. If someome could just give me a quick heads up about what I should be looking at for this, then I'd be grateful.
Also, if you think you're helping me cheat, you're not. The assignment is to build an application of our choice and we're being graded on our use of the various feature of the language (it must contain so many classes, use these variables types, etc).
Check out Beej's Guide to Network Programming for a quick but excellent introduction to sockets. If you cannot use any non-standard libraries, your only option is to manually connect on port 80 and make the request yourself.
Assuming even a beginner-level knowledge of C++, that should be all you need.
First off, it can't be done using only standard C++. There is no network interface in either standard C++ or standard C.
If you're required to take a "do-it-yourself" approach, then probably the intention is that you would use your platform's sockets API. In the case of linux, this is part of the POSIX standard, not C++, and is available from <sys/socket.h>.
The basic procedure is: parse the URL; look up the IP address of the domain; create a socket; connect the socket; write an HTTP request to the socket; read the HTTP response back from the socket; clean up.
Obviously, an HTTP library is far more convenient, especially since an HTTP download can get more complicated than what I describe above (for example, if the server responds with a redirect). Pretty much all linux distributions will provide libcurl, and/or the curl and wget programs.
Writing a program to make a socket connection is relatively trivial.
http://www.linuxhowtos.org/C_C++/socket.htm
Now that you have a socket open to an HTTP server you need to understand how to ask for a document and how to decode the reply:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html
Basically you need to send:
GET<SP><URL><SP>HTTP/1.1<CRLF>
Where:
SP: Single Space
CRLF: \r\n
URL: The Full URL of the page including the server name.
What you get back will be
http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6
HTTP/1.1<SP>200<SP>OK<CRLF>
(<Header><CRLF>)*
<CRLF>
<Document>
The above means:
The first line is the response line that should contain 200 OK.
If it does not then there is some kind of error and you should just give up.
This is followed by 0 or more header lines
Just ignore these lines
There will be 1 empty line to mark the end of the headers.
Then the document will be on the stream.
If you really want to do it without using libcurl you can always open a tcp socket and then send:
GET /myurl
(http 1.0 or preferably use http 1.1)
Basically you're writing a very simple http protocol client implementation.
You can download source code for wget standard utility
Since you are not allowed to use non-standard libraries, you could write your own primitive wrapper class for the linux "curl" command (I'm assuming you are using linux). Curl is a very powerful command, and it can probably do what you need it to.

Help streaming over http in C++

I'm looking to use a web service that offers a streaming api. This api can typically be used by the java method java.net.URL.openStream();
Problem is I am trying to design my program in C++ and have no idea what libraries (I've heard the cUrl library is very good at this sort of thing) to use, or how to use them to do what I want.
The idea is that after opening the file as a stream I can access continually updating data in realtime.
Any help would be much appreciated.
Boost.Asio socket iostreams seem to be what you're after. Your code will look like this:
ip::tcp::iostream stream("www.someserver.com", "http");
if (!stream)
{
// Can't connect.
}
// Use stream as a regular C++ input stream:
std::string text;
std::getline(stream, text);
If you're new to C++ and have no experience with iostreams then this page is an excellent source of information. In particular, check the docs of the istream class to see what kind of operations your Boost.ASIO stream will support. You'll find that they're not so different from those in the Java IO API.
EDIT: Eric is right, you'll have to send some requests to the server (using the same stream) so it's probably less similar to Java's openStream than I thought. The following example shows how to make those requests:
http://blog.think-async.com/2007_01_01_archive.html
It depends what you're after. Manuel's suggestion of boost::asio::ip::tcp::iostream is good if you want something at a lower level, directly returning the "raw" server response (However, I suspect that something is missing in the example provided in his answer: I think that a "GET" request should be written to the stream before reading from it. See this example from the Asio docs).
I have no experience with java.net.URL.openStream(), but it seems that it is at a little higher level in that only returns the body (and not the headers) of the reply, takes care of HTTP redirects, etc. In that case, yes, libcurl may be more what you want. You could also take a look at the cpp-netlib library, which is built on top of Boost.Asio. It is still in its infancy, but its http::client seems to already provide something pretty similar to what is provided by Java URL.openStream()

Download a URL in C++

I want to be able to download a URL in C++. Something as simple as:
std::string s;
s=download("http://www.example.com/myfile.html");
Ideally, this would include URLs like:
ftp://example.com/myfile.dat
file:///usr/home/myfile.dat
https://example.com/myfile.html
I was using asio in Boost, but it didn't really seem to have the code for handling protocols like ftp and https. Now I discovered QT has more what I need (http://doc.trolltech.com/2.3/network.html).
It's tempting to make the switch to Qt, but it seems a bit heavy and intersects a lot of Boost functionality. Is it worth learning yet another API (Qt) or can Boost do more than I think?
Not a direct answer, but you might like to consider libCURL, which is almost exactly what you describe.
There are sample applications here, and in particular this demonstrates how simple usage can be.
I wouldn't go to Qt just for the networking stuff, since it's really not all that spectacular; there are a lot of missing pieces. I'd only switch if you need the GUI stuff, for which it is top notch.
libCURL is pretty simple to use and more robust than the Qt stuff.
You can use URLDownloadToFile.
#include <Urlmon.h>
HANDLE hr;
hr=URLDownloadToFile(NULL, L"http://www.example.com/myfile.html",L"mylocalfile.html",BINDF_GETNEWESTVERSION,NULL);
According to MSDN, BINDF_GETNEWESTVERSION - is a "Value that indicates that the bind operation retrieves the newest version of the data or object available. In URL monikers, this flag maps to the WinInet flag, INTERNET_FLAG_RELOAD, which forces a download of the requested resource".
The Poco Project has classes for cross-platform HTTP and FTP (and a lot of other stuff). There is overlap with boost. I recently found this, but have not used it.
You can use the URLDownloadToFile or URLOpenBlockingStream, although cURL, libcurl are the proper tools for that kind of jobs.
I got it working without either libcurl nor WinSock: https://stackoverflow.com/a/51959694/1599699
Special thanks to Nick Dandoulakis for suggesting URLOpenBlockingStream! I like it.