Get the HTML of a site [closed] - c++

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm trying to get into a string (or a char[]) the html of a page...( and such)
I know how to use basic sockets, and connect as a client/server...
I've wrote a client in the past, that gets an ip & port, and connects to it, and send images and such using sockets betwen the client & the server...
I've searched the internet a bit, and found I can connect to the website, and send a GET request, to get the HTTP content of a page and store it in a variable, though I have a few problems :
1) I'm trying to get the HTML of a page that isnt the main page of a site, like, not stackoverflow.com, but stackoverflow.com/help and such (not the "official page of the site", but something inside that site)
2) I'm not sure how to either send or store the data I got from the GET request...
I saw there are outside libraries I could use, but I rather use sockets only...
By the way - I'm using Windows 7, and I aim that it'll work on Windows only(so it's fine if it wont work for Linux)
Thanks for you'r help! :)

To access a resource on some host you just specify the path to the resource in the first line of the request, just after the 'GET'. E.g. check http://www.jmarshall.com/easy/http/#http1.1
GET /path/file.html HTTP/1.1
Host: www.host1.com:80
[blank line here]
I'd also recomend using some portable library like Boost.ASIO instead of sockets. But I'd strongly recomend you to use some existing, portable library implementing HTTP protocol. Of course only if it is not a matter of learning how to implement it.
Even if you want to implement it by yourself it'd be worth knowing the existing solutions. For instance this is how you can get a webpage using cpp-netlib (http://cpp-netlib.org/0.10.1/index.html):
using namespace boost::network;
using namespace boost::network::http;
client::request request_("http://127.0.0.1:8000/");
request_ << header("Connection", "close");
client client_;
client::response response_ = client_.get(request_);
std::string body_ = body(response_);
This is how you can do it using cURL library (http://curl.haxx.se/libcurl/c/simple.html):
#include <stdio.h>
#include <curl/curl.h>
int main(void)
{
CURL *curl;
CURLcode res;
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://example.com");
/* example.com is redirected, so we tell libcurl to follow redirection */
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
/* Perform the request, res will get the return code */
res = curl_easy_perform(curl);
/* Check for errors */
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));
/* always cleanup */
curl_easy_cleanup(curl);
}
return 0;
}
Both libraries are portable but if you'd like to use some Windows-specific API you might check WinINet (http://msdn.microsoft.com/en-us/library/windows/desktop/aa383630%28v=vs.85%29.aspx) but it's less pleasant to use.

Related

Is it possible to do API streaming using curl library similar to python's request (or some other C++ lib)?

I have a python test script that performs an API streaming using requests.post(). It looks like so:
response = requests.post(url_events, data="XYZ", stream=True, headers = {"A":"B"})
if (response.ok):
for chunk in response.iter_content(chunk_size=256):
print chunk
I'm trying to figure out how can I have the same logic but using C++. From what I found the curl library may help, however I cannot find how to pass data field. This is the code I have so far:
CURL* connection = curl_easy_init();
// set url
curl_easy_setopt(connection, CURLOPT_URL, url_events);
// set header
struct curl_slist* headers = NULL;
headers = curl_slist_append(headers, "A:B");
code = curl_easy_setopt(connection, CURLOPT_HTTPHEADER, headers);
// set streaming callback that will print every received message
curl_easy_setopt(connection, CURLOPT_WRITEFUNCTION, printCallback);
// start connection
code = curl_easy_perform(connection);
// ...
curl_easy_cleanup(connection);
curl_slist_free_all(headers);
I was looking through the curl.h file trying to find how to specify the data field, but nothing seems to fit (based on the name)?
Am I on the right track? Would using curl be the right approach for my task, or should I be looking into some other C,C++ libraries? An example that does the same task as above request.post() is appreciated, or a suggestion how to achieve the same using curl.

How to send a request to the WooCommerce API

I'm currently building a solution for a company as an intern, and I need to use the WooCommerce REST API features in my C++ project to send data to the website.
I've so far, after 2 long painful days, managed to install the cURL library (through vcpkg) and tested the library a bit with the many examples that you can find on the internet. But for now, what I found doesn't seem to match with what the people at WooCommerce put in their documentation.
For example, in this section, they show how to create a product on the platform using cURL, but I can't understand how to translate it in cURL language inside the C++ project. Heck, the command doesn't even work when I use it in the command prompt with my parameters.
#include <curl/curl.h>
#include <string>
// cUrl declaration
CURL* curl;
CURLcode res;
std::string readBuffer;
std::string URL = "http://www.example.com";
curl_global_init(CURL_GLOBAL_ALL);
curl = curl_easy_init();
if (curl) {
curl_easy_setopt(curl, CURLOPT_URL, URL);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
res = curl_easy_perform(curl);
// Check for errors
if (res != CURLE_OK) {
std::string error = "curl_easy_perform() failed: ";
error += curl_easy_strerror(res);
error += "\nImpossible de se connecter au site WooCommerce fourni. Veuillez verifier vos paramètres et redémarrer l'application.";
wxMessageBox(error);
}
else {
std::string success = "Connexion au domaine ";
success += URL;
success += " réussie.\nPour changer de domaine, veuillez consulter la page Paramètres.";
wxMessageBox(success);
}
}
// cleanup
curl_easy_cleanup(curl);
curl_global_cleanup();
This code works fine, I know that I have to add the company's website instead of the example, but I can't figure out where to add my client key and client secret (basically like in the example shown on the WooCommerce doc). The basic cURL commands work fine in my local command prompt, but the example doesn't event work.
I know that my request for help may be kind of basic and easy to solve but I just spent the last 2 days and a half working on this and I'm starting to lose it.
Thanks for your help, I tried to speak the best english I could, so sorry in advance for any typo, or sorry if my post doesn't live up to the presentation standards of this platform, I'm kinda new around here :D
Ok, I've figured it out, for those who pass by and may have the same problem as I had. The commands you do with cURL in the terminal and with the library are totally different :
In the command prompt, you got to enter curl -X POST https://blablablabla
In the C++ library, you have to call the curl_easy_setopt() function with parameters to specify each component of the request : CURLOPT_URL is your main domain, CURLOPT_POSTFIELDS is the data you want to POST, and there are other parameters such as CURLOPT_WRITEFUNCTION, CURLOPT_WRITE_DATA,... etc. that handles the response from the server.
For me, this example was really useful, I don't know how I could have missed it :D Thanks Jesper Juhl for the advice, it is crucial to understand how HTTP and HTTPS works to figure this out.

How to submit a from in this page c++ [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want to create a void that submit a form in this page: https://fs9.formsite.com/9jr4Rm/x1ncox1ipr/
how to do it?
void sendForm(char *answer)
{
//??
}
You can also do something like this: https://fs9.formsite.com/9jr4Rm/x1ncox1ipr/fill?1=hello
I don't want the program to open a site, just to send the data.
I'm not exactly sure what you're planning to achieve, but here's my guess:
Your easiest solution is to use libcurl (available on various OSes):
Documentation is here: https://curl.haxx.se/libcurl/c/
Or, if you prefer examples (I do) here's a simple example to post some stuff on a remote server using http:
https://curl.haxx.se/libcurl/c/http-post.html
If you want to use GET (like the ?1=hello parameter), you can try the same without the CURLOPT_POSTFIELDS line. Just use:
curl_easy_setopt(curl, CURLOPT_URL, "https://fs9.formsite.com/9jr4Rm/x1ncox1ipr/fill?1=hello");
If you can't (or don't want to) use curl, you can implement HTTP over a simple socket connection. I've done that on an arduino, no problem. Connect to fs9.formsite.com's TCP port 80 (it's the default), send:
GET /9jr4Rm/x1ncox1ipr/fill?1=hello HTTP/1.1\r\n
Host: fs9.formsite.com\r\n
Connection: close\r\n
\r\n
(where \r\n is a DOS-style line ending, but most of the time \n works fine)
You could tell us more about the system you'd like this to run on for better responses (e.g. OS, maybe compiler name would be nice).
Edit: I've modified the above curl example. Here it is:
#include <stdio.h>
#include <curl/curl.h>
CURL *curl;
CURLcode res;
void sendForm(const char *answer)
{
curl = curl_easy_init();
if(curl)
{
curl_easy_setopt(curl, CURLOPT_URL, answer);
res = curl_easy_perform(curl);
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res));
curl_easy_cleanup(curl);
}
else
{
fprintf(stderr, "curl is NULL\n");
}
}
int main(void)
{
curl_global_init(CURL_GLOBAL_ALL);
sendForm("https://fs9.formsite.com/9jr4Rm/x1ncox1ipr/fill?1=hello");
curl_global_cleanup();
return 0;
}
Then you can compile it with whatever you use. I use gcc on Linux:
gcc curltest.c -l curl -o curltest
Then run:
./curltest
Just one more suggestion: don't use void for this return type. Network connection should always be considered unreliable, and returning a success/error code is nice.

Sending and receiving strings over http via curl

I have a situation where my program on a server (windows machine) outputs some strings. I need to send those strings from the server to the client via HTTP using curl. Once sent I am to receive the data on the client side as string, decode it and perform subsequent actions.
I already achieved this functionality using C Sockets using berkely API as I had familiarity with that. But for some reason I am not allowed to use a program of my own.
I poked around and seems CURL can be my solution. However I am very new to curl and cant seem to figure out how to achieve this functionality. On the Client side I found this to be useful may be:
#include <stdio.h>
#include <curl/curl.h>
int main(void)
{
CURL *curl;
CURLcode res;
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://example.com");
/* Perform the request, res will get the return code */
res = curl_easy_perform(curl);
/* Check for errors */
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));
/* always cleanup */
curl_easy_cleanup(curl);
}
return 0;
}
I understand that you have to use the write back functions to receive data ?
Also on the client side I need to develop a program using curl that whenever the server sends over a string, it should receive it and decode it. Any pointers to tutorials related to the specific problems will be highly appreciated. Or if someone has already tried this I'll highly appreciate any help here.
Thanks.
Take a look at this example code from their site. It details how to get your response data written to a region of memory rather than a file:
http://curl.haxx.se/libcurl/c/getinmemory.html
also take a look at the generic tutorial on the curl website:
http://curl.haxx.se/libcurl/c/libcurl-tutorial.html
one final thing to consider, if using C++ you need to make sure your callbacks are not non static member functions (see here libcurl - unable to download a file)
This should get you started at least.

Downloading a file from URL to disk in C++

I have a simple question. Is it possible to write simple code to download a file from the internet (from URL to disk) without using C++ (for mac osx) libraries like curl?
I have seen some examples but all of these use the Curl library.
i use this code on my xcode projet..but i have some compilation (linking) errors
#define CURL_STATICLIB
#include <stdio.h>
#include <curl/curl.h>
#include <curl/types.h>
#include <curl/easy.h>
#include <string>
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written;
written = fwrite(ptr, size, nmemb, stream);
return written;
}
int main(void) {
CURL *curl;
FILE *fp;
CURLcode res;
char *url = "http://localhost/aaa.txt";
char outfilename[FILENAME_MAX] = "bbb.txt";
curl = curl_easy_init();
if (curl) {
fp = fopen(outfilename,"wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
}
return 0;
}
how can i link the curl library to my xcode project?
You can launch a console command, it is very simple :D
system("curl -o ...")
or
system("wget ...")
"Downloading a file from URL" means basically doing an GET request to some remote HTTP server. So you need to have your application know how to do that HTTP request.
But HTTP is now a quite complex protocol. Its specification alone is long and complex (more than a hundred pages). libcurl is a good library implementing it.
Why do you want to avoid using a good free library implementing a complex protocol? Of course, you could implement the complex HTTP protocol by yourself (probably that needs years of work), or make a minimal program which don't implement all the details of HTTP protocol but might work (but won't work with weird HTTP servers).
You have to learn bits of "socket programming" and implement a very basic HTTP protocol; the minimalist thing is to send string like "GET /this/path/to/file.png HTTP/1.0\r\n" to the site; then, likely it will answer with an HTTP header you have to parse to know at least the length of the binary data following (if the request succeeded, otherwise you have to handle HTTP errors, or a unexpected contet-type like a html page).
This guide should give you the basic to start with; about HTTP, it depends on your need, sometimes sending a "raw" GET could suffice, sometimes not.
EDIT
Changed to pretend that the request comes from a HTTP/1 compliant client, since HTTP/1.1 wants the Host header to be sent, as commenter has rightly pointed.
EDIT2
The OP changed the question, which became something about how to link with a library in Xcode. There's already a similar question on SO.