My current curl setup to call a webpage, save it into a string, and reiterate the process after sleeping for a second. This is the code to write into the string:
#include <curl/curl.h>
#include <string>
#include <iostream>
#include <thread>
#include <chrono>
size_t curl_writefunc(void* ptr, size_t size, size_t nmemb, std::string* data)
{
data->append((const char*)ptr, size * nmemb);
return size * nmemb;
}
void curl_handler(std::string& data)
{
int http_code = 0;
CURL* curl;
// Initialize cURL
curl = curl_easy_init();
// Set the function to call when there is new data
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, curl_writefunc);
// Set the parameter to append the new data to
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data);
// Set the URL to download; just for this question.
curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com/");
// Download
curl_easy_perform(curl);
// Get the HTTP response code
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &http_code);
// Clean up
curl_easy_cleanup(curl);
curl_global_cleanup();
}
int main()
{
bool something = true;
std::string data;
while (something)
{
curl_handler(data);
std::cout << data << '\n';
data.clear();
std:: this_thread:: sleep_for (std:: chrono:: seconds(1));
}
}
However it runs into a problem about 20 minutes into runtime and this is the message it confronts me with:
140377776379824:error:02001018:system library:fopen:Too many open files:bss_file.c:173:fopen('/etc/ssl/openssl.cnf','rb')
140377776379824:error:2006D002:BIO routines:BIO_new_file:system lib:bss_file.c:178:
140377776379824:error:0E078002:configuration file routines:DEF_LOAD:system lib:conf_def.c:199:
It seems to stem from an openssl file, that does not close once it has fullfilled its task in the single iteration. If iterated more than once, the open files add up and are bound to enter into an error at some point.
I am still much of a beginner programmer, and therefore don't want to start messing with openSSL, so I came here to ask, wether there is a solution for this kind of problem. Could it be solved by declaring the curl object outside of the recalled function?
What has to be done is simply declaring the handle and its settings before getting the data. Only the actual download and its accompanying response is then reiterated in the loop. It is encouraged to re-use a handler as often as needed, since part of its resources (like the files opened in this session), may need to redeployed again.
Related
Goal: To send requests to the same URL without having to wait for the request-sending function to finish executing.
Currently when I send a request to a URL, I have to wait around 10 ms for the server's response before sending another request using the same function. The aim is to detect changes on a webpage slightly faster than the program currently is doing, so for the WHILE loop to behave in a non-blocking manner.
Question: Using libcurl C++, if I have a WHILE loop that calls a function to send a request to a URL, how can I avoid waiting for the function to finish executing before sending another request to the SAME URL?
Note: I have been researching libcurl's multi-interface but I am struggling to determine if this interface is more suited to parallel requests to multiple URLs rather than sending requests to the same URL without having to wait for the function to finish executing each time. I have tried the following and looked at these resources:
an attempt at multi-threading a C program using libcurl requests
How to do curl_multi_perform() asynchronously in C++?
http://www.godpatterns.com/2011/09/asynchronous-non-blocking-curl-multi.html
https://curl.se/libcurl/c/multi-single.html
https://curl.se/libcurl/c/multi-poll.html
Here is my attempt at sending a request to one URL, but I have to wait for the request() function to finish and return a response code before sending the same request again.
#include <vector>
#include <iostream>
#include <curl/curl.h>
size_t write_callback(char *ptr, size_t size, size_t nmemb, void *userdata) {
std::vector<char> *response = reinterpret_cast<std::vector<char> *>(userdata);
response->insert(response->end(), ptr, ptr+nmemb);
return nmemb;
}
long request(CURL *curl, const std::string &url) {
std::vector<char> response;
long response_code;
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response);
auto res = curl_easy_perform(curl);
// ...
// Print variable "response"
// ...
return response_code;
}
int main() {
curl_global_init(CURL_GLOBAL_ALL);
CURL *curl = curl_easy_init();
while (true) {
// blocking: request() must complete before executing again
long response_code = request(curl, "https://example.com");
// ...
// Some condition breaks loop
}
curl_easy_cleanup(curl);
curl_global_cleanup();
return 0;
}
I'm at a point where I have tried to understand the multi-interface documentation as best as possible, but still struggle to fully understand it / determine if it's actually suited to my particular problem. Apologies if this question appears to have not provided enough of my own research, but there are gaps in my libcurl knowledge I'm struggling to fill.
I'd appreciate it if anyone could suggest / explain ways in which I can modify my single libcurl example above to behave in a non-blocking manner.
EDIT:
From libcurl's C implemented example called "multi-poll", when I run the below program the URL's content is printed, but because it only prints once despite the WHILE (1) loop I'm confused as to whether or not it is sending repeated non-blocking requests to the URL (which is the aim), or just one request and is waiting on some other change/event?
#include <stdio.h>
#include <string.h>
/* somewhat unix-specific */
#include <sys/time.h>
#include <unistd.h>
/* curl stuff */
#include <curl/curl.h>
int main(void)
{
CURL *http_handle;
CURLM *multi_handle;
int still_running = 1; /* keep number of running handles */
curl_global_init(CURL_GLOBAL_DEFAULT);
http_handle = curl_easy_init();
curl_easy_setopt(http_handle, CURLOPT_URL, "https://example.com");
multi_handle = curl_multi_init();
curl_multi_add_handle(multi_handle, http_handle);
while (1) {
CURLMcode mc; /* curl_multi_poll() return code */
int numfds;
/* we start some action by calling perform right away */
mc = curl_multi_perform(multi_handle, &still_running);
if(still_running) {
/* wait for activity, timeout or "nothing" */
mc = curl_multi_poll(multi_handle, NULL, 0, 1000, &numfds);
}
// if(mc != CURLM_OK) {
// fprintf(stderr, "curl_multi_wait() failed, code %d.\n", mc);
// break;
// }
}
curl_multi_remove_handle(multi_handle, http_handle);
curl_easy_cleanup(http_handle);
curl_multi_cleanup(multi_handle);
curl_global_cleanup();
return 0;
}
You need to move curl_multi_add_handle and curl_multi_remove_handle inside the
while loop. Below is the extract from curl documentation https://curl.se/libcurl/c/libcurl-multi.html
When a single transfer is completed, the easy handle is still left added to the >multi stack. You need to first remove the easy handle with curl_multi_remove_handle >and then close it with curl_easy_cleanup, or possibly set new options to it and add >it again with curl_multi_add_handle to start another transfer.
Scenario:
Before updating at a scheduled time, a web page has a HTTP status code of 503. When new data is added to the page after the scheduled time, the HTTP status code changes to 200.
Goal:
Using a non-blocking loop, to detect this change in the HTTP status code from 503 to 200 as fast as possible. With the current code seen further below, a WHILE loop successfully listens for the change in HTTP status code and prints out a success statement. Once 200 is detected, a break statement stops the loop.
However, it seems that the program must wait for a response every time a HTTP request is made before moving to the next WHILE loop iteration, behaving in a blocking manner.
Question:
Using libcurl C++, how can the below program be modified to transmit requests (to a single URL) to detect a HTTP status code change without having to wait for the response before sending another request?
Please note: I am aware that excessive requests may be deemed as unfriendly (this is an experiment for my own URL).
Before posting this question, the following SO questions and resources have been consulted:
How to do curl_multi_perform() asynchronously in C++?
Is curl_easy_perform() synchronous or asynchronous?
http://www.godpatterns.com/2011/09/asynchronous-non-blocking-curl-multi.html
https://curl.se/libcurl/c/multi-single.html
https://curl.se/libcurl/c/multi-poll.html
What's been tried so far:
Using multi-threading with a FOR loop in C to repeatedly call function to detect HTTP code change, which had a slight latency advantage. See code here: https://pastebin.com/73dBwkq3
Utilised OpenMP, again when using a FOR loop instead of the original WHILE loop. Latency advantage wasn't substantial.
Using the libcurl documentation C tutorials to try to replicate a program that listens to just one URL for changes, using the asynchronous multi-interface with difficulty.
Current attempt using curl_easy_opt:
#include <iostream>
#include <iomanip>
#include <vector>
#include <string>
#include <curl/curl.h>
// Function for writing callback
size_t write_callback(char *ptr, size_t size, size_t nmemb, void *userdata) {
std::vector<char> *response = reinterpret_cast<std::vector<char> *>(userdata);
response->insert(response->end(), ptr, ptr+nmemb);
return nmemb;
}
long request(CURL *curl, const std::string &url) {
std::vector<char> response;
long response_code;
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response);
auto res = curl_easy_perform(curl);
if (response_code == 200) {
std::cout << "SUCCESS" << std::endl;
}
return response_code;
}
int main() {
curl_global_init(CURL_GLOBAL_ALL);
CURL *curl = curl_easy_init();
while (true) {
long response_code = request(curl, "www.example.com");
if (response_code == 200) {
break; // Page updated
}
}
curl_easy_cleanup(curl);
curl_global_cleanup();
return 0;
}
Summary:
Using C++ and libcurl, does anyone know how a WHILE loop can be used to repeatedly send a request to one URL only, without having to wait for the response in between sending requests? The aim of this is to detect the change as quickly as possible.
I understand that there is ample libcurl documentation, but have had difficulties grasping the multi-interface aspects to help apply them to this issue.
/* get us the resource without a body - use HEAD! */
curl_easy_setopt(curl, CURLOPT_NOBODY, 1L);
If HEAD does not work for you, the server may reject HEAD, another solution:
size_t header_callback(char *buffer, size_t size, size_t nitems, void *userdata) {
long response_code = 0;
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code);
if (response_code != 200)
return 0; // Aborts the request.
return nitems;
}
curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, header_callback);
The second solution will consume network traffic, the HEAD is much better, once you receive 200, you can request GET.
I'm trying to teach myself C++ by writing a simple program that sends a cURL request to a JSON API, parses the data and then stores it either in a text document or database for a web application to access. I have done this task in PHP and figured C++ wouldn't be much harder but I can't even get cURL to return a string and display it.
I get this to compile with no errors, but the response "JSON data: " doesn't display anything where the JSON data should be.
Where did I go wrong? URL-to-API is the actual URL, so I believe I'm using a wrong setopt function, or not setting one. In PHP, "CURLOPT_RETURNTRANSFER" made it return as a string, but I get an error with it:
error: ‘CURLOPT_RETURNTRANSFER’ was not declared in this scope
curl_easy_setopt(curl, CURLOPT_RETURNTRANSFER, true);
I'm using g++ compiler on Ubuntu and added -lcurl to the command line argument.
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <string>
#include <curl/curl.h>
//#include "json.hpp"
using namespace std;
//using json = nlohmann::json;
size_t WriteCallback(char *contents, size_t size, size_t nmemb, void *userp) {
((std::string*)userp)->append((char*)contents, size * nmemb);
return size * nmemb;
}
string getJSON(string URL) {
CURL *curl;
CURLcode res;
string readBuffer;
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_HEADER, 1);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, false);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, true); // follow redirect
//curl_easy_setopt(curl, CURLOPT_RETURNTRANSFER, true); // return as string
curl_easy_setopt(curl, CURLOPT_HEADER, false);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(curl, CURLOPT_URL, URL);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
res = curl_easy_perform(curl);
/* always cleanup */
curl_easy_cleanup(curl);
return readBuffer;
}
return 0;
}
int main() {
string data = getJSON("URL-to-api");
cout << "JSON Data: \n" << data;
return 0;
}
When I uncomment the JSON for Modern C++ include and namespace line I get this error:
error This file requires compiler and library support for the ISO C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
Along with a bunch of errors for functions in that library. I just downloaded the most recent version of g++ before embarking on this project, so what do I need to do?
I'm using g++ 5.4.0 on Ubuntu.
UPDATE:
So I added a check under res = curl_easy_perform(curl) and it doesn't return the error message, and res gets displayed as 6. This seems to be much more difficult than it should be:
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <string>
#include <curl/curl.h>
//#include "json.hpp"
using namespace std;
//using json = nlohmann::json;
size_t WriteCallback(char *contents, size_t size, size_t nmemb, void *userp) {
((std::string*)userp)->append((char*)contents, size * nmemb);
return size * nmemb;
}
string getJSON(string URL) {
CURL *curl;
CURLcode res;
string readBuffer;
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_HEADER, 1);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, false);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, true); // follow redirect
curl_easy_setopt(curl, CURLOPT_HEADER, false);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(curl, CURLOPT_URL, URL);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
res = curl_easy_perform(curl);
cout << res << endl;
if (!res) {
cout << "cURL didn't work\n";
}
/* always cleanup */
curl_easy_cleanup(curl);
curl = NULL;
return readBuffer;
}
}
int main() {
string data = getData("");
cout << "JSON Data: \n" << data;
return 0;
}
I get the following output when I run the program:
6
JSON Data:
In PHP "CURLOPT_RETURNTRANSFER" made it return as a string but I get an error:
error: ‘CURLOPT_RETURNTRANSFER’ was not declared in this scope
curl_easy_setopt(curl, CURLOPT_RETURNTRANSFER, true);
There is no CURLOPT_RETURNTRANSFER option documented for curl_easy_setopt(). I think that is an option specify to PHP's curl_exec() function, which doesn't exist in CURL itself. CURLOPT_WRITEFUNCTION is the correct way to go in this situation.
When I uncomment the JSON for Modern C++ include and namespace line I get:
error This file requires compiler and library support for the ISO C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
The error is self-explanatory. Your JSON library requires C++11 but you are not compiling with C++11 enabled. Some modern compilers still default to an older C++ version (usually C++98) and require you to explicitly enable C++11 (or later) when invoking the compiler on the command line, or in your project makefile configuration.
In the case of g++, the current version (8.2) defaults to (the GNU dialect of) C17 for C and C++14 for C++, if not specified otherwise via the -std parameter. Your version (5.4) defaults to (the GNU dialect of) C11 and C++98, respectively.
UPDATE: there are other mistakes in your code:
You are passing a std::string object to curl_easy_setopt() where a char* pointer is expected for CURLOPT_URL. You need to change this:
curl_easy_setopt(curl, CURLOPT_URL, URL);
To this instead:
curl_easy_setopt(curl, CURLOPT_URL, URL.c_str());
You are not testing the return value of curl_easy_perform() correctly. Per the documentation, curl_easy_perform() returns 0 (CURLE_OK) on success, and non-zero on error, so you need to change this:
if (!res)
To this instead:
if (res != CURLE_OK)
So I added a check under res = curl_easy_perform(curl) ..., and res gets displayed as 6.
That is CURLE_COULDNT_RESOLVE_HOST, which makes sense as your updated example is passing a blank URL to getJSON():
string data = getJSON(""); // should be "URL-to-api" instead!
I have a long base64 encoded text string. It's about 1024 characters. From my Objective C code, I want to send it to my PHP script, have it dump it to a log, and return an "OK" response back. I tried this cookbook example, but it only has an example of upload and download (not both combined), and it doesn't even work in my case.
I'd be willing to switch this to a C++ solution if I knew how.
The Objective C Client Code (command line client)
NSString *sMessage = #"My Long Base64 Encoded Message";
NSString *sURL = "http://example.com/request.php";
NSURL *oURL = [NSURL URLWithString:sURL];
NSData *data = [NSData dataWithBytes:sMessage.UTF8String length:sMessage.length];
NSURLSessionDataTask *downloadTask = [[NSURLSession sharedSession]
dataTaskWithURL:oURL completionHandler:^(NSData *data, NSURLResponse *response, NSError *error) {
NSLog(#"\n\nDATA\n\n%#",data);
NSLog(#"\n\nRESPONSE\n\n%#",response);
NSLog(#"\n\nERROR\n\n%#",error);
}];
[downloadTask resume];
The PHP Web Server Code
<?php
error_reporting(E_ALL);
ini_set('display_errors','On');
$sRaw = file_get_contents('php://input');
file_put_contents('TEST.TXT',$sRaw);
die('OK');
There's a far easier route using ordinary C++. You'll have to convert your .m file to a .mm file in order to be able to mix Objective C and C++ code.
The PHP code is good and doesn't require a revision. Here's the C++ example I used that worked. It used the STL and curl. I was doing this on a Mac, and by default OSX has the curl libraries pre-installed. Note that the example below is synchronous -- it jams program execution until the server call is completed. (I desired this in my case -- you may not.)
The C++ Client Code (class)
#pragma once
#include <string>
#include <sstream>
#include <iostream>
#include <curl/curl.h>
class Webby {
public:
static size_t write_data(void *ptr, size_t size, size_t nmemb, void *stream) {
std::string buf = std::string(static_cast<char *>(ptr), size * nmemb);
std::stringstream *response = static_cast<std::stringstream *>(stream);
response->write(buf.c_str(), (std::streamsize)buf.size());
return size * nmemb;
}
static std::string sendRawHTTP(std::string sHostURL, std::string &sStringData) {
CURL *curl;
CURLcode res;
curl = curl_easy_init();
if (curl) {
std::stringstream response;
curl_easy_setopt(curl, CURLOPT_URL, sHostURL.c_str());
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, sStringData.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, Webby::write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &response);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
return response.str();
}
return "";
}
}; // end class
My problem is that in one part of my program a double variable gets incorrectly set to 2.71179e-308 on one specific (virtual) computer (not any other). No crashes or anything like that. After much work I narrowed the problem down to a call to curl_easy_perform (the call has no other connection with said variable). If I fake the curl_easy_perform call and return before it executes the variable is not modified.
My first thought was that there was some problem with the callback write function, but having looked long and hard at it I can't find anything wrong, and replacing it with an empty function still didn't help the error.
My second thought was that perhaps my curl settings strings went out of scope before being used, but in my version of curl (7.43.0) they should be copied and stored by curl itself.
Now I'm at wits end and so turn to SO for some sort of hint at what could be wrong. This is the code for the class I use for all communication;
Header:
#include <string>
class HTTPClient
{
public:
enum Verb
{
V_GET,
V_POST,
V_PUT,
V_DELETE,
};
// Constructor is not thread safe. Make sure to initialize in main thread.
HTTPClient(const std::string& userAgent, const std::string& certPath, const std::string& cookieFile, const std::string& proxy = "");
~HTTPClient(void);
int MakeRequest(Verb verb, const std::string& url, int& responseCode, std::string& response);
private:
static size_t WriteFunction(void *ptr, size_t size, size_t nmemb, void *custom);
void* m_curl;
};
Source:
#include "HTTPClient.h"
#include <curl/curl.h>
HTTPClient::HTTPClient(const std::string& userAgent, const std::string& certPath, const std::string& cookieFile, const std::string& proxy)
{
curl_global_init(CURL_GLOBAL_DEFAULT);
m_curl = curl_easy_init();
if (m_curl)
{
// Indicate where the certificate is located
curl_easy_setopt(m_curl, CURLOPT_CAINFO, certPath.c_str());
// Indicate where the cookie file is located
curl_easy_setopt(m_curl, CURLOPT_COOKIEFILE, cookieFile.c_str());
curl_easy_setopt(m_curl, CURLOPT_COOKIEJAR, cookieFile.c_str());
// Set user agent
curl_easy_setopt(m_curl, CURLOPT_USERAGENT, userAgent.c_str());
// Set the response function
curl_easy_setopt(m_curl, CURLOPT_WRITEFUNCTION, &HTTPClient::WriteFunction);
// Set proxy if specified
if (!proxy.empty())
curl_easy_setopt(m_curl, CURLOPT_PROXY, proxy.c_str());
}
}
HTTPClient::~HTTPClient(void)
{
if (m_curl)
curl_easy_cleanup(m_curl);
curl_global_cleanup();
}
int HTTPClient::MakeRequest(Verb verb, const std::string& url, int& responseCode, std::string& response)
{
std::string protocol, server, path, parameters;
if (!m_curl)
return CURLE_FAILED_INIT;
// Set response data
response.clear();
curl_easy_setopt(m_curl, CURLOPT_WRITEDATA, &response);
switch (verb)
{
case V_GET:
curl_easy_setopt(m_curl, CURLOPT_CUSTOMREQUEST, "GET");
curl_easy_setopt(m_curl, CURLOPT_URL, url.c_str());
break;
// Other cases removed for brevity
}
// Execute command
CURLcode res = curl_easy_perform(m_curl); // <-- When this executes the variable gets damaged
// Get response code
long code;
curl_easy_getinfo(m_curl, CURLINFO_RESPONSE_CODE, &code);
responseCode = code;
return res;
}
size_t HTTPClient::WriteFunction(void *ptr, size_t size, size_t nmemb, void *custom)
{
size_t full_size = size * nmemb;
std::string* p_response = reinterpret_cast<std::string*>(custom);
p_response->reserve(full_size);
for (size_t i=0; i<full_size; ++i)
{
char c = reinterpret_cast<char*>(ptr)[i];
p_response->push_back(c);
}
return full_size;
}
Edit:
I have now run tests with WinDbg and have found out pretty much what happens (but not how to solve the problem).
The double that is trashed is in the release build put in the mmx7 register. It then sits there without being changed until curl_easy_perform is called but before the WriteFunction is called by curl_easy_perform.
Other than compiling my own version of curl so I can debug deeper I don't know how to get past this problem.