How to do curl_multi_perform() asynchronously in C++? - c++

I have come to use curl synchronously doing a http request. My question is how can I do it asynchronously?
I did some searches which lead me to the documentation of curl_multi_* interface from this question and this example but it didn't solve anything at all.
My simplified code:
CURLM *curlm;
int handle_count = 0;
curlm = curl_multi_init();
CURL *curl = NULL;
curl = curl_easy_init();
if(curl)
{
curl_easy_setopt(curl, CURLOPT_URL, "https://stackoverflow.com/");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeCallback);
curl_multi_add_handle(curlm, curl);
curl_multi_perform(curlm, &handle_count);
}
curl_global_cleanup();
The callback method writeCallback doesn't get called and nothing happens.
Please advise me.
EDIT:
According to #Remy's below answer, I got this but seems that it's not quite what I really needed. Cause using a loop is still a blocking one. Please tell me if I'm doing wrong or misunderstanding something. I'm actually pretty new to C++.
Here's my code again:
int main(int argc, const char * argv[])
{
using namespace std;
CURLM *curlm;
int handle_count;
curlm = curl_multi_init();
CURL *curl1 = NULL;
curl1 = curl_easy_init();
CURL *curl2 = NULL;
curl2 = curl_easy_init();
if(curl1 && curl2)
{
curl_easy_setopt(curl1, CURLOPT_URL, "https://stackoverflow.com/");
curl_easy_setopt(curl1, CURLOPT_WRITEFUNCTION, writeCallback);
curl_multi_add_handle(curlm, curl1);
curl_easy_setopt(curl2, CURLOPT_URL, "http://google.com/");
curl_easy_setopt(curl2, CURLOPT_WRITEFUNCTION, writeCallback);
curl_multi_add_handle(curlm, curl2);
CURLMcode code;
while(1)
{
code = curl_multi_perform(curlm, &handle_count);
if(handle_count == 0)
{
break;
}
}
}
curl_global_cleanup();
cout << "Hello, World!\n";
return 0;
}
I can now do 2 http requests simultaneously. Callbacks are called but still need to finish before executing following lines. Will I have to think of thread?

Read the documentation again more carefully, particularly these portions:
http://curl.haxx.se/libcurl/c/libcurl-multi.html
Your application can acquire knowledge from libcurl when it would like to get invoked to transfer data, so that you don't have to busy-loop and call that curl_multi_perform(3) like crazy. curl_multi_fdset(3) offers an interface using which you can extract fd_sets from libcurl to use in select() or poll() calls in order to get to know when the transfers in the multi stack might need attention. This also makes it very easy for your program to wait for input on your own private file descriptors at the same time or perhaps timeout every now and then, should you want that.
http://curl.haxx.se/libcurl/c/curl_multi_perform.html
When an application has found out there's data available for the multi_handle or a timeout has elapsed, the application should call this function to read/write whatever there is to read or write right now etc. curl_multi_perform() returns as soon as the reads/writes are done. This function does not require that there actually is any data available for reading or that data can be written, it can be called just in case. It will write the number of handles that still transfer data in the second argument's integer-pointer.
If the amount of running_handles is changed from the previous call (or is less than the amount of easy handles you've added to the multi handle), you know that there is one or more transfers less "running". You can then call curl_multi_info_read(3) to get information about each individual completed transfer, and that returned info includes CURLcode and more. If an added handle fails very quickly, it may never be counted as a running_handle.
When running_handles is set to zero (0) on the return of this function, there is no longer any transfers in progress.
In other words, you need to run a loop that polls libcurl for its status, calling curl_multi_perform() whenever there is data waiting to be transferred, repeating as needed until there is nothing left to transfer.
The blog article you linked to mentions this looping:
The code can be used like
Http http;
http:AddRequest("http://www.google.com");
// In some update loop called each frame
http:Update();
You are not doing any looping in your code, that is why your callback is not being called. New data has not been received yet when you call curl_multi_perform() one time.

Related

Using libcurl "multi" interface for single file download - C++

I am creating a C++ dll and I need to make a single non-blocking http request. What I mean is that at a certain point in the dll, I need to kick off a download of a file in the background, but resume the program after the single download is kicked off.
I am currently using libcurl as my http client. I read that the multi-interface (https://curl.haxx.se/libcurl/c/libcurl-multi.html) will give me non-blocking functionality, but all of the examples I see are for making multiple http requests simultaneously, and waiting for all of them to finish w/a while loop for example. Am I able to use this interface to make a single http request, and resume my program while it downloads? And if so, can it fire a callback upon completion? Or do I have to continually check the handle to see if the download is complete?
Example of single request with libcurl multi interface:
int main() {
curl_global_init(CURL_GLOBAL_DEFAULT);
// init easy curl session
auto curl_easy = curl_easy_init();
// set url
auto url = "https://[foo].exe";
curl_easy_setopt(curl_easy, CURLOPT_URL, url);
// set curl to write file to directory
auto path = "C:\\foo\\app.exe";
auto filePtr = std::fopen(path, "wb");
curl_easy_setopt(curl_easy, CURLOPT_WRITEFUNCTION, writeFile);
curl_easy_setopt(curl_easy, CURLOPT_WRITEDATA, filePtr);
// init multi curl session
auto multi_handle = curl_multi_init();
// add easy handle to multi handle
curl_multi_add_handle(multi_handle, curl_easy);
auto res = curl_multi_perform(multi_handle, &running);
auto running = 0;
while (running) {
res = curl_multi_perform(multi_handle, &running);
}
I'd like to not do the while (running) loop and go do other things in main while the file downloads. I may be misunderstanding the "non-blocking" nature of curl_multi_perform(). Is it non-blocking only in the sense that many requests can happen simultaneously? If so, I don't think using it to make a single request gets me anything I wouldn't have with curl_easy_perform. Apologies if this is veering outside of stack overflow, I don't want a libcurl tutorial. But should I instead be using something like a blocking libcurl call (curl_easy_perform) inside of std::async()?

curl multi interface program with progress meter

I came across this example that demonstrates curl multi interface to download a single file.
curl multi single. I have added this code to my program. My requirement is as follows.
I want to download and upload a file, and while the file is being downloaded/uploaded, I want the average upload/download rate to be displayed on the screen.
I was initially using curl easy interface with a single call to curl_easy_perform. Since this is synchronous/blocking, I was not able to get the screen update thread update the rate on the screen.
This is my drive to switch to curl multi interface.(as it is not blocking) After switching to curl multi interface too I find that screen update is not happening. Is curl multi interface expected to help in my situation. Are there any other solutions that you can suggest.
This is the relevant portion of my code.
curl_multi_add_handle(m_multiCurl, m_curl);
curl_multi_perform(m_multiCurl, &stillRunning);
while(stillRunning) {
CURLMcode mc;
int numFds;
mc = curl_multi_wait(m_multiCurl, NULL, 0, 1000, &numFds);
if(mc != CURLM_OK) {
m_logger->errorf("curl_multi_wait() failed, code %d.\n", mc);
break;
}
if(!numFds) {
repeats++;
if(repeats > 1) {
WAITMS(100);
}
} else {
repeats = 0;
}
curl_multi_perform(m_multiCurl, &stillRunning);
}
Slightly counter-intuitively, you need to set CURLOPT_NOPROGRESS per easy handle (to zero) to get the progress meter output per easy handle to occur. See example below.
But, and this is I think a fairly important but, when you do more than one transfer concurrently outputting the built-in progress meter per-transfer is probably not what you want.
When doing more than one transfer at any one time, I would imagine that what you want to do is to implement the CURLOPT_XFERINFOFUNCTION callback and implement your own progress meter that can show the progress for all transfers at the same time in a nice way.
CURLOPT_NOPROGRESS example:
CURL *curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://example.com");
/* enable progress meter */
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0L);
/* Perform the request */
curl_easy_perform(curl);
}

Libcurl progress callback not working with multi

I'm trying to manage the progress of a download with libcurl in C++.
I have managed to do this with curl_easy, but the issue with curl_easy is that it blocks the program until the request has been made.
I need to use curl_mutli so the http request is asynchronous, but when I try changing to curl_multi, my progress function stops working.
I have the following curl_easy request code:
int progressFunc(void* p, double TotalToDownload, double NowDownloaded, double TotalToUpload, double NowUploaded) {
std::cout << TotalToDownload << ", " << NowDownloaded << std::endl;
return 0;
}
FILE* file = std::fopen(filePath.c_str(), "wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, false);
curl_easy_setopt(curl, CURLOPT_XFERINFOFUNCTION, progressFunc);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeData);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, file);
CURLcode res = curl_easy_perform(curl);
which works perfectly and prints to the console the progress of the download.
However, when trying to modify this code to use curl_multi instead, the file does not download correctly (shows 0 bytes) and the download progress callback function shows only 0, 0.
FILE* file = std::fopen(filePath.c_str(), "wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, false);
curl_easy_setopt(curl, CURLOPT_XFERINFOFUNCTION, progressFunc);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeData);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, file);
curl_multi_add_handle(curlm, curl);
int runningHandles;
CURLMcode res = curl_multi_perform(curlm, &runningHandles);
TL; DR: you are supposed to call curl_multi_perform in loop. If you don't use event loop and poll/epoll, you should probably stick with using curl_easy in separate thread.
The whole point of curl_multi API is not blocking: instead of magically downloading entire file in single call, you can use epoll or similar means to monitor curl's non-blocking sockets and invoke curl_multi_perform each time some data arrives from network. When you use it's multi-mode, curl itself does not start any internal threads and does not monitor it's sockets — you are expected to do it yourself. This allows writing highly performant event loops, that run multiple simultaneous curl transfers in the same thread. People, who need that, usually already have the necessary harness or can easily write it themselves.
The first time you invoke curl_multi_perform it will most likely return before the DNS resolution completes and/or before the TCP connection is accepted by remote side. So the amount of payload data transferred in first call will indeed be 0. Depending on server configuration, second call might not transfer any payload either. By "payload" I mean actual application data (as opposed to DNS requests, SSL negotiation, HTTP headers and HTTP2 frame metadata).
To actually complete a transfer you have to repeatedly invoke epoll_wait, curl_multi_perform and number of other functions until you are done. Curl's corresponding example stops after completing one transfer, but in practice it is more beneficial to create a permanently running thread, that handles all HTTP transfers for application's lifetime.

Using libcurl unsuccessfully with multiple threads

I'm trying to make use of libcurl with multiple threads. While reading the documentation, I understood the correct way of doing it is to use a single CURL* handle for each thread.
This is what I'm trying to do:
static size_t WriteCallback(void *contents, size_t size, size_t nmemb, void *userp)
{
((std::string*)userp)->append((char*)contents, size * nmemb);
return size * nmemb;
}
bool KeyIsValid(std::string keytocheck) {
CURL *curl;
CURLcode res;
std::string content;
curl = curl_easy_init();
if (curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://localhost/mypage.php");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &content);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, ("something=hello&somethingtwo=" + keytocheck).c_str());
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
// std::cout << content << std::endl;
if (content.find("id or key is not correct") == std::string::npos) // if i use an != the correct key (abc) doesn't get printed
{
return false;
}
else {
return true;
}
}
}
Summarizing this code, I can say I'm working in a new handle for each thread. It makes the request to my localhost and, after the post request is performed, using a callback I'm storing the content to a std::string.
After everything I check if the webpage contains some identifiers for in/corrected id/key. Exactly, the page prints out this:
id or key is not correct
when the id/key is not correct. This is how I call the method KeyIsValid():
if (KeyIsValid(currentKey))
{
std::cout << "key tested with success -> " << currentKey << '\n';
return 1; // 1 = success
}
but while I check every key stored in an array (1 stored key in the array equals to 1 new thread), I get some "misinterpretations":
key tested with success -> abc
key tested with success -> hello
key tested with success -> hello
key tested with success -> hello
while the only correct key is only abc. I'm not sure why the program prints the correct key abc followed from the other incorrect keys.
But if I change the array's items to just two items (abc and hello, and so using two threads), everything seems to work properly as I get printed just the abc key.
I did some searches in internet, this is what I found:
I have a question regarding the safety of performing parallel HTTP-requests using libcurl (C++). When reading this question, please bear in mind I have limited knowledge about HTTP-requests in general. Basically, let's say I have two (or more) threads, each thread makes a HTTP-request once per second. (All the requests made are to the same server). How does my program (or something else?) keep track of what HTTP-response belongs to which tread? I mean, can I be sure that if request A was sent from thread 1, and request B from thread 2 at the same time, and the responses are retrived at the same time, the correct reponse (response A) goes to thread 1 and response B to thread 2? Please excuse my ignorance in this matter. Thanks.
this guy is just asking my same question without being more specific (he didn't show any code).
I'm going to exactly ask this:
Can I be sure that if request A was sent from thread 1, and request B from thread 2 at the same time, and the responses are retrived at the same time, the correct reponse (response A) goes to thread 1 and response B to thread 2?
with reference to my code. Maybe I'm analyzing the page incorrectly, I don't know.
Sorry for my ignorance in this matter.
Edit:
After two days I tried to change my callback code, but still nothing works properly.
Can I be sure that if request A was sent from thread 1, and request B
from thread 2 at the same time, and the responses are retrieved at the
same time, the correct response (response A) goes to thread 1 and
response B to thread 2?
Yes you can be absolutely sure of that.

Synchronized curl requests

I'm trying to do HTTP requests to multiple targets, and I need to them to run (almost) exactly at the same moment.
I'm trying to create a thread for each request, but I don't know why Curl is crashing when doing the perform. I'm using an easy-handle per thread so in theory everything should be ok...
Has anybody had a similar problem? or Does anyone know if the multi interface allows you to choose when to perform all the requests?
Thanks a lot.
EDIT:
Here is an example of the code:
void Clazz::function(std::vector<std::string> urls, const std::string& data)
{
for (auto it : urls)
{
std::thread thread(&Clazz::DoRequest, this, it, data);
thread->detach();
}
}
int Clazz::DoRequest(const std::string& url, const std::string& data)
{
CURL* curl = curl_easy_init();
curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Expect:");
headers = curl_slist_append(headers, "Content-Type: application/json");
curl_easy_setopt(curl, CURLOPT_POST, 1);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, data.c_str());
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 15);
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt (curl, CURLOPT_FAILONERROR, 1L);
//curlMutex.lock();
curl_easy_perform(curl);
//curlMutex.unlock();
long responseCode = 404;
curl_easy_getinfo (curl, CURLINFO_RESPONSE_CODE, &responseCode);
curl_easy_cleanup(curl);
curl_slist_free_all(headers);
}
I hope this can help, thanks!
Are you calling curl_global_init anywhere? Perhaps rather early in your main() method?
Quoting from http://curl.haxx.se/libcurl/c/curl_global_init.html:
This function is not thread safe. You must not call it when any other thread in the program (i.e. a thread sharing the same memory) is running. This doesn't just mean no other thread that is using libcurl. Because curl_global_init calls functions of other libraries that are similarly thread unsafe, it could conflict with any other thread that uses these other libraries.
Quoting from http://curl.haxx.se/libcurl/c/curl_easy_init.html:
If you did not already call curl_global_init, curl_easy_init does it automatically. This may be lethal in multi-threaded cases, since curl_global_init is not thread-safe, and it may result in resource problems because there is no corresponding cleanup.
It sounds like you're not calling curl_global_init, and letting curl_easy_init take care of it for you. Since you're doing it on two threads simultaneously, you're hitting the thread unsafe scenario, with the lethal result that was mentioned.
After being able to debug properly in the device y have found that the problem is an old know issue with curl.
http://curl.haxx.se/mail/lib-2010-11/0181.html
after using CURLOPT_NOSIGNAL in every curl handle the crash has disappeared. :)