libcurl downloads no data to buffer - libcurl

I am using following code to download data from an url to memory (stream). Around 2% chance, the size of the stream is zero. I can download proper data from the same failing url if I try it another time. I am not sure if this is a network issue, CPU usage issue, or it's just the code not covering some corner cases. Please advice. Thanks!
static size_t write_data(char *ptr, size_t size, size_t nmemb, void *userdata)
{
std::vector<uchar> *stream = (std::vector<uchar>*)userdata;
size_t count = size * nmemb;
stream->insert(stream->end(), ptr, ptr + count);
return count;
}
static void CurlUrl(const char* img_url, std::vector<uchar>* stream) {
CURL *curl = curl_easy_init(); // curl_global_init is called eleswhere.
curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(curl, CURLOPT_URL, img_url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, stream);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, 10);
CURLcode res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
}

If it didn't deliver any download data into the buffer via the callback, it means that the transfer either failed or that there was exactly zero bytes to transfer.
Check the return code from curl_easy_perform() as it might actually tell you exactly what happened.
Use CURLOPT_VERBOSE to see what's going on if (1) is not enough.
Use CURLOPT_ERRORBUFFER to get a better error description if it fails if (2) is not enough.

Related

Extract specific data from webpage

Basically this is my code :
int main()
{
CURL *curl;
FILE *fp;
CURLcode res;
std::string readBuffer;
curl = curl_easy_init();
char outfilename[FILENAME_MAX] = "C:\\Users\\admin\\desktop\\test.txt";
if(curl) {
fp = fopen(outfilename,"wb");
curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com");
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "user=123&pass=123");
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
Sleep(1000);
curl_easy_cleanup(curl);
fclose(fp);
}
return EXIT_SUCCESS;
}
The output is successfully saved in the text file.
My concern is how to extract specific content in between specific tags.
For example i want only the content between < bla> .............. < /bla> .
Whats the easiest way and thank you.
In your Example, you are dumping the response from the website to a file, libcURL writes the data returned by the webpage that you hit as it is, it does not take efforts for restructuring the returned data.
You can obtain the data in a memory, by defining the write_data function, which needs the following format only:
size_t write_data(char *ptr, size_t size, size_t nmemb, void *userdata);
Once you get the data in a memory, you can parse it and restructure it as required.
See Example Here for using write_data function.
For XML Parsing you may use This sample code

Still reachable leak summary in Valgrind for libcurl c++ code

The following functions in libcurl saves a file and returns the http status code. However, when I run this using valgrind, it is reporting 0 bytes for "definitely lost", "indirectly lost", "possibly lost", but it is reporting 47448 bytes for "still reachable". I'm trying to resolve the "still reachable" bytes.
Are there any potential memory leaks in the code below?
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream){
size_t written = fwrite(ptr, size, nmemb, stream);
return written;
}
void connectAndSaveFile(char* url, char* output_file_name){
CURL *curl;
curl = curl_easy_init();
if (curl) {
FILE *fp = fopen(output_file_name,"wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
}
}
string get_http_status_code(string URL) {
CURL *session;
session = curl_easy_init();
curl_easy_setopt(session, CURLOPT_URL, URL.c_str());
curl_easy_setopt(session, CURLOPT_NOBODY, true);
CURLcode curl_code = curl_easy_perform (session);
long http_code = 0;
curl_easy_getinfo (session, CURLINFO_RESPONSE_CODE, &http_code);
curl_easy_cleanup(session);
std::ostringstream buff;
buff << http_code;
return buff.str();
}
"still reachable" is most frequently not actually a leak
you might get slightly less memory reachable if you use curl_global_init and curl_global_cleanup
The most of the code mention above uses libcurl. So I think we would have to look the documentation and read about API and what are the recommended steps.
However in the below method, client is passing pointer in which fwrite API is writing and returns back to the caller. This memory needs to be released in client(who would be calling this function) code once usage is complete.
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream)
However in pure C++ ways, we should use std::fstream & std::string so that we need not worry about the memory management. For more informtion you may refer to following link:
https://stackoverflow.com/a/22048298/2724703

CURL finish executing and timeout

I'm performing a server request with curl in C++ which return responses in pieces and those pieces's size may also vary.
At the time of arrival of each piece, the callback function is being called. The problem is I can't detect when the connection finished in order to perform an another callback to my parent class.
And by the way, I want to know if we can set and detect timeout for a curl?
Here is my code in short:
CURL *curl = curl_easy_init();
curl_global_init(CURL_GLOBAL_ALL);
curl_easy_setopt(curl, CURLOPT_URL, "My URL");
curl_easy_setopt(curl, CURLOPT_POST, 1);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "My Postfields");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeCallback);
curl_easy_perform(curl);
curl_easy_cleanup(curl);
curl_global_cleanup();
The default callback:
size_t writeCallback(char* buf, size_t size, size_t nmemb, void* up)
{
//do something
//But how can I detect the last callback when connection finished
//in order to call an another one?
return size*nmemb;
}
The data you want can be saved off during the callback, then used once curl_easy_perform returns. Example:
CURL *curl = curl_easy_init();
curl_global_init(CURL_GLOBAL_ALL);
// NOTE: added to accumulate data.
std::string result;
curl_easy_setopt(curl, CURLOPT_URL, "My URL");
curl_easy_setopt(curl, CURLOPT_POST, 1);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "My Postfields");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &result); // NOTE: added
curl_easy_perform(curl);
// TODO: do something with your data stored in result
curl_easy_cleanup(curl);
curl_global_cleanup();
And in your write callback:
size_t writeCallback(char* buf, size_t size, size_t nmemb, void* up)
{
std::string* pstr = static_cast<std::string*>(up);
std::copy(buf, buf+size*nmemb, std::back_inserter(*pstr));
return size*nmemb;
}
or something along those lines. I leave all the error checking to you (and sorry for any typos; I don't have a compiler to validate this on immediately available to me).
Regarding timeout length, there are a multitude of timeout options available to a easy-mode curl request. Too many to mention here, in fact. See the documentation for curl_easy_setopt, in particular the connection options approximately 2/3rd of the way down the page.
Best of luck.

"Failed writing body" CURLOPT_WRITEDATA

The below code is to get response from a server using wsdl, here the problem is curl returns response but am unable to print it.
Error:
Failed writing body
Failed writing data
#include<stdio.h>
#include<string.h>
#include"../include/curl.h"
size_t write_data(void *ptr, size_t size, size_t count, void *stream)
{
/* ptr - your string variable.
stream - data chuck you received */
printf("%.*s", size, (char*)stream);
}
int main()
{
int res=0,i=0;
char buffer[4098]="",buff[128]="",buf[256]="",buf7[30]="",buf6[30]="",buf5[30]="";
char machineid[]="SUBANI";
char filename1[50]="";
int refno=0,paymode=0,taxtype=0;
FILE *fbc;
memset(filename1,0,sizeof(filename1));
sprintf(filename1,"/mnt/jffs2/Response_Details1.xml");
lk_dispclr();
lk_disptext(1,0,(unsigned char *)"Sending Request",0);
lk_disptext(2,0,(unsigned char *)"Please Wait",0);
memset(buffer,0,sizeof(buffer));
sprintf(buffer,"<?xml version=\"1.0\" encoding=\"utf-8\"?>\
<soap:Envelope xmlns:soap=\"http://www.w3.org/2003/05/soap-envelope\" xmlns:log=\"http://wsdlclassess.application.sims.test.com\">\
<soap:Header>\
</soap:Header>\
<soap:Body>\
<log:loginMethod>\
<log:loginid>%s</log:loginid>\
<log:password>%s</log:password>\
</log:loginMethod>\
</soap:Body>\
</soap:Envelope>","raja","test");
res=GET_FILE1(buffer,filename1);
return 0;
}
int GET_FILE1(char *buffer,char *filename)
{
CURL *curl;
CURLcode res;
struct curl_slist *headers = NULL;
FILE *out_fd = (FILE *) 0;
char errorbuf[300] = "",tmpbuff[128]="";
char errmsg[256];
int Timeout=120; //Default timeout is = 2 mins
int buffer_size = 0;
char urlbuff[256]="";
char mstr[10240];
memset(urlbuff,0,sizeof(urlbuff));
memset(tmpbuff,0,sizeof(tmpbuff));
buffer_size = strlen(buffer);
strcpy(tmpbuff,"http://10.10.1.111:8081/test_server/services/application?wsdl");
tmpbuff[strlen(tmpbuff)]='\0';
curl = curl_easy_init();
if(curl)
{
out_fd = fopen (filename, "w");
curl_easy_setopt(curl, CURLOPT_FILE, out_fd);
printf("%s:Sign-In Request\n", __func__);
headers = curl_slist_append(headers, "Content-type:application/soap+xml; charset=utf-8; action=\"http://wsdlclassess.application.sims.test.com/loginMethod\"");
curl_easy_setopt(curl, CURLOPT_URL, tmpbuff);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, mstr);
curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, buffer_size);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, buffer);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, Timeout);
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER,errmsg);
printf("The Server%s:Performing Transaction.....\n",__func__);
res = curl_easy_perform(curl);
printf("res=after culreasey perform%d\n",res);
curl_slist_free_all(headers);
curl_easy_cleanup(curl);
printf("\nerrorbuf:%s\n",errmsg);
fclose(out_fd);
if(CURLE_OK != res)
{
puts("error occured is\n" );
//ppp_close();
return -1;
}
}
return 0;
}
The error is that you don't return the correct value from the function, in fact you don't return anything.
Also, the data provided to the function is actually the first ptr argument.
I agree that the documentation is not very clear, but it says:
The size of the data pointed to by ptr is size multiplied with nmemb, it will not be zero terminated.
The above line (emphasis mine) tells you that the data is in ptr which is the first argument in the function declaration provided in the documentation.
The documentation also states:
Return the number of bytes actually taken care of. If that amount differs from the amount passed to your function, it'll signal an error to the library. This will abort the transfer and return CURLE_WRITE_ERROR.
You don't return a value from the function, and so you have undefined behavior with a seemingly random value being returned causing the whole operation to fail. To fix this you should return size * count.
You also uses size to print the string, which is the size of the underlying type used (probably 1), your count variable is the number of characters read by CURL. To be fully working, without invoking more undefined behavior (since the data is not terminated) you should call printf like:
printf("%*.*s", size * count, size * count, ptr);

uploading file with libcurl

Take a look at the following code
static size_t reader(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t retcode = fread(ptr, size, nmemb, stream);
cout << "*** We read " << retcode << " bytes from file" << endl;
return retcode;
}
void upload() { //upload() is called from ouside
FILE *pFile;
pFile = fopen("map.txt" , "r");
struct stat file_info;
stat("map.txt", &file_info);
size_t size = (size_t)file_info.st_size;
uploadFile(pFile, size);
}
bool uploadFile(void* data, size_t datasize) {
CURL *curl;
CURLcode res;
curl = curl_easy_init();
if (curl) {
char *post_params = ...;
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_UPLOAD, 1L);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, post_params);
curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, (long) strlen(post_params));
curl_easy_setopt(curl, CURLOPT_READFUNCTION, reader);
curl_easy_setopt(curl, CURLOPT_READDATA, data);
curl_easy_setopt(curl, CURLOPT_INFILESIZE_LARGE, (curl_off_t) datasize);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
}
return true;
}
When the code is executed, the following is outputed
*** We read 490 bytes from file
*** We read 0 bytes from file
after that the app does nothing (even not exiting).
Can someone point out at what's wrong here?
Will be grateful for any help!!!
There's some serious confusions shown in this code. Let me try to explain:
CURLOPT_UPLOAD - this will ask libcurl to PUT the file when the protocol of choice is HTTP
CURLOPT_POSTFIELDS - tells libcurl to POST the data that is provided in the additional argument (which has the size set with CURLOPT_POSTFIELDSIZE)
CURLOPT_READFUNCTION - provides libcurl an alternative way to get data than CURLOPT_POSTFIELDS to allow a POST that reads the data from a file. When using CURLOPT_UPLOAD this is the only way to provide data.
So in the end the questions left for you are:
Do you want PUT or POST?
Do you want to provide the data as a string or do you want it provided with a callback?