download function with libcurl, but it works incomplete [closed]

download function with libcurl, but it works incomplete [closed] - c++

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
Greetings everyone read this topic, my platform is win32. And I'm using libcurl with a problem.
My goal is to coding with libcurl for a download program, which it includes requesting a url to download a file, saving the file locally(fwrite), showing the progress bar while downloading.
The Problem is it can download the very small file well but when requesting a larger file like 30MB, it stops before it's done.
How can I debug this program to work well with any size of files?
I'm not familiar with libcurl, any simple detail could help. Can I have either answer of how curl_easy series works to call multiple callback functions, improper coding of either of the two callback functions, or some missing rules from libcurl?
Feel free to answer me anything.
Things I've tried:
1.I've tried re-compiling versions of libcurl. Now I'm using libcurl-7.64 compiled with "WITH_SSL=static".
2.I've tried many sites, finding the clue: the sites for very small(like 80kb) file will be downloaded completely with the progress bar. But larger file(like 30Mb) will be incomplete. One of my guess is it stopped from some transfer problem since the file is larger.
codes:
static FILE * fp;
static size_t write_callback(char *ptr, size_t size, size_t nmemb, void *userdata)
{
size_t nWrite = fwrite(ptr, size, nmemb, fp);
return nWrite;
}
static int progress_callback(void *clientp, curl_off_t dltotal, curl_off_t dlnow, curl_off_t ultotal, curl_off_t ulnow)
{
(void)ultotal;
(void)ulnow;
int totaldotz = 40;
double fractiondownloaded = (double)dlnow / (double)dltotal;
int dotz = (int)(fractiondownloaded * totaldotz);
printf("%3.0f%% [", fractiondownloaded * 100); //print the number percentage of the progress
int i = 0;
for (; i < dotz; i++) { //print "=" to show progress
printf("=");
}
for (; i < totaldotz; i++) { //print space to occupy the rest
printf(" ");
}
printf("]\r");
fflush(stdout);
return 0;
}
int download_function(CURL *curl,const char * url, const char * path)
{
curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_XFERINFOFUNCTION, progress_callback);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0L);
fopen_s(&fp, path, "ab+");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
curl_easy_setopt(curl, CURLOPT_MAXREDIRS, 5L);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, false);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, false);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 3L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, 3L);
char * error = NULL;
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, error);
CURLcode retcCode = curl_easy_perform(curl);
fclose(fp);
const char* pError = curl_easy_strerror(retcCode);
if (curl) {
curl_easy_cleanup(curl);
}
return 0;
}

#ccxxshow seems right. Set the timeout option gives me CURLE_OPERATION_TIMEDOUT error.
After remove this line I can download about 9MB PDF file successfully.
curl_easy_setopt(curl, CURLOPT_TIMEOUT, 3L);
My complete code:
#include <curl/curl.h>
static FILE * fp;
static size_t write_callback(char *ptr, size_t size, size_t nmemb, void *userdata)
{
size_t nWrite = fwrite(ptr, size, nmemb, fp);
return nWrite;
}
static int progress_callback(void *clientp, curl_off_t dltotal, curl_off_t dlnow, curl_off_t ultotal, curl_off_t ulnow)
{
(void)ultotal;
(void)ulnow;
int totaldotz = 40;
double fractiondownloaded = (double)dlnow / (double)dltotal;
int dotz = (int)(fractiondownloaded * totaldotz);
printf("%3.0f%% [", fractiondownloaded * 100); //print the number percentage of the progress
int i = 0;
for (; i < dotz; i++) { //print "=" to show progress
printf("=");
}
for (; i < totaldotz; i++) { //print space to occupy the rest
printf(" ");
}
printf("]\r");
fflush(stdout);
return 0;
}
int download_function(CURL *curl, const char * url, const char * path)
{
curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_XFERINFOFUNCTION, progress_callback);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0L);
fopen_s(&fp, path, "ab+");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_callback);
curl_easy_setopt(curl, CURLOPT_MAXREDIRS, 5L);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, false);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, false);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 3L);
//curl_easy_setopt(curl, CURLOPT_TIMEOUT, 3L);
char * error = NULL;
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, error);
CURLcode retcCode = curl_easy_perform(curl);
fclose(fp);
const char* pError = curl_easy_strerror(retcCode);
if (curl) {
curl_easy_cleanup(curl);
}
return 0;
}
int main()
{
CURL *testCurl = NULL;
const char *fileAddr = "https://gotocon.com/dl/goto-cph-2015/slides/AndersLybecker_and_SebastianBrandes_DevelopingIoTSolutionsWithWindows10AndAzure.pdf";
download_function(testCurl, fileAddr, "my-9MB.pdf");
}

Related

How to prevent libcurl(c++) form downloading binary data?

I am making a web crawler
and I have the following code but the problem is that it also downloads binary data and I don't want that to happen. How do I prevent it
size_t HTML::WriteMemoryCallback(void *contents, size_t size, size_t nmemb, void *userp) {
size_t realsize = size * nmemb;
if(contents!=NULL||userp!=NULL){
std::string* str=(std::string*)userp;
str->reserve(realsize);
auto c_str=(char*)contents;
if(c_str!=NULL){
for(size_t i=0;i<realsize;i++){
str->push_back(c_str[i]);
}
}
}
return realsize;
}
HTML_CODE HTML::get_html(std::string url) {
std::string chunk;
CURL *curl_handle=curl_easy_init();
CURLcode res;
if(curl_handle) {
curl_easy_setopt(curl_handle, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl_handle, CURLOPT_FOLLOWLOCATION, 1L);
curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, WriteMemoryCallback);
curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);
curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, USER_AGENT);
res = curl_easy_perform(curl_handle);
if(res != CURLE_OK) {
std::cout<<"Can't get html content from "<<url<<"\n";
fprintf(stderr, "error: %s\n", curl_easy_strerror(res));
return {"",""};
}
curl_easy_cleanup(curl_handle);
}
else{
std::cout<<"Error: Couldn't create a curl instance"<<std::endl;
return {"",""};
}
return {.url=url,.content=chunk};
}
Things I have tried:-
Check if the data has a null terminator
Check if the char is an assci letter(it wont work with other language)

How to extract header information via CURLOPT_HEADERFUNCTION?

I want to extract header information by using the CURLOPT_HEADERFUNCTION in my c++ program.
How can I use CURLOPT_HEADERFUNCTION to read a single response header field? provides the solution on how to get those header information but I want to know why my code is not working and a possible solution with example.
//readHeader function which returns the specific header information
size_t readHeader(char* header, size_t size, size_t nitems, void *userdata) {
Erza oprations; //class which contains string function like startsWith etc
if (oprations.startsWith(header, "Content-Length:")) {
std::string header_in_string = oprations.replaceAll(header, "Content-Length:", "");
long size = atol(header_in_string.c_str());
file_size = size; // file_size is global variable
std::cout << size; // here it is showing correct file size
}
else if (oprations.startsWith(header, "Content-Type:")) {
// do something
}else
// do something
return size * nitems;
}
// part of main function
curl = curl_easy_init();
if (curl) {
fp = fopen(path, "wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_CAINFO, "./ca-bundle.crt");
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, false);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, false);
curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, readHeader);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
std::cout << file_size; // showing value 0
Getting correct file size in readHeader function but getting 0 bytes in main function.

As shown in your github depot, oprations (operations !?) is a local variable, and will be released at the end of the readHeader function. A way to process the readHeader function and get the correct file size for a given Erza instance is to pass its pointer to userdata value. The Erza class may be rewritten as :
class Erza : public Endeavour {
//... your class body
public:
bool download (const char *url,const char* path){
curl = curl_easy_init();
if (curl) {
fp = fopen(path, "wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_CAINFO, "./ca-bundle.crt");
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, false);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, false);
curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, readHeader);
curl_easy_setopt(curl, CURLOPT_HEADERDATA, this ); //<-- set this pointer to userdata value used in the callback.
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
return false;
}else
return true;
}
size_t analyseHeader( char* header, size_t size, size_t nitems ){
if (startsWith(header, "Content-Length:")) {
std::string header_in_string = replaceAll(header, "Content-Length:", "");
long size = atol(header_in_string.c_str());
file_size = size; // file_size is a member variable
std::cout << size; // here it is showing correct file size
}
else if (startsWith(header, "Content-Type:")) {
// do something
}else
// do something
return size * nitems;
}
}//Eof class Erza
size_t readHeader(char* header, size_t size, size_t nitems, void *userdata) {
//get the called context (Erza instance pointer set in userdata)
Erza * oprations = (Erza *)userdata;
return oprations->analyseHeader( header, size, nitems );
}

Curl gives segmentation fault error

I am trying to download a .txt file from a server which I can access via the web browser on my raspberry pi.
Curl library gives segmentation error when I am trying to do this. Here is the code I am using.
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written = fwrite(ptr, size, nmemb, stream);
return written;
}
int checkNewFiles(){
CURL *curl;
FILE *fp;
CURLcode res;
string url = "http://52.233.176.151:1880/files/device/software/text.txt";
char outfilename[FILENAME_MAX] = "/home/pi/Desktop/project/cpp/ab.txt";
curl = curl_easy_init();
if (curl) {
fp = fopen(outfilename, "wb");
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
}
return 0;
}

I found the problem, what is url.c_str() doing?
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
change this to
curl_easy_setopt(curl, CURLOPT_URL, url);
Example : Curl program that download the text file.
Offcourse you need to add this neccessary header file here.
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written = fwrite(ptr, size, nmemb, stream);
return written;
}
int main(void) {
CURL *curl;
FILE *fp;
CURLcode res;
const char *url = "http://localhost/yourfile.txt";
char outfilename[FILENAME_MAX] = "C:\\outfile.txt";
curl = curl_easy_init();
if (curl) {
fp = fopen(outfilename,"wb");
curl_easy_setopt(curl, CURLOPT_FAILONERROR, 1); /* enable failure on http errors */
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
if(res != CURLE_OK) { /* check that the operation was successful */
printf("curl_easy_perform(): %s\n", curl_easy_strerror(res));
}
/* always cleanup */
curl_easy_cleanup(curl);
fclose(fp);
}
return 0;
}

I noticed you're not checking for errors after fopen. If it fails, it returns a NULL pointer, which would cause a segfault when curl attempts to write to it.
I'm not convinced that c_str() was the culprit to your segfault in the original question as I have used that in numerous applications with no problems.

Heap increase - Curl HTTP request

I am currently using CURL library, I tried a simple example and I noticed that the heap memory increases every time I make a request. This is a very important problem, especially when you are trying to use multithread.
Does anyone know the problem?
static int Swriter(char *data, size_t size, size_t nmemb, std::string *writerData)
{
if(writerData == NULL)
return 0;
writerData->append(data, size*nmemb);
return size * nmemb;
}
static void RequestReadJson(std::string url, std::string &content)
{
CURL *curl = curl_easy_init();
if (curl) {
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, Swriter);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &content);
curl_easy_perform(curl);
curl_easy_cleanup(curl);
}
}
int main(int argc, wchar_t* argv[]) {
curl_global_init(CURL_GLOBAL_DEFAULT);
std::string content;
std::string url("www.google.com");
for(int i=0;i<300;i++)
RequestReadJson(url, content); //Heap increase
curl_global_cleanup();
}
Heap increase

You append the new downloaded content to the old, hence the heap increase:
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &content);
which leads to:
writerData->append(data, size*nmemb);
You'd better return a fresh string:
static std::string RequestReadJson(std::string url)
{
std::string content;
CURL *curl = curl_easy_init();
if (curl) {
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, Swriter);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &content);
curl_easy_perform(curl);
curl_easy_cleanup(curl);
}
return content;
}

libcurl 404 detection

I'm doing a file download with libcurl in my c++ program. How can i detect if the request is a 404, and not do the file write? The code is:
void GameImage::DownloadImage(string file_name) {
string game_name;
game_name = file_name.substr(file_name.find_last_of("/")+1);
CURL *curl;
FILE *fp;
CURLcode res;
string url = "http://site/"+game_name+".png";
string outfilename = file_name+".png";
cout<<"INFO; attempting to download "<<url<<"..."<<endl;
curl = curl_easy_init();
if (curl) {
cout<<"INFO; downloading "<<url<<"..."<<endl;
fp = fopen(outfilename.c_str(), "wb");
cout<<"INFO; trying to open "<<outfilename<<" for file output"<<endl;
if (fp != NULL) {
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, GameImage::WriteData);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(curl, CURLOPT_FAILONERROR, true);
res = curl_easy_perform(curl);
long http_code = 0;
curl_easy_getinfo (curl, CURLINFO_RESPONSE_CODE, &http_code);
curl_easy_cleanup(curl);
fclose(fp);
}
else {
cout<<"GameImage::DownloadImage; Couldn't open output file"<<endl;
}
}
}
size_t GameImage::WriteData(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written;
written = fwrite(ptr, size, nmemb, stream);
return written;
}
I can delete the 404 response after the transfer occurs, but it would be good to not even save the response.

You can check against CURLE_HTTP_RETURNED_ERROR
This is returned if CURLOPT_FAILONERROR is set to true and the HTTP server returns an error code that is >= 400. You can't grab the specific HTTP response code, but should be enough to accomplish what you want.

I know this is an old post, but the error you're doing is that you're not checking the return value of curl_easy_perform. Setting CURLOPT_FAILONERROR will not crash the program, instead, it will notify you of the error through the return variable you named res. To get rid of the empty file, you could do something like this:
void GameImage::DownloadImage(string file_name) {
string game_name;
game_name = file_name.substr(file_name.find_last_of("/")+1);
CURL *curl;
FILE *fp;
CURLcode res;
string url = "http://site/"+game_name+".png";
string outfilename = file_name+".png";
cout<<"INFO; attempting to download "<<url<<"..."<<endl;
curl = curl_easy_init();
if (curl) {
cout<<"INFO; downloading "<<url<<"..."<<endl;
fp = fopen(outfilename.c_str(), "wb");
cout<<"INFO; trying to open "<<outfilename<<" for file output"<<endl;
if (fp == NULL) {
cout<<"GameImage::DownloadImage; Couldn't open output file"<<endl;
curl_easy_cleanup(curl);
return;
}
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, GameImage::WriteData);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(curl, CURLOPT_FAILONERROR, true);
res = curl_easy_perform(curl);
fclose(fp);
if (res != CURLE_OK) {
cout<<"GameImage::DownloadImage; Failed to download file"<<endl;
remove(outfilename.c_str());
}
curl_easy_cleanup(curl);
}
}
size_t GameImage::WriteData(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written;
written = fwrite(ptr, size, nmemb, stream);
return written;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

download function with libcurl, but it works incomplete [closed] - c++

Related

How to prevent libcurl(c++) form downloading binary data?

How to extract header information via CURLOPT_HEADERFUNCTION?

Curl gives segmentation fault error

Heap increase - Curl HTTP request

libcurl 404 detection

Categories

Resources