I would like to dowload some page content of wikitionary. I use curl in a loop. The first iteration is ok but the others give me the same result as the first. What is missing/wrong?. Thank you. This is the loop:
std::string buffer;
size_t curl_write( void *ptr, size_t size, size_t nmemb, void *stream)
{
buffer.append((char*)ptr, size*nmemb);
return size*nmemb;
}
int main(int argc, char **argv)
{
CURL *curl = curl_easy_init();
string data;
data="http://fr.wiktionary.org/w/api.php?format=json&action=query&titles=";
//Page titles are read from local file. The code is not shown to make short.
while ( not_end_of_file){
//list_of_page_title is pages requested for the current iteration.
data=data+list_of_page_title+"prop=revisions&rvprop=content";
curl_easy_setopt(curl, CURLOPT_URL, data.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, curl_write);
curl_easy_perform(curl);
curl_easy_reset(curl);
}
curl_easy_cleanup(curl);
return 0;
}
I am new to curl. May be many things are missed. Thank you for the help.
data=data+list_of_page_title will append the new title onto your previous URL instead of replacing the previous. By the end you'll have a gigantic URL full of garbage. The server is probably paying attention to the first title and ignoring the rest.
And this would be obvious if you just output your URL as the first step of debugging... "Am I requesting what I think I'm requesting?"
One problem is that you are not resetting your buffer variable.
while ( not_end_of_file){
buffer = ""; // reset buffer to empty string
//list_of_page_title is pages requested for the current iteration.
data="http://fr.wiktionary.org/w/api.php?format=json&action=query&titles=" +
list_of_page_title +
"prop=revisions&rvprop=content";
curl_easy_setopt(curl, CURLOPT_URL, data.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, curl_write);
curl_easy_perform(curl);
curl_easy_reset(curl);
}
And as Peter points out your handling of the data variable has a very similar problem.
Related
I am using following code to download data from an url to memory (stream). Around 2% chance, the size of the stream is zero. I can download proper data from the same failing url if I try it another time. I am not sure if this is a network issue, CPU usage issue, or it's just the code not covering some corner cases. Please advice. Thanks!
static size_t write_data(char *ptr, size_t size, size_t nmemb, void *userdata)
{
std::vector<uchar> *stream = (std::vector<uchar>*)userdata;
size_t count = size * nmemb;
stream->insert(stream->end(), ptr, ptr + count);
return count;
}
static void CurlUrl(const char* img_url, std::vector<uchar>* stream) {
CURL *curl = curl_easy_init(); // curl_global_init is called eleswhere.
curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(curl, CURLOPT_URL, img_url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, stream);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, 10);
CURLcode res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
}
If it didn't deliver any download data into the buffer via the callback, it means that the transfer either failed or that there was exactly zero bytes to transfer.
Check the return code from curl_easy_perform() as it might actually tell you exactly what happened.
Use CURLOPT_VERBOSE to see what's going on if (1) is not enough.
Use CURLOPT_ERRORBUFFER to get a better error description if it fails if (2) is not enough.
Basically this is my code :
int main()
{
CURL *curl;
FILE *fp;
CURLcode res;
std::string readBuffer;
curl = curl_easy_init();
char outfilename[FILENAME_MAX] = "C:\\Users\\admin\\desktop\\test.txt";
if(curl) {
fp = fopen(outfilename,"wb");
curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com");
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "user=123&pass=123");
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
Sleep(1000);
curl_easy_cleanup(curl);
fclose(fp);
}
return EXIT_SUCCESS;
}
The output is successfully saved in the text file.
My concern is how to extract specific content in between specific tags.
For example i want only the content between < bla> .............. < /bla> .
Whats the easiest way and thank you.
In your Example, you are dumping the response from the website to a file, libcURL writes the data returned by the webpage that you hit as it is, it does not take efforts for restructuring the returned data.
You can obtain the data in a memory, by defining the write_data function, which needs the following format only:
size_t write_data(char *ptr, size_t size, size_t nmemb, void *userdata);
Once you get the data in a memory, you can parse it and restructure it as required.
See Example Here for using write_data function.
For XML Parsing you may use This sample code
The below code is to get response from a server using wsdl, here the problem is curl returns response but am unable to print it.
Error:
Failed writing body
Failed writing data
#include<stdio.h>
#include<string.h>
#include"../include/curl.h"
size_t write_data(void *ptr, size_t size, size_t count, void *stream)
{
/* ptr - your string variable.
stream - data chuck you received */
printf("%.*s", size, (char*)stream);
}
int main()
{
int res=0,i=0;
char buffer[4098]="",buff[128]="",buf[256]="",buf7[30]="",buf6[30]="",buf5[30]="";
char machineid[]="SUBANI";
char filename1[50]="";
int refno=0,paymode=0,taxtype=0;
FILE *fbc;
memset(filename1,0,sizeof(filename1));
sprintf(filename1,"/mnt/jffs2/Response_Details1.xml");
lk_dispclr();
lk_disptext(1,0,(unsigned char *)"Sending Request",0);
lk_disptext(2,0,(unsigned char *)"Please Wait",0);
memset(buffer,0,sizeof(buffer));
sprintf(buffer,"<?xml version=\"1.0\" encoding=\"utf-8\"?>\
<soap:Envelope xmlns:soap=\"http://www.w3.org/2003/05/soap-envelope\" xmlns:log=\"http://wsdlclassess.application.sims.test.com\">\
<soap:Header>\
</soap:Header>\
<soap:Body>\
<log:loginMethod>\
<log:loginid>%s</log:loginid>\
<log:password>%s</log:password>\
</log:loginMethod>\
</soap:Body>\
</soap:Envelope>","raja","test");
res=GET_FILE1(buffer,filename1);
return 0;
}
int GET_FILE1(char *buffer,char *filename)
{
CURL *curl;
CURLcode res;
struct curl_slist *headers = NULL;
FILE *out_fd = (FILE *) 0;
char errorbuf[300] = "",tmpbuff[128]="";
char errmsg[256];
int Timeout=120; //Default timeout is = 2 mins
int buffer_size = 0;
char urlbuff[256]="";
char mstr[10240];
memset(urlbuff,0,sizeof(urlbuff));
memset(tmpbuff,0,sizeof(tmpbuff));
buffer_size = strlen(buffer);
strcpy(tmpbuff,"http://10.10.1.111:8081/test_server/services/application?wsdl");
tmpbuff[strlen(tmpbuff)]='\0';
curl = curl_easy_init();
if(curl)
{
out_fd = fopen (filename, "w");
curl_easy_setopt(curl, CURLOPT_FILE, out_fd);
printf("%s:Sign-In Request\n", __func__);
headers = curl_slist_append(headers, "Content-type:application/soap+xml; charset=utf-8; action=\"http://wsdlclassess.application.sims.test.com/loginMethod\"");
curl_easy_setopt(curl, CURLOPT_URL, tmpbuff);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, mstr);
curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, buffer_size);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, buffer);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
curl_easy_setopt(curl, CURLOPT_TIMEOUT, Timeout);
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER,errmsg);
printf("The Server%s:Performing Transaction.....\n",__func__);
res = curl_easy_perform(curl);
printf("res=after culreasey perform%d\n",res);
curl_slist_free_all(headers);
curl_easy_cleanup(curl);
printf("\nerrorbuf:%s\n",errmsg);
fclose(out_fd);
if(CURLE_OK != res)
{
puts("error occured is\n" );
//ppp_close();
return -1;
}
}
return 0;
}
The error is that you don't return the correct value from the function, in fact you don't return anything.
Also, the data provided to the function is actually the first ptr argument.
I agree that the documentation is not very clear, but it says:
The size of the data pointed to by ptr is size multiplied with nmemb, it will not be zero terminated.
The above line (emphasis mine) tells you that the data is in ptr which is the first argument in the function declaration provided in the documentation.
The documentation also states:
Return the number of bytes actually taken care of. If that amount differs from the amount passed to your function, it'll signal an error to the library. This will abort the transfer and return CURLE_WRITE_ERROR.
You don't return a value from the function, and so you have undefined behavior with a seemingly random value being returned causing the whole operation to fail. To fix this you should return size * count.
You also uses size to print the string, which is the size of the underlying type used (probably 1), your count variable is the number of characters read by CURL. To be fully working, without invoking more undefined behavior (since the data is not terminated) you should call printf like:
printf("%*.*s", size * count, size * count, ptr);
I am trying to read the content of a PHP / HTML file on a remote web server using C++, but haven't found a way to do it. I want to pass GET statements to it, so http://example.com/login.php?user=abc&password=def.
How would I do it?
Your best bet is to use an external library. libcurl is popular and fairly easy to use.
Here's a simple example, you need to add error checking though:
string data;
CURL *curl = curl_easy_init();
curl_easy_setopt(curl, CURLOPT_URL, url_.c_str());
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, curlWrite);
curl_easy_perform(curl);
Your callback would look something like this:
size_t curlWrite(void *ptr, size_t size, size_t nmemb, void *usrPtr)
{
size_t bytes = size * nmemb;
string *data = static_cast<string *>(usrPtr);
data->append(static_cast<const char *>(ptr), bytes);
return bytes;
}
You can add your GET parameters on the end of the URL.
I am currently trying to make an updater for my software project. I need it to be able to download multiple files, I don't mind if they download in sync or one after each other, whatever is easier (file size is not an issue). I followed the example from the libcurl webpage and a few other resources and came up with this:
#include <iostream>
#include <stdio.h>
#include <curl/curl.h>
#include <string.h>
size_t write_data(void *ptr, size_t size, size_t nmemb, FILE *stream) {
size_t written;
written = fwrite(ptr, size, nmemb, stream);
return written;
}
int main(void){
for (int i = 0; i < 2;){ //download 2 files (loop twice)
CURL *curl;
FILE *fp;
CURLcode res;
char *url = "http://sec7.org/1024kb.txt"; //first file URL
char outfilename[FILENAME_MAX] = "C:\\users\\grant\\desktop\\1024kb.txt";
curl = curl_easy_init();
if (curl){
fp = fopen(outfilename,"wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
}
url = "http://sec7.org/index.html"; //I want to get a new file this time
outfilename[FILENAME_MAX] = "C:\\users\\grant\\desktop\\index.html";
}
return 0;
}
The first issue is if i remove the new file assignments (*url = "http://...") and just try to loop the download code twice, the program simply stops responding. This occurs in any combination of the download being called more than once in the program. The other issue is that I am unable to change the value of the character array outfilename[FILENAME_MAX]. I feel like this is just some silly error I am making but no solution comes to mind. Thank you!
Why not put this in a function and call it twice?
Your syntax for the arrays is all wrong, plus all the variables inside the loop are local, which means they are destroyed after each loop iteration.
What Conspicuous Compiler said. That's what's causing your program to freeze; it's stuck in an infinite loop because i is never > 2.
Put your code into a function like so:
void downloadFile(const char* url, const char* fname) {
CURL *curl;
FILE *fp;
CURLcode res;
curl = curl_easy_init();
if (curl){
fp = fopen(fname, "wb");
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_data);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, fp);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
fclose(fp);
}
}
And call it twice with the relevant file names and urls:
downloadFile("http://sec7.org/1024kb.txt", "C:\\users\\grant\\desktop\\1024kb.txt");
downloadFile("http://sec7.org/index.html", "C:\\users\\grant\\desktop\\index.html");
The example function is very bad though, it's just an example. You should alter it to return error codes/throw exceptions, and stuff like that.