receiving webpages using windows sockets(C ++),but got some unexpected words - c++

I am trying to get a webpage with sockets,using http GET.I do get the page,but there is something little wrong.
Sometimes I got it all right,but sometimes I got it with wrong characters like:
**<td class="c_ba2636">09</t
1ff8
d>**
it should be :
<td class="c_ba2636">09</td>
I donot know why there is a "1ff8" and some "\r\n".
It happens here and there from time to time.And sometimes it occurs like:
06
again it should be :
<td class="c_ba2636">06</td>
this is how I receive and save the page from a socket:
ofstream out("webpage.html");
char text[2050]="";
int recvbytes=0;
string content;
while ( (recvbytes = recv(sock, text, 2048, 0)) > 0)
{
content=string(text,recvbytes);
out << content.c_str();
//System::Console::Write(gcnew String(content.c_str()));
}
closesocket(sock);
out.close();
I tried :out << text; it did not work.
Please does anyone know what's wrong with my codes.
I am using VS2010,and this is a winform program.

It may be normal if your input text is UTF8 encoded and contains characters out of ASCII space

Now I got it done.it turns out that those "1ff8" "2000" or whatever are from some http protocols,to indicates something(length?).I just need to delete those lines and rearrange the lines that are interrupted by them.So I add a function:
private: void rearrangment()
{
ifstream ifile("webpage.html");
ofstream ofile("web.html");
char line1[2048]="";
char line2[2048]="";
char line3[2048]="";
ifile.getline(line1,2047);
//ifile.getline(line2,2047);
while(!ifile.eof())
{
ifile.getline(line2,2047);
if(string(line2,0,3)!=" "
&& line2[0]!='<' && line2[1]!='<' && line2[2]!='<')//they are "1ff8"s
{
ifile.getline(line3,2047);
for(int i=0;i<2046;++i)
{
if(line1[i]==13)
{
line1[i]=0;
break;
}
}
strcat(line1,line3);
}
else
{
ofile<<line1<<endl;
strcpy(line1,line2);
//ofile<<line1;
}
//ofile<<line1;
}
ifile.close();
ofile.close();
}
and now it works well.
Sorry about this stupid question,I should have searched before I asked.

Related

Knowing where is the segmentation fault happening comparing two files

I have the following structure:
int main(int argc, char **argv) {
try {
FX3USBConnection fx3USB3Connection = FX3USB3Connection();
fx3USB3Connection.send_text_file();
}
catch (ErrorOpeningLib& e) {
printf("Error opening library\n");
return -1;
}
catch (NoDeviceFound& e) {
printf("No device found\n");
return 0;
}
return 0;
}
Within send_text_files, the last thing I do is compare two txt files as follows:
printf("Loopback recieved, checking if I received the same that I sended\n");
files_match(out_text_filename, in_text_filename);
printf("Exited without problem");
return; // (actually implicit)
I already used 2 version of files_match function but the last one is an exact copy of this Compare two files
bool FX3USB3Connection::files_match(const std::string &p1, const std::string &p2) {
bool files_match;
std::ifstream f1(p1, std::ifstream::binary|std::ifstream::ate);
std::ifstream f2(p2, std::ifstream::binary|std::ifstream::ate);
if (f1.fail() || f2.fail()) {
return false; //file problem
}
if (f1.tellg() != f2.tellg()) {
return false; //size mismatch
}
//seek back to beginning and use std::equal to compare contents
f1.seekg(0, std::ifstream::beg);
f2.seekg(0, std::ifstream::beg);
files_match = std::equal(std::istreambuf_iterator<char>(f1.rdbuf()),
std::istreambuf_iterator<char>(),
std::istreambuf_iterator<char>(f2.rdbuf()));
f1.close();
f2.close();
if (files_match) { printf("Files match\n"); }
else { printf("Files not equal\n"); }
return files_match;
}
Sometimes I get an error and sometimes I don't. When I get the error I get:
Loopback recieved, checking if I received the same that I sended
Files match
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
So, the print after the call to files_match is not being print so I guess the problem was within the function. However, I do a print just before the return statement and it is printing correctly.
PS: I commented the function files_match and I have no problems.
PS1: The files can have whatever like this character: ¥
Yes, as #john suggested, I had to add the fflush() function. There I realize the error was actually outside all this loop but actually is when getting out of the try{} section. It seams to me that is not managing to destroy the fx3USBConnection.
Thank you! I was so mislead now knowing fprint was actually buffered.

Call HPDF_SaveToFile() with japanese filename

Im trying to save one pdf in path that contains japanese username. In this case, HPDF_SaveToFile is doing crash my app on windows. Any options to compile or other thing? Any idea to support Unicode filenames with libhaur? I not want to create pdf with japanese encode, I want to write pdf with japanese filename.
A solution in Qt. If you use C++, you can use fstream/ofstream(::write). If you use C, you can use fwrite.
QFile file(path);
if (file.open(QIODevice::WriteOnly))
{
HPDF_SaveToStream(m_pdf);
/* get the data from the stream and write it to file. */
for (;;)
{
HPDF_BYTE buf[4096];
HPDF_UINT32 siz = 4096;
HPDF_STATUS ret = HPDF_ReadFromStream(m_pdf, buf, &siz);
if (siz == 0)
{
break;
}
if (-1 == file.write(reinterpret_cast<const char *>(buf), siz))
{
qDebug() << "Write PDF error";
break;
}
}
}
HPDF_Free(m_pdf);
Refrence: Libharu Usage examples

Receiving only necessary data with C++ Socket

I'm just trying to get the contents of a page with their headers...but it seems that my buffer of size 1024 is either too large or too small for the last packet of information coming through...I don't want to get too much or too little, if that makes sense. Here's my code. It's printing out the page just fine with all the information, but I want to ensure that it's correct.
//Build HTTP Get Request
std::stringstream ss;
ss << "GET " << url << " HTTP/1.0\r\nHost: " << strHostName << "\r\n\r\n";
std::string req = ss.str();
// Send Request
send(hSocket, req.c_str(), strlen(req.c_str()), 0);
// Read from socket into buffer.
do
{
nReadAmount = read(hSocket, pBuffer, sizeof pBuffer);
printf("%s", pBuffer);
}
while(nReadAmount != 0);
nReadAmount = read(hSocket, pBuffer, sizeof pBuffer);
printf("%s", pBuffer);
This is broken. You can only use the %s format specifier for a C-style (zero-terminated) string. How is printf supposed to know how many bytes to print? That information is in nReadAmount, but you don't use it.
Also, you call printf even if read fails.
The simplest fix:
do
{
nReadAmount = read(hSocket, pBuffer, (sizeof pBuffer) - 1);
if (nReadAmount <= 0)
break;
pBuffer[nReadAmount] = 0;
printf("%s", pBuffer);
} while(1);
The correct way to read an HTTP reply is to read until you have received a full LF-delimited line (some servers use bare LF even though the official spec says to use CRLF), which contains the response code and version, then keep reading LF-delimited lines, which are the headers, until you encounter a 0-length line, indicating the end of the headers, then you have to analyze the headers to figure out how the remaining data is encoded so you know the proper way to read it and know how it is terminated. There are several different possibilities, refer to RFC 2616 Section 4.4 for the actual rules.
In other words, your code needs to use this kind of structure instead (pseudo code):
// Send Request
send(hSocket, req.c_str(), req.length(), 0);
// Read Response
std::string line = ReadALineFromSocket(hSocket);
int rescode = ExtractResponseCode(line);
std::vector<std::string> headers;
do
{
line = ReadALineFromSocket(hSocket);
if (line.length() == 0) break;
headers.push_back(line);
}
while (true);
if (
((rescode / 100) != 1) &&
(rescode != 204) &&
(rescode != 304) &&
(request is not "HEAD")
)
{
if ((headers has "Transfer-Encoding") && (Transfer-Encoding != "identity"))
{
// read chunks until a 0-length chunk is encountered.
// refer to RFC 2616 Section 3.6 for the format of the chunks...
}
else if (headers has "Content-Length")
{
// read how many bytes the Content-Length header says...
}
else if ((headers has "Content-Type") && (Content-Type == "multipart/byteranges"))
{
// read until the terminating MIME boundary specified by Content-Type is encountered...
}
else
{
// read until the socket is disconnected...
}
}

Reading COM port in c++, getting errors

First time poster long time reader.
I've been playing round with reading in data from a bluetooth GPS unit.
I can connect to it using hyperterm and see the data
The following log is from the hyperterm
$GPRMC,195307.109,A,5208.2241,N,00027.7689,W,000.0,345.8,310712,,,A*7E
$GPVTG,345.8,T,,M,000.0,N,000.0,K,A*07
$GPGGA,195308.109,5208.2242,N,00027.7688,W,1,04,2.1,58.9,M,47.3,M,,0000*7E
$GPGSA,A,3,19,03,11,22,,,,,,,,,5.5,2.1,5.0*3F
$GPRMC,195308.109,A,5208.2242,N,00027.7688,W,000.0,345.8,310712,,,A*73
$GPVTG,345.8,T,,M,000.0,N,000.0,K,A*07
$GPGGA,195309.109,5208.2243,N,00027.7688,W,1,04,2.1,58.9,M,47.3,M,,0000*7E
END LOG
The following log is from my C++ program
$GPGSV,3,3,12,14,20,105,16,28,18,323,,08,07,288,,16,01,178,*7A
$GPRMC,195,3,2ÿþÿÿÿL.š945.109,A,5208.2386,N,00027.7592,W,000.0,169.5,8,323,,08,07,288,,16,01,178,*7A
$GPRMC,195,3,2ÿþÿÿÿL.š310712,,,A*70
$GPVTG,169.5,T,,M,000.0,N,000.0,K,A*06
8,07,288,,16,01,178,*7A
$GPRMC,195,3,2ÿþÿÿÿL.š310712,,,A*70
$GPVTG,169.5,T,,M,000.0,N,000.0,K,A*06
8,07,288,,16,01,178,*7A
$GPRMC,195,3,2ÿþÿÿÿL.š$GPGGA,195946.109,5208.2386,N,00027.7592,W,1.0,K,A*06
8,07,288,,16,01,178,*7A
END LOG
THE PROBLEM
I've left the line feeds as they come, the C++ output has extra line feeds, not sure why?
The C++ log also has some funky chars...?
The Code
for (int n=0;n<100;n++) {
char INBUFFER[100];
cv::waitKey(1000);
bStatus = ReadFile(comport, // Handle
&INBUFFER, // Incoming data
100, // Number of bytes to read
&bytes_read, // Number of bytes read
NULL);
cout << "bStatus " << bStatus << endl;
if (bStatus != 0)
{
// error processing code goes here
}
LogFile << INBUFFER;
}
I'm using settings...
comSettings.BaudRate = 2400;
comSettings.StopBits = ONESTOPBIT;
comSettings.ByteSize = 8;
comSettings.Parity = NOPARITY;
comSettings.fParity = FALSE;
...which as far as I can tell are the same as the settings used by hyperterm.
Any hints on what I'm doing wrong?
cheers!
UPDATE
So after updating to use bytes_read and account for the extra LF at the end of NMEA data I now have...
if (bytes_read!=0) {
for (int i=0; i < bytes_read; i++) {
LogFile << INBUFFER[i];
}
}
Which appears to have fixed things!
$GPGGA,215057.026,5208.2189,N,00027.7349,W,1,04,6.8,244.6,M,47.3,M,,0000*41
$GPGSA,A,3,32,11,01,19,,,,,,,,,9.7,6.8,7.0*3D
$GPRMC,215057.026,A,5208.2189,N,00027.7349,W,002.0,208.7,310712,,,A*74
$GPVTG,208.7,T,,M,002.0,N,003.8,K,A*09
$GPGGA,215058.026,5208.2166,N,00027.7333,W,1,04,6.8,243.1,M,47.3,M,,0000*42
Thanks folks, your help was much appreciated.
You have a bytes_read var, but you don't do anything with it? Seems to me that you're dumping the entire INBUFFER to the file, no matter how many/few bytes are actually loaded into it?

Inconsistent behaviour with fopen in C/C++

I'm working with a library which opens the same file many times. It checks the header of the file to make sure that it is the correct format. The first 1212 times it opens the file, it behaves correctly. The 1213th time, the bytes read out from the file are different. Can anyone suggest why this might be happening?
Unfortunately I can't make a small reproducible example - and it takes 20 minutes to run through to this point. So I'm wondering if there are any subtleties of fopen which I might have missed, or something else which might have a bearing on this execution.
The code is below. Many instances of the class are created, and each on has initialise() called with the same filename. The first 1212 times, the output is:
Expecting: '?'
?lon-1800????%#LYB1800????%#LYB100????%#LYB
lat-900??p-2?%#HYB900??p-2?%#HYB10??p-2?%#HYB
? soilcode0 ?? ?-2?&#AYB12 ?? ?-2?&#AYB1 ?? ?-2?&#AYBmtemp-600??x.2?&#6YB600??x.2?&#6YB10??x.2?&#6YB
?mprec0???H2?&#.YB99999???H2?&#.YB1999???H2?&#.YB?msun0???A2?&#%YB1000???A2?&#%YB100???A2?&#%YB
?
Got: '?'
?lon-1800????%#LYB1800????%#LYB100????%#LYB
lat-900??p-2?%#HYB900??p-2?%#HYB10??p-2?%#HYB
? soilcode0 ?? ?-2?&#AYB12 ?? ?-2?&#AYB1 ?? ?-2?&#AYBmtemp-600??x.2?&#6YB600??x.2?&#6YB10??x.2?&#6YB
?mprec0???H2?&#.YB99999???H2?&#.YB1999???H2?&#.YB?msun0???A2?&#%YB1000???A2?&#%YB100???A2?&#%YB
?
The last time I get:
Expecting: '?'
?lon-1800????%#LYB1800????%#LYB100????%#LYB
lat-900??p-2?%#HYB900??p-2?%#HYB10??p-2?%#HYB
? soilcode0 ?? ?-2?&#AYB12 ?? ?-2?&#AYB1 ?? ?-2?&#AYBmtemp-600??x.2?&#6YB600??x.2?&#6YB10??x.2?&#6YB
?mprec0???H2?&#.YB99999???H2?&#.YB1999???H2?&#.YB?msun0???A2?&#%YB1000???A2?&#%YB100???A2?&#%YB
?
Got: ' lon lat year
The function is as follows:
class Archive {
private:
FILE* pfile;
<snip>
bool initialise(char* filename) {
int i;
unsigned char* pheader;
if (pfile) fclose(pfile);
pfile=fopen(filename,"rb");
if (!pfile || pfile == NULL ) {
printf("Could not open %s for input\n",filename);
return false;
}
pheader=new unsigned char[CRU_1901_2002_HEADERSIZE-4];
if (!pheader) {
printf("Out of memory\n");
fclose(pfile);
pfile=NULL;
return false;
}
::rewind(pfile);
fread(pheader,CRU_1901_2002_HEADERSIZE-4,1,pfile);
printf( "Expecting: '%s'\n", CRU_1901_2002_HEADER);
for( int j = 0; j < CRU_1901_2002_HEADERSIZE-4;j++ )
printf( "%c", CRU_1901_2002_HEADER[j]);
printf( "\nGot: '%s'\n", pheader);
for( int j = 0; j < CRU_1901_2002_HEADERSIZE-4;j++ )
printf( "%c", pheader[j]);
printf( "\n");
for (i=0;i<CRU_1901_2002_HEADERSIZE-4;i++) {
if (pheader[i]!=CRU_1901_2002_HEADER[i]) {
fclose(pfile);
pfile=NULL;
delete pheader;
return false;
}
}
delete pheader;
::rewind(pfile);
fseek(pfile,CRU_1901_2002_HEADERSIZE+CRU_1901_2002_DATA_LENGTH*CRU_1901_2002_NRECORD,SEEK_CUR);
recno=0;
iseof=false;
return true;
}
public:
Archive() {
pfile=NULL;
}
Archive() {
if (pfile) fclose(pfile);
}
Are you sure that there is data in the 1213th position? or, are these data are correct?
I suggest you mount a file with more than 1213th records and do a test to confirm if there is a read error or not in this position.
It turns out this is down to too many files being open. Changing the program elsewhere to open less files fixes it.
Checking fread returns 1, except for the last one, where it returns 0.
However, I don't understand why fopen gives back a non-null file pointer when it can't open the file. In test code, it returns NULL, which is then caught as expected.