I have to make an http proxy. In the proxy side I have to parse the http request sent by the user.
The question is: How to read binary data from the client such at the final I will get an array of char that contains the full request .
What I did is: Read 1 Byte each time.
char c;
int n = read(con,&c,1);
I saw many implementation where we use 1024 bytes as the size of the buffer , but are we sure that the size of the request will not exceeds 1024?
Normally in first place I have to allocate the memory for the buffer array , so how can i know the size of the request to allocate the same size of memory?
My full methods:
void readToken(int con,char *token){
char c;
int i=0;
do{
int n = read(con,&c,1);
*token++ = c;
}while(c!=' ' && c!='\n');
}
void readLine(int con,char *line){
char c;int i=0;
do{
int n = read(con,&c,1);
*line++ = c;
}while(c!='\n');
}
char * handleRequest(int con){
char resource[30];
char version[5];
char method[4] ;
//i read 4 byte to get the method tyepe
int n = read(con,&method,4);
//here i read until i get a blank space
readToken(con,resource);
//readToken(con,version);
printf("the method is%s\n",method);
printf("the resource asked is%s\n",resource);
//printf("the resource asked is%s\n",version);
printf("the method read is %s",firstLine);
readLine(con,hostLine);
printf("the method read is %s",hostLine);
}
Reading by a single character is terribly inefficient and slows you down tremendously. Instead, you should be reading by chunks of approriate size (1024 seems as good initial guess as any) in the loop and append read buffer to the total data read so far. It is extremely easy to do with C++ std::vector.
Parsing an HTTP request is a quite complicated task, I think it could be easier to use a library like http-parser, which does the parsing for you in a very efficient way.
Related
I am trying to implement a simple file transfer. Below here is two methods that i have been testing:
Method one: sending and receiving without splitting the file.
I hard coded the file size for easier testing.
sender:
send(sock,buffer,107,NULL); //sends a file with 107 size
receiver:
char * buffer = new char[107];
recv(sock_CONNECTION,buffer,107,0);
std::ofstream outfile (collector,std::ofstream::binary);
outfile.write (buffer,107);
The output is as expected, the file isn't corrupted because the .txt file that i sent contains the same content as the original.
Method two: sending and receiving by splitting the contents on receiver's side. 5 bytes each loop.
sender:
send(sock,buffer,107,NULL);
Receiver:
char * buffer = new char[107]; //total file buffer
char * ptr = new char[5]; //buffer
int var = 5;
int sizecpy = size; //orig size
while(size > var ){ //collect bytes
recv(sock_CONNECTION,ptr,5,0);
strcat(buffer,ptr); //concatenate
size= size-var; //decrease
std::cout<<"Transferring.."<<std::endl;
}
std::cout<<"did it reach here?"<<std::endl;
char*last = new char[size];
recv(sock_CONNECTION,last,2,0); //last two bytes
strcat(buffer,last);
std::ofstream outfile (collector,std::ofstream::binary);
outfile.write (buffer,107);
Output: The text file contains invalid characters especially at the beginning and the end.
Questions: How can i make method 2 work? The sizes are the same but they yield different results. the similarity of the original file and the new file on method 2 is about 98~99% while it's 100% on method one. What's the best method for transferring files?
What's the best method for transferring files?
Usually I'm not answering questions like What's the best method. But in this case it's obvious:
You sent the file size and a checksum in network byte order, when starting a transfer
Sent more header data (e.g filename) optionally
The client reads the file size and the checksum, and decodes it to host byte order
You sent the file's data in reasonably sized chunks (5 bytes isn't a reasonable size), chunks should match tcp/ip frames maximum available payload size
You receive chunk by chunk at the client side until the previously sent file size is matched
You calculate the checksum for the received data at the client side, and check if it matches the one that was received beforhand
Note: You don't need to combine all chunks in memory at the client side, but just append them to a file at a storage medium. Also the checksum (CRC) usually can be calculated from running through data chunks.
Disagree with Galik. Better not to use strcat, strncat, or anything but the intended output buffer.
TCP is knda fun. You never really know how much data you are going to get, but you will get it or an error.
This will read up to MAX bytes at a time. #define MAX to whatever you want.
std::unique_ptr<char[]> buffer (new char[size]);
int loc = 0; // where in buffer to write the next batch of data
int bytesread; //how much data was read? recv will return -1 on error
while(size > MAX)
{ //collect bytes
bytesread = recv(sock_CONNECTION,&buffer[loc],MAX,0);
if (bytesread < 0)
{
//handle error.
}
loc += bytesread;
size= size-bytesread; //decrease
std::cout<<"Transferring.."<<std::endl;
}
bytesread = recv(sock_CONNECTION,&buffer[loc],size,0);
if (bytesread < 0)
{
//handle error
}
std::ofstream outfile (collector,std::ofstream::binary);
outfile.write (buffer.get(),size);
Even more fun, write into the output buffer so you don't have to store the whole file. In this case MAX should be a bigger number.
std::ofstream outfile (collector,std::ofstream::binary);
char buffer[MAX];
int bytesread; //how much data was read? recv will return -1 on error
while(size)
{ //collect bytes
bytesread = recv(sock_CONNECTION,buffer,MAX>size?size:MAX,0);
// MAX>size?size:MAX is like a compact if-else: if (MAX>size){size}else{MAX}
if (bytesread < 0)
{
//handle error.
}
outfile.write (buffer,bytesread);
size -= bytesread; //decrease
std::cout<<"Transferring.."<<std::endl;
}
The initial problems I see are with std::strcat. You can't use it on an uninitialized buffer. Also you are not copying a null terminated c-string. You are copying a sized buffer. Better to use std::strncat for that:
char * buffer = new char[107]; //total file buffer
char * ptr = new char[5]; //buffer
int var = 5;
int sizecpy = size; //orig size
// initialize buffer
*buffer = '\0'; // add null terminator
while(size > var ){ //collect bytes
recv(sock_CONNECTION,ptr,5,0);
strncat(buffer, ptr, 5); // strncat only 5 chars
size= size-var; //decrease
std::cout<<"Transferring.."<<std::endl;
}
beyond that you should really as error checking so the sockets library can tell you if anything went wrong with the communication.
Does anyone know how to read in a file with raw encoding? So stumped.... I am trying to read in floats or doubles (I think). I have been stuck on this for a few weeks. Thank you!
File that I am trying to read from:
http://www.sci.utah.edu/~gk/DTI-data/gk2/gk2-rcc-mask.raw
Description of raw encoding:
hello://teem.sourceforge.net/nrrd/format.html#encoding (change hello to http to go to page)
- "raw" - The data appears on disk exactly the same as in memory, in terms of byte values and byte ordering. Produced by write() and fwrite(), suitable for read() or fread().
Info of file:
http://www.sci.utah.edu/~gk/DTI-data/gk2/gk2-rcc-mask.nhdr - I think the only things that matter here are the big endian (still trying to understand what that means from google) and raw encoding.
My current approach, uncertain if it's correct:
//Function ripped off from example of c++ ifstream::read reference page
void scantensor(string filename){
ifstream tdata(filename, ifstream::binary); // not sure if I should put ifstream::binary here
// other things I tried
// ifstream tdata(filename) ifstream tdata(filename, ios::in)
if(tdata){
tdata.seekg(0, tdata.end);
int length = tdata.tellg();
tdata.seekg(0, tdata.beg);
char* buffer = new char[length];
tdata.read(buffer, length);
tdata.close();
double* d;
d = (double*) buffer;
} else cerr << "failed" << endl;
}
/* P.S. I attempted to print the first 100 elements of the array.
Then I print 100 other elements at some arbitrary array indices (i.e. 9,900 - 10,000). I actually kept increasing the number of 0's until I ran out of bound at 100,000,000 (I don't think that's how it works lol but I was just playing around to see what happens)
Here's the part that makes me suspicious: so the ifstream different has different constructors like the ones I tried above.
the first 100 values are always the same.
if I use ifstream::binary, then I get some values for the 100 arbitrary printing
if I use the other two options, then I get -6.27744e+066 for all 100 of them
So for now I am going to assume that ifstream::binary is the correct one. The thing is, I am not sure if the file I provided is how binary files actually look like. I am also unsure if these are the actual numbers that I am supposed to read in or just casting gone wrong. I do realize that my casting from char* to double* can be unsafe, and I got that from one of the threads.
*/
I really appreciate it!
Edit 1: Right now the data being read in using the above method is apparently "incorrect" since in paraview the values are:
Dxx,Dxy,Dxz,Dyy,Dyz,Dzz
[0, 1], [-15.4006, 13.2248], [-5.32436, 5.39517], [-5.32915, 5.96026], [-17.87, 19.0954], [-6.02961, 5.24771], [-13.9861, 14.0524]
It's a 3 x 3 symmetric matrix, so 7 distinct values, 7 ranges of values.
The floats that I am currently parsing from the file right now are very large (i.e. -4.68855e-229, -1.32351e+120).
Perhaps somebody knows how to extract the floats from Paraview?
Since you want to work with doubles, I recommend to read the data from file as buffer of doubles:
const long machineMemory = 0x40000000; // 1 GB
FILE* file = fopen("c:\\data.bin", "rb");
if (file)
{
int size = machineMemory / sizeof(double);
if (size > 0)
{
double* data = new double[size];
int read(0);
while (read = fread(data, sizeof(double), size, file))
{
// Process data here (read = number of doubles)
}
delete [] data;
}
fclose(file);
}
I have the code:
unsigned char *myArray = new unsigned char[40000];
char pixelInfo[3];
int c = 0;
while(!reader.eof()) //reader is a ifstream open to a BMP file
{
reader.read(pixelInfo, 3);
myArray[c] = (unsigned char)pixelInfo[0];
myArray[c + 1] = (unsigned char)pixelInfo[1];
myArray[c + 2] = (unsigned char)pixelInfo[2];
c += 3;
}
reader.close();
delete[] myArray; //I get HEAP CORRUPTION here
After some tests, I found it to be caused by the cast in the while loop, if I use a signed char myArray I don't get the error, but I must use unsigned char for the rest of my code.
Casting pixelInfo to unsigned char also gives the same error.
Is there any solution to this?
This is what you should do:
reader.read((char*)myArray, myArrayLength); /* note, that isn't (sizeof myArray) */
if (!reader) { /* report error */ }
If there's processing going on inside the loop, then
int c = 0;
while (c + 2 < myArraySize) //reader is a ifstream open to a BMP file
{
reader.read(pixelInfo, 3);
myArray[c] = (unsigned char)pixelInfo[0];
myArray[c + 1] = (unsigned char)pixelInfo[1];
myArray[c + 2] = (unsigned char)pixelInfo[2];
c += 3;
}
Trying to read after you've hit the end is not a problem -- you'll get junk in the rest of the array, but you can deal with that at the end.
Assuming your array is big enough to hold the whole file invites buffer corruption. Buffer overrun attacks involving image files with carefully crafted incorrect metadata are quite well-known.
in Mozilla
in Sun Java
in Internet Explorer
in Windows Media Player
again in Mozilla
in MSN Messenger
in Windows XP
Do not rely on the entire file content fitting in the calculated buffer size.
reader.eof() will only tell you if the previous read hit the end of the file, which causes your final iteration to write past the end of the array. What you want instead is to check if the current read hits the end of file. Change your while loop to:
while(reader.read(pixelInfo, 3)) //reader is a ifstream open to a BMP file
{
// ...
}
Note that you are reading 3 bytes at a time. If the total number of bytes is not divisible by 3 (not a multiple of 3) then only part of the pixelInfo array will actually be filled with correct data which may cause an error with your program. You could try the following piece of not tested code.
while(!reader.eof()) //reader is a ifstream open to a BMP file
{
reader.read(pixelInfo, 3);
for (int i = 0; i < reader.gcount(); i++) {
myArray[c+i] = pixelInfo[i];
}
c += 3;
}
Your code does follow the documentation on cplusplus.com very well since eof bit will be set after an incomplete read so this code will terminate after your last read however, as I mentioned before the likely cause of your issue is the fact that you are assigning likely junk data to the heap since pixelInfo[x] might not necessarily be set if 3 bytes were not read.
I have a trouble, my server application sends packet 8 bytes length - AABBCC1122334455 but my application receives this packet in two parts AABBCC1122 and 334455, via "recv" function, how can i fix that?
Thanks!
To sum up a liitle bit:
TCP connection doesn't operate with packets or messages on the application level, you're dealing with stream of bytes. From this point of view it's similar to writing and reading from a file.
Both send and recv can send and receive less data than provided in the argument. You have to deal with it correctly (usually by applying proper loop around the call).
As you're dealing with streams, you have to find the way to convert it to meaningful data in your application. In other words, you have to design serialisation protocol.
From what you've already mentioned, you most probably want to send some kind of messages (well, it's usually what people do). The key thing is to discover the boundaries of messages properly. If your messages are of fixed size, you simply grab the same amount of data from the stream and translate it to your message; otherwise, you need a different approach:
If you can come up with a character which cannot exist in your message, it could be your delimiter. You can then read the stream until you reach the character and it'll be your message. If you transfer ASCII characters (strings) you can use zero as a separator.
If you transfer binary data (raw integers etc.), all characters can appear in your message, so nothing can act as a delimiter. Probably the most common approach in this case is to use fixed-size prefix containing size of your message. Size of this extra field depends on the max size of your message (you will be probably safe with 4 bytes, but if you know what is the maximum size, you can use lower values). Then your packet would look like SSSS|PPPPPPPPP... (stream of bytes), where S is the additional size field and P is your payload (the real message in your application, number of P bytes is determined by value of S). You know every packet starts with 4 special bytes (S bytes), so you can read them as an 32-bit integer. Once you know the size of the encapsulated message, you read all the P bytes. After you're done with one packet, you're ready to read another one from the socket.
Good news though, you can come up with something completely different. All you need to know is how to deserialise your message from a stream of bytes and how send/recv behave. Good luck!
EDIT:
Example of function receiving arbitrary number of bytes into array:
bool recv_full(int sock, char *buffer, size_t size)
{
size_t received = 0;
while (received < size)
{
ssize_t r = recv(sock, buffer + received, size - received, 0);
if (r <= 0) break;
received += r;
}
return received == size;
}
And example of receiving packet with 2-byte prefix defining size of payload (size of payload is then limited to 65kB):
uint16_t msgSize = 0;
char msg[0xffff];
if (recv_full(sock, reinterpret_cast<char *>(&msgSize), sizeof(msgSize)) &&
recv_full(sock, msg, msgSize))
{
// Got the message in msg array
}
else
{
// Something bad happened to the connection
}
That's just how recv() works on most platforms. You have to check the number of bytes you receive and continue calling it in a loop until you get the number that you need.
You "fix" that by reading from TCP socket in a loop until you get enough bytes to make sense to your application.
my server application sends packet 8 bytes length
Not really. Your server sends 8 individual bytes, not a packet 8 bytes long. TCP data is sent over a byte stream, not a packet stream. TCP neither respects nor maintains any "packet" boundary that you might have in mind.
If you know that your data is provided in quanta of N bytes, then call recv in a loop:
std::vector<char> read_packet(int N) {
std::vector buffer(N);
int total = 0, count;
while ( total < N && (count = recv(sock_fd, &buffer[N], N-total, 0)) > 0 )
total += count;
return buffer;
}
std::vector<char> packet = read_packet(8);
If your packet is variable length, try sending it before the data itself:
int read_int() {
std::vector<char> buffer = read_packet(sizeof (int));
int result;
memcpy((void*)&result, (void*)&buffer[0], sizeof(int));
return result;
}
int length = read_int();
std::vector<char> data = read_buffer(length);
okay so i am trying to be more efficient in my programming, by attempting to pass multiple strings as one char array[100]. I can pass all this info from my client perfectly fine, but am now trying to use substr or strch to remove the values i passed using the char array on the server side.
server will look like this on the other end, with the data passed from the client being stored into memory under vel_rec:
recv(sock, vel_rec, 100, 0);
The program is set up to send some basic numerical data to a threaded server to run computations on and return results. I will most likely be using atoi function to get the integer values back out of the strings and then return a result back to the client. This returned result again will be a character array.
client:
char vel_snd[100], vel_rec[100], char buffer[100];
memset(buffer, 0, 99);
memset(vel_snd, 0, 99);
memset(vel_rec, 0, 99);
recv(sock, buffer, 100, 0);
//are we connected?
cout << "connection status = " << buffer << endl;
string v0, a, time, space = "\n", stop; stop = 'g';
while (stop != "x")
{
cout<<":::Press 'x' then 'enter' at any time to quit:::"<< endl;
cout<<"enter data? "<< endl;
cin>>v0;
stop = v0;
cout<<"some more data? "<<endl;
cin>>a;
stop = a;
cout<<"even more data? "<<endl;
cin>>time;
stop = time;
v0 += space + a + space + time;
strcpy(vel_snd, v0.c_str());
send(sock,vel_snd,strlen(vel_snd), 0);
cout<<"Calculating...\n";
recv(sock, vel_rec, 100, 0);
cout<<"\nThe data in "<<time<<" seconds will be "<<vel_rec<<endl;
}
okay so this is all fine and dandy, I am sure there exist infinitely better solutions to concatenation of the null characters which I am not even sure will work, just left them there for you to pick over if you can use them on the server side of things. I was attempting to use them to break out my original strings, but without success. I just want the server to receive this char array and copy sections of it into strings. I am assuming this is possible, but if not I could always send multiple char arrays through, it just seems to me that they would get jumbled up on the server if not properly flagged and organized as the data comes in. one in - one out, seems very clean to me and preferable honestly. The use of strings may not even be needed here as well, I just thought it would be better than using int values and converting, when i can just pass them as strings to begin with. Either way, I still need to implement some method of breaking out the values sent to the server from the client and the substr doesn't want to work with the char array.
I am also incredibly unfamiliar with anything networking and this is my first attempt at it from some tutorials i have seen. I had a 30 min lecture on the subject, with no further explanation other than "it works". That being said please be constructive in your responses. Hopefully I can learn a bit more than just this, since I plan on implementing something similar in a networked game I am working on. oh and before i forget this is using wsock32.lib.
recap:
[client]
char array1[100] = string 1 + string 2 + string 3
server<--char array1
char array2<--- server
print
[server]
char array1<---client string 1 + string 2 + string 3
int A = string 1
int B = string 2
int C = string 3
<some math...>
<char array 2 = itoa A B C>
client <--- char array2
Thanks a bunch
Shawn
Not sure I understood the question but a simple technique would be to for example, everytime the client enters something it will be newline terminated. Make the and when you have finished inputting data then send that message. When the message is then sent you automatically send the size of the array to the server. So for example on the server side:
char* temp;
int i=0;
nr=read(sd,buf,sizeof(buf));
temp = strtok(buf,"\n");
//Got the first token within the string.
//I'm pretending its an int to switch case the server
switch(i){
case 1:
temp = strtok(NULL,"\n");
//Get the next token
printf("CASE 1 with: %s",temp);
break;
}
etc...
I have not tested this code and just wrote it from the back of my mind so if there is faults in syntax I oppologize.
Good luck
suppose you store the received data in std::string received, then you can split it using this function:
std::vector<std::string> string_split(const std::string & str, char delim)
{
std::vector<std::string> elems;
std::stringstream strstream(str);
std::string item;
while(std::getline(strstream, item, delim))
elems.push_back(item);
return elems;
}
then call
std::vector<std::string> messages = string_split(received, '\n');
EDIT
Btw, you can replace this code
char vel_snd[100], vel_rec[100], char buffer[100];
memset(buffer, 0, 99);
memset(vel_snd, 0, 99);
memset(vel_rec, 0, 99);
with this code
char vel_snd[100] = {0}, vel_rec[100] = {0}, char buffer[100] = {0};
not exactly the same, but good enough when working with char arrays.
well i finally just broke down and sent separate char arrays each containing the necessary values to the multithreaded server, did atoi, did the calculations, stored the int calculated result back using itoa, and sent it back to the clients. Just needed to be smacked in the face with KISS. Tried making a gui to go with it in vs2010 but that was a headache, so switched to Qt, so i will be back with questions on that I am sure. Thanks for all the replies. I tried to vote, but I am too low on reputation.