Use stringstream to read from TCP socket - c++

I am using a socket library (I'd rather not not use it) whose recv operations works with std::string, but is just a wrapper for one call of the recv socket function, so it is probably that I only got some part of the message I wanted. My first instinct was to go in a loop and append the received string to another string until I get everything, but this seems inefficient. Another possibility was to do the same with a char array, but this seems messy. (I'd have to check the strings size before adding into the array and if it overflowed I need to store the string somewhere until the array is empty again.. )
So I was thinking about using a stringstream. I use a TLV protocol, so I need to first extract two bytes into an unsigned short, then get a certain amount of bytes from the stringstream and then loop again until I reach a delimiter field.
Is there any better way to do this? Am I completely on the wrong track? Are there any best practices? So far I've always only seen direct use of the socket library with char arrays so I'm curious why using `std::string`` with stringstreams could be a bad idea..
Edit: Replying to the comment below: The library is one we use internally, its not public (its nothing special though, mostly just a wrapper around the socket library to add exceptions, etc.).
I should mention that I have a working prototype using the socket library directly.
This works something like:
int lengthFieldSize = sizeof(unsigned short);
int endOfBuffer= 0;//Pointer to last valid position in buffer.
while(true) {
char buffer[RCVBUFSIZE];
while(true) {
int offset= endOfBuffer;
int rs= 0;
rs= recv(sock, buffer+offset, sizeof(buffer)-offset, 0);
endOfBuffer+= rs;
if(rs < 1) {
// Received nothing or error.
break;
} else if(endOfBuffer == RCVBUFSIZE) {
// Buffer full.
break;
} else if(rs > 0 && endOfBuffer > 1) {
unsigned short msglength= 0;
memcpy((char *) &msglength, buffer+endOfBuffer-lengthFieldSize, lengthFieldSize);
if(msglength == 0) {
break; // Received a full transmission.
}
}
}
unsigned int startOfData = 0;
unsigned short protosize= 0;
while(true) {
// Copy first two bytes into protosize (length field)
memcpy((char *) &protosize, buffer+startOfData, lengthFieldSize);
// Is the last length field the delimiter?
// Then reply and return. (We're done.)
// Otherwise: Is the next message not completely in the buffer?
// Then break. (Outer while will take us back to receiving)
if(protosize == 0) {
// Done receiving. Now send:
SendReplyMsg(sock, lengthFieldSize);
// Clean up.
close(sock);
return;
} else if((endOfBuffer-lengthFieldSize-startOfData) < protosize) {
memmove(buffer, buffer+startOfData, RCVBUFSIZE-startOfData);
//Adjust endOfBuffer:
endOfBuffer-=startOfData;
break;
}
startOfData+= lengthFieldSize;
gtControl::gtMsg gtMessage;
if(!gtMessage.ParseFromArray(buffer+startOfData, protosize)) {
cerr << "Failed to parse gtMessage." << endl;
close(sock);
return;
}
// Move position pointer forward by one message (length+pbuf)
startOfData+= protosize;
PrintGtMessage(&gtMessage);
}
}
So basically I have a big loop which contains a receiving loop and a parsing loop. There's a character array being passed back and forth as I can't be sure to have received everything until I actually parse it. I'm trying to replicate this behaviour using "proper" C++ (i.e. std::string)

My first instinct was to go in a loop and append the received string to another string until I get everything, but this seems inefficient.
String concatenation is technically platform dependent, but probably str1 + str2 will require one dynamic allocation and two copies (from str1 and str2). That's sorta slow, but it's far faster than network access! So my first piece of advice would be to go with your first instinct, to find out whether it's both correct and fast enough.
If it's not fast enough, and your profiler shows that the redundant string copies are to blame, consider maintaining a list of strings (std::vector<string*>, perhaps) and joining all the strings together once at the end. This requires some care, but should avoid a bunch of redundant string copying.
But definitely profile first!

Related

Cannot serve png files and other binary files in hobby HTTP server

I am writing a HTTP server in C++, and serving static files is mostly OK, however when reading .PNG files or other binary's, every method I have tried fails. My main problem is when I open up Dev tools, reading a example image would give a transferred size of 29.56kb, and a size of 29.50 kb for my current method. The sizes given also do not match up with what du-sh give, which is 32kb.
My first method was to push the contents of a file onto a string, and call a function to serve that. However, this would also server ~6kb if memory serves correctly.
My current method is to read the file using std::ifstream in binary mode. I am getting the size of the file using C++17's filesystem header and using std::filesystem::file_size. I read the contents into a buffer and then call a function to send the buffer contents 1 byte at a time
void WebServer::sendContents(std::string contents) {
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
throw std::runtime_error("Server accept: " + std::string(strerror(errno)));
}
}
void WebServer::sendFile(std::string path) {
path = "./" + path;
std::string fileCont; //File contents
std::string mimeType; //Mime type of the file
std::string contLength;
std::string::size_type idx = path.rfind('.');
if (idx != std::string::npos) mimeType = this->getMimeType(path.substr(idx + 1));
else mimeType = "text/html";
std::filesystem::path reqPath = std::filesystem::path("./" + path).make_preferred();
std::filesystem::path parentPath = std::filesystem::path("./");
std::filesystem::path actualPath = std::filesystem::canonical(parentPath / reqPath);
if (!this->isSubDir(actualPath, parentPath)) { this->sendRoute("404"); return; }
std::ifstream ifs;
ifs.open(actualPath, std::ios::binary);
if (ifs.is_open()) {
//Get the size of the static file being server
std::filesystem::path staticPath{path};
std::size_t length = std::filesystem::file_size(staticPath);
char* buffer = new char[length];
*buffer = { 0 }; //Initalize the buffer that will send the static file
ifs.read(buffer, sizeof(char) * length); //Read the buffer
std::string resp = "HTTP/1.0 200 OK\r\n"
"Server: webserver-c\r\n"
"Content-Length" + std::to_string(length) + "\r\n"
"Content-type: " + mimeType + "\r\n\r\n";
if (!ifs) std::cout << "Error! Only " << std::string(ifs.gcount()) << " could be read!" << std::endl;
this->sendContents(resp); //Send the headers
for (size_t i=0; i < length; i++) {
std::string byte = std::string(1, buffer[i]);
this->sendContents(byte);
}
delete buffer; //We do not need megs of memory stack up, that shit will grow quick
buffer = nullptr;
} else {
this->sendContents("HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"); return;
}
ifs.close();
}
It should be noted that this->newFd is a socket descriptor
It should also be noted that I have tried to take a look at this question here, however the same problem still occurs for me
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
There are two bugs for the price of one, here.
This is used to send the contents of the binary file. One byte at a time. sendContents gets used, apparently, to send one byte at a time, here. This is horribly inefficient, but it's not the bug. The first bug is as follows.
Your binary file has plenty of bytes that are 00.
In that case, contents will proudly contain this 00 byte, here. c_str() returns a pointer to it. strlen() then reaches the conclusion that it is receiving an empty string, for input, and make a grandiose announcement that the string contains 0 characters.
In the end, send's third parameter will be 0.
No bytes will get sent, at all, instead of the famous 00 byte.
The second bug will come into play once the inefficient algorithm gets fixed, and sendContents gets used to send more than one byte at a time.
send() holds a secret: this system call may return other values, other than -1 to indicate the failure. Such as the actual number of bytes that were sent. So, if send() was called to send, say, 100 bytes, it may decide so send only 30 bytes, return 30, and leaving you holding the bag with the remaining 70 unsent bytes.
This is actually, already, an existing bug in the shown code. sendContents() also gets used to send the entire resp string. Which is, eh, in the neighborhood of a 100 bytes. Give or take a dozen.
You are relying on this house of cards: of send() always doing its job complete job, in this particular case, not slacking off, and actually sending the entire HTTP/1.0 response string.
But, send() is a famous slacker, and you have no guarantees, whatsoever, that this will actually happen. And I have it on good authority that an upcoming Friday the 13th your send() will decide to slack off, all of a sudden.
So, to fix the shown code:
Implement the appropriate logic to handle the return value from send().
Do not use c_str(), followed by strlen(), because: A) it's broken, for strings containing binary data, B) this elaborate routine simply reinvents a wheel called size(). You will be happy to know that size() does exactly what it's name claims to be.
One other bug:
char* buffer = new char[length];
It is possible for an exception to get thrown from the subsequent code. This memory get leaked, because delete does not get called.
C++ gurus know a weird trick: they rarely use new, but instead use containers, like std::vector, and they don't have to worry about leaking memory, because of that.

Serving HTML as C array from Arduino code - String size limit problem

I've been working on a HTML / websocket server on a Wiznet W5100S-EVB-Pico, programmed in the Arduino IDE. It all worked fine up until now but I'm running into, I think, a string size limit. I guess it is in the way the code handles the const char but I don't know how to do it properly.
I hope someone is willing to help :)
Let me explain:
I convert the index.html to a index_html.h file containing a const char array:
const char c_index_html[] = {
0x3c,0x21,0x44,0x4f,0x43,..., ..., 0x6d,0x6c,0x3e};
In my code I include the index_html.h file:
#include "index_html.h"
Now the code that actually serves the "HTML"
if (web_client){
Serial.println("New client");
// an http request ends with a blank line
bool currentLineIsBlank = true;
while (web_client.connected()){
if (web_client.available()){
char c = web_client.read();
if (c == '\n' && currentLineIsBlank) // if you've gotten to the end of the line (received a newline
{ // character) and the line is blank, the http request has ended,
Serial.println(F("Sending response")); // so you can send a reply
String strData;
strData = c_index_html;
web_client.println(strData);
break;
}
if (c == '\n')
{
// you're starting a new line
currentLineIsBlank = true;
}
else if (c != '\r')
{
// you've gotten a character on the current line
currentLineIsBlank = false;
}
}
}
This is not the prettiest code, it's smashed together from examples and now the main culprit seems to be:
String strData;
strData = c_index_html;
web_client.println(strData);
When I add extra code to the HTML and view the page source, the code is incomplete. I tested reducing the HTML to a minimum and that solves the problem.
So my main question is:
How do I serve the 'const char c_index_html' without use of 'String'?
But also:
How could I prettify the whole 'if (web_client)' serving function?
Thank you very much for making it all the way through this post and if you have a suggestion I would very much appreciate it ;)
Edit: There is a bug in the ethernet library shown in this post.
I don't know if it affects you; you should look at your library implementation.
I'm assuming that web_client is an instance of EthernetClient from the Arduino libraries.
EthernetClient::println is inherited from Print via Stream and is defined in terms of write, which is:
size_t EthernetClient::write(const uint8_t *buf, size_t size)
{
if (_sockindex >= MAX_SOCK_NUM) return 0;
// This library code is not correct:
if (Ethernet.socketSend(_sockindex, buf, size)) return size;
setWriteError();
return 0;
}
So we see that it asks the socket to send the buffer up to some size. The socket can respond with a size or 0 (see edit); if it responds with 0 then there's an error condition to check.
Edit: This is how it's supposed to work. Since write is always returning the requested size and not telling you how much was written, you can't fix your problem using the print/write facilities and need to directly use socketSend.
You're not checking the result of this write (which is supposed to come through println) so you don't know whether the socket sent size bytes, 0 bytes, or some number in between.
In EthernetClient::connect we see that it's opening a TCP stream:
_sockindex = Ethernet.socketBegin(SnMR::TCP, 0);
When you call socketSend you're actually just copying your buffer into a buffer in the network stack. The TCP driver writes out that buffer when it can. If you're writing into that buffer faster than it's being flushed to the network then you'll fill it up and your socketSend calls will start returning < size bytes. See Does send() always send whole buffer?.
So you're probably right that your string is too long. What you need to do is spread your writes out. There are countless tutorials covering this on the web; it's roughly like this in your example:
...
size_t bytesRemaining = 0;
while (web_client.connected()){
if (bytesRemaining > 0) {
// Still responding to the last request
char const* const cursor = c_index_html
+ sizeof(c_index_html)
- bytesRemaining;
size_t const bytesWritten = web_client.write(cursor, bytesRemaining);
if (!bytesWritten) {
// check for error
}
bytesRemaining -= bytesWritten;
if (bytesRemaining == 0) {
// End the message. This might not write!
// We should add the '\n' to the source array so
// it's included in our write-until-finished routine.
web_client.println();
// Stop listening
break;
}
} else if (web_client.available()){
// Time for a new request
char c = web_client.read();
if (c == '\n' && currentLineIsBlank)
{
Serial.println(F("Sending response"));
// Start responding to this request
bytesRemaining = sizeof(c_index_html);
continue;
}
...
This is what I think is going on. I'm not an expert so I might be wrong, but it seems to make sense.
This is not an answer as in "solution" but I found out there is a 2k buffer size limit using the W5100S-EVB-Pico. And indeed, if I keep the HTML below 2k it works. Turns out that I actually got Matt Murphy's suggestion to work but the 2k limit was the problem. It looks like a hardware/library limitation, not completely sure on that.
For now I'll shrink my HTML and Javascript to a minimum and compact it even more with for example textfixer.com. I think I might write some python code to do that
Maybe there is a path to a solution in the link below but at this moment I'll try to live with the limitations
Link:
https://github.com/khoih-prog/EthernetWebServer/issues/7

Parse sequences of protobuf messages from continguous chunks of fixed sized byte buffer

I've been struggling with this for two days straight with my poor knowledge with C++. What I need to do is parsing sequences of messages using protobuf C++ API from a big file, a file that may contain millions of such messages. Reading straight from the file is easy, as I can always just do "ReadVarInt32" to get the size and then do ParseFromCodedStream with the limit pushed on CodedInputStream, as described in this post. However, the I/O level API I am working with (libuv actually) requires a fixed sized of buffer being allocated for every read callback action. Apparently that block size has nothing to do with the message size I am reading out.
This makes my life hard. Basically everytime I read from the file and fill in the fixed-sized buffer (say 16K), that buffer would probably contain hundreds of complete protobuf messages, but the last chunk of that buffer would likely be an incomplete message. So I thought, okay what I should do is attempt reading as many messages as I can, and at the end, extract the last chunk and attach it to the beginning of the next 16K buffer I read out, keep going until I reach EOF of the file. I use ReadVarInt32() to get the size, and then compare that number with the rest of the buffer size, if the message size is smaller, keeps reading.
There is this API called GetDirectBufferPointer, so that I attempt to use this to record the pointer position before I even read out the next message's size. However I suspect due to endianness weirdness, if I just extract the rest of the byte array from where pointer starts and attaches to the next chunk, Parse won't succeed and in fact the first several bytes (8 I think) just completely messed up.
Alternatively, if I do codedStream.ReadRaw() and writes the residual stream into the buffer and then attaches to the head of the new chunk, the data won't get corrupted. But the problem is this time I will lose the "size" byte information as it has already been "read" in "ReadVarInt32"! And even if I just go ahead and remember the size information I read last time and directly call in next iteration message.ParseFromCodedStream(), it ended up reading one less byte, and some part even got corrupted and cannot restore the object successfully.
std::vector<char> mCheckBuffer;
std::vector<char> mResidueBuffer;
char bResidueBuffer[READ_BUFFER_SIZE];
char temp[READ_BUFFER_SIZE];
google::protobuf::uint32 size;
//"in" is the file input stream
while (in.good()) {
in.read(mReadBuffer.data(), READ_BUFFER_SIZE);
mCheckBuffer.clear();
//merge the last remaining chunk that contains incomplete message with
//the new data chunk I got out from buffer. Excuse my terrible C++ foo
std::merge(mResidueBuffer.begin(), mResidueBuffer.end(),
mReadBuffer.begin(), mReadBuffer.end(), std::back_inserter(mCheckBuffer));
//Treat the new merged buffer array as the new CIS
google::protobuf::io::ArrayInputStream ais(&mCheckBuffer[0],
mCheckBuffer.size());
google::protobuf::io::CodedInputStream cis(&ais);
//Record the pointer location on CIS in bResidueBuffer
cis.GetDirectBufferPointer((const void**)&bResidueBuffer,
&bResidueBufSize);
//No size information, probably first time or last iteration
//coincidentally read a complete message out. Otherwise I simply
//skip reading size again as I've already populated that from last
//iteration when I got an incomplete message
if(size == 0) {
cis.ReadVarint32(&size);
}
//Have to read this again to get remaining buffer size
cis.GetDirectBufferPointer((const void**)&temp, &mResidueBufSize);
//Compare the next message size with how much left in the buffer, if
//message size is smaller, I know I can read at least one more message
//out, keep reading until I run out of buffer, or, it's the end of message
//and my buffer just allocated larger so size should be 0
while (size <= mResidueBufSize && size != 0) {
//If this cis I constructed didn't have the size info at the beginning,
//and I just read straight from it hoping to get the message out from
//the "size" I got from last iteration, it simply doesn't work
//(read one less byte in fact, and some part of the message corrupted)
//push the size constraint to the input stream;
int limit = cis.PushLimit(size);
//parse message from the input stream
message.ParseFromCodedStream(&cis);
cis.PopLimit(limit);
google::protobuf::TextFormat::PrintToString(message, &str);
printf("%s", str.c_str());
//do something with the parsed object
//Now I have to record the new pointer location again
cis.GetDirectBufferPointer((const void**)&bResidueBuffer,
&bResidueBufSize);
//Read another time the next message's size and go back to while loop check
cis.ReadVarint32(&size);
}
//If I do the next line, bResidueBuffer will have the correct CIS information
//copied over, but not having the "already read" size info
cis.ReadRaw(bResidueBuffer, bResidueBufSize);
mResidueBuffer.clear();
//I am constructing a new vector that receives the residual chunk of the
//current buffer that isn't enough to restore a message
//If I don't do ReadRaw, this copy completely messes up at least the first 8
//bytes of the copied buffer's value, due to I suspect endianness
mResidueBuffer.insert(mResidueBuffer.end(), &bResidueBuffer[0],
&bResidueBuffer[bResidueBufSize]);
}
I'm really out of idea now. Is it even possible to gracefully use protobuf with APIs that requires fixed-sized intermediate buffer at all? Any inputs very much appreciated, thanks!
I see two major problems with your code:
std::merge(mResidueBuffer.begin(), mResidueBuffer.end(),
mReadBuffer.begin(), mReadBuffer.end(), std::back_inserter(mCheckBuffer));
It looks like you are expecting std::merge to concatenate your buffers, but in fact this function performs a merge of two sorted arrays into a single sorted array in the sense of MergeSort. This doesn't make any sense in this context; mCheckBuffer will end up containing nonsense.
cis.GetDirectBufferPointer((const void**)&bResidueBuffer,
&bResidueBufSize);
Here you are casting &bResidueBuffer to an incompatible pointer type. bResidueBuffer is a char array, so &bResidueBuffer is a pointer to a char array, which is not a pointer to a pointer. This is admittedly confusing because arrays can be implicitly converted to pointers (where the pointer points to the first element of the array), but this is actually a conversion -- bResidueBuffer is itself not a pointer, it can just be converted to one.
I think you're also misunderstanding what GetDirectBufferPointer() does. It looks like you want it to copy the rest of the buffer into bResidueBuffer, but the method never copies any data. The method gives you back a pointer that points into the original buffer.
The correct way to call it is something like:
const void* ptr;
int size;
cis.GetDirectBufferPointer(&ptr, &size);
Now ptr will point into the original buffer. You could now compare this against a pointer to the beginning of the buffer to find out where you are in the stream, like:
size_t pos = (const char*)ptr - &mCheckBuffer[0];
But, you shouldn't do that, because CodedInputStream already has the method CurrentPosition() for exactly this purpose. That will return the current byte offset in the buffer. So, use that instead.
Okay thanks to Kenton's help in pointing the major issues in my question, I have now revised the code piece and tested it working. I will post my solution down here. With that said, however, I am not feeling happy about all the complexity and edge case checkings I needed to do here. I think it's error prone. Even with this, what I will probably do for real is writing my direct "read from stream" blocking call in another thread outside of my libuv main thread so I don't get the requirement of having to use libuv API. But for the sake of completeness, here's my code:
std::vector<char> mCheckBuffer;
std::vector<char> mResidueBuffer;
std::vector<char> mReadBuffer(READ_BUFFER_SIZE);
google::protobuf::uint32 size;
//"in" is the file input stream
while (in.good()) {
//This part is tricky as you're not guaranteed that what end up in
//mReadBuffer is everything you read out from the file. The same
//happens with libuv's assigned buffer, after EOF, what's rest in
//the buffer could be anything
in.read(mReadBuffer.data(), READ_BUFFER_SIZE);
//merge the last remaining chunk that contains incomplete message with
//the new data chunk I got out from buffer. I couldn't find a more
//efficient way doing that
mCheckBuffer.clear();
mCheckBuffer.reserve(mResidueBuffer.size() + mReadBuffer.size());
mCheckBuffer.insert(mCheckBuffer.end(), mResidueBuffer.begin(),
mResidueBuffer.end());
mCheckBuffer.insert(mCheckBuffer.end(), mReadBuffer.begin(),
mReadBuffer.end());
//Treat the new merged buffer array as the new CIS
google::protobuf::io::ArrayInputStream ais(&mCheckBuffer[0],
mCheckBuffer.size());
google::protobuf::io::CodedInputStream cis(&ais);
//No size information, probably first time or last iteration
//coincidentally read a complete message out. Otherwise I simply
//skip reading size again as I've already populated that from last
//iteration when I got an incomplete message
if(size == 0) {
cis.ReadVarint32(&size);
}
bResidueBufSize = mCheckBuffer.size() - cis.CurrentPosition();
//Compare the next message size with how much left in the buffer, if
//message size is smaller, I know I can read at least one more message
//out, keep reading until I run out of buffer. If, it's the end of message
//and size (next byte I read from stream) happens to be 0, that
//will trip me up, cos when I push size 0 into PushLimit and then try
//parsing, it will actually return true even if it reads nothing.
//So I can get into an infinite loop, if I don't do the check here
while (size <= bResidueBufSize && size != 0) {
//If this cis I constructed didn't have the size info at the
//beginning, and I just read straight from it hoping to get the
//message out from the "size" I got from last iteration
//push the size constraint to the input stream
int limit = cis.PushLimit(size);
//parse the message from the input stream
bool result = message.ParseFromCodedStream(&cis);
//Parse fail, it could be because last iteration already took care
//of the last message and that size I read last time is just junk
//I choose to only check EOF here when result is not true, (which
//leads me to having to check for size=0 case above), cos it will
//be too many checks if I check it everytime I finish reading a
//message out
if(!result) {
if(in.eof()) {
log.info("Reached EOF, stop processing!");
break;
}
else {
log.error("Read error or input mal-formatted! Log error!");
exit;
}
}
cis.PopLimit(limit);
google::protobuf::TextFormat::PrintToString(message, &str);
//Do something with the message
//This is when the last message read out exactly reach the end of
//the buffer and there is no size information available on the
//stream any more, in which case size will need to be reset to zero
//so that the beginning of next iteration will read size info first
if(!cis.ReadVarint32(&size)) {
size = 0;
}
bResidueBufSize = mCheckBuffer.size() - cis.CurrentPosition();
}
if(in.eof()) {
break;
}
//Now I am copying the residual buffer into the intermediate
//mResidueBuffer, which will be merged with newly read data in next iteration
mResidueBuffer.clear();
mResidueBuffer.reserve(bResidueBufSize);
mResidueBuffer.insert(mResidueBuffer.end(),
&mCheckBuffer[cis.CurrentPosition()],&mCheckBuffer[mCheckBuffer.size()]);
}
if(!in.eof()) {
log.error("Something else other than EOF happened to the file, log error!");
exit;
}

Winsock - read integer from Java client in C++

I have a client-server application, with the server part written in C++ (Winsock) and the client part in Java.
When sending data from the client, I first send its length followed by the actual data. For sending the length, this is the code:
clientSender.print(text.length());
where clientSender is of type PrintWriter.
On the server side, the code that reads this is
int iDataLength;
if(recv(client, (char *)&iDataLength, sizeof(iDataLength), 0) != SOCKET_ERROR)
//do something
I tried printing the value of iDataLength within the if and it always turns out to be some random large integer. If I change iDataLength's type to char, I get the correct value. However, the actual value could well exceed a char's capacity.
What is the correct way to read an integer passed over a socket in C++ ?
I think the problem is that PrintWriter is writing text and you are trying to read a binary number.
Here is what PrintWriter does with the integer it sends:
http://docs.oracle.com/javase/7/docs/api/java/io/PrintWriter.html#print%28int%29
Prints an integer. The string produced by String.valueOf(int) is
translated into bytes according to the platform's default character
encoding, and these bytes are written in exactly the manner of the
write(int) method.
Try something like this:
#include <sys/socket.h>
#include <cstring> // for std::strerror()
// ... stuff
char buf[1024]; // buffer to receive text
int len;
if((len = recv(client, buf, sizeof(buf), 0)) == -1)
{
std::cerr << "ERROR: " << std::strerror(errno) << std::endl;
return 1;
}
std::string s(buf, len);
int iDataLength = std::stoi(s); // convert text back to integer
// use iDataLength here (after sanity checks)
Are you sure the endianness is not the issue? (Maybe Java encodes it as big endian and you read it as little endian).
Besides, you might need to implement receivall function (similar to sendall - as here). To make sure you receive exact number of bytes specified - because recv may receive fewer bytes than it was told to.
You have a confusion between numeric values and their ASCII representation.
When in Java you write clientSender.print(text.length()); you are actually writing an ascii string - if length is 15, you will send characters 1 (code ASCII 0x31) and 5 (code ASCII 0x35)
So you must either :
send a binary length in a portable way (in C or C++ you have hton and ntoh, but unsure in Java)
add a separator (newline) after the textual length from Java side and decode that in C++ :
char buffer[1024]; // a size big enough to read the packet
int iDataLength, l;
l = recv(client, (char *)&iDataLength, sizeof(iDataLength), 0);
if (l != SOCKET_ERROR) {
buffer[l] = 0;
iDataLength = sscanf(buffer, "%d", &iDataLength);
char *ptr = strchr(buffer, '\n');
if (ptr == NULL) {
// should never happen : peer does not respect protocol
...
}
ptr += 1; // ptr now points after the length
//do something
}
Java part should be : clientSender.println(text.length());
EDIT :
From Remy Lebeau's comment, There is no 1-to-1 relationship between sends and reads in TCP. recv() can and does return arbitrary amounts of data, so you cannot assume that a single recv() will read the entire line of text.
Above code should not do a simple recv but be ready to concatenate multiple reads to find the separator (left as exercise for the reader :-) )

Concatenating strings into own protocol

I'm writing networking programming using socket.h to my studies. I have written server and client simple programs that can transfer files between them using buffer size given by user.
Server
void transfer(string name)
{
char *data_to_send;
ifstream myFile;
myFile.open(name.c_str(),ios::binary);
if(myFile.is_open))
{
while(myFile.eof))
{
data_to_send = new char [buffer_size];
myFile.read(data_to_send, buffer_size);
send(data_to_send,buffer_size);
delete [] data_to_send;
}
myFile.close();
send("03endtransmission",buffer_size);
}
else
{
send("03error",buffer_size);
}
}
Client
void download(string name)
{
char *received_data;
fstream myFile;
myFile.open(name.c_str(),ios::out|ios::binary);
if(myFile.is_open())
{
while(1)
{
received_data = new char[rozmiar_bufora];
if((receivedB = recv(sockfd, received_data, buffer_size,0)) == -1) {
perror("recv");
close(sockfd);
exit(1);
}
if(strcmp(received_data,"03endoftransmission") == 0)
{
cout<<"End of transmission"<<endl;
break;
}
else if (strcmp(received_data,"03error") == 0)
{
cout<<"Error"<<endl;
break;
}
myFile.write(received_data,buffer_size);
}
myFile.close();
}
The problem occurs, when I want to implement my own protocol- two chars (control), 32 chars hash, and the rest of package is data. I tried few times to split it and I end up with this code:
Server
#define PAYLOAD 34
void transfer(string name)
{
char hash[] = "12345678901234567890123456789012"; //32 chars
char *data_to_send;
ifstream myFile;
myFile.open(name.c_str(),ios::binary);
if(myFile.is_open))
{
while(myFile.eof))
{
data_to_send = new char [buffer_size-PAYLOAD];
myFile.read(data_to_send, buffer_size-PAYLOAD);
concatenation = new char[buffer_size];
strcpy(concatenation,"02");
strcat(concatenation,hash);
strcat(concatenation,data_to_send);
send(concatenation,buffer_size);
delete [] data_to_send;
delete [] concatenation;
}
myFile.close();
send("03endtransmission",buffer_size);
}
else
{
send("03error",buffer_size);
}
}
Client
void download(string name)
{
char *received_data;
fstream myFile;
myFile.open(name.c_str(),ios::out|ios::binary);
if(myFile.is_open())
{
while(1)
{
received_data = new char[buffer_size];
if((receivedB = recv(sockfd, received_data, buffer_size,0)) == -1) {
perror("recv");
close(sockfd);
exit(1);
}
if(strcmp(received_data,"03endoftransmission") == 0)
{
cout<<"End of transmission"<<endl;
break;
}
else if (strcmp(received_data,"03error") == 0)
{
cout<<"Error"<<endl;
break;
}
control = new char[3];
strcpy(control,"");
strncpy(control, received_data,2);
control[2]='\0';
hash = new char[33];
strcpy(hash,"");
strncpy(hash,received_data+2,32);
hash[32]='\0';
data = new char[buffer_size-PAYLOAD+1];
strcpy(data,"");
strncpy(data,received_data+34,buffer_size-PAYLOAD);
myFile.write(data,buffer_size-PAYLOAD);
}
myFile.close();
}
But this one inputs to file some ^# instead of real data. Displaying "data" to console looks the same on server and client. If you know how I can split it up, I would be very grateful.
You have some issues which may or may not be your problem.
(1) send/recv can return less than you requested. You may ask to receive 30 bytes but only get 10 on the recv call so all of these have to be coded in loops and buffered somewhere until you actually get the number you wanted. Your first set of programs was lucky to work in this regard and probably only because you tested on a limited amount of data. Once you start to push through more data your assumptions on what you are reading (and comparing) will fail.
(2) There is no need to keep allocating char buffers in the loops; allocate them before the loop or just use a local buffer rather than the heap. What you are doing is inefficient and in the second program you have memory leaks because you don't delete them.
(3) You can get rid of the strcpy/strncpy statements and just use memmove()
Your specific problem is not jumping out at me but maybe this will push in the right direction. More information what is being transmitted properly and exactly where in the data you are seeing problems would be helpful.
But this one inputs to file some ^# instead of real data. Displaying
"data" to console looks the same on server and client. If you know how
I can split it up, I would be very grateful.
You say that the data (I presume the complete file rather than the '^#') is the same on both client and server? If this is the case, then your issue is likely writing the data to file, rather than the actual transmission of the data itself.
If this is the case, you'll probably want to check assumptions about how the program writes to file - for example, are you passing in text data to be written to file, or binary data? If you're writing binary data, but it uses the NULL-terminated string, chances are it will quit early treating valid binary information as a NULL.
If it's text mode, you might want to consider initialising all strings with memset to a default character (other than NULL) to see if it's garbage data being out put.
If both server and client display the '^#' (or whatever data), binary based char data would be incompatible with the strcpy/strcat functions as this rely on NULL termination (where-as binary uses size termination instead).
I can't track down the specific problem, but maybe this might offer an insight or two that helps.