I am writing a HTTP server in C++, and serving static files is mostly OK, however when reading .PNG files or other binary's, every method I have tried fails. My main problem is when I open up Dev tools, reading a example image would give a transferred size of 29.56kb, and a size of 29.50 kb for my current method. The sizes given also do not match up with what du-sh give, which is 32kb.
My first method was to push the contents of a file onto a string, and call a function to serve that. However, this would also server ~6kb if memory serves correctly.
My current method is to read the file using std::ifstream in binary mode. I am getting the size of the file using C++17's filesystem header and using std::filesystem::file_size. I read the contents into a buffer and then call a function to send the buffer contents 1 byte at a time
void WebServer::sendContents(std::string contents) {
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
throw std::runtime_error("Server accept: " + std::string(strerror(errno)));
}
}
void WebServer::sendFile(std::string path) {
path = "./" + path;
std::string fileCont; //File contents
std::string mimeType; //Mime type of the file
std::string contLength;
std::string::size_type idx = path.rfind('.');
if (idx != std::string::npos) mimeType = this->getMimeType(path.substr(idx + 1));
else mimeType = "text/html";
std::filesystem::path reqPath = std::filesystem::path("./" + path).make_preferred();
std::filesystem::path parentPath = std::filesystem::path("./");
std::filesystem::path actualPath = std::filesystem::canonical(parentPath / reqPath);
if (!this->isSubDir(actualPath, parentPath)) { this->sendRoute("404"); return; }
std::ifstream ifs;
ifs.open(actualPath, std::ios::binary);
if (ifs.is_open()) {
//Get the size of the static file being server
std::filesystem::path staticPath{path};
std::size_t length = std::filesystem::file_size(staticPath);
char* buffer = new char[length];
*buffer = { 0 }; //Initalize the buffer that will send the static file
ifs.read(buffer, sizeof(char) * length); //Read the buffer
std::string resp = "HTTP/1.0 200 OK\r\n"
"Server: webserver-c\r\n"
"Content-Length" + std::to_string(length) + "\r\n"
"Content-type: " + mimeType + "\r\n\r\n";
if (!ifs) std::cout << "Error! Only " << std::string(ifs.gcount()) << " could be read!" << std::endl;
this->sendContents(resp); //Send the headers
for (size_t i=0; i < length; i++) {
std::string byte = std::string(1, buffer[i]);
this->sendContents(byte);
}
delete buffer; //We do not need megs of memory stack up, that shit will grow quick
buffer = nullptr;
} else {
this->sendContents("HTTP/1.1 500 Error\r\nContent-Length: 0\r\nConnection: keep-alive\r\n\r\n"); return;
}
ifs.close();
}
It should be noted that this->newFd is a socket descriptor
It should also be noted that I have tried to take a look at this question here, however the same problem still occurs for me
if (send(this->newFd, contents.c_str(), strlen(contents.c_str()), 0) == -1) {
There are two bugs for the price of one, here.
This is used to send the contents of the binary file. One byte at a time. sendContents gets used, apparently, to send one byte at a time, here. This is horribly inefficient, but it's not the bug. The first bug is as follows.
Your binary file has plenty of bytes that are 00.
In that case, contents will proudly contain this 00 byte, here. c_str() returns a pointer to it. strlen() then reaches the conclusion that it is receiving an empty string, for input, and make a grandiose announcement that the string contains 0 characters.
In the end, send's third parameter will be 0.
No bytes will get sent, at all, instead of the famous 00 byte.
The second bug will come into play once the inefficient algorithm gets fixed, and sendContents gets used to send more than one byte at a time.
send() holds a secret: this system call may return other values, other than -1 to indicate the failure. Such as the actual number of bytes that were sent. So, if send() was called to send, say, 100 bytes, it may decide so send only 30 bytes, return 30, and leaving you holding the bag with the remaining 70 unsent bytes.
This is actually, already, an existing bug in the shown code. sendContents() also gets used to send the entire resp string. Which is, eh, in the neighborhood of a 100 bytes. Give or take a dozen.
You are relying on this house of cards: of send() always doing its job complete job, in this particular case, not slacking off, and actually sending the entire HTTP/1.0 response string.
But, send() is a famous slacker, and you have no guarantees, whatsoever, that this will actually happen. And I have it on good authority that an upcoming Friday the 13th your send() will decide to slack off, all of a sudden.
So, to fix the shown code:
Implement the appropriate logic to handle the return value from send().
Do not use c_str(), followed by strlen(), because: A) it's broken, for strings containing binary data, B) this elaborate routine simply reinvents a wheel called size(). You will be happy to know that size() does exactly what it's name claims to be.
One other bug:
char* buffer = new char[length];
It is possible for an exception to get thrown from the subsequent code. This memory get leaked, because delete does not get called.
C++ gurus know a weird trick: they rarely use new, but instead use containers, like std::vector, and they don't have to worry about leaking memory, because of that.
Related
I've been working on a HTML / websocket server on a Wiznet W5100S-EVB-Pico, programmed in the Arduino IDE. It all worked fine up until now but I'm running into, I think, a string size limit. I guess it is in the way the code handles the const char but I don't know how to do it properly.
I hope someone is willing to help :)
Let me explain:
I convert the index.html to a index_html.h file containing a const char array:
const char c_index_html[] = {
0x3c,0x21,0x44,0x4f,0x43,..., ..., 0x6d,0x6c,0x3e};
In my code I include the index_html.h file:
#include "index_html.h"
Now the code that actually serves the "HTML"
if (web_client){
Serial.println("New client");
// an http request ends with a blank line
bool currentLineIsBlank = true;
while (web_client.connected()){
if (web_client.available()){
char c = web_client.read();
if (c == '\n' && currentLineIsBlank) // if you've gotten to the end of the line (received a newline
{ // character) and the line is blank, the http request has ended,
Serial.println(F("Sending response")); // so you can send a reply
String strData;
strData = c_index_html;
web_client.println(strData);
break;
}
if (c == '\n')
{
// you're starting a new line
currentLineIsBlank = true;
}
else if (c != '\r')
{
// you've gotten a character on the current line
currentLineIsBlank = false;
}
}
}
This is not the prettiest code, it's smashed together from examples and now the main culprit seems to be:
String strData;
strData = c_index_html;
web_client.println(strData);
When I add extra code to the HTML and view the page source, the code is incomplete. I tested reducing the HTML to a minimum and that solves the problem.
So my main question is:
How do I serve the 'const char c_index_html' without use of 'String'?
But also:
How could I prettify the whole 'if (web_client)' serving function?
Thank you very much for making it all the way through this post and if you have a suggestion I would very much appreciate it ;)
Edit: There is a bug in the ethernet library shown in this post.
I don't know if it affects you; you should look at your library implementation.
I'm assuming that web_client is an instance of EthernetClient from the Arduino libraries.
EthernetClient::println is inherited from Print via Stream and is defined in terms of write, which is:
size_t EthernetClient::write(const uint8_t *buf, size_t size)
{
if (_sockindex >= MAX_SOCK_NUM) return 0;
// This library code is not correct:
if (Ethernet.socketSend(_sockindex, buf, size)) return size;
setWriteError();
return 0;
}
So we see that it asks the socket to send the buffer up to some size. The socket can respond with a size or 0 (see edit); if it responds with 0 then there's an error condition to check.
Edit: This is how it's supposed to work. Since write is always returning the requested size and not telling you how much was written, you can't fix your problem using the print/write facilities and need to directly use socketSend.
You're not checking the result of this write (which is supposed to come through println) so you don't know whether the socket sent size bytes, 0 bytes, or some number in between.
In EthernetClient::connect we see that it's opening a TCP stream:
_sockindex = Ethernet.socketBegin(SnMR::TCP, 0);
When you call socketSend you're actually just copying your buffer into a buffer in the network stack. The TCP driver writes out that buffer when it can. If you're writing into that buffer faster than it's being flushed to the network then you'll fill it up and your socketSend calls will start returning < size bytes. See Does send() always send whole buffer?.
So you're probably right that your string is too long. What you need to do is spread your writes out. There are countless tutorials covering this on the web; it's roughly like this in your example:
...
size_t bytesRemaining = 0;
while (web_client.connected()){
if (bytesRemaining > 0) {
// Still responding to the last request
char const* const cursor = c_index_html
+ sizeof(c_index_html)
- bytesRemaining;
size_t const bytesWritten = web_client.write(cursor, bytesRemaining);
if (!bytesWritten) {
// check for error
}
bytesRemaining -= bytesWritten;
if (bytesRemaining == 0) {
// End the message. This might not write!
// We should add the '\n' to the source array so
// it's included in our write-until-finished routine.
web_client.println();
// Stop listening
break;
}
} else if (web_client.available()){
// Time for a new request
char c = web_client.read();
if (c == '\n' && currentLineIsBlank)
{
Serial.println(F("Sending response"));
// Start responding to this request
bytesRemaining = sizeof(c_index_html);
continue;
}
...
This is what I think is going on. I'm not an expert so I might be wrong, but it seems to make sense.
This is not an answer as in "solution" but I found out there is a 2k buffer size limit using the W5100S-EVB-Pico. And indeed, if I keep the HTML below 2k it works. Turns out that I actually got Matt Murphy's suggestion to work but the 2k limit was the problem. It looks like a hardware/library limitation, not completely sure on that.
For now I'll shrink my HTML and Javascript to a minimum and compact it even more with for example textfixer.com. I think I might write some python code to do that
Maybe there is a path to a solution in the link below but at this moment I'll try to live with the limitations
Link:
https://github.com/khoih-prog/EthernetWebServer/issues/7
I am trying to send a PNG file from C++ over stdout to Nodejs. However, when I send it, it seems to get cut halfway sometimes when I read it in NodeJS, while I only flush after I sent the whole PNG in C++. What causes this behaviour?
My code to send the image:
void SendImage(Mat image)
{ //from: https://stackoverflow.com/questions/41637438/opencv-imencode-buffer-exception
std::vector<uchar> buffer;
#define MB image_size.width*image_size.height
buffer.resize(200 * MB);
cv::imencode(".png", image, buffer);
printf("image ");
for(int i = 0; i < buffer.size(); i++)
printf("%c", buffer[i]);
fflush(stdout);
}
Then, I receive it in Nodejs and just test what I receive:
this.puckTracker.stdout.on('data', (data) => {
console.log("DATA");
var str = data.toString();
console.log(str);
//first check if its an image being sent. C++ prints "image 'imageData'". So try to see if the first characters are 'image'.
const possibleImage = str.slice(0, 5);
console.log("POSSIBLEIMAGE: " + possibleImage);
}
I have tried the following commands in C++ to try and remove automatic flushes:
//disable sync between libraries. This makes the stdout much faster, but you must either use cout or printf, no mixes. Since printf is faster, use printf everywhere.
std::ios_base::sync_with_stdio(false);
//make sure C++ ONLY flushes when I say so, so no data gets broken in half.
std::setvbuf(stdout, nullptr, _IOFBF, BUFSIZ);
When I run the C++ program with a visible terminal, it seems to be alright.
What I expect the NodeJS console to print is:
DATA
image ëPNG
IHDR ... etc, all the image data.
POSSIBLEIMAGE: image
and this for every image I send.
Instead I get:
DATA
image �PNG
IHDT ...
POSSIBLEIMAGE: image
DATA
-m5VciVWjՖҬvXjvXm9kV[d嬭v
POSSIBLEIMAGE: -m5V
DATA
image �PNG
etc.
It seems to cut each image once as far as I can tell.
Here is a pastebin in case someone needs the full log. (Printing some additional stuff, but that shouldn't matter.) https://pastebin.com/VJEbm6V5
for(int i = 0; i < buffer.size(); i++)
printf("%c", buffer[i]);
fflush(stdout);
There are no guarantees whatsoever that only the final fflush will send all the data, in one chunk.
You never had any, nor will have any, guarantee whatsoever that stdout will get flushed only when you explicitly want it to. Typical implementations of stdout, or its C++ equivalent use a fixed size buffer that gets automatically flushed when its full, whether you want it or not. As each character goes out the door, it gets added to this fixed size buffer. When it's full the buffer gets flushed to the output. The only thing fflush does is make it explicitly, flushing out the partially-filled buffer.
Then, that's not the whole story.
When you are reading from a network connection, you also have no guarantees whatsoever that you will read everything that was written, in one chunk, even if it was flushed in one chunk. Sockets and pipes don't work this way. Anywhere in between the data can get broken up in intermediate chunks, and delivered to your reading process one chunk at a time.
//make sure C++ ONLY flushes when I say so, so no data gets broken in half.
std::setvbuf(stdout, nullptr, _IOFBF, BUFSIZ);
This does not turn off buffering, effectively making the buffering infinite. From the Linux documentation of what happens with a null buffer pointer:
If the argument buf is NULL, only the mode is affected; a new buffer
will be allocated on the next read or write operation.
All this does is give you a default buffer, with the default size. Which stdout already has anyway.
Now, you could certainly create a custom buffer that's as big as your image, so that everything gets buffered up front. But, as I explained, that won't accomplish anything useful, whatsoever. The data will still likely be broken up in transit, and you will read it in nodejs one chunk a time.
This entire approach is completely wrong. You need to send the # of bytes separately, up front, read it first, then you know how many bytes to expect, then read the given number of bytes.
printf("image ");
Put the number of bytes to follow, here, read it in nodejs, parse it, and then you know how many bytes to keep reading, until you get everything.
Of course, keep in mind that, for the reasons I explained above, the very first thing your nodejs code could read (unlikely, but it can happen, and a good programmer will write proper code that will correctly handle all possibilities):
image 123
with the "40" part read in the next chunk, indicating that 12340 bytes follow. Or, it could equally well read just:
ima
with the rest following.
Conclusion: you have no guarantees that whatever you read, in whatever way, will always match, exactly, the byte counts of whatever was written, no matter how it was buffered on the write end, or when it was flushed. Sockets and pipes never gave you this guarantee (there are some slight read/write semantics that are documented, for pipes, but that's irrelevant). You will need to code everything on the reading side accordingly: no matter how small or big is read, your code will need to logically parse "image ### ", one character at a time, determining when to stop when parsing the space after a digit. Parsing this gives you the byte count, then your code will need to logically read the exact number of bytes to follow. It's possible that this, and the first chunk of data, will be the first thing you read. It's possible that the first think you will read will be just the "i". You never know what's to expect. It's like playing the lottery. You don't have any guarantees, but that's how things work. No, this is not easy, to do correctly.
I have fixed it and it works now. I'm placing my code here, in case someone in the feature needs it.
Sending side C++
To be able to concatenate my buffer and parse it correctly, I have added "stArt" and "eNd" around the message I send. Example: stArtimage‰PNG..IHDR..binary data..eNd.
You can probably also do this by just using the default start and stop of the PNG itself or even only the start and take everything before the next start. However, I need to send custom data as well. The C++ code is now:
void SendImage(Mat image)
{
std::vector<uchar> buffer;
cv::imencode(".png", image, buffer);
//StArt (that caps) is the word to split the data chunks on in nodejs.
cout << "stArtimage";
fwrite(buffer.data(), 1, buffer.size(), stdout);
cout << "eNd";
fflush(stdout);
}
Very important: add this at the start of your program, otherwise the image becomes unreadable:
#include <io.h>
#include <fcntl.h>
//sets the stdout to binary. If this is not done, it replaces \n by \r\n, which gives issues when sending PNG images.
_setmode(_fileno(stdout), O_BINARY);
Receiving side NodeJS
When new data comes in, I concatenate with the previous unused data. If I can find both a stArt and an eNd, the data is complete and I use the piece in between. I then store all the bytes after eNd, so I can use them for the next time I get data. In my code this is placed in a class, so if it doesn't compile, do that :). I also use SocketIO to send data from NodeJS to the browser, so that is the eventdispatcher.emit you are seeing.
this.puckTracker.stdout.on('data', (data) => {
try {
this.bufferArray.push(data);
var buff = Buffer.concat(this.bufferArray);
//data is sent in like: concat ["stArt"][5 letters of dataType][data itself]["eNd"]
// dataTypes: "PData" = puck data, "image" = png image, "Track" = tracking is running
// example image: stArtimage*binaryPNGdata*eNd
// example: stArtPData[]eNdStArtPData[{"ID": "0", "pos": [881.023071, 448.251221]}]eNd
var startBuf = buff.indexOf("stArt");
var endBuf = buff.indexOf("eNd");
if (startBuf != -1 && endBuf != -1) {
var dataType = buff.subarray(startBuf + 5, startBuf + 10).toString(); //extract the five letters datatype directly behind stArt.
var realData = buff.subarray(startBuf + 10, endBuf); //extract the data behind the datatype, before the end of data.
switch (dataType) {
//sending custom JSON data
//sending the PNG image.
case "image":
this.eventDispatcher.emit('PNG', realData);
this.refreshBuffer(endBuf, buff);
break;
case "customData": //do something with your custom realData
this.refreshBuffer(endBuf, buff);
break;
}
}
else {
this.bufferArray.length = 0; //empty the array
this.bufferArray.push(buff); //buff contains the full concatenated buffer of the previous bufferArray, it therefore saves all previous unused data in index 0.
}
} catch (error) {
console.error(error);
console.error(data.toString());
}
});
refreshBuffer(endBuf, buff) {
//do this in all cases (but not if there is no match of dataType)
var tail = buff.subarray(endBuf + 3); //save the unused data of the previous buffer
this.bufferArray.length = 0; //empty the array
this.bufferArray.push(tail); //fill the first spot of the array with the tail of the previous buffer.
}
Client side Javascript
To just make the answer complete, to render the PNG in the browser, use the following code, and make sure you have a canvas ready in your HTML.
socket.on('PNG', (PNG) => {
var blob = new Blob([PNG], { type: "image/png" });
var img = new Image();
var c = document.getElementById("canvas");
var ctx = c.getContext("2d");
img.onload = function (e) {
console.log("PNG Loaded");
ctx.drawImage(img, 0, 0);
window.URL.revokeObjectURL(img.src);
img = null;
};
img.onerror = img.onabort = function (error) {
console.error("ERROR!", error);
img = null;
};
img.src = window.URL.createObjectURL(blob);
});
Make sure you dont use SendImage too often, or you will overflow the stdout and connection with data and it will print it out faster than the browser or server can handle it.
I have a relatively simple web server I have written in C++. It works fine for serving text/html pages, but the way it is written it seems unable to send binary data and I really need to be able to send images.
I have been searching and searching but can't find an answer specific to this question which is written in real C++ (fstream as opposed to using file pointers etc.) and whilst this kind of thing is necessarily low level and may well require handling bytes in a C style array I would like the the code to be as C++ as possible.
I have tried a few methods, this is what I currently have:
int sendFile(const Server* serv, const ssocks::Response& response, int fd)
{
// some other stuff to do with headers etc. ........ then:
// open file
std::ifstream fileHandle;
fileHandle.open(serv->mBase + WWW_D + resource.c_str(), std::ios::binary);
if(!fileHandle.is_open())
{
// error handling code
return -1;
}
// send file
ssize_t buffer_size = 2048;
char buffer[buffer_size];
while(!fileHandle.eof())
{
fileHandle.read(buffer, buffer_size);
status = serv->mSock.doSend(buffer, fd);
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
return 0
}
And then elsewhere:
int TcpSocket::doSend(const char* message, int fd) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, strlen(message), 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
As I say, the problem is that when the client requests an image it won't work. I get in std::cerr "Error: socket error sending file"
EDIT : I got it working using the advice in the answer I accepted. For completeness and to help those finding this post I am also posting the final working code.
For sending I decided to use a std::vector rather than a char array. Primarily because I feel it is a more C++ approach and it makes it clear that the data is not a string. This is probably not necessary but a matter of taste. I then counted the bytes read for the stream and passed that over to the send function like this:
// send file
std::vector<char> buffer(SEND_BUFFER);
while(!fileHandle.eof())
{
fileHandle.read(&buffer[0], SEND_BUFFER);
status = serv->mSock.doSend(&buffer[0], fd, fileHandle.gcount());
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
Then the actual send function was adapted like this:
int TcpSocket::doSend(const char* message, int fd, size_t size) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, size, 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
The first thing you should change is the while (!fileHandle.eof()) loop, because that will not work as you expect it to, in fact it will iterate once too many because the eof flag isn't set until after you try to read from beyond the end of the file. Instead do e.g. while (fileHandle.read(...)).
The second thing you should do is to check how many bytes was actually read from the file, and only send that amount of bytes.
Lastly, you read binary data, not text, so you can't use strlen on the data you read from the file.
A little explanations of the binary file problem: As you should hopefully know, C-style strings (the ones you use strlen to get the length of) are terminated by a zero character '\0' (in short, a zero byte). Random binary data can contain lots of zero bytes anywhere inside it, and it's a valid byte and doesn't have any special meaning.
When you use strlen to get the length of binary data there are two possible problems:
There's a zero byte in the middle of the data. This will cause strlen to terminate early and return the wrong length.
There's no zero byte in the data. That will cause strlen to go beyond the end of the buffer to look for the zero byte, leading to undefined behavior.
I am creating simple apache module to capture all HTTP traffic for real time processing by security software. My goal is to get headers and body from both request and response. So far I managed to get all i need except request body. What's the best way to get request body in output filter, or in any other hook/handler to get request-response "tuple" with all releated information ?
static apr_status_t ef_output_filter(ap_filter_t *f, apr_bucket_brigade *bb)
{
apr_status_t rv;
request_rec *r = f->r;
apr_bucket *e = APR_BRIGADE_FIRST(bb);
const char *data;
apr_size_t length;
std::ofstream outfile;
outfile.open("/var/log/apache2/test.txt", std::ios_base::app);
outfile << r->method << r->unparsed_uri << std::endl;
apr_table_do(loop_table, &outfile, r->headers_in, NULL);
//!!! READ REQUST BODY HERE !!!!
outfile << r->status << std::endl;
apr_table_do(loop_table, &outfile, r->headers_out, NULL);
outfile << std::endl;
while (e != APR_BRIGADE_SENTINEL(bb)) {
apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
e = APR_BUCKET_NEXT(e);
outfile << data;
}
outfile.flush();
outfile.close();
return ap_pass_brigade(f->next, bb);
}
Any help appriciated
You can read the body from the request_rec pointer you're deriving from the ap_filter_t pointer variable.
As a first step, you should tell apache you want to read data from the client, by calling ap_setup_client_block, passing the request_rec pointer and a "read policy" as argument.
Second, you call ap_should_client_block (passing the request_rec pointer as argument) to check everything is OK, especially on the client side (expecting true as result).
Then you call (as many times as needed) ap_get_client_block, with the request_rec as argument, a buffer where the data will go, and the size of your buffer. You should get as a response the number of bytes read, and the data should be in your buffer. If you tried to read X bytes maximum and got X bytes returned, you should call again to get the remaining bytes. Note that the header "Content-length" should be use to avoid trying to read too many data, which might cause crashes...
So you'd go for something along the lines of:
char buffer[SOME_BUFER_SIZE];
int ret_code = ap_setup_client_block(r, REQUEST_CHUNKED_ERROR);
if (ret_code == OK) {
if (ap_should_client_block(r)) {
int dataBytesRead = ap_get_client_block(r, buffer, SOME_BUFFER_SIZE);
...
}
}
As of writing, you can find more info here: https://docstore.mik.ua/orelly/apache_mod/139.htm or here: http://byteandbits.blogspot.com/2013/09/example-apache-module-for-reading.html
Hope it helps...
I'm trying to copy a file, but whatever I try, the copy seems to be a few bytes short.
_file is an ifstream set to binary mode.
void FileProcessor::send()
{
//If no file is opened return
if(!_file.is_open()) return;
//Reset position to beginning
_file.seekg(0, ios::beg);
//Result buffer
char * buffer;
char * partBytes = new char[_bufferSize];
//Packet *p;
//Read the file and send it over the network
while(_file.read(partBytes,_bufferSize))
{
//buffer = Packet::create(Packet::FILE,std::string(partBytes));
//p = Packet::create(buffer);
//cout<< p->getLength() << "\n";
//writeToFile(p->getData().c_str(),p->getLength());
writeToFile(partBytes,_bufferSize);
//delete[] buffer;
}
//cout<< *p << "\n";
delete [] partBytes;
}
_writeFile is the file to be written to.
void FileProcessor::writeToFile(const char *buffer,unsigned int size)
{
if(_writeFile.is_open())
{
_writeFile.write(buffer,size);
_writeFile.flush();
}
}
In this case I'm trying to copy a zip file.
But opening both the original and copy in notepad I noticed that while they look identical , It's different at the end where the copy is missing a few bytes.
Any suggestions?
You are assuming that the file's size is a multiple of _bufferSize. You have to check what's left on the buffer after the while:
while(_file.read(partBytes,_bufferSize)) {
writeToFile(partBytes,_bufferSize);
}
if(_file.gcount())
writeToFile(partBytes, _file.gcount());
Your while loop will terminate when it fails to read _bufferSize bytes because it hits an EOF.
The final call to read() might have read some data (just not a full buffer) but your code ignores it.
After your loop you need to check _file.gcount() and if it is not zero, write those remaining bytes out.
Are you copying from one type of media to another? Perhaps different sector sizes are causing the apparent weirdness.
What if _bufferSize doesn't divide evenly into the size of the file...that might cause extra bytes to be written.
You don't want to always do writeToFile(partBytes,_bufferSize); since it's possible (at the end) that less than _bufferSize bytes were read. Also, as pointed out in the comments on this answer, the ifstream is no longer "true" once the EOF is reached, so the last chunk isn't copied (this is your posted problem). Instead, use gcount() to get the number of bytes read:
do
{
_file.read(partBytes, _bufferSize);
writeToFile(partBytes, (unsigned int)_file.gcount());
} while (_file);
For comparisons of zip files, you might want to consider using a non-text editor to do the comparison; HxD is a great (free) hex editor with a file compare option.