Read/write operation works neither good nor bad - c++

I am programming a face detection algorithm. In my code I'm parsing an XML file (in a recursion way, very inefficient takes my about 4 minutes to parse the whole XML file). I'd like to save the XML content using Iosteam binary to a file. I'm using a struct in C++ in order to use the raw data.
My goal is to parse the XML only if the raw data file is not exist.
The method work like this:
If the raw data file is not exist, parse the XML file and save the data to a file.
If the raw data file exist, read the raw data from the file
My problem is: whenever I open the raw data file and read from it. I get to read only small amount of byte from the file, I don't know how much, but in a certain point I receive only 0x00 data on my buffer.
My guess: I believe this has to do with the OS buffer, Which has a certain amount of buffer for read and write operations. I might be wrong about this. Though I'm not sure which one from the operations doesn't work well, it's either the write or read.
I was thinking to write / read the raw data char by char or line by line. In the other hand the file doesn't contain a text, which means that I can't read line by line or char by char.
The raw data size is
size_t datasize = DataSize(); == 196876 (Byte)
Which is retrieve in this function
/* Get the upper bound for predefined cascade size */
size_t CCacadeInterpreter::DataSize()
{
// this is an upper boundary for the whole hidden cascade size
size_t datasize = sizeof(HaarClassifierCascade) * TOTAL_CASCADE+
sizeof(HaarStageClassifier)*TOTAL_STAGES +
sizeof(HaarClassifier) * TOTAL_CLASSIFIERS +
sizeof(void*)*(TOTAL_CASCADE+TOTAL_STAGES+TOTAL_CLASSIFIERS);
return datasize;
}
The method work like this
BYTE * CCacadeInterpreter::Interpreter()
{
printf("|Phase - Load cascade from memory | CCacadeInterpreter::Interpreter | \n");
size_t datasize = DataSize();
// Create a memory structure
nextFreeSpace = pStartMemoryLocation = new BYTE [datasize];
memset(nextFreeSpace,0x00,datasize);
// Try to open a predefined cascade file on the current folder (instead of parsing the file again)
fstream stream;
stream.open(cascadeSavePath); // ...try existing file
if (stream.is_open())
{
stream.seekg(0,ios::beg);
stream.read((char*)pStartMemoryLocation , datasize); // **ream from file**
stream.close();
printf("|Load cascade from saved memory location | CCacadeInterpreter::Interpreter | \n");
printf("Completed\n\n");
stream.close();
return pStartMemoryLocation;
}
// Open the cascade file and parse the cascade xml file
std::fstream cascadeFile;
cascadeFile.open(cascadeDestanationPath, std::fstream::in); // open the file with read only attributes
if (!cascadeFile.is_open())
{
printf("Error: couldn't open cascade XML file\n");
delete pStartMemoryLocation;
return NULL;
}
// Read the file XML file , line by line
string buffer, str;
getline(cascadeFile,str);
while(cascadeFile)
{
buffer+=str;
getline(cascadeFile,str);
}
cascadeFile.close();
split(buffer, '<',m_tokens);
// Parsing begins
pHaarClassifierCascade = (HaarClassifierCascade*)nextFreeSpace;
nextFreeSpace += sizeof(HaarClassifierCascade);
pHaarClassifierCascade->count=0;
pHaarClassifierCascade->orig_window_size_height=20;
pHaarClassifierCascade->orig_window_size_width=20;
m_deptInTree=0;
m_numOfStage = 0;
m_numOfTotalClassifiers=0;
while (m_tokens.size())
{
Parsing();
}
// Save the current cascade into a file
SaveBlockToMemory(pStartMemoryLocation,datasize);
printf("\nCompleted\n\n");
return pStartMemoryLocation;
}
bool CCacadeInterpreter::SaveBlockToMemory(BYTE * pStartMemoryLocation,size_t dataSize)
{
fstream stream;
if (stream.is_open() )
stream.close();
stream.open(cascadeSavePath); // ...try existing file
if (!stream.is_open()) // ...else, create new file...
stream.open(cascadeSavePath, ios_base::in | ios_base::out | ios_base::trunc);
stream.seekg(0,ios::beg);
stream.write((char*)pStartMemoryLocation,dataSize);
stream.close();
return true;
}

Try using the Boost IOstreams library.
It has an easy to use wrrapers for file handling

Related

Access violation error while attempting to transfer a large file using HTTP / REST server

Using a REST library, I am trying to set it up as a file sharing server, but am running into issues when transferring large files.
As I understand it, the file transfer should mean opening a stream to the file, getting its buffer in a stringstream,then write it within a response body. This seems to work with small files of only a few bytes or KB, but anything larger fails.
std::string filePath = "some_accessible_file";
struct stat st;
if(stat(filePath.c_str(), &st) != 0)
{
//handle it
}
size_t fileSize = st.st_size;
std::streamsize sstreamSize = fileSize;
std::fstream str;
str.open(filePath.c_str(), std::ios::in);
std::ostringstream sstream;
sstream << str.rdbuf();
const std::string str1(sstream.str());
const char* ptr = str1.c_str();
response.headers().add(("Content-Type"), ("application/octet-stream"));
response.headers().add(("Content-Length"), fileSize);
if (auto resp = request.respond(std::move(response))) //respond returns shared pointer to respond type
{
resp->write(ptr, sstreamSize ); //Access violation for large files
}
Not quite sure why large files would fail. Does file type make a difference? I was able to transfer small text files etc. but a small pdf failed...
The root cause of this error was std::fstream not reading the entire file because it was opened in text mode. In windows, this makes reading stop at a end of file (0x1A) character.
The fix is to open the file in std::ios::binary mode.

RapidJSON c++ efficient and scalable way to append json object to file

i have a json file that has array of json objects. i am using rapidjson c++.
i want to append new object to json array that is inside this file
currently what i do is that i read the whole file in a json object using fileread stream and the i add new member (new json object) using AddMember inside array of that document that i read previously. and now i overwrite this new object inside the file and repeat the process for new objects.
this solution is not scalable. Can someone pointout anyother solution using rapidjson or raw filestream. help will be appreciated, i've been looking all over the internet but no luck.
is there something like append to file incrementally using json.
or any other scalable solution because my file size will get very large with time and thus reading the whole file everytime and then appending a new object and then rewrite the whole file will be a waste to memory and cpu time.
help me with this one please
This question is from some years ago, but this answer is still relevant.
The goal is to append a json object with rapidjson to a potentially already existing file which contains a json array. The following is satisfied:
No reading or parsing of the already existing file.
The new object is added directly to the already existing file without document merging.
Time does not depend on what has been added previously.
Here is the code with comments:
bool appendToFile(const std::string& filename, const rapidjson::Document& document)
{
using namespace rapidjson;
// create file if it doesn't exist
if (FILE* fp = fopen(filename.c_str(), "r"); !fp)
{
if (fp = fopen(filename.c_str(), "w"); !fp)
return false;
fputs("[]", fp);
fclose(fp);
}
// add the document to the file
if (FILE* fp = fopen(filename.c_str(), "rb+"); fp)
{
// check if first is [
std::fseek(fp, 0, SEEK_SET);
if (getc(fp) != '[')
{
std::fclose(fp);
return false;
}
// is array empty?
bool isEmpty = false;
if (getc(fp) == ']')
isEmpty = true;
// check if last is ]
std::fseek(fp, -1, SEEK_END);
if (getc(fp) != ']')
{
std::fclose(fp);
return false;
}
// replace ] by ,
fseek(fp, -1, SEEK_END);
if (!isEmpty)
fputc(',', fp);
// append the document
char writeBuffer[65536];
FileWriteStream os(fp, writeBuffer, sizeof(writeBuffer));
Writer<FileWriteStream> writer(os);
document.Accept(writer);
// close the array
std::fputc(']', fp);
fclose(fp);
return true;
}
return false;
}
I do not know if there is a readymade library for that, but if you decide to do it yourself is not impossible.
In few steps you could:
1) Load the all JSON in ram.
2) Take every request to append JSON and save it to a log file
3) Update JSON In RAM after written request to log
4) Every x seconds block changes, write the all JSON to disk and clear the log file
5) Unblock changes
6) Goto 2
Further optimizations could b:
1) Check for log file on start (after a crash) and apply log requests
2) When you write the JSON file do not rewrite completely but check if there were only appends at the end and write only the new part.
How does this sound ?

Load file data into memory and saving it back out

I'm working on a rough storage system for a small personal project. I have a struct that holds data for each stored file:
struct AssetTableRow {
std::string id = "Unnamed"; // a unique name given by the user
std::string guid = ""; // a guid generated based on the file data, used to detect duplicate files
std::string data; // binary data of the file
};
I load a file into it like this:
std::streampos size;
char* memblock;
std::ifstream file(filePath, std::ios::in | std::ios::binary | std::ios::ate);
if (file.is_open()) {
size = file.tellg();
memblock = new char[size];
file.seekg(0, std::ios::beg);
file.read(memblock, size);
file.close();
AssetTableRow row = AssetTableRow();
row.id = "myid";
row.guid = "myguid";
row.data = std::string(memblock);
AssetTable.push_back(row);
}
And then to try to write it back out to a file I used this:
std::ofstream file(destPath, std::ios::out | std::ios::binary);
if (file.is_open()) {
printf("Writing...\n", id.c_str());
// I think this is where it might be messing up
file.write((row.data.c_str(), row.data.c_str().size());
file.close();
printf("Done!\n", id.c_str());
}
Now when I try to open the file that got written out (.png sprite sheet) the photo viewer tells me it can't open a file of that type (but opening the original is fine).
If I open up the 2 files in Notepad++ (original on the left) I can see that they are indeed very different, there is almost no data in the output file!
I'm guessing this has something to do with the length on the write or read, but I've tried every different possible value I can think of for them and it doesn't seem to change anything.
If I print the data out to the console after it's read from the original file it appears as it does in the written file, so that leads me to believe the problem is with how I'm reading the file, but I fail to see any problems with that part of the code.
What is wrong with how I'm reading the file that it doesn't appear to be reading the whole file?
Also please forgive any awful mistakes I've made in my code, I'm still learning c++ and don't fully understand some parts of it so my code may not be the best.
EDIT:
As per Superman's advice about strings, I changed my my code to use char* for holding the data instead.
struct AssetTableRow {
std::string id = "Unnamed"; // a unique name given by the user
std::string guid = ""; // a guid generated based on the file data, used to detect duplicate files
char* data; // binary data of the file
};
And changed the read function so it's reading into the struct's data member:
std::ifstream file(actualPath, std::ios::in | std::ios::binary | std::ios::ate);
if (file.is_open()) {
AssetTableRow row = AssetTableRow();
size = file.tellg();
row.data = new char[size];
file.seekg(0, std::ios::beg);
file.read(row.data, size);
file.close();
row.id = "myid";
row.guid = "myguid";
printf("%s\n", row.data);
}
However I still see the same output from when I was using a string, so now I'm even more confused as to why this is happening.
EDIT2:
Upon further investigation I found that the size variable for reading the file reports the correct size in bytes. So now my guess is that for whatever reason it's not reading in the whole file

Reading a file and saving the same exact file c++

I am actually writing a c++ program that reads any kind of file and saves it as a bmp file, but first I need to read the file, and thats were the issue is
char fileName[] = "test.jpg";
FILE * inFileForGettingSize;//This is for getting the file size
fopen_s(&inFileForGettingSize, fileName, "r");
fseek(inFileForGettingSize, 0L, SEEK_END);
int fileSize = ftell(inFileForGettingSize);
fclose(inFileForGettingSize);
ifstream inFile;//This is for reading the file
inFile.open(fileName);
if (inFile.fail()) {
cerr << "Error Opening File" << endl;
}
char * data = new char[fileSize];
inFile.read(data, fileSize);
ofstream outFile;//Writing the file back again
outFile.open("out.jpg");
outFile.write(data, fileSize);
outFile.close();
cin.get();
But when I read the file, lets say its a plainttext file it allways outputs some wierd charactes at the end, for example:
assdassaasd
sdaasddsa
sdadsa
passes to:
assdassaasd
sdaasddsa
sdadsaÍÍÍ
So when I do this with a jpg, exe, etc. It corrupts it.
I am not trying to COPY a file, I know there are other ways for that, Im just trying to read a complete file byte per byte. Thanks.
EDIT:
I found out that those 'Í' are equal to the number of end lines the file has, but this doesn't help me much
This is caused by newline handling.
You open the files in text mode (because you use "r" instead of "rb" for fopen and because you don't pass ios::binary to your fstream open calls), and on Windows, text mode translates "\r\n" pairs to "\n" on reading and back to "\r\n" when writing. The result is that the in-memory size is going to be shorter than the on-disk size, so when you try to write using the on-disk size, you go past the end of your array and write whatever random stuff happens to reside in memory.
You need to open files in binary mode when working with binary data:
fopen_s(&inFileForGettingSize, fileName, "rb");
inFile.open(fileName, ios::binary);
outFile.open("out.jpg", ios::binary);
For future reference, your copy routine could be improved. Mixing FILE* I/O with iostream I/O feels awkward, and opening and closing the file twice is extra work, and (most importantly), if your routine is ever run on a large enough file, it will exhaust memory trying to load the entire file into RAM. Copying a block at a time would be better:
const int BUFFER_SIZE = 65536;
char buffer[BUFFER_SIZE];
while (source.good()) {
source.read(buffer, BUFFER_SIZE);
dest.write(buffer, source.gcount());
}
It's a binary file, so you need to read and write the file as binary; otherwise it's treated as text, and assumed to have newlines that need translation.
In your call to fopen(), you need add the "b" designator:
fopen_s(&inFileForGettingSize, fileName, "rb");
And in your fstream::open calls, you need to add std::fstream::binary:
inFile.open(fileName, std::fstream::binary);
// ...
outFile.open("out.jpg", std::fstream::binary);

Extracting bytes from byte stream

I receive a binary file via POST in a C++ CGI script and I'm using the Cgicc library to get its contents like so:
std::ofstream myfile;
myfile.open ("file.out", std::ios::in | std::ios::binary);
try
{
cgicc::Cgicc cgi;
cgicc::const_file_iterator file = cgi.getFile("bitmap");
if(file != cgi.getFiles().end())
{
file->writeToStream(myfile);
}
}
catch(std::exception &exception)
{
std::cout << exception.what();
}
The result is a binary file containing the bytes.
Now, because each byte should represent one pixel of an 8 bit bitmap file, I want to construct the entire bitmap file. In order to achieve this, I think I can use the easyBMP library, but since I need to create the image pixel by pixel, I need to somehow iterate over the received bytes. Does anyone know how this can be achieved? Can I get an iterator somehow to an std::ostream / std::ostrstream / std::ostringstream?
std::ostringstream stream;
file->writeToStream(stream);
//foreach byte in stream do { ... }
If you use std::ostringstream you can get std::string from it, using std::ostringstream::str function http://cplusplus.com/reference/iostream/ostringstream/str/ . Also, you can open your file and read it...