compare data in a JSON::Value variable and then update to file - c++

I am trying to update a data to two JSON files by providing the filename at run time.
This is the updateTOFile function which will update data stored in JSON variable to two different in two different threads.
void updateToFile()
{
while(runInternalThread)
{
std::unique_lock<std::recursive_mutex> invlock(mutex_NodeInvConf);
FILE * pFile;
std::string conff = NodeInvConfiguration.toStyledString();
pFile = fopen (filename.c_str(), "wb");
std::ifstream file(filename);
fwrite (conff.c_str() , sizeof(char), conff.length(), pFile);
fclose (pFile);
sync();
}
}
thread 1:
std::thread nt(&NodeList::updateToFile,this);
thread 2:
std::thread it(&InventoryList::updateToFile,this);
now it's updating the files even if no data has changed from the previous execution. I want to update the file only if there's any change compared to previously stored one. if there is no change then it should print the data is same.
Can anyone please help with this??
Thanks.

You can check if it has changed before writing.
void updateToFile()
{
std::string previous;
while(runInternalThread)
{
std::unique_lock<std::recursive_mutex> invlock(mutex_NodeInvConf);
std::string conf = NodeInvConfiguration.toStyledString();
if (conf != previous)
{
// TODO: error handling missing like in OP
std::ofstream file(filename);
file.write (conf.c_str() , conf.length());
file.close();
previous = std::move(conf);
sync();
}
}
}
However such constant polling in loop is likely inefficient. You may add Sleeps to make it less diligent. Other option is to track by NodeInvConfiguration itself if it has changed and clear that flag when storing.

Related

Qt: How can I copy a big data using QT?

I want to read a big data, and then write it to a new file using Qt.
I have tried to read a big file. And the big file only have one line. I test with readAll() and readLine().
If the data file is about 600MB, my code can run although it is slow.
If the data file is about 6GB, my code will fail.
Can you give me some suggestions?
Update
My test code is as following:
#include <QApplication>
#include <QFile>
#include <QTextStream>
#include <QTime>
#include <QDebug>
#define qcout qDebug()
void testFile07()
{
QFile inFile("../03_testFile/file/bigdata03.txt");
if (!inFile.open(QIODevice::ReadOnly | QIODevice::Text))
{
qcout << inFile.errorString();
return ;
}
QFile outFile("../bigdata-read-02.txt");
if (!outFile.open(QIODevice::WriteOnly | QIODevice::Truncate))
return;
QTime time1, time2;
time1 = QTime::currentTime();
while(!inFile.atEnd())
{
QByteArray arr = inFile.read(3*1024);
outFile.write(arr);
}
time2 = QTime::currentTime();
qcout << time1.msecsTo(time2);
}
void testFile08()
{
QFile inFile("../03_testFile/file/bigdata03.txt");
if (!inFile.open(QIODevice::ReadOnly | QIODevice::Text))
return;
QFile outFile("../bigdata-readall-02.txt");
if (!outFile.open(QIODevice::WriteOnly | QIODevice::Truncate))
return;
QTime time1, time2, time3;
time1 = QTime::currentTime();
QByteArray arr = inFile.readAll();
qcout << arr.size();
time3 = QTime::currentTime();
outFile.write(inFile.readAll());
time2 = QTime::currentTime();
qcout << time1.msecsTo(time2);
}
int main(int argc, char *argv[])
{
testFile07();
testFile08();
return 0;
}
After my test, I share my experience about it.
read() and readAll() seem to be the same fast; more actually, read() is slightly faster.
The true difference is writing.
The size of file is 600MB:
Using read function, read and write the file cost about 2.1s, with 875ms for reading
Using readAll function, read and write the file cost about 10s, with 907ms for reading
The size of file is 6GB:
Using read function, read and write the file cost about 162s, with 58s for reading
Using readAll function, get the wrong answer 0. Fail to run well.
Open both files as QFiles. In a loop, read a fixed number of bytes, say 4K, into an array from the input file, then write that array into the output file. Continue until you run out of bytes.
However, if you just want to copy a file verbatim, you can use QFile::copy
You can use QFile::map and use the pointer to the mapped memory to write in a single shot to the target file:
void copymappedfile(QString in_filename, QString out_filename)
{
QFile in_file(in_filename);
if(in_file.open(QFile::ReadOnly))
{
QFile out_file(out_filename);
if(out_file.open(QFile::WriteOnly))
{
const qint64 filesize = in_file.size();
uchar * mem = in_file.map(0, filesize, QFileDevice::MapPrivateOption);
out_file.write(reinterpret_cast<const char *>(mem) , filesize);
in_file.unmap(mem);
out_file.close();
}
in_file.close();
}
}
One thing to keep in mind:
With read() you specify a maximum size for the currently read chunk (in your example 3*1024 bytes), with readAll() you tell the program to read the entire file at once.
In the first case you (repeatedly) put 3072 Bytes on the stack, write them and they get removed from the stack once the current loop iteration ends. In the second case you push the entire file on the stack. Pushing 600MB on the stack at once might be the reason for your performance issues. If you try to put 6GB on the stack at once you may just run out of memory/adress space - causing your program to crash.

RapidJSON c++ efficient and scalable way to append json object to file

i have a json file that has array of json objects. i am using rapidjson c++.
i want to append new object to json array that is inside this file
currently what i do is that i read the whole file in a json object using fileread stream and the i add new member (new json object) using AddMember inside array of that document that i read previously. and now i overwrite this new object inside the file and repeat the process for new objects.
this solution is not scalable. Can someone pointout anyother solution using rapidjson or raw filestream. help will be appreciated, i've been looking all over the internet but no luck.
is there something like append to file incrementally using json.
or any other scalable solution because my file size will get very large with time and thus reading the whole file everytime and then appending a new object and then rewrite the whole file will be a waste to memory and cpu time.
help me with this one please
This question is from some years ago, but this answer is still relevant.
The goal is to append a json object with rapidjson to a potentially already existing file which contains a json array. The following is satisfied:
No reading or parsing of the already existing file.
The new object is added directly to the already existing file without document merging.
Time does not depend on what has been added previously.
Here is the code with comments:
bool appendToFile(const std::string& filename, const rapidjson::Document& document)
{
using namespace rapidjson;
// create file if it doesn't exist
if (FILE* fp = fopen(filename.c_str(), "r"); !fp)
{
if (fp = fopen(filename.c_str(), "w"); !fp)
return false;
fputs("[]", fp);
fclose(fp);
}
// add the document to the file
if (FILE* fp = fopen(filename.c_str(), "rb+"); fp)
{
// check if first is [
std::fseek(fp, 0, SEEK_SET);
if (getc(fp) != '[')
{
std::fclose(fp);
return false;
}
// is array empty?
bool isEmpty = false;
if (getc(fp) == ']')
isEmpty = true;
// check if last is ]
std::fseek(fp, -1, SEEK_END);
if (getc(fp) != ']')
{
std::fclose(fp);
return false;
}
// replace ] by ,
fseek(fp, -1, SEEK_END);
if (!isEmpty)
fputc(',', fp);
// append the document
char writeBuffer[65536];
FileWriteStream os(fp, writeBuffer, sizeof(writeBuffer));
Writer<FileWriteStream> writer(os);
document.Accept(writer);
// close the array
std::fputc(']', fp);
fclose(fp);
return true;
}
return false;
}
I do not know if there is a readymade library for that, but if you decide to do it yourself is not impossible.
In few steps you could:
1) Load the all JSON in ram.
2) Take every request to append JSON and save it to a log file
3) Update JSON In RAM after written request to log
4) Every x seconds block changes, write the all JSON to disk and clear the log file
5) Unblock changes
6) Goto 2
Further optimizations could b:
1) Check for log file on start (after a crash) and apply log requests
2) When you write the JSON file do not rewrite completely but check if there were only appends at the end and write only the new part.
How does this sound ?

Why did getline not reach the end of the file in C++?

I have a simple function that edits an HTML file. All it does is just to replace some texts in the file. This is the code for the function:
void edit_file(char* data1, char* data1_token, char* data2, char* data2_token) {
std::ifstream filein("datafile.html");
std::ofstream fileout("temp.html");
std::string line;
//bool found = false;
while(std::getline(filein, line))
{
std::size_t for_data1 = line.find(data1_token);
std::size_t for_data2 = line.find(data2_token);
if (for_data1 != std::string::npos) {
line.replace(for_data1, 11, data1);
}
if (for_data2 != std::string::npos) {
line.replace(for_data2, 19, data2);
}
fileout<<line;
}
filein.close();
fileout.close();
}
void edit_file_and_copy_back(char* data1, char* data1_token, char* data2, char* data2_token)
{
edit_file(data1, data1_token, data2, data2_token);
MoveFileEx("temp.html", "datafile.html", MOVEFILE_REPLACE_EXISTING);
}
For some reasons, I will call this function multiple times, but this function only works for the first time, and later on the "getline" it will stop somewhere in the middle of the file.
The replace function works without any problems (because it works the first time). However, the second time, the while loop will end after just reading some lines.
I have tried filein.close() or file.seekg function, but neither of them fix the problem. What causes the incorrect execution and how do to solve it?
Buffering is biting you. Here's what you're doing:
Opening datafile.html for read and temp.html for write
Copying lines from datafile.html to temp.html
When you're done, without closing or flushing temp.html, you open a separate handle to temp.html for read (which won't share the buffer with the original handle, so unflushed data isn't seen)
You open a separate handle to datafile.html for write, and copying from the second temp.html handle to the new datafile.html handle
But the copy in steps 3 & 4 is missing the data still in the buffer for temp.html opened in step 1. And each time you call this, if the input and output buffer sizes don't match, or the iostream implementation you're using doesn't flush until you write buffer size + 1 bytes, you'll drop up to another buffer size worth of data.
Change your code so the scope of the original handles ends before you call the copy back function:
void edit_file(char* data1, char* data1_token, char* data2, char* data2_token) {
{ // New scope; when it ends, files are closed
ifstream filein("datafile.html");
ofstream fileout("temp.html");
string strTemp;
std::string line;
//bool found = false;
while(std::getline(filein, line))
{
std::size_t for_data1 = line.find(data1_token);
std::size_t for_data2 = line.find(data2_token);
if (for_data1 != std::string::npos) {
line.replace(for_data1, 11, data1);
}
if (for_data2 != std::string::npos) {
line.replace(for_data2, 19, data2);
}
fileout<<line;
}
} // End of new scope, files closed at this point
write_back_file();
}
void write_back_file() {
ifstream filein("temp.html");
ofstream fileout("datafile.html");
fileout<<filein.rdbuf();
}
Mind you, this still has potential errors; if both data tokens are found, and data1_token occurs before data2_token, the index for data2_token will be stale when you use it; you need to delay the scan for data2_token until after you scan and replace data1_token (or if the data1_token replacement might create a data2_token that shouldn't be replaced, you'll need to compare the hit indices and perform the replacement for the later hit first, so the earlier index remains valid).
Similarly, from a performance and atomicity perspective, you probably don't want to copy from temp.html back to datafile.html; other threads and processes would be able to see the incomplete datafile.html in that case, rather than seeing the old version atomically replaced with the new version. It also means you need to worry about removing temp.html at some point. Typically, you just move the temporary file over the original file:
rename("temp.html", "datafile.html");
If you're on Windows, that won't work atomically to replace an existing file; you'd need to use MoveFileEx to force replacing of existing files:
MoveFileEx("temp.html", "datafile.html", MOVEFILE_REPLACE_EXISTING);
void edit_file(char* data1, char* data1_token, char* data2, char* data2_token) {
ifstream filein("datafile.html");
ofstream fileout("temp.html");
// STUFF
// At this point the two streams are still open and
// may not have been flushed to the file system.
// You now call this function.
write_back_file();
}
void write_back_file() {
// You are opening files that are already open.
// Do not think there are any guarantees about the content at this point.
// So what is copied is questionable.
ifstream filein("temp.html");
ofstream fileout("datafile.html");
fileout<<filein.rdbuf();
}
Do not call write_back_file() from within edit_file(). Rather provide a wrapper that calls both.
void edit_file_and_copy_back(char* data1, char* data1_token, char* data2, char* data2_token)
{
edit_file(data1, data1_token, data2, data2_token);
write_back_file();
}

How to read multiple structs from a binary file

I have written two instances ck1,ck2 of a struct named Cookie and have saved them in a binary file named "mydat" by calling a function :
bool s_cookie(Cookie myck,std::string fname) {
std::ofstream ofs(fname,std::ios::binary | std::ios::app);
if(!ofs) return false;
ofs.write((char *) &myck, sizeof(Cookie));
ofs.close();
return true;
}
of course myck can be ck1, ck2, etc, and fname reps the "mydat" binary file. So the two structs have both been saved in the same file.
Now I want to read them back into ck3 and ck4 respectively. How do i do that? Cookie looks like this :
struct Cookie {
std::string name;
std::string value;
unsigned short duration;
bool expired;
};
Thanks
Something like writing, but read them, if Cookie is a POD:
std::ifstream ifs(fname,std::ios::binary);
Cookie ck3, ck4;
ifs.read((char *) &ck3, sizeof(Cookie));
ifs.read((char *) &ck4, sizeof(Cookie));
Also, you should check the result of each opening and reading operation and handle them.
Update: After your update and seeing the Cookie, you can not simply write it into a file. You should serialize it or make a well-defined protocol to read/write data.
A simple workaround is (read the comment):
// Assume name and value are not longer that 99
// and you don't care about wasted space in the file
struct CookiePOD {
CookiePOD(const Cookie &p)
{
// I ignored bound checking !
std::copy(p.name.begin(), p.name.end(), name);
name[p.name.size()] = 0;
std::copy(p.value.begin(), p.value.end(), value);
value[p.value.size()] = 0;
duration = p.duration;
expired = p.expired;
}
char name[100];
char value[100];
unsigned short duration;
bool expired;
};
And then try to read/write CookiePOD instead of Cookie.

Concatenating strings into own protocol

I'm writing networking programming using socket.h to my studies. I have written server and client simple programs that can transfer files between them using buffer size given by user.
Server
void transfer(string name)
{
char *data_to_send;
ifstream myFile;
myFile.open(name.c_str(),ios::binary);
if(myFile.is_open))
{
while(myFile.eof))
{
data_to_send = new char [buffer_size];
myFile.read(data_to_send, buffer_size);
send(data_to_send,buffer_size);
delete [] data_to_send;
}
myFile.close();
send("03endtransmission",buffer_size);
}
else
{
send("03error",buffer_size);
}
}
Client
void download(string name)
{
char *received_data;
fstream myFile;
myFile.open(name.c_str(),ios::out|ios::binary);
if(myFile.is_open())
{
while(1)
{
received_data = new char[rozmiar_bufora];
if((receivedB = recv(sockfd, received_data, buffer_size,0)) == -1) {
perror("recv");
close(sockfd);
exit(1);
}
if(strcmp(received_data,"03endoftransmission") == 0)
{
cout<<"End of transmission"<<endl;
break;
}
else if (strcmp(received_data,"03error") == 0)
{
cout<<"Error"<<endl;
break;
}
myFile.write(received_data,buffer_size);
}
myFile.close();
}
The problem occurs, when I want to implement my own protocol- two chars (control), 32 chars hash, and the rest of package is data. I tried few times to split it and I end up with this code:
Server
#define PAYLOAD 34
void transfer(string name)
{
char hash[] = "12345678901234567890123456789012"; //32 chars
char *data_to_send;
ifstream myFile;
myFile.open(name.c_str(),ios::binary);
if(myFile.is_open))
{
while(myFile.eof))
{
data_to_send = new char [buffer_size-PAYLOAD];
myFile.read(data_to_send, buffer_size-PAYLOAD);
concatenation = new char[buffer_size];
strcpy(concatenation,"02");
strcat(concatenation,hash);
strcat(concatenation,data_to_send);
send(concatenation,buffer_size);
delete [] data_to_send;
delete [] concatenation;
}
myFile.close();
send("03endtransmission",buffer_size);
}
else
{
send("03error",buffer_size);
}
}
Client
void download(string name)
{
char *received_data;
fstream myFile;
myFile.open(name.c_str(),ios::out|ios::binary);
if(myFile.is_open())
{
while(1)
{
received_data = new char[buffer_size];
if((receivedB = recv(sockfd, received_data, buffer_size,0)) == -1) {
perror("recv");
close(sockfd);
exit(1);
}
if(strcmp(received_data,"03endoftransmission") == 0)
{
cout<<"End of transmission"<<endl;
break;
}
else if (strcmp(received_data,"03error") == 0)
{
cout<<"Error"<<endl;
break;
}
control = new char[3];
strcpy(control,"");
strncpy(control, received_data,2);
control[2]='\0';
hash = new char[33];
strcpy(hash,"");
strncpy(hash,received_data+2,32);
hash[32]='\0';
data = new char[buffer_size-PAYLOAD+1];
strcpy(data,"");
strncpy(data,received_data+34,buffer_size-PAYLOAD);
myFile.write(data,buffer_size-PAYLOAD);
}
myFile.close();
}
But this one inputs to file some ^# instead of real data. Displaying "data" to console looks the same on server and client. If you know how I can split it up, I would be very grateful.
You have some issues which may or may not be your problem.
(1) send/recv can return less than you requested. You may ask to receive 30 bytes but only get 10 on the recv call so all of these have to be coded in loops and buffered somewhere until you actually get the number you wanted. Your first set of programs was lucky to work in this regard and probably only because you tested on a limited amount of data. Once you start to push through more data your assumptions on what you are reading (and comparing) will fail.
(2) There is no need to keep allocating char buffers in the loops; allocate them before the loop or just use a local buffer rather than the heap. What you are doing is inefficient and in the second program you have memory leaks because you don't delete them.
(3) You can get rid of the strcpy/strncpy statements and just use memmove()
Your specific problem is not jumping out at me but maybe this will push in the right direction. More information what is being transmitted properly and exactly where in the data you are seeing problems would be helpful.
But this one inputs to file some ^# instead of real data. Displaying
"data" to console looks the same on server and client. If you know how
I can split it up, I would be very grateful.
You say that the data (I presume the complete file rather than the '^#') is the same on both client and server? If this is the case, then your issue is likely writing the data to file, rather than the actual transmission of the data itself.
If this is the case, you'll probably want to check assumptions about how the program writes to file - for example, are you passing in text data to be written to file, or binary data? If you're writing binary data, but it uses the NULL-terminated string, chances are it will quit early treating valid binary information as a NULL.
If it's text mode, you might want to consider initialising all strings with memset to a default character (other than NULL) to see if it's garbage data being out put.
If both server and client display the '^#' (or whatever data), binary based char data would be incompatible with the strcpy/strcat functions as this rely on NULL termination (where-as binary uses size termination instead).
I can't track down the specific problem, but maybe this might offer an insight or two that helps.