I need help using RapidJSON - c++

I'm trying to parse a JSON file using RapidJSON that has thousands of objects like this one
"Amateur Auteur": {
"layout": "normal",
"name": "Amateur Auteur",
"manaCost": "{1}{W}",
"cmc": 2,
"colors": [
"White"
],
"type": "Creature — Human",
"types": [
"Creature"
],
"subtypes": [
"Human"
],
"text": "Sacrifice Amateur Auteur: Destroy target enchantment.",
"power": "2",
"toughness": "2",
"imageName": "amateur auteur",
"colorIdentity": [
"W"
]
},
I believe I have stored the JSON as a C string correctly I just can't really understand how to use the RapidJSON library to return the values I want from each object.
This is the code for storing the JSON as a C string and then parsing it in case I am doing something incorrect here.
std::ifstream input_file_stream;
input_file_stream.open("AllCards.json", std::ios::binary | std::ios::ate); //Open file in binary mode and seek to end of stream
if (input_file_stream.is_open())
{
std::streampos file_size = input_file_stream.tellg(); //Get position of stream (We do this to get file size since we are at end)
input_file_stream.seekg(0); //Seek back to beginning of stream to start reading
char * bytes = new char[file_size]; //Allocate array to store data in
if (bytes == nullptr)
{
std::cout << "Failed to allocate the char byte block of size: " << file_size << std::endl;
}
input_file_stream.read(bytes, file_size); //read the bytes
document.Parse(bytes);
input_file_stream.close(); //close file since we are done reading bytes
delete[] bytes; //Clean up where we allocated bytes to prevent memory leak
}
else
{
std::cout << "Unable to open file for reading.";
}

Your post seems to ask multiple questions. Lets start from the beginning.
I believe I have stored the JSON as a C string correctly I just can't
really understand how to use the RapidJSON library to return the
values I want from each object.
This is a big no no in software engineering. Never believe or assume. It will come back on release day and haunt you. Instead validate your assertion. Here are a few steps starting from the easy to the more involved.
Print out your C-String.
The easiest way to confirm the content of your variable, especially string data, is to simply print to the screen. Nothing easier then seeing your JSON data print to the screen to confirm you have read it in correctly.
std::cout.write(bytes, filesize);
Breakpoint / Debugger
If you have some reason for not printing out your variable, then compile your code with debugging enabled and load in GDB if you're using g++, lldb if you're using clang++, or simply place a breakpoint in visual studio if you're using an VS or VSCode. Once at the breakpoint you can inspect the content of your variable.
However, before we move on I wouldn't be helping you if I didn't point out that reading files in CPP is much much easier then the way you're reading.
// Open File
std::ifstream in("AllCards.json", std::ios::binary);
if(!in)
throw std::runtime_error("Failed to open file.");
// dont skip on whitespace
std::noskipws(in);
// Read in content
std::istreambuf_iterator<char> head(in);
std::istreambuf_iterator<char> tail;
std::string data(head, tail);
At the end of the above code you now have all of the content from your file read into an std::string which wraps a null terminated or C-String. You can access that data by calling .c_str() on your string instances. If you do it this way you no longer have to worry about calling new or delete[] as the std::string class takes care of the buffer for you. Just make sure it hangs around as long as you're using it in RapidJSON.
This is the code for storing the JSON as a C string and then parsing
it in case I am doing something incorrect here.
No. according the the rapid JSON documentation you create a document object and have it parse the string.
Document d;
d.Parse(data.c_str());
However, that just creates the element for querying the document. You can ask the document if specific items exist (d.hasMember("")), ask for a string typed members content d["name"].GetString() or anything listed over at the documentation. You can read the tutorial here.
By the way. Welcome to SO. I would suggest that next time you post ask a more targeted question. What exactly are you trying to do with the parsed JSON element?
I just can't really understand how to use the RapidJSON library to
return the values I want from each object.
I cannot answer this question for two reasons. What are you trying to extract? What have you tried? Have you read the documentation and do not understand a specific item?
Here is a good place to read up on asking better questions. Please don't think I am coming down on you. I bring this up because asking better questions will get you better and more specific answers. poorly asked questions always run the risk of being ignored or, dare I say it, the good old does google not work today response.
https://stackoverflow.com/help/how-to-ask
** Updates **
To your question. You can iterate over all objects.
using namespace rapidjson;
Document d;
d.Parse(data.c_str());
for (auto itr = d.MemberBegin(); itr != d.MemberEnd(); ++itr){
std::cout << itr->name.GetString() << '\n';
}

Related

Pass Binary string/file content from c++ to node js

I'm trying to pass the content of a binary file from c++ to node using the node-gyp library. I have a process that creates a binary file using the .fit format and I need to pass the content of the file to js to process it. So, my first aproach was to extract the content of the file in a string and try to pass it to node like this.
char c;
std::string content="";
while (file.get(c)){
content+=c;
}
I'm using the following code to pass it to Node
v8::Local<v8::ArrayBuffer> ab = v8::ArrayBuffer::New(args.GetIsolate(), (void*)content.data(), content.size());
args.GetReturnValue().Set(ab);
In node a get an arrayBuffer but when I print the content to a file it is different to the one that show a c++ cout.
How can I pass the binary data succesfully?
Thanks.
Probably the best approach is to write your data to a binary disk file. Write to disk in C++; read from disk in NodeJS.
Very importantly, make sure you specify BINARY MODE.
For example:
myFile.open ("data2.bin", ios::out | ios::binary);
Do not use "strings" (at least not unless you want to uuencode). Use buffers. Here is a good example:
How to read binary files byte by byte in Node.js
var fs = require('fs');
fs.open('file.txt', 'r', function(status, fd) {
if (status) {
console.log(status.message);
return;
}
var buffer = new Buffer(100);
fs.read(fd, buffer, 0, 100, 0, function(err, num) {
...
});
});
You might also find these links helpful:
https://nodejs.org/api/buffer.html
<= Has good examples for specific Node APIs
http://blog.paracode.com/2013/04/24/parsing-binary-data-with-node-dot-js/
<= Good discussion of some of the issues you might face, including "endianness" and "interpreting numbers"
ADDENDUM:
The OP clarified that he's considering using C++ as a NodeJS Add-On (not a standalone C++ program.
Consequently, using buffers is definitely an option. Here is a good tutorial:
https://community.risingstack.com/using-buffers-node-js-c-plus-plus/
If you choose to go this route, I would DEFINITELY download the example code and play with it first, before implementing buffers in your own application.
It depends but for example using redis
Values can be strings (including binary data) of every kind, for
instance you can store a jpeg image inside a value. A value can't be
bigger than 512 MB.
If the file is bigger than 512MB, then you can store it in chunks.
But I wouldnt suggest since this is an in-memory data store
Its easy to implement in both c++ and node.js

C++ Read specific parts of a file with start and endpoint

I am serializing multiple objects and want to save the given Strings to a file. The structure is the following:
A few string and long attributes, then a variable amount of maps<long, map<string, variant> >. My first idea was creating one valid JSONFile but this is very hard to do (all of the maps are very big and my temporary memory is not big enough). Since I cant serialize everything together I have to do it piece by piece. I am planning on doing that and I then want to save the recieved strings to a file. Here is how it will look like:
{ "Name": "StackOverflow"}
{"map1": //map here}
{"map2": //map here}
As you can see this is not one valid JSON object but 3 valid JSONObjects in one file. Now I want to deserialize and I need to give a valid JSONObject to the deserializer. I already save tellp() everytime when I write a new JSONObject to file, so in this example I would have the following adresses saved: 26, endofmap1, endofmap2.
Here is what I want to do: I want to use these addresses, to extract the strings from the file I wrote to. I need one string which is from 0 to (26-1), one string from 26 to(endofmap1-1) and one string from endofmap1 to (endofmap2-1). Since these strings would be valid JSONObjects i could deserialize them without problem.
How can I do this?
I would create a serialize and deserialize class that you can use as part of a hierarchy.
So for instance, in rough C++ psuedo-code:
class Object : public serialize, deserialize {
public:
int a;
float b;
Compound c;
bool serialize(fstream& fs) {
fs << a;
fs << b;
c->serialize(fs);
fs->flush();
}
// same for deserialize
};
class Compound : serialize, deserialize {
public:
map<> things;
bool serialize(fstream& fs) {
for(thing : things) {
fs << thing;
}
fs->flush();
}
};
With this you can use JSON as the file will be written as your walk the heirarchy.
Update:
To extract a specific string from a file you can use something like this:
// pass in an open stream (streams are good for unit testing!)
std::string extractString(fstream& fs) {
int location = /* the location of the start from file */;
int length = /* length of the string you want to extract */;
std::string str;
str.resize(length);
char* begin = *str.begin();
fs->seekp(location);
fs->read(begin, length);
return str;
}
Based on you saying "my temporary memory is not big enough", I'm going to assume two possibilities (though some kind of code example may help us help you!).
possibility one, the file is too big
The issue you would be facing here isn't a new one - a file too large for memory, assuming your algorithm isn't buffering all the data, and your stack can handle the recursion of course.
On windows you can use the MapViewOfFile function, the MSDN has plenty of detail on that. This function will effectively grab a "view" of a section of a file - allowing you to load enough of the file to modify only what you need, before closing and opening a view at a later offset.
If you are on a different platform, there will be similar functions.
possibility two, you are doing too much at once
The other option is more of a "software engineering" issue. You have so much data then when holding them in your std::maps, you run out of heap-memory.
If this is the case, you are going to need to use some clever thinking - here are some ideas!
Don't load all your data into the maps. wherever the data is coming from, take a CRC, Index, or Filename of the data-source. Store that information in the map, and leave the actual "big strings" on the hard disk. - This way you can load each item of data when you need it.
This works really well for data that needs to be sorted, or correlated.
Process or load your data when you need to write it. If you don't need to sort or correlate the data, why load it into a map beforehand at all? Just load each "big string" of data in sequence, then write them to the file with an ofstream.

C++ read text line-by-line, speed/efficiency savings needed

I have a series of large text files (10s - 100s of thousands of lines) that I want to parse line-by-line. The idea is to check if the line has a specific word/character/phrase and to, for now, record to a secondary file if it does.
The code I've used so far is:
ifstream infile1("c:/test/test.txt");
while (getline(infile1, line)) {
if (line.empty()) continue;
if (line.find("mystring") != std::string::npos) {
outfile1 << line << '\n';
}
}
The end goal is to be writing those lines to a database. My thinking was to write them to the file first and then to import the file.
The problem I'm facing is the time taken to complete the task. I'm looking to minimize the time as far as possible, so any suggestions as to time savings on the read/write scenario above would be most welcome. Apologies if anything is obvious, I've only just started moving into C++.
Thanks
EDIT
I should say that I'm using VS2015
EDIT 2
So this was my own dumb fault, when switching to Release and changing the architecture type I had noticeable speed increases. Thanks to everyone for pointing me in that direction. I'm also looking at the mmap stuff and that's proving useful too. Thanks guys!
When you use ifstream to read and process to/from really big files, you have to increase the default buffer size that is used (normally 512 bytes).
The best buffer size depends on your needs, but as a hint you can use the partition block size of the file(s) your reading/writing. To know that information you can use a lot of tools or even code.
Example in Windows:
fsutil fsinfo ntfsinfo c:
Now, you have to create a new buffer to ifstream like this:
size_t newBufferSize = 4 * 1024; // 4K
char * newBuffer = new char[newBufferSize];
ifstream infile1;
infile1.rdbuf()->pubsetbuf(newBuffer, newBufferSize);
infile1.open("c:/test/test.txt");
while (getline(infile1, line)) {
/* ... */
}
delete newBuffer;
Do the same with the output stream and don't forget set new buffer before open file or it may not work.
You can play with values to find the very best size for you.
You'll note the difference.
C-style I/O functions are much faster than fstream.
You may use fgets/fputs to read/write each text line.

Writing to a .csv file with C++?

TL;DR I am trying to take a stream of data and make it write to a .csv file. Everything is worked out except the writing part, which I think is simply due to me not referencing the .csv file correctly. But I'm a newbie to this stuff, and can't figure out how to correctly reference it, so I need help.
Hello, and a big thank you in advance to anyone that can help me out with this! Some advance info, my IDE is Xcode, using C++, and I'm using the Myo armband from Thalmic Labs as a device to collect data. There is a program (link for those interested enough to look at it) that is supposed to stream the EMG, accelerometer, gyroscope, and orientation values into a .csv file. I am so close to getting the app to work, but my lack of programming experience has finally caught up to me, and I am stuck on something rather simple. I know that the app can stream the data, as I have been able to make it print the EMG values in the debugging area. I can also get the app to open a .csv file, using this code:
const char *path= "/Users/username/folder/filename";
std::ofstream file(path);
std::string data("data to write to file");
file << data;
But no data ends up being streamed/printed into that file after I end the program. The only thing that I can think might be causing this is that the print function is not correctly referencing this file pathway. I would assume that to be a straightforward thing, but like I said, I am inexperienced, and do not know exactly how to address this. I am not sure what other information is necessary, so I'll just provide everything that I imagine might be helpful.
This is the function structure that is supposed to open the files: (Note: The app is intended to open the file in the same directory as itself)
void openFiles() {
time_t timestamp = std::time(0);
// Open file for EMG log
if (emgFile.is_open())
{
emgFile.close();
}
std::ostringstream emgFileString;
emgFileString << "emg-" << timestamp << ".csv";
emgFile.open(emgFileString.str(), std::ios::out);
emgFile << "timestamp,emg1,emg2,emg3,emg4,emg5,emg6,emg7,emg8" << std::endl;
This is the helper to print accelerometer and gyroscope data (There doesn't appear to be anything like this to print EMG data, but I know it does, so... Watevs):
void printVector(std::ofstream &path, uint64_t timestamp, const myo::Vector3< float > &vector)
{
path << timestamp
<< ',' << vector.x()
<< ',' << vector.y()
<< ',' << vector.z()
<< std::endl;
}
And this is the function structure that utilizes the helper:
void onAccelerometerData(myo::Myo *myo, uint64_t timestamp, const myo::Vector3< float > &accel)
{
printVector(accelerometerFile, timestamp, accel);
}
I spoke with a staff member at Thalmic Labs (the guy who made the app actually) and he said it sounded like, unless the app was just totally broken, I was potentially just having problems with the permissions on my computer. There are multiple users on this computer, so that may very well be the case, though I certainly hope not, and I'd still like to try and figure it out one more time before throwing in the towel. Again, thanks to anyone who can be of assistance! :)
My imagination is failing me. Have you tried writing to or reading from ostringstream or istringstream objects? That might be informative. Here's a line that's correct:
std::ofstream outputFile( strOutputFilename.c_str(), std::ios::app );
Note that C++ doesn't have any native support for streaming .csv code, though, you may have to do those conversions yourself. :( Things may work better if you replace the "/"'s by (doubled) "//" 's ...

C++ reading/writing objects with binary files

I have gone through hours of time trying to fix the issue of binary file manipulation.
The task is to read and write BookStoreBook objects to/from a binary file
The BookStoreBook class contains the following member variables:
string isbn;
string title;
Author author;
string publisher;
Date dateAdded;
int quantityOnHand;
double wholesaleCost;
double retailPrice;
The code for reading books is as shown:
fstream file("inventory.txt", ios::binary | ios::in | ios::out);
vector<BookStoreBook> books:
BookStoreBook *book = (BookStoreBook *)new char[sizeof(BookStoreBook)];
file.read((char*)book, sizeof(BookStoreBook));
while (!file.eof())
{
books.push_back(*book);
file.read((char*)book, sizeof(BookStoreBook));
}
The code for writing books is as shown:
vector<BookStoreBook> writeBooks = library.getBooks(); //library contains books
file.close();
file.open("inventory.txt", ios::out | ios::binary);
for(int i = 0; i < writeBooks.size(); i++)
{
BookStoreBook *book = (BookStoreBook *)new char[sizeof(BookStoreBook)];
book = &writeBooks[i];
file.write((char*)book, sizeof(BookStoreBook));
file.clear();
}
file.clear();
file.close();
I don't want to convert any string to a c_str(), as this is prohibited in the assignment requirements.
Some notes:
Right when I run the program, the program tries to read books from the file, and
that is when we get a Windows error window, later when i debug, i get the following message:
Unhandled exception at 0x56b3caa4 (msvcr100d.dll) in FinalProject.exe: 0xC0000005: Access violation reading location 0x0084ef10
The funny thing is, sometimes the program runs perfectly fine, and
sometimes it crashes when it first reads the books from the file.
However, whenever the program has successfully read some contents, and
I dont modify the books, and then reopen the program, the program keeps
running perfectly.
Nothing seems to work. Please help!
Your problem here is that certain parts of your BookStoreBook class contain pointers, even though they are not visible. std::string for example has a pointer to the memory location where the contents of the string are kept.
It is virtually always considered a bad practice to write data structures in C++ to disk as they appear in memory. Doing this does not account for different endianness of different machines, word width on different machines (int or long may differ in size on 32bit and 64bit machines), and you run into all the pointer trouble.
You should push each of the fields of your BookStoreBook to the output stream, along the lines of
file << book.isbn << ' ';
file << book.title << ' ';
...
Note that the above is very bad practice, as the decoding gets horribly difficult. I suggest you use Boost.Serialization for this, or write your own methods that can read/write key-value-pairs from a file, or you might want to look into jsoncpp or tinyxml2. This whole topic can get quite convoluted, so sticking with Boost is a good idea, even if just to figure out how to solve the issue yourself (assuming this is a homework assignment).