Rapidjson returning reference to Document Value

Rapidjson returning reference to Document Value - c++

I'm having some trouble with the following method and I need some help trying to figure out what I am doing wrong.
I want to return a reference to a Value in a document. I am passing the Document from outside the function so that when I read a json file into it I don't "lose it".
const rapidjson::Value& CTestManager::GetOperations(rapidjson::Document& document)
{
const Value Null(kObjectType);
if (m_Tests.empty())
return Null;
if (m_current > m_Tests.size() - 1)
return Null;
Test& the_test = m_Tests[m_current];
CMyFile fp(the_test.file.c_str()); // non-Windows use "r"
if (!fp.is_open())
return Null;
u32 operations_count = 0;
CFileBuffer json(fp);
FileReadStream is(fp.native_handle(), json, json.size());
if (document.ParseInsitu<kParseCommentsFlag>(json).HasParseError())
{
(...)
}
else
{
if (!document.IsObject())
{
(...)
}
else
{
auto tests = document.FindMember("td_tests");
if (tests != document.MemberEnd())
{
for (SizeType i = 0; i < tests->value.Size(); i++)
{
const Value& test = tests->value[i];
if (test["id"].GetInt() == the_test.id)
{
auto it = test.FindMember("operations");
if (it != test.MemberEnd())
{
//return it->value; is this legitimate?
return test["operations"];
}
return Null;
}
}
}
}
}
return Null;
}
Which I am calling like this:
Document document;
auto operations = TestManager().GetOperations(document);
When I inspect the value of test["operations"] inside the function I can see everything I would expect (debug code removed from the abode code).
When I inspect the returned value outside the function I can see that it's an array (which I expect). the member count int the array is correct as well, but when print it out, I only see garbage instead.
When I "print" the Value to a string inside the methods, I get what I expect (i.e. a well formated json), but when I do it outside all keys show up as "IIIIIIII" and values that aren't strings show up correctly.
rapidjson::StringBuffer strbuf2;
rapidjson::PrettyWriter<rapidjson::StringBuffer> writer2(strbuf2);
ops->Accept(writer2);
As this didn't work I decided to change the method to receive a Value as a parameter and do a deep copy into it like this
u32 CTestManager::GetOperationsEx(rapidjson::Document& document, rapidjson::Value& operations)
{
(...)
if (document.ParseInsitu<kParseCommentsFlag>(json).HasParseError())
{
(...)
}
else
{
if (!document.IsObject())
{
(...)
}
else
{
auto tests = document.FindMember("tests");
if (tests != document.MemberEnd())
{
for (SizeType i = 0; i < tests->value.Size(); i++)
{
const Value& test = tests->value[i];
if (test["id"].GetInt() == the_test.id)
{
const Value& opv = test["operations"];
Document::AllocatorType& allocator = document.GetAllocator();
operations.CopyFrom(opv, allocator); //would Swap work?
return operations.Size();
}
}
}
}
}
return 0;
}
Which I'm calling like this:
Document document;
Value operations(kObjectType);
u32 count = TestManager().GetOperationsEx(document, operations);
But... I get same thing!!!!
I know that it's going to be something silly but I can't put my hands on it!
Any ideas?

The problem in this case lies with the use of ParseInSitu. When any of the GetOperations exist the CFileBuffer loses scope and is cleaned up. Because the json is being parsed in-situ when the buffer to the file goes, so goes the data.

Related

using a bytes field as proxy for arbitrary messages

Hello nano developers,
I'd like to realize the following proto:
message container {
enum MessageType {
TYPE_UNKNOWN = 0;
evt_resultStatus = 1;
}
required MessageType mt = 1;
optional bytes cmd_evt_transfer = 2;
}
message evt_resultStatus {
required int32 operationMode = 1;
}
...
The dots denote, there are more messages with (multiple) primitive containing datatypes to come. The enum will grow likewise, just wanted to keep it short.
The container gets generated as:
typedef struct _container {
container_MessageType mt;
pb_callback_t cmd_evt_transfer;
} container;
evt_resultStatus is:
typedef struct _evt_resultStatus {
int32_t operationMode;
} evt_resultStatus;
The field cmd_evt_transfer should act as a proxy of subsequent messages like evt_resultStatus holding primitive datatypes.
evt_resultStatus shall be encoded into bytes and be placed into the cmd_evt_transfer field.
Then the container shall get encoded and the encoding result will be used for subsequent transfers.
The background why to do so, is to shorten the proto definition and avoid the oneof thing. Unfortunately syntax version 3 is not fully supported, so we can not make use of any fields.
The first question is: will this approach be possible?
What I've got so far is the encoding including the callback which seems to behave fine. But on the other side, decoding somehow skips the callback. I've read issues here, that this happened also when using oneof and bytes fields.
Can someone please clarify on how to proceed with this?
Sample code so far I got:
bool encode_msg_test(pb_byte_t* buffer, int32_t sval, size_t* sz, char* err) {
evt_resultStatus rs = evt_resultStatus_init_zero;
rs.operationMode = sval;
pb_ostream_t stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
/*encode container*/
container msg = container_init_zero;
msg.mt = container_MessageType_evt_resultStatus;
msg.cmd_evt_transfer.arg = &rs;
msg.cmd_evt_transfer.funcs.encode = encode_cb;
if(! pb_encode(&stream, container_fields, &msg)) {
const char* local_err = PB_GET_ERROR(&stream);
sprintf(err, "pb_encode error: %s", local_err);
return false;
}
*sz = stream.bytes_written;
return true;
}
bool encode_cb(pb_ostream_t *stream, const pb_field_t *field, void * const *arg) {
evt_resultStatus* rs = (evt_resultStatus*)(*arg);
//with the below in place a stream full error rises
// if (! pb_encode_tag_for_field(stream, field)) {
// return false;
// }
if(! pb_encode(stream, evt_resultStatus_fields, rs)) {
return false;
}
return true;
}
//buffer holds previously encoded data
bool decode_msg_test(pb_byte_t* buffer, int32_t* sval, size_t msg_len, char* err) {
container msg = container_init_zero;
evt_resultStatus res = evt_resultStatus_init_zero;
msg.cmd_evt_transfer.arg = &res;
msg.cmd_evt_transfer.funcs.decode = decode_cb;
pb_istream_t stream = pb_istream_from_buffer(buffer, msg_len);
if(! pb_decode(&stream, container_fields, &msg)) {
const char* local_err = PB_GET_ERROR(&stream);
sprintf(err, "pb_encode error: %s", local_err);
return false;
}
*sval = res.operationMode;
return true;
}
bool decode_cb(pb_istream_t *istream, const pb_field_t *field, void **arg) {
evt_resultStatus * rs = (evt_resultStatus*)(*arg);
if(! pb_decode(istream, evt_resultStatus_fields, rs)) {
return false;
}
return true;
}
I feel, I don't have a proper understanding of the encoding / decoding process.
Is it correct to assume:
the first call of pb_encode (in encode_msg_test) takes care of the mt field
the second call of pb_encode (in encode_cb) handles the cmd_evt_transfer field
If I do:
bool encode_cb(pb_ostream_t *stream, const pb_field_t *field, void * const *arg) {
evt_resultStatus* rs = (evt_resultStatus*)(*arg);
if (! pb_encode_tag_for_field(stream, field)) {
return false;
}
if(! pb_encode(stream, evt_resultStatus_fields, rs)) {
return false;
}
return true;
}
then I get a stream full error on the call of pb_encode.
Why is that?

Yes, the approach is reasonable. Nanopb callbacks do not care what the actual data read or written by the callback is.
As for why your decode callback is not working, you'll need to post the code you are using for decoding.
(As an aside, Any type does work in nanopb and is covered by this test case. But the type_url included in all Any messages makes them have a quite large overhead.)

Copy only necessary objects from PDF file

I've got a huge PDF file with more than 100 pages and I want to separate them to single PDF files (containing only one page each). Problem is, that PoDoFo does not copy just the page, but the whole document because of the references (and so each of the 100 PDF files have same size as the 100-page PDF). A relevant mailing list post can be found, unfortunately there is no solution provided.
In source code of function InsertPages there is explanation:
This function works a bit different than one might expect.
Rather than copying one page at a time - we copy the ENTIRE document
and then delete the pages we aren't interested in.
We do this because
1) SIGNIFICANTLY simplifies the process
2) Guarantees that shared objects aren't copied multiple times
3) offers MUCH faster performance for the common cases
HOWEVER: because PoDoFo doesn't currently do any sort of "object
garbage collection" during a Write() - we will end up with larger
documents, since the data from unused pages will also be in there.
I have tried few methods to copy only relevant objects, but each of them failed.
Copy all pages and remove irrelevant ones
Use XObject wrapping: FillXObjectFromDocumentPage and FillXObjectFromExistingPage
Copy object by object
Use RenumberObjects with bDoGarbageCollection = true
but none of them worked out. Does anybody have an idea or working solution for this problem?

The only solution is to use another PDF library. Or wait for garbage collection to be implemented.
The problem is stated in the quote you mentioned:
> during a Write() - we will end up with larger documents, since the
> data from unused pages will also be in there.
This means podofo always puts the entire PDF content in your file, no matter what. The whole PDF is there, you just don't see parts of it.

Dennis from the podofo support sent me an working example of optimized version of InsertPages function which is actually fixing page references and decreases document size significantly!
void PdfMemDocument::InsertPages2(const PdfMemDocument & rDoc, std::vector<int> pageNumbers)
{
std::unordered_set<PdfObject*> totalSet;
std::vector<pdf_objnum> oldObjNumPages;
std::unordered_map<pdf_objnum, pdf_objnum> oldObjNumToNewObjNum;
std::vector<PdfObject*> newPageObjects;
// Collect all dependencies from all pages that are to be copied
for (int i = 0; i < pageNumbers.size(); ++i) {
PdfPage* page = rDoc.GetPage(pageNumbers[i]);
if (page) {
oldObjNumPages.push_back(page->GetObject()->Reference().ObjectNumber());
std::unordered_set<PdfObject*> *set = page->GetPageDependencies();
totalSet.insert(set->begin(), set->end());
delete set;
}
}
// Create a new page object for every copied page from the old document
// Copy all objects the pages depend on to the new document
for (auto it = totalSet.begin(); it != totalSet.end(); ++it) {
unsigned int length = static_cast<unsigned int>(GetObjects().GetSize() + GetObjects().GetFreeObjects().size());
PdfReference ref(static_cast<unsigned int>(length+1), 0);
PdfObject* pObj = new PdfObject(ref, *(*it));
pObj->SetOwner(&(GetObjects()));
if ((*it)->HasStream()) {
PdfStream *stream = (*it)->GetStream();
pdf_long length;
char* buf;
stream->GetCopy(&buf, &length);
PdfMemoryInputStream inputStream(buf, length);
pObj->GetStream()->SetRawData(&inputStream, length);
free(buf);
}
oldObjNumToNewObjNum.insert(std::pair<pdf_objnum, pdf_objnum>((*it)->Reference().ObjectNumber(), length+1));
GetObjects().push_back(pObj);
newPageObjects.push_back(pObj);
}
// In all copied objects, fix the object numbers so they are valid in the new document
for (auto it = newPageObjects.begin(); it != newPageObjects.end(); ++it) {
FixPageReferences(GetObjects(), *it, oldObjNumToNewObjNum);
}
// Insert the copied pages into the pages tree
for (auto it = oldObjNumPages.begin(); it != oldObjNumPages.end(); ++it) {
PdfObject* pageObject = GetObjects().GetObject(PdfReference(oldObjNumToNewObjNum[(*it)], 0));
PdfPage *page = new PdfPage(pageObject, std::deque<PdfObject*>());
GetPagesTree()->InsertPage(GetPageCount() - 1, page);
}
}
std::unordered_set<PdfObject *>* PdfPage::GetPageDependencies() const
{
std::unordered_set<PdfObject *> *set = new std::unordered_set<PdfObject *>();
const PdfObject* pageObj = GetObject();
if (pageObj) {
PdfVecObjects* objects = pageObj->GetOwner();
if (objects) {
set->insert((PdfObject*)pageObj);
objects->GetObjectDependencies2(pageObj, *set);
}
}
return set;
}
// Optimized version of PdfVecObjects::GetObjectDependencies
void PdfVecObjects::GetObjectDependencies2(const PdfObject* pObj, std::unordered_set<PdfObject*> &refMap) const
{
// Check objects referenced from this object
if (pObj->IsReference())
{
PdfObject* referencedObject = GetObject(pObj->GetReference());
if (referencedObject != NULL && refMap.count(referencedObject) < 1) {
(refMap).insert((PdfObject *)referencedObject); // Insert referenced object
GetObjectDependencies2((const PdfObject*)referencedObject, refMap);
}
}
else {
// Recursion
if (pObj->IsArray())
{
PdfArray::const_iterator itArray = pObj->GetArray().begin();
while (itArray != pObj->GetArray().end())
{
GetObjectDependencies2(&(*itArray), refMap);
++itArray;
}
}
else if (pObj->IsDictionary())
{
TCIKeyMap itKeys = pObj->GetDictionary().GetKeys().begin();
while (itKeys != pObj->GetDictionary().GetKeys().end())
{
if ((*itKeys).first != PdfName("Parent")) {
GetObjectDependencies2((*itKeys).second, refMap);
}
++itKeys;
}
}
}
}
void FixPageReferences(PdfVecObjects& objects, PdfObject* pObject, std::unordered_map<pdf_objnum, pdf_objnum>& oldNumToNewNum) {
if( !pObject)
{
PODOFO_RAISE_ERROR( ePdfError_InvalidHandle );
}
if( pObject->IsDictionary() )
{
TKeyMap::iterator it = pObject->GetDictionary().GetKeys().begin();
while( it != pObject->GetDictionary().GetKeys().end() )
{
if ((*it).first != PdfName("Parent")) {
FixPageReferences(objects, (*it).second, oldNumToNewNum);
}
++it;
}
}
else if( pObject->IsArray() )
{
PdfArray::iterator it = pObject->GetArray().begin();
while( it != pObject->GetArray().end() )
{
FixPageReferences(objects, &(*it), oldNumToNewNum),
++it;
}
}
else if( pObject->IsReference() )
{
//PdfObject* referencedObj = objects.GetObject(pObject->GetReference());
pdf_objnum oldnum = pObject->GetReference().ObjectNumber();
pdf_objnum newnum = oldNumToNewNum[oldnum];
if (!newnum) throw new std::exception("No new object number for old object number");
*pObject = PdfReference(newnum, 0);
}
}

is there a better way to make this software flow

I have several functions that try and evaluate some data. Each function returns a 1 if it can successfully evaluate the data or 0 if it can not. The functions are called one after the other but execution should stop if one returns a value of 1.
Example functions look like so:
int function1(std::string &data)
{
// do something
if (success)
{
return 1;
}
return 0;
}
int function2(std::string &data)
{
// do something
if (success)
{
return 1;
}
return 0;
}
... more functions ...
How would be the clearest way to organise this flow? I know I can use if statements as such:
void doSomething(void)
{
if (function1(data))
{
return;
}
if (function2(data))
{
return;
}
... more if's ...
}
But this seems long winded and has a huge number of if's that need typing. Another choice I thought of is to call the next function from the return 0 of the function like so
int function1(std::string &data)
{
// do something
if (success)
{
return 1;
}
return function2(data);
}
int function2(std::string &data)
{
// do something
if (success)
{
return 1;
}
return function3(data);
}
... more functions ...
Making calling cleaner because you only need to call function1() to evaluate as far as you need to but seems to make the code harder to maintain. If another check need to be inserted into the middle of the flow, or the order of the calls changes, then all of the functions after the new one will need to be changed to account for it.
Am I missing some smart clear c++ way of achieving this kind of program flow or is one of these methods best. I am leaning towards the if method at the moment but I feel like I am missing something.

void doSomething() {
function1(data) || function2(data) /* || ... more function calls ... */;
}
Logical-or || operator happens to have the properties you need - evaluated left to right and stops as soon as one operand is true.

I think you can make a vector of lambdas where each lambdas contains specific process on how you evaluate your data. Something like this.
std::vector<std::function<bool(std::string&)> listCheckers;
listCheckers.push_back([](std::string& p_data) -> bool { return function1(p_data); });
listCheckers.push_back([](std::string& p_data) -> bool { return function2(p_data); });
listCheckers.push_back([](std::string& p_data) -> bool { return function3(p_data); });
//...and so on...
//-----------------------------
std::string theData = "Hello I'm a Data";
//evaluate all data
bool bSuccess = false;
for(fnChecker : listCheckers){
if(fnChecker(theData)) {
bSuccess = true;
break;
}
}
if(bSuccess ) { cout << "A function has evaluated the data successfully." << endl; }
You can modify the list however you like at runtime by: external objects, config settings from file, etc...

C++ Create std::list in function and return through arguments

How to correct return created std::list through function argument? Now, I try so:
bool DatabaseHandler::tags(std::list<Tag> *tags)
{
QString sql = "SELECT * FROM " + Tag::TABLE_NAME + ";";
QSqlQueryModel model;
model.setQuery(sql);
if(model.lastError().type() != QSqlError::NoError) {
log(sql);
tags = NULL;
return false;
}
const int count = model.rowCount();
if(count > 0)
tags = new std::list<Tag>(count);
else
tags = new std::list<Tag>();
//some code
return true;
}
After I can use it:
std::list<Tag> tags;
mDB->tags(&tags);
Now, I fix my function:
bool DatabaseHandler::tags(std::list<Tag> **tags)
{
QString sql = "SELECT * FROM " + Tag::TABLE_NAME + ";";
QSqlQueryModel model;
model.setQuery(sql);
if(model.lastError().type() != QSqlError::NoError) {
log(sql);
*tags = NULL;
return false;
}
const int count = model.rowCount();
if(count > 0)
*tags = new std::list<Tag>(count);
else
*tags = new std::list<Tag>();
for(int i = 0; i < count; ++i) {
auto record = model.record(i);
Tag tag(record.value(Table::KEY_ID).toInt());
(*tags)->push_back(tag);
}
return true;
}
It works but list return size 4 although loop executes only 2 iterations and empty child objects (if I just called their default constructor). The Tag class hasn't copy constructor.

Since you passed an already instantiated list as a pointer to the function, there is no need to create another list.
In that sense, you question is pretty unclear. I'd suggest you read up a bit on pointers, references and function calls in general.
http://www.cplusplus.com/doc/tutorial/pointers/
http://www.cplusplus.com/doc/tutorial/functions/
UPDATE: I still strongly suggest you read up on the mentioned topics, since you don't know these fundamental points.
Anyway, this is what you probably want to do (event though I would suggest using references, here is the solution with pointers):
bool someFunc(std::list<Tag> **tags) {
// by default null the output argument
*tags = nullptr;
if (error) {
return false;
}
// dereference tags and assign it the address to a new instance of list<Tag>
*tags = new std::list<Tag>();
return true
}
std::list<Tag> *yourList;
if (someFunc(&yourList)) {
// then yourList is valid
} else {
// then you had an error and yourList == nullptr
}
However, this is not idiomatic C++. Please read a modern book or tutorial.

Use a reference.
bool DatabaseHandler::tags(std::list<Tag>& tags);
std::list<Tag> tags;
mDB->tags(tags);
You'll have to change all the -> to ., of course. Every operation done on the reference in the function will be done to the original tags list it was called with.
EDIT: If you want to create the list inside the function and return it, you have a couple options. The closest, I think, is to just return a list pointer, and return nullptr if the function fails.
//beware, pseudocode ahead
std::list<Tag>* DatabaseHandler::tags() //return new list
{
if (success)
return new std::list<Tag>(...); //construct with whatever
else
return nullptr; //null pointer return, didn't work
}
std::list<Tag> tags* = mDB->tags();
You could alternatively have it return an empty list instead, depending on how you want it to work. Taking a reference to a pointer would work the same way, too.
bool DatabaseHandler::tags(std::list<Tag>*&); //return true/false
std::list<Tag>* tags;
mDB->tags(tags); //tags will be set to point to a list if it worked

Header functions

EDIT: by not working I mean that in my main array mA in main doesn't show any change to the elements within the array.
I have been checking my functions as I develop the headers and they have been working perfectly: Until I got to the final header MonitorArray.h.
mA.getScreen(i).checkScreen();
Didn't work and I couldn't work out why. So I created a new function within MonitorArray to do a similar job using the same function, and to my surprise it worked.
mA.pollScreens();
Which uses (Inside MonitorArray.h):
monitorArray[i].checkScreen();
Function getScreen:
ScreenArray MonitorArray::getScreen(int arrayPointer)
{
if (arrayPointer<0 || arrayPointer>=monitors)
{
return false;
}
else
{
return monitorArray[arrayPointer];
}
}
Function checkScreen and addArray:
void ScreenArray::checkScreen()
{
HDC dMonitor;
PixelArray pArray;
int lenX = 0, lenY = 0;
dMonitor = CreateDC(iMonitor.szDevice, iMonitor.szDevice, NULL, NULL);
lenX = (iMonitor.rcWork.right - iMonitor.rcWork.left) - 1;
lenY = (iMonitor.rcWork.bottom - iMonitor.rcWork.top) - 1;
pArray.setColour(0, GetPixel(dMonitor, 0, 0));
pArray...
...
...
addArray(&pArray);
ReleaseDC(NULL, dMonitor);
}
void ScreenArray::addArray(PixelArray* pA)
{
if (previousCheck(*pA))
{
arrayPosition = 0;
screenArray[arrayPosition] = *pA;
arrayPosition++;
}
else
{
screenArray[arrayPosition] = *pA;
arrayPosition++;
}
if (arrayPosition==11)
{
//Run screen saver on monitor
}
}
Why does running the command within the header file through a new function work but running the functions from main not?

Assuming that "didn't work" means "didn't affect the ScreenArray in my MonitorArray", it's because getScreen returns a copy of the array element
ScreenArray MonitorArray::getScreen(int arrayPointer)
while the new member function most likely works with the array directly.
You'll need to return a pointer to the array element instead:
ScreenArray* MonitorArray::getScreen(int arrayPointer)
{
if (arrayPointer<0 || arrayPointer>=monitors)
{
return NULL;
}
else
{
return &monitorArray[arrayPointer];
}
}
(BTW: the implicit conversion from bool to ScreenArray looks very odd.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Rapidjson returning reference to Document Value - c++

The problem in this case lies with the use of ParseInSitu. When any of the GetOperations exist the CFileBuffer loses scope and is cleaned up. Because the json is being parsed in-situ when the buffer to the file goes, so goes the data.

Related

using a bytes field as proxy for arbitrary messages

Copy only necessary objects from PDF file

is there a better way to make this software flow

C++ Create std::list in function and return through arguments

Header functions

Categories

Resources