Highlight some words in PDF using c++ MuPDF - c++

Iam trying to highlight some words inside a PDF, I searched on a good C++ library for doing this, I found MuPDF, I download the last version and compiled it.
Now iam starting to write some codes to highlight the text in the PDF, there is no examples for this task in c++, so I start to try myself.
fz_document *doc;
fz_context *ctx;
ctx = fz_new_context(NULL, NULL, FZ_STORE_UNLIMITED);
fz_register_document_handlers(ctx);
doc = fz_open_document(ctx, "D:/b.pdf");
cout << fz_count_pages(ctx, doc) << endl;
fz_page *page = fz_load_page(ctx, doc, 0);
fz_quad *q;
fz_search_page(ctx, page, "more", q, 1);
fz_rect rec = fz_rect_from_quad((*q));
fz_stext_page *pp = fz_new_stext_page(ctx, rec);
fz_point point;
point.x = 0;
point.y = 0;
fz_highlight_selection(ctx, pp, point, point, q, 16);
fz_buffer *buffer = fz_new_buffer_from_stext_page(ctx, pp);
fz_save_buffer(ctx, buffer, "D:/Final.pdf");
That is what i tried so far iam not sure it crash at a point, iam using it with Qt 5.13 MSVC 2017, so what i did wrong, or if some one has a good useful example to do this or for the library in general as it leak examples so far from my search, all the examples are in python, java, and other put for c++ there is few examples just 2 examples coming with the library.
Even if there is another good c++ library has this function please share it.
Thanks in advance.

So it seems you are making the common newbie error of thinking that just because an API uses a pointer, you must declare a pointer. But that is not correct, instead you should declare an object and pass the address of that object. So for example this
fz_quad *q;
fz_search_page(ctx, page, "more", q, 1);
fz_rect rec = fz_rect_from_quad((*q));
should actually be this
fz_quad q; // object not pointer
fz_search_page(ctx, page, "more", &q, 1); // address of the object to get the pointer
fz_rect rec = fz_rect_from_quad(q);
The idea is that fz_search_page will fill in the fz_quad object. Your version fails because you gave an uninitialised pointer to fz_search_page which will result in memory corruption when fz_search_page tries to use that pointer.
You should also definitely add the sanity check
doc = fz_open_document(ctx, "D:/b.pdf");
if (doc == nullptr) // check if we can open the document
{
std::cerr << "cannot open document\n"; // or whatever error handling you prefer
exit(1);
}
Opening files or documents can fail for all sorts of reasons and you should always check that it works.
There's probably lots else that needs improving but those issues stood out for me.

To elaborate on john's answer, you need to allocate space for the results, not just a pointer. This can be stack allocated.
fz_quad q[100]; // stack allocate array of 100 quads
int n = fz_search_page(ctx, page, "more", q, 100);
However, there seem to be some more areas of confusion as to what the APIs actually do.
The fz_search_page function returns a list of quads covering the search hits. fz_highlight_selection also returns a list of quads but this time based on the location on a page a user has dragged using the start and end coordinates of the selection.
fz_new_buffer_from_stext_page returns a plain text version of the structured text data. This is NOT in PDF format.
If you want to add a highlight annotation, then you should create a highlight annotation covering the area you want highlighted:
n = pdf_search_page(ctx, page, "more", q, 100);
if (n > 0) {
pdf_annot *annot = pdf_create_annot(ctx, page, PDF_ANNOT_HIGHLIGHT);
for (i = 0; i < n; ++i)
pdf_add_annot_quad_point(ctx, annot, q[i]);
pdf_update_annot(ctx, annot);
}
Then you can save the new modified PDF document:
pdf_save_document(ctx, doc, "out.pdf", NULL);

Related

Trying to load SFML sprites into array, then access attributes of each?

I've been learning c++ through a book I recently bought by John Horton that teaches you via game creation. It briefly mentions placing SFML sprites into an array that can then be accessed later, but doesn't cover it much. I've been trying to figure it out myself, but the SFML documentation doesn't really clarify it entirely for me (or I just don't understand it correctly yet). Anyway, this is what I'm trying, but I get the error "expression must have handle or pointer" for clouds[i] in the cout line.
I'm trying to make it so I can use this function to retrieve the positions of each cloud and, if needed, be able to change it. I thought I needed a reference or pointer but neither are working.
int GetClouds()
{
Texture textureCloud;
textureCloud.loadFromFile("graphics/cloud.png");
Sprite spriteCloud;
spriteCloud.setTexture(textureCloud);
Sprite clouds[3];
for (int i = 2; i >= 0; i--)
{
clouds[i] = spriteCloud;
}
for (int i = 0; i != 2; i++)
{
std::cout << clouds[i]->getPosition;
}
return 0;
}

When creating threads using lambda expressions, how to give each thread its own copy of the lambda expression?

I have been working on a program that basically used brute force to work backward to find a method using a given set of operations to reach the given number. So, for example, if I gave in a set of operations +5,-7,*10,/3, and a given number say 100(*this example probably won't come up with a solution), and also a given max amount of moves to solve (let's say 8), it will attempt to come up with a use of these operations to get to 100. This part works using a single thread which I have tested in an application.
However, I wanted it to be faster and I came to multithreading. I have worked a long time to even get the lambda function to work, and after some serious debugging have realized that the solution "combo" is technically found. However, before it is tested, it is changed. I wasn't sure how this was possible considering the fact that I had thought that each thread was given its own copy of the lambda function and its variables to use.
In summary, the program starts off by parsing the information, then passes the information which is divided by the parser as paramaters into the array of an operation object(somewhat of a functor). It then uses an algorithm which generated combinations which are then executed by the operation objects. The algorithm, in simplicity, takes in the amount of operations, assigns it to a char value(each char value corresponds to an operation), then outputs a char value. It generates all possible combinations.
That is a summary of how my program works. Everything seems to be working fine and in order other than two things. There is another error which I have not added to the title because there is a way to fix it, but I am curious about alternatives. This way is also probably not good for my computer.
So, going back to the problem with the lambda expression inputted with the thread as seen is with what I saw using breakpoints in the debugger. It appeared that both threads were not generating individual combos, but more rather properly switching between the first number, but alternating combos. So, it would go 1111, 2211, rather than generating 1111, 2111.(these are generated as the previous paragraph showed, but they are done a char at a time, combined using a stringstream), but once they got out of the loop that filled the combo up, combos would get lost. It would randomly switch between the two and never test the correct combo because combinations seemed to get scrambled randomly. This I realized must have something to do with race conditions and mutual exclusion. I had thought I had avoided it all by not changing any variables changed from outside the lambda expression, but it appears like both threads are using the same lambda expression.
I want to know why this occurs, and how to make it so that I can say create an array of these expressions and assign each thread its own, or something similar to that which avoids having to deal with mutual exclusion as a whole.
Now, the other problem happens when I at the end delete my array of operation objects. The code which assigns them and the deleting code is shown below.
operation *operations[get<0>(functions)];
for (int i = 0; i < get<0>(functions); i++)
{
//creates a new object for each operation in the array and sets it to the corresponding parameter
operations[i] = new operation(parameterStrings[i]);
}
delete[] operations;
The get<0>(functions) is where the amount of functions is stored in a tuple and is the number of objects to be stored in an array. The paramterStrings is a vector in which the strings used as parameters for the constructor of the class are stored. This code results in an "Exception trace/breakpoint trap." If I use "*operations" instead I get a segmentation fault in the file where the class is defined, the first line where it says "class operation." The alternative is just to comment out the delete part, but I am pretty sure that it would be a bad idea to do so, considering the fact that it is created using the "new" operator and might cause memory leaks.
Below is the code for the lambda expression and where the corresponding code for the creation of threads. I readded code inside the lambda expression so it could be looked into to find possible causes for race conditions.
auto threadLambda = [&](int thread, char *letters, operation **operations, int beginNumber) {
int i, entry[len];
bool successfulComboFound = false;
stringstream output;
int outputNum;
for (i = 0; i < len; i++)
{
entry[i] = 0;
}
do
{
for (i = 0; i < len; i++)
{
if (i == 0)
{
output << beginNumber;
}
char numSelect = *letters + (entry[i]);
output << numSelect;
}
outputNum = stoll(output.str());
if (outputNum == 23513511)
{
cout << "strange";
}
if (outputNum != 0)
{
tuple<int, bool> outputTuple;
int previousValue = initValue;
for (int g = 0; g <= (output.str()).length(); g++)
{
operation *copyOfOperation = (operations[((int)(output.str()[g])) - 49]);
//cout << copyOfOperation->inputtedValue;
outputTuple = (*operations)->doOperation(previousValue);
previousValue = get<0>(outputTuple);
if (get<1>(outputTuple) == false)
{
break;
}
debugCheck[thread - 1] = debugCheck[thread - 1] + 1;
if (previousValue == goalValue)
{
movesToSolve = g + 1;
winCombo = outputNum;
successfulComboFound = true;
break;
}
}
//cout << output.str() << ' ';
}
if (successfulComboFound == true)
{
break;
}
output.str("0");
for (i = 0; i < len && ++entry[i] == nbletters; i++)
entry[i] = 0;
} while (i < len);
if (successfulComboFound == true)
{
comboFoundGlobal = true;
finishedThreads.push_back(true);
}
else
{
finishedThreads.push_back(true);
}
};
Threads created here :
thread *threadArray[numberOfThreads];
for (int f = 0; f < numberOfThreads; f++)
{
threadArray[f] = new thread(threadLambda, f + 1, lettersPointer, operationsPointer, ((int)(workingBeginOperations[f])) - 48);
}
If any more of the code is needed to help solve the problem, please let me know and I will edit the post to add the code. Thanks in advance for all of your help.
Your lambda object captures its arguments by reference [&], so each copy of the lambda used by a thread references the same shared objects, and so various threads race and clobber each other.
This is assuming things like movesToSolve and winCombo come from captures (it is not clear from the code, but it seems like it). winCombo is updated when a successful result is found, but another thread might immediately overwrite it right after.
So every thread is using the same data, data races abound.
You want to ensure that your lambda works only on two three types of data:
Private data
Shared, constant data
Properly synchronized mutable shared data
Generally you want to have almost everything in category 1 and 2, with as little as possible in category 3.
Category 1 is the easiest, since you can use e.g., local variables within the lambda function, or captured-by-value variables if you ensure a different lambda instance is passed to each thread.
For category 2, you can use const to ensure the relevant data isn't modified.
Finally you may need some shared global state, e.g., to indicate that a value is found. One option would be something like a single std::atomic<Result *> where when any thread finds a result, they create a new Result object and atomically compare-and-swap it into the globally visible result pointer. Other threads check this pointer for null in their run loop to see if they should bail out early (I assume that's what you want: for all threads to finish if any thread finds a result).
A more idiomatic way would be to use std::promise.

wxWidgets / wxStyledTextCtrl - Highlight all occurrences when doubleclicking

I am using wxWidgets 3.0.2 in a static unicode build on Windows 10. I am using a wxStyledTextCtrl, which is a near 1-to-1 mapping of Scintilla.
I am looking for functionality similar to Notepad++ where upon double-clicking on something in the editor, all occurrences of that item get highlighted. It is hard to find good examples that really demonstrate styling. I've looked at wxWidgets documentation, Scintilla documentation, Notepad++ source and Code::Blocks source (the latter two use Scintilla as their text editors) and still haven't had much luck.
I've tried many different variations of the following code and it never quite works right. Either nothing is highlighted or the whole document is highlighted. I know I'm missing something, but I can't figure out what.
//textarea is a wxStyledTextCtrl*
textarea->StyleSetBackground(styleHightlightAllSelected, wxColor(80, 255, 80));
wxString selectedText = textarea->GetSelectedText();
int selSize = selectedText.size();
int selStart = textarea->GetSelectionStart();
int pos = 0;
int curr = 0;
int maxPos = textarea->GetLastPosition();
while(pos != -1){
pos = textarea->FindText(curr, maxPos, selectedText);
if(pos == selStart){ //skip the actual highlighted item
curr = pos + selSize;
} else if(pos != -1){
textarea->StartStyling(pos, 0x1F);
textarea->SetStyling(selSize, styleHightlightAllSelected);
curr = pos + selSize;
}
}
The search part of the loop does successfully find the selected text; it's just that the styling doesn't seem to take hold.
So my questions that I couldn't really find answers to are:
styleHightlightAllSelected is an int set to 100. When I had it as 0, the whole document turned green when doubleclicking. I see that styles 32-39 are predefined. Are there other styles that are predefined-but-not-really-documented; meaning, is 100 ok?
Do I have to set the entire style up, or can I just set the background color as I do above?
Is it enough to do StartStyling() and SetStyling() when I find an occurrence and be done with it, or is there more?
StartStyling() in wxWidgets has a mask argument, but the Scintilla counterpart does not. I can't clearly determine what I should set this to. It seems to be 31 (00011111) to preserve the 5 existing styling/lexer bits? Essentially, I'm not sure what to set this to if all I want to do is modify the background color of each occurrence.
My program will regularly deal with files that are dozens or more megabytes in size, so should I just be highlighting occurrences that are visible, and adjust as necessary when srolling/jumping? At the moment it searches and (fails to) set styling on each occurrence, and it takes about a second on a 50MB file. I've observed that on the same file loaded in Notepad++, it happens instantly, so I'm assuming it does it on a visible basis?
I ended up asking about this on the github issues page for the Notepad++ project, and the correct way to do this is to not use styles, but rather use indicators instead. So my code above changes to this:
int maxPos = textarea->GetLastPosition();
textarea->IndicatorClearRange(0, maxPos);
textarea->IndicatorSetStyle(styleHightlightAllSelected, wxSTC_INDIC_ROUNDBOX);
textarea->IndicatorSetAlpha(styleHightlightAllSelected, 100);
textarea->IndicatorSetUnder(styleHightlightAllSelected, true);
textarea->IndicatorSetForeground(styleHightlightAllSelected, wxColor(0, 255, 0));
wxString selectedText = textarea->GetSelectedText();
int selSize = selectedText.size();
int selStart = textarea->GetSelectionStart();
int pos = 0;
int curr = 0;
vector<int> selectionList;
while((pos = textarea->FindText(curr, maxPos, selectedText)) != -1){
selectionList.push_back(pos);
curr = pos + selSize;
}
textarea->SetIndicatorCurrent(styleHightlightAllSelected);
for(unsigned int i = 0; i < selectionList.size(); i++){
if(selectionList[i] != selStart){
textarea->IndicatorFillRange(selectionList[i], selSize);
}
}
This doesn't factor in, however, only highlighting the visible range and only highlighting new occurrences as they scroll into view (I will add this later), so for files that are dozens of megabytes in size, it will take 2-3 seconds for the highlighting to finish.

qt ASSERT failure in QList<T>::at: "index out of range"

I am still relatively new to Qt and I have recently been working on a large project. When I attempt to run the project I get this error:
ASSERT failure in QList<T>::at: "index out of range", file c:\qt\qt5.3.0\5.3\msvc2013_64\include\qtcore\qlist.h, line 479
Just wondering if anyone knows what this means or how I might go about tracking down the source of the problem?
[edit] I believe that the addition of this code is causing the error
autAtom *aP = new autAtom(Principal);
autAtom *aQ = new autAtom(Principal);
autData *P = new autData (DataAtom, aP);
autData *Q = new autData (DataAtom, aQ);
autData *X = new autData (AnyData);
AUTPostulate *p;
autStatementList preList;
{
preList.clear();
//autData *d1 = new autData(NotHereData, X);
autStatement *pre1 = new autStatement(aP, believes, X);
autStatement *goal = new autStatement(aP, sees, X);
preList.append(pre1);
p = new AUTPostulate("BS", BS, goal, preList);
cout << "" << p->getString().toStdString() << endl;
AUTPostulates.append(p);
}
When this is taken out the tool runs fine.
I ran into a similar issue because I did a connect on itemChanged before populating the widget and then while populating my slot code was called. After I put in a guard that prevented signal handling during widget population, I found I could populate the widget fine and I could also handle the signal fine after. Hope this helps.
Index out of range means you're trying to access an index of a QList object, or maybe an object that is a subclass of a QList that does not exist. So if you have a QList with a length of 5 and you try to access index 5, it will be out of range.
Also, it looks like your code contains a lot of classes that are not standard to Qt or C++. At least I don't recognize them. It's difficult to say what's going on here without knowing about those classes.

How to move an item up and down in a wxListCtrl (wxwidgets)

This should be pretty easy but I'm having a heck of a time doing it. Basically I want to move a row in my wxListCtrl up or down. I posted this to wxwidgets forum and got the following code.
m_list->Freeze();
wxListItem item;
item.SetId(item_id); // the one which is selected
m_list->GetItem(item); // Retrieve the item
m_list->DeleteItem(item_id); // Remove it
item.SetId(item_id - 1); // Move it up
m_list->SetItem(item); // Apply it's new pos in the list
m_list->Thaw();
which doesn't work. The element is deleted but not moved up (I guess the setitem line is not working). Then I thought to just switch the text and the image but I can't even get the text from the row reliably. I have
int index = m_right->GetNextItem(-1, wxLIST_NEXT_ALL, wxLIST_STATE_SELECTED);
wxString label = m_right->GetItemText(index);
if(index == 0)
return;
wxListItem item;
item.SetId(index);
bool success = m_right->GetItem(item);
wxString text = item.GetText();
but text is blank even though there is text and the index is correct. So, I'm stuck not even being able to do the most basic task. Anybody know how to do this? The code runs in a button callback (the user presses a little up arrow and my code executes to try to move it). I'm using 2.9.1 on windows.
I made it work like this with wxWidgets 2.9.3 :
void FileSelectionPanel::OnMoveUp( wxCommandEvent& WXUNUSED(evt) )
{
int idx = _listCtrl->GetNextItem( -1, wxLIST_NEXT_ALL, wxLIST_STATE_SELECTED );
if( idx == 0) idx = _listCtrl->GetNextItem( 0, wxLIST_NEXT_ALL, wxLIST_STATE_SELECTED );
_listCtrl->Freeze();
while( idx > -1 ) {
wxListItem item;
item.SetId(idx); _listCtrl->GetItem(item);
item.SetId(idx-1); _listCtrl->InsertItem(item);
_listCtrl->SetItemData( idx-1, _listCtrl->GetItemData( idx+1 ));
for( int i = 0; i < _listCtrl->GetColumnCount(); i++ ) {
_listCtrl->SetItem( idx-1, i, _listCtrl->GetItemText( idx+1, i ));
}
_listCtrl->DeleteItem( idx + 1 );
idx = _listCtrl->GetNextItem( idx-1, wxLIST_NEXT_ALL, wxLIST_STATE_SELECTED );
}
_listCtrl->Thaw();
}
The thing I noticed it that wxListItem is more of a convenience struct, for storing state of the view and help pass values into the wxListCtrl "nicely". It is in no way bound to what is actually inside of the wxListCtrl.
Hope this still helps anyone !
Even there is already an answer that is checked. I have the same problem here, but my list is unordered. By looking into wxWidgets' code I found out there is another important information inside the wxListItem object - the mask. I got my reordering to work correctly by setting the mask value to -1, which means that all data shall be copied. This includes the item text as well as other information, like the item data (which was important in my case).
wxListItem item;
item.SetId(item_id); // set needed id
item.SetMask(-1); // set needed data
m_list->GetItem(item); // actually retrieve the item
m_list->DeleteItem(item_id); // remove old copy
item.SetId(item_id - 1); // move item up
m_list->InsertItem(item); // insert copy of item
I also had to use "InsertItem" instead of "SetItem". Otherwise, there was no new item inserted, but an existing one overwritten (see also tomcat31's answer).
Is the list ordered? if it is auto ordering it may be ignoring the order you are trying to apply.
From recollection the internal order was not necessarily sequential, you might have to get the index of the previous item and go one before it.