Better, or advantages in different ways of coding similar functions - c++

I'm writing the code for a GUI (in C++), and right now I'm concerned with the organisation of text in lines. One of the problems I'm having is that the code is getting very long and confusing, and I'm starting to get into a n^2 scenario where for every option I add in for the texts presentation, the number of functions I have to write is the square of that. In trying to deal with this, A particular design choice has come up, and I don't know the better method, or the extent of the advantages or disadvantages between them:
I have two methods which are very similar in flow, i.e, iterate through the same objects, taking into account the same constraints, but ultimately perform different operations between this flow. For anyones interest, the methods render the text, and determine if any text overflows the line due to wrapping the text around other objects or simply the end of the line respectively.
These functions need to be copied and rewritten for left, right or centred text, which have different flow, so whatever design choice I make would be repeated three times.
Basically, I could continue what I have now, which is two separate methods to handle these different actions, or I could merge them into one function, which has if statements within it to determine whether or not to render the text or figure out if any text overflows.
Is there a generally accepted right way to going about this? Otherwise, what are the tradeoffs concerned, what are the signs that might indicate one way should be used over the other? Is there some other way of doing things I've missed?
I've edited through this a few times to try and make it more understandable, but if it isn't please ask me some questions so I can edit and explain. I can also post the source code of the two different methods, but they use a lot of functions and objects that would take too long to explain.
// EDIT: Source Code //
Function 1:
void GUITextLine::renderLeftShifted(const GUIRenderInfo& renderInfo) {
if(m_renderLines.empty())
return;
Uint iL = 0;
Array2t<float> renderCoords;
renderCoords.s_x = renderInfo.s_offset.s_x + m_renderLines[0].s_x;
renderCoords.s_y = renderInfo.s_offset.s_y + m_y;
float remainingPixelsInLine = m_renderLines[0].s_y;
for (Uint iTO= 0;iTO != m_text.size();++iTO)
{
if(m_text[iTO].s_pixelWidth <= remainingPixelsInLine)
{
string preview = m_text[iTO].s_string;
m_text[iTO].render(&renderCoords);
remainingPixelsInLine -= m_text[iTO].s_pixelWidth;
}
else
{
FSInternalGlyphData intData = m_text[iTO].stealFSFastFontInternalData();
float characterWidth = 0;
Uint iFirstCharacterOfRenderLine = 0;
for(Uint iC = 0;;++iC)
{
if(iC == m_text[iTO].s_string.size())
{
// wrap up
string renderPart = m_text[iTO].s_string;
renderPart.erase(iC, renderPart.size());
renderPart.erase(0, iFirstCharacterOfRenderLine);
m_text[iTO].s_font->renderString(renderPart.c_str(), intData,
&renderCoords);
break;
}
characterWidth += m_text[iTO].s_font->getWidthOfGlyph(intData,
m_text[iTO].s_string[iC]);
if(characterWidth > remainingPixelsInLine)
{
// Can't push in the last character
// No more space in this line
// First though, render what we already have:
string renderPart = m_text[iTO].s_string;
renderPart.erase(iC, renderPart.size());
renderPart.erase(0, iFirstCharacterOfRenderLine);
m_text[iTO].s_font->renderString(renderPart.c_str(), intData,
&renderCoords);
if(++iL != m_renderLines.size())
{
remainingPixelsInLine = m_renderLines[iL].s_y;
renderCoords.s_x = renderInfo.s_offset.s_x + m_renderLines[iL].s_x;
// Cool, so now try rendering this character again
--iC;
iFirstCharacterOfRenderLine = iC;
characterWidth = 0;
}
else
{
// Quit
break;
}
}
}
}
}
// Done! }
Function 2:
vector GUITextLine::recalculateWrappingContraints_LeftShift()
{
m_pixelsOfCharacters = 0;
float pixelsRemaining = m_renderLines[0].s_y;
Uint iRL = 0;
// Go through every text object, fiting them into render lines
for(Uint iTO = 0;iTO != m_text.size();++iTO)
{
// If an entire text object fits in a single line
if(pixelsRemaining >= m_text[iTO].s_pixelWidth)
{
pixelsRemaining -= m_text[iTO].s_pixelWidth;
m_pixelsOfCharacters += m_text[iTO].s_pixelWidth;
}
// Otherwise, character by character
else
{
// Get some data now we don't get it every function call
FSInternalGlyphData intData = m_text[iTO].stealFSFastFontInternalData();
for(Uint iC = 0; iC != m_text[iTO].s_string.size();++iC)
{
float characterWidth = m_text[iTO].s_font->getWidthOfGlyph(intData, '-');
if(characterWidth < pixelsRemaining)
{
pixelsRemaining -= characterWidth;
m_pixelsOfCharacters += characterWidth;
}
else // End of render line!
{
m_pixelsOfWrapperCharacters += pixelsRemaining; // we might track how much wrapping px we use
// If this is true, then we ran out of render lines before we ran out of text. Means we have some overflow to return
if(++iRL == m_renderLines.size())
{
return harvestOverflowFrom(iTO, iC);
}
else
{
pixelsRemaining = m_renderLines[iRL].s_y;
}
}
}
}
}
vector<GUIText> emptyOverflow;
return emptyOverflow; }
So basically, render() takes renderCoordinates as a parameter and gets from it the global position of where it needs to render from. calcWrappingConstraints figures out how much text in the object goes over the allocated space, and returns that text as a function.
m_renderLines is an std::vector of a two float structure, where .s_x = where rendering can start and .s_y = how large the space for rendering is - not, its essentially width of the 'renderLine', not where it ends.
m_text is an std::vector of GUIText objects, which contain a string of text, and some data, like style, colour, size ect. It also contains under s_font, a reference to a font object, which performs rendering, calculating the width of a glyph, ect.
Hopefully this clears things up.

There is no generally accepted way in this case.
However, common practice in any programming scenario is to remove duplicated code.
I think you're getting stuck on how to divide code by direction, when direction changes the outcome too much to make this division. In these cases, focus on the common portions of the three algorithms and divide them into tasks.
I did something similar when I duplicated WinForms flow layout control for MFC. I dealt with two types of objects: fixed positional (your pictures etc.) and auto positional (your words).
In the example you provided I can list out common portions of your example.
Write Line (direction)
bool TestPlaceWord (direction) // returns false if it cannot place word next to previous word
bool WrapPastObject (direction) // returns false if it runs out of line
bool WrapLine (direction) // returns false if it runs out of space for new line.
Each of these would be performed no matter what direction you are faced with.
Ultimately, the algorithm for each direction is just too different to simplify anymore than that.

How about an implementation of the Visitor Pattern? It sounds like it might be the kind of thing you are after.

Related

When creating threads using lambda expressions, how to give each thread its own copy of the lambda expression?

I have been working on a program that basically used brute force to work backward to find a method using a given set of operations to reach the given number. So, for example, if I gave in a set of operations +5,-7,*10,/3, and a given number say 100(*this example probably won't come up with a solution), and also a given max amount of moves to solve (let's say 8), it will attempt to come up with a use of these operations to get to 100. This part works using a single thread which I have tested in an application.
However, I wanted it to be faster and I came to multithreading. I have worked a long time to even get the lambda function to work, and after some serious debugging have realized that the solution "combo" is technically found. However, before it is tested, it is changed. I wasn't sure how this was possible considering the fact that I had thought that each thread was given its own copy of the lambda function and its variables to use.
In summary, the program starts off by parsing the information, then passes the information which is divided by the parser as paramaters into the array of an operation object(somewhat of a functor). It then uses an algorithm which generated combinations which are then executed by the operation objects. The algorithm, in simplicity, takes in the amount of operations, assigns it to a char value(each char value corresponds to an operation), then outputs a char value. It generates all possible combinations.
That is a summary of how my program works. Everything seems to be working fine and in order other than two things. There is another error which I have not added to the title because there is a way to fix it, but I am curious about alternatives. This way is also probably not good for my computer.
So, going back to the problem with the lambda expression inputted with the thread as seen is with what I saw using breakpoints in the debugger. It appeared that both threads were not generating individual combos, but more rather properly switching between the first number, but alternating combos. So, it would go 1111, 2211, rather than generating 1111, 2111.(these are generated as the previous paragraph showed, but they are done a char at a time, combined using a stringstream), but once they got out of the loop that filled the combo up, combos would get lost. It would randomly switch between the two and never test the correct combo because combinations seemed to get scrambled randomly. This I realized must have something to do with race conditions and mutual exclusion. I had thought I had avoided it all by not changing any variables changed from outside the lambda expression, but it appears like both threads are using the same lambda expression.
I want to know why this occurs, and how to make it so that I can say create an array of these expressions and assign each thread its own, or something similar to that which avoids having to deal with mutual exclusion as a whole.
Now, the other problem happens when I at the end delete my array of operation objects. The code which assigns them and the deleting code is shown below.
operation *operations[get<0>(functions)];
for (int i = 0; i < get<0>(functions); i++)
{
//creates a new object for each operation in the array and sets it to the corresponding parameter
operations[i] = new operation(parameterStrings[i]);
}
delete[] operations;
The get<0>(functions) is where the amount of functions is stored in a tuple and is the number of objects to be stored in an array. The paramterStrings is a vector in which the strings used as parameters for the constructor of the class are stored. This code results in an "Exception trace/breakpoint trap." If I use "*operations" instead I get a segmentation fault in the file where the class is defined, the first line where it says "class operation." The alternative is just to comment out the delete part, but I am pretty sure that it would be a bad idea to do so, considering the fact that it is created using the "new" operator and might cause memory leaks.
Below is the code for the lambda expression and where the corresponding code for the creation of threads. I readded code inside the lambda expression so it could be looked into to find possible causes for race conditions.
auto threadLambda = [&](int thread, char *letters, operation **operations, int beginNumber) {
int i, entry[len];
bool successfulComboFound = false;
stringstream output;
int outputNum;
for (i = 0; i < len; i++)
{
entry[i] = 0;
}
do
{
for (i = 0; i < len; i++)
{
if (i == 0)
{
output << beginNumber;
}
char numSelect = *letters + (entry[i]);
output << numSelect;
}
outputNum = stoll(output.str());
if (outputNum == 23513511)
{
cout << "strange";
}
if (outputNum != 0)
{
tuple<int, bool> outputTuple;
int previousValue = initValue;
for (int g = 0; g <= (output.str()).length(); g++)
{
operation *copyOfOperation = (operations[((int)(output.str()[g])) - 49]);
//cout << copyOfOperation->inputtedValue;
outputTuple = (*operations)->doOperation(previousValue);
previousValue = get<0>(outputTuple);
if (get<1>(outputTuple) == false)
{
break;
}
debugCheck[thread - 1] = debugCheck[thread - 1] + 1;
if (previousValue == goalValue)
{
movesToSolve = g + 1;
winCombo = outputNum;
successfulComboFound = true;
break;
}
}
//cout << output.str() << ' ';
}
if (successfulComboFound == true)
{
break;
}
output.str("0");
for (i = 0; i < len && ++entry[i] == nbletters; i++)
entry[i] = 0;
} while (i < len);
if (successfulComboFound == true)
{
comboFoundGlobal = true;
finishedThreads.push_back(true);
}
else
{
finishedThreads.push_back(true);
}
};
Threads created here :
thread *threadArray[numberOfThreads];
for (int f = 0; f < numberOfThreads; f++)
{
threadArray[f] = new thread(threadLambda, f + 1, lettersPointer, operationsPointer, ((int)(workingBeginOperations[f])) - 48);
}
If any more of the code is needed to help solve the problem, please let me know and I will edit the post to add the code. Thanks in advance for all of your help.
Your lambda object captures its arguments by reference [&], so each copy of the lambda used by a thread references the same shared objects, and so various threads race and clobber each other.
This is assuming things like movesToSolve and winCombo come from captures (it is not clear from the code, but it seems like it). winCombo is updated when a successful result is found, but another thread might immediately overwrite it right after.
So every thread is using the same data, data races abound.
You want to ensure that your lambda works only on two three types of data:
Private data
Shared, constant data
Properly synchronized mutable shared data
Generally you want to have almost everything in category 1 and 2, with as little as possible in category 3.
Category 1 is the easiest, since you can use e.g., local variables within the lambda function, or captured-by-value variables if you ensure a different lambda instance is passed to each thread.
For category 2, you can use const to ensure the relevant data isn't modified.
Finally you may need some shared global state, e.g., to indicate that a value is found. One option would be something like a single std::atomic<Result *> where when any thread finds a result, they create a new Result object and atomically compare-and-swap it into the globally visible result pointer. Other threads check this pointer for null in their run loop to see if they should bail out early (I assume that's what you want: for all threads to finish if any thread finds a result).
A more idiomatic way would be to use std::promise.

C++ do while loop

I have a vector holding 10 items (all of the same class for simplicity call it 'a'). What I want to do is to check that 'A' isn't either a) hiding the walls or b) hiding another 'A'. I have a collisions function that does this.
The idea is simply to have this looping class go though and move 'A' to the next position, if that potion is causing a collision then it needs to give itself a new random position on the screen. Because the screen is small, there is a good chance that the element will be put onto of another one (or on top of the wall etc). The logic of the code works well in my head - but debugging the code the object just gets stuck in the loop, and stay in the same position. 'A' is supposed to move about the screen, but it stays still!
When I comment out the Do while loop, and move the 'MoveObject()' Function up the code works perfectly the 'A's are moving about the screen. It is just when I try and add the extra functionality to it is when it doesn't work.
void Board::Loop(void){
//Display the postion of that Element.
for (unsigned int i = 0; i <= 10; ++i){
do {
if (checkCollisions(i)==true){
moveObject(i);
}
else{
objects[i]->ResetPostion();
}
}
while (checkCollisions(i) == false);
objects[i]->SetPosition(objects[i]->getXDir(),objects[i]->getYDir());
}
}
The class below is the collision detection. This I will expand later.
bool Board::checkCollisions(int index){
char boundry = map[objects[index]->getXDir()][objects[index]->getYDir()];
//There has been no collisions - therefore don't change anything
if(boundry == SYMBOL_EMPTY){
return false;
}
else{
return true;
}
}
Any help would be much appreciated. I will buy you a virtual beer :-)
Thanks
Edit:
ResetPostion -> this will give the element A a random position on the screen
moveObject -> this will look at the direction of the object and adjust the x and Y cord's appropriately.
I guess you need: do { ...
... } while (checkCollisions(i));
Also, if you have 10 elements, then i = 0; i < 10; i++
And btw. don't write if (something == true), simply if (something) or if (!something)
for (unsigned int i = 0; i <= 10; ++i){
is wrong because that's a loop for eleven items, use
for (unsigned int i = 0; i < 10; ++i){
instead.
You don't define what 'doesn't work' means, so that's all the help I can give for now.
There seems to be a lot of confusion here over basic language structure and logic flow. Writing a few very simple test apps that exercise different language features will probably help you a lot. (So will a step-thru debugger, if you have one)
do/while() is a fairly advanced feature that some people spend whole careers never using, see: do...while vs while
I recommend getting a solid foundation with while and if/else before even using for. Your first look at do should be when you've just finished a while or for loop and realize you could save a mountain of duplicate initialization code if you just changed the order of execution a bit. (Personally I don't even use do for that any more, I just use an iterator with while(true)/break since it lets me pre and post code all within a single loop)
I think this simplifies what you're trying to accomplish:
void Board::Loop(void) {
//Display the postion of that Element.
for (unsigned int i = 0; i < 10; ++i) {
while(IsGoingToCollide(i)) //check is first, do while doesn't make sense
objects[i]->ResetPosition();
moveObject(i); //same as ->SetPosition(XDir, YDir)?
//either explain difference or remove one or the other
}
}
This function name seems ambiguous to me:
bool Board::checkCollisions(int index) {
I'd recommend changing it to:
// returns true if moving to next position (based on inertia) will
// cause overlap with any other object's or structure's current location
bool Board::IsGoingToCollide(int index) {
In contrast checkCollisions() could also mean:
// returns true if there is no overlap between this object's
// current location and any other object's or structure's current location
bool Board::DidntCollide(int index) {
Final note: Double check that ->ResetPosition() puts things inside the boundaries.

Logic Help: comparing values and taking the smallest distance, while removing it from the list of "available to compare"

Okay, I have been set with the task of comparing this list of Photons using one method (IU) and comparing it with another (TSP). I need to take the first IU photon and compare distances with all of the TSP photons, find the smallest distance, and "pair" them (i.e. set them both in arrays with the same index). Then, I need to take the next photon in the IU list, and compare it to all of the TSP photons, minus the one that was chosen already.
I know I need to use a Boolean array of sorts, with keeping a counter. I can't seem to logic it out entirely.
The code below is NOT standard C++ syntax, as it is written to interact with ROOT (CERN data analysis software).
If you have any questions with the syntax to better understand the code, please ask. I'll happily answer.
I have the arrays and variables declared already. The types that you see are called EEmcParticleCandidate and that's a type that reads from a tree of information, and I have a whole set of classes and headers that tell that how to behave.
Thanks.
Bool_t used[2];
if (num[0]==2 && num[1]==2) {
TIter photonIterIU(mPhotonArray[0]);
while(IU_photon=(EEmcParticleCandidate_t*)photonIterIU.Next()){
if (IU_photon->E > thresh2) {
distMin=1000.0;
index = 0;
IU_PhotonArray[index] = IU_photon;
TIter photonIterTSP(mPhotonArray[1]);
while(TSP_photon=(EEmcParticleCandidate_t*)photonIterTSP.Next()) {
if (TSP_photon->E > thresh2) {
Float_t Xpos_IU = IU_photon->position.fX;
Float_t Ypos_IU = IU_photon->position.fY;
Float_t Xpos_TSP = TSP_photon->position.fX;
Float_t Ypos_TSP = TSP_photon->position.fY;
distance_1 = find distance //formula didnt fit here //
if (distance_1 < distMin){
distMin = distance_1;;
for (Int_t i=0;i<2;i++){
used[i] = false;
} //for
used[index] = true;
TSP_PhotonArray[index] = TSP_photon;
index++;
} //if
} //if thresh
} // while TSP
} //if thresh
} // while IU
Thats all I have at the moment... work in progress, I realize all of the braces aren't closed. This is just a simple logic question.
This may take a few iterations.
As a particle physicist, you should understand the importance of breaking things down into their component parts. Let's start with iterating over all TSP photons. It looks as if the relevant code is here:
TIter photonIterTSP(mPhotonArray[1]);
while(TSP_photon=(EEmcParticleCandidate_t*)photonIterTSP.Next()) {
...
if(a certain condition is met)
TSP_PhotonArray[index] = TSP_photon;
}
So TSP_photon is a pointer, you will be copying it into the array TSP_PhotonArray (if the energy of the photon exceeds a fixed threshold), and you go to a lot of trouble keeping track of which pointers have already been so copied. There is a better way, but for now let's just consider the problem of finding the best match:
distMin=1000.0;
while(TSP_photon= ... ) {
distance_1 = compute_distance_somehow();
if (distance_1 < distMin) {
distMin = distance_1;
TSP_PhotonArray[index] = TSP_photon; // <-- BAD
index++; // <-- VERY BAD
}
}
This is wrong. Suppose you find a TSP_photon with the smallest distance yet seen. You haven't yet checked all TSP photons, so this might not be the best, but you store the pointer anyway, and increment the index. Then if you find another match that's even better, you'll store that one too. Conceptually, it should be something like this:
distMin=1000.0;
best_photon_yet = NULL;
while(TSP_photon= ... ) {
distance_1 = compute_distance_somehow();
if (distance_1 < distMin) {
distMin = distance_1;
best_pointer_yet = TSP_photon;
}
}
// We've now finished searching the whole list of TSP photons.
TSP_PhotonArray[index] = best_photon_yet;
index++;
Post a comment to this answer, telling me if this makes sense; if so, we can proceed, if not, I'll try to clarify.

Optimizing WordWrap Algorithm

I have a word-wrap algorithm that basically generates lines of text that fit the width of the text. Unfortunately, it gets slow when I add too much text.
I was wondering if I oversaw any major optimizations that could be made. Also, if anyone has a design that would still allow strings of lines or string pointers of lines that is better I'd be open to rewriting the algorithm.
Thanks
void AguiTextBox::makeLinesFromWordWrap()
{
textRows.clear();
textRows.push_back("");
std::string curStr;
std::string curWord;
int curWordWidth = 0;
int curLetterWidth = 0;
int curLineWidth = 0;
bool isVscroll = isVScrollNeeded();
int voffset = 0;
if(isVscroll)
{
voffset = pChildVScroll->getWidth();
}
int AdjWidthMinusVoffset = getAdjustedWidth() - voffset;
int len = getTextLength();
int bytesSkipped = 0;
int letterLength = 0;
size_t ind = 0;
for(int i = 0; i < len; ++i)
{
//get the unicode character
letterLength = _unicodeFunctions.bringToNextUnichar(ind,getText());
curStr = getText().substr(bytesSkipped,letterLength);
bytesSkipped += letterLength;
curLetterWidth = getFont().getTextWidth(curStr);
//push a new line
if(curStr[0] == '\n')
{
textRows.back() += curWord;
curWord = "";
curLetterWidth = 0;
curWordWidth = 0;
curLineWidth = 0;
textRows.push_back("");
continue;
}
//ensure word is not longer than the width
if(curWordWidth + curLetterWidth >= AdjWidthMinusVoffset &&
curWord.length() >= 1)
{
textRows.back() += curWord;
textRows.push_back("");
curWord = "";
curWordWidth = 0;
curLineWidth = 0;
}
//add letter to word
curWord += curStr;
curWordWidth += curLetterWidth;
//if we need a Vscroll bar start over
if(!isVscroll && isVScrollNeeded())
{
isVscroll = true;
voffset = pChildVScroll->getWidth();
AdjWidthMinusVoffset = getAdjustedWidth() - voffset;
i = -1;
curWord = "";
curStr = "";
textRows.clear();
textRows.push_back("");
ind = 0;
curWordWidth = 0;
curLetterWidth = 0;
curLineWidth = 0;
bytesSkipped = 0;
continue;
}
if(curLineWidth + curWordWidth >=
AdjWidthMinusVoffset && textRows.back().length() >= 1)
{
textRows.push_back("");
curLineWidth = 0;
}
if(curStr[0] == ' ' || curStr[0] == '-')
{
textRows.back() += curWord;
curLineWidth += curWordWidth;
curWord = "";
curWordWidth = 0;
}
}
if(curWord != "")
{
textRows.back() += curWord;
}
updateWidestLine();
}
There are two main things making this slower than it could be, I think.
The first, and probably less important: as you build up each line, you're appending words to the line. Each such operation may require the line to be reallocated and its old contents copied. For long lines, this is inefficient. However, I'm guessing that in actual use your lines are quite short (say 60-100 characters), in which case the cost is unlikely to be huge. Still, there's probably some efficiency to be won there.
The second, and probably much more important: you're apparently using this for a text-area in some sort of GUI, and I'm guessing that it's being typed into. If you're recomputing for every character typed, that's really going to hurt once the text gets long.
As long as the user is only adding characters at the end -- which is surely the most common case -- you can make effective use of the fact that with your "greedy" line-breaking algorithm changes never affect anything on earlier lines: so just recompute from the start of the last line.
If you want to make it fast even when the user is typing (or deleting or whatever) somewhere in the middle of the text, your code will need to do more work and store more information. For instance: whenever you build a line, remember "if you start a line with this word, it ends with that word and this is the whole resulting line". Invalidate this information when anything changes within that line. Now, after a little editing, most changes will not require very much recalculation. You should work out the details of this for yourself because (1) it's a good exercise and (2) I need to go to bed now.
(To save on memory, you might prefer not to store whole lines at all -- whether or not you implement the sort of trick I just described. Instead, just store here's-the-next-line-break information and build up lines as your UI needs to render them.)
It's probably more complication than you want to take on board right now, but you should also look up Donald Knuth's dynamic-programming-based line-breaking algorithm. It's substantially more complicated than yours but can still be made quite quick, and it produces distinctly better results. See, e.g., http://defoe.sourceforge.net/folio/knuth-plass.html.
Problems on algorithms often come with problem on data-structures.
Let's make a few observations, first:
paragraphs can be treated independently
editing at a given index only invalidates the current word and those that follow
it is unnecessary to copy the whole words when their index would suffice for retrieving them and only their length matter for the computation
Paragraph
I would begin by introducing the notion of paragraph, which are determined by user-introduced line-breaks. When an edition takes place, you need to locate which is the concerned paragraph, which requires a look-up structure.
The "ideal" structure here would be a Fenwick Tree, for a small text box however this seems overkill. We'll just have each paragraph store the number of displayed lines that make up its representation and you'll count from the beginning. Note that an access to the last displayed line is an access to the last paragraph.
The paragraphs are thus stored as a contiguous sequence, in C++ terms, well probably take the hit of an indirection (ie storing pointers) to save moving them around when a paragraph in the middle is removed.
Each paragraph will store:
its content, the simplest being a single std::string to represent it.
its display, in editable form (which we need to determine still)
Each paragraph will cache its display, this paragraph cache will be invalidated whenever an edit is made.
The actual rendering will be made for only a couple of paragraphs at a time (and better, a couple of displayed lines): those which are visible.
Displayed Line
A paragraph may be to displayed with at least one line, but there is no maximum. We need to store the "display" in editable form, that is a form suitable for edition.
A single chunk of characters with \n thrown in is not suitable. Changes imply moving lots of characters around, and users are supposed to be changing the text, so we need better.
Using lengths, instead of characters, we may actually only store a mere 4 bytes (if the string takes more than 3GB... I don't guarantee much about this algorithm).
My first idea was to use the character index, however in case of edition all subsequent indexes are changed, and the propagation is error prone. Lengths are offsets, so we have an index relative to the position of the previous word. It does pose the issue of what a word (or token) is. Notably, do you collapse multiple spaces ? How do you handle them ? Here I'll assume that words are separated from one another by a single whitespace.
For "fast" retrieval, I'll store the length of the whole displayed line as well. This allows quickly skipping the first displayed lines when an edit is made at character 503 of the paragraph.
A displayed line will thus be composed of:
a total length (inferior to the maximum displayed length of the box, once computation ended)
a sequence of words (tokens) length
This sequence should be editable efficiently at both ends (since for wrapping we'll push/pop words at both ends depending on whether an edit added or removed words). It's not so important if in the middle we're not that efficient, because only one line at a time is edited in the middle.
In C++, either a vector or deque should be fine. While in theory a list would be "perfect", in practice its poor memory locality and high memory overhead will offset its asymptotic guarantees. A line is composed of few words, so the asymptotic behavior does not matter and high constants do.
Rendering
For the rendering, pick up a buffer of already sufficient length (a std::string with a call to reserve will do). Normally, you'd clear and rewrite the buffer each time, so no memory allocation occurs.
You need not display what cannot be seen, but do need to know how many lines there are, to pick up the correct paragraph.
Once you get the paragraph:
set offset to 0
for each line hidden, increment offset by its length (+ 1 for the space after it)
a word is accessed as a substring of _content, you can use the insert method on buffer: buffer.insert(buffer.end(), _content[offset], _content[offset+length])
The difficulty is in maintaining offset, but that's what makes the algorithm efficient.
Structures
struct LineDisplay: private boost::noncopyable
{
Paragraph& _paragraph;
uint32_t _length;
std::vector<uint16_t> _words; // copying around can be done with memmove
};
struct Paragraph:
{
std::string _content;
boost::ptr_vector<LineDisplay> _lines;
};
With this structure, implementation should be straightforward, and should not slow down as much when the content grows.
General change to the algorithm -
work out if you need the scroll bar as cheap as you can, ie. count the number of \n in the text and if it's greater then the vheight turn on the scroll, check lengths so on.
prepare the text into appropriate lines for the control now that you know you need a scroll bar or not.
This allows you to remove/reduce the test if(!isVscroll && isVScrollNeeded()) as is run on almost every character - isVScroll is probably not cheep, the example code doesn't seem to pass knowledge of lines to the function so can't see how it tells if it is needed.
Assuming textRows is a vector<string> - textrows.back() += is kind of expensive, looking up the back not so much as += on string not being efficient for strings. I'd change to using a ostrstream for gathering the row and push it in when it is done.
getFont().getWidth() are likely to be expensive - is the font changing? how greatly does the width differ between smallest and largest, shortcuts for fixed width fonts.
Use native methods where possible to get the size of a word since you don't want to break them - GetTextExtentPoint32
Often the will be sufficient space to allow for the VScroll when you change between. Restarting from the beginning with measuring could cost you up to twice the time. Store the width of the line with each line so you can skip over the ones that still fit.
Or don't build the line strings directly, keep the words seperate with the size.
How accurate does it realy need to be? Apply some pragmatism...
Just assume VScroll will be needed, mostly wrapping won't change much even if it isn't (1 letter words at the end/start of a line)
try and work more with words than with letters - checking remaining space for each letter can waste time. assume each letter in the string is the longest letter, letters x longest < space then put it in.

Checking lists and running handlers

I find myself writing code that looks like this a lot:
set<int> affected_items;
while (string code = GetKeyCodeFromSomewhere())
{
if (code == "some constant" || code == "some other constant") {
affected_items.insert(some_constant_id);
} else if (code == "yet another constant" || code == "the constant I didn't mention yet") {
affected_items.insert(some_other_constant_id);
} // else if etc...
}
for (set<int>::iterator it = affected_items.begin(); it != affected_items.end(); it++)
{
switch(*it)
{
case some_constant_id:
RunSomeFunction(with, these, params);
break;
case some_other_constant_id:
RunSomeOtherFunction(with, these, other, params);
break;
// etc...
}
}
The reason I end up writing this code is that I need to only run the functions in the second loop once even if I've received multiple key codes that might cause them to run.
This just doesn't seem like the best way to do it. Is there a neater way?
One approach is to maintain a map from strings to booleans. The main logic can start with something like:
if(done[code])
continue;
done[code] = true;
Then you can perform the appropriate action as soon as you identify the code.
Another approach is to store something executable (object, function pointer, whatever) into a sort of "to do list." For example:
while (string code = GetKeyCodeFromSomewhere())
{
todo[code] = codefor[code];
}
Initialize codefor to contain the appropriate function pointer, or object subclassed from a common base class, for each code value. If the same code shows up more than once, the appropriate entry in todo will just get overwritten with the same value that it already had. At the end, iterate over todo and run all of its members.
Since you don't seem to care about the actual values in the set you could replace it with setting bits in an int. You can also replace the linear time search logic with log time search logic. Here's the final code:
// Ahead of time you build a static map from your strings to bit values.
std::map< std::string, int > codesToValues;
codesToValues[ "some constant" ] = 1;
codesToValues[ "some other constant" ] = 1;
codesToValues[ "yet another constant" ] = 2;
codesToValues[ "the constant I didn't mention yet" ] = 2;
// When you want to do your work
int affected_items = 0;
while (string code = GetKeyCodeFromSomewhere())
affected_items |= codesToValues[ code ];
if( affected_items & 1 )
RunSomeFunction(with, these, params);
if( affected_items & 2 )
RunSomeOtherFunction(with, these, other, params);
// etc...
Its certainly not neater, but you could maintain a set of flags that say whether you've called that specific function or not. That way you avoid having to save things off in a set, you just have the flags.
Since there is (presumably from the way it is written), a fixed at compile time number of different if/else blocks, you can do this pretty easily with a bitset.
Obviously, it will depend on the specific circumstances, but it might be better to have the functions that you call keep track of whether they've already been run and exit early if required.