Prioritization of output by more than one factor in List

Prioritization of output by more than one factor in List - list

public Transform m_targetPos;
public List<Transform> l_targetList = new List<Transform>();
private void GetPriority()
{
l_targetList = l_targetList.OrderBy(x => Vector3.Distance(_path.m_start.position,
(x.transform == _PC.transform) ? x.transform.position + new Vector3(0, 10, 0) : x.transform.position))
.ToList();
m_targetPos = l_targetList[0];
}
Here I have a method to output a single Transform to store in m_targetPos to feed another method for other functions like targeting for AoE attack. As of now, it's sorted by a factor(distance from a point called _path.m_start.position) and that's the basic function I intended this to work.
However, how can I add another distinguishing factor from here?
Let me explain what I want:
There are two tag; tagA and tagB. If in List l_targetList objectAwithTagA has a distance of 10 and objectBwithTagB has a distance of 7 the method GetPriority() will store objectBwithTagB due to the distance. However, since tagA is prioritized by my intention I want GetPriority() to ignore(or compensate some amount) of the distance factor to prioritize object with tagA.
I feel like I totally blew the comprehension here.

Instead of just x => Vector3.Distance(...) do something along the lines of x => Vector3.Distance(...) * CalculateTagFactor(x). To have no effect from the tag, just have CalculateTagFactor return 1. To have it ignore anything with a tag, return float.NaN or float.PositiveInfinity.

Related

custom reduce functions in crossfilter on 2 fields

My data looks like this
field1,field2,value1,value2
a,b,1,1
b,a,2,2
c,a,3,5
b,c,6,7
d,a,6,7
The ultimate goal is to get value1+value2 for each distinct value of field1 and field2 : {a:15(=1+2+5+7),b:9(=1+2+6),c:10(=3+7),d:6(=6)}
I don't have a good way of rearranging that data so let's assume the data has to stay like this.
Based on this previous question (Thanks #Gordon), I mapped using :
cf.dimension(function(d) { return [d.field1,d.field2]; }, true);
But I am left a bit puzzled as to how to write the custom reduce functions for my use case. Main question is : from within the reduceAdd and reduceRemove functions, how do I know which key is currently "worked on" ? i.e in my case, how do I know whether I'm supposed to take value1 or value2 into account in my sum ?
(have tagged dc.js and reductio because it could be useful for users of those libraries)

OK so I ended up doing the following for defining the group :
reduceAdd: (p, v) => {
if (!p.hasOwnProperty(v.field1)) {
p[v.field1] = 0;
}
if (!p.hasOwnProperty(v.field2)) {
p[v.field2] = 0;
}
p[v.field1] += +v.value1;
p[v.field2] += +v.value2;
return p;
}
reduceRemove: (p, v) => {
p[v.field1] -= +v.value1;
p[v.field2] -= +v.value2;
return p;
}
reduceInitial: () => {
return {}
}
And when you use the group in a chart, you just change the valueAccessor to be (d) => d.value[d.key] instead of the usual (d) => d.value
Small inefficicency as you store more data than you need to in the value fields but if you don't have millions of distinct values it's basically negligible.

you always have a good way to re-arrange the data, after you have fetched it and before you feed it to crossfilter ;)
In fact, it's pretty much mandatory as soon as you handle non string fields (numeric or date)
You can do a reduceSum on multiple fields
dimensions.reduceSum(function(d) {return +d.value1 + +d.value2; });

Machine Learning Algorithm using recursion

I am currently working on a very beginners version of the ID3 machine learning algorithm. I am stuck on how to recursively call my build_tree function to actually make the rest of the decision tree and output it in a nice format. I have calculated gains, entropies, gain ratios, etc. but I have no clue how to integrate recursion into my function.
I am given a data set, which after doing all the calculations mentioned above, have split it into two datasets. Now I need to be able to recursively call it until both the left and right data sets become pure [which can easily be checked by a function i wrote called dataset.is_pure()], all while keeping track of the threshold at each node. I know that all my calculations and split methods are working as I have done individuual testing on them. It is just the recursive part that I am having trouble with.
Here is my build_tree function that I am having a recursion nightmare with. I am currently working in a linux environment with the g++ compiler. The code I have right now compiles, but when run gives me a segmentation error. Any and all help would be greatly appreciated!
struct node
{
vector<vector<string>> data;
double atrb;
node* parent;
node* left = NULL;
node* right = NULL;
node(node* parent) : parent(parent) {}
};
node* root = new node(NULL);
void build_tree(node* current, dataset data_set)
{
vector<vector<string>> l_d;
vector<vector<string>> r_d;
double global_entropy = calc_entropy(data_set.get_col(data_set.n_col()-1));
int best_col = this->get_best_col(data_set, global_entropy);
hash_map selected_atrb(data_set.n_row(), data_set.truncate(best_col));
double threshold = get_threshold(selected_atrb, global_entropy);
cout << threshold << "\n";
split_data(threshold, best_col, data_set, l_d, r_d);
dataset right_data(r_d);
dataset left_data(l_d);
right_data.delete_col(best_col);
left_data.delete_col(best_col);
if(left_data.is_pure())
return;
else
{
node* new_left = new node(current);
new_left->atrb = threshold;
current->left = new_left;
new_left->data = l_d;
return build_tree(new_left, left_data);
}
if(right_data.is_pure())
return;
else
{
node* new_right = new node(current);
new_right->atrb = threshold;
current->right = new_right;
new_right->data = r_d;
return build_tree(new_right, right_data);
}
}
id3(dataset data)
{
build_tree(root, data);
}
};
This is only a part of my class. If you wish to see any other code, just let me know!

Regards,
I will explain to you with pseudocodigo how the reclusive function works, I will also leave you the code that you make in javascript for the implementation of said algorithm.
Before going into detail, I will mention certain concepts and classes you use.
Attribute: Characteristic of the data set, it is usually the name of a column of the data set.
Class: Decision characteristic, it is generally of binary value and usually it is always the last column of the data set.
Value: Possible value of the attribute in the data set, for example (Sunny, Cloudy, Rainy)
Tree: classes that have a number of nodes associated with each other.
Node: Entity in charge of storing the attribute (question), also has a list with the arcs.
Arc: Contains the value of an attribute and has an attribute that will contain the following child node.
Leaf : Contains a class. This node is the result of a decision, for example (Yes or No).
Best feature: Attribute with the highest information gain.
Function to create the tree from a set of data:
Obtain the values of a class.
Evaluate if there is only one type of class in the data set, for example (Yes).   
If true, then we create a Leaf object and return this object
Obtain the information gain of each current attribute.
Choose the attribute with the highest information gain.
Create a node with the best feature.
Obtain the values of the best feature.
Iterate the list of those values.
Filter the list, so that there are only records with the value that we are iterating (save it in a variable temporary)
Create an Arc with this value.
     - Assign the following attribute to the Arc: (Here comes the recursion) call again the same only function that you send (the filtered list of records, the class, the list of attributes without the best feature, the list of general attributes without the attributes of the best feature)
Add the arc to the node.
Return the node.
This would be the segment of code that is responsible for creating the tree
let crearArbol = (ejemplosLista, clase, atributos, valores) => {
let valoresClase = obtenerValoresAtributo(ejemplosLista, clase);
if (valoresClase.length == 1) {
autoIncremental++;
return new Hoja(valoresClase[0], autoIncremental);
}
if (atributos.length == 0) {
let claseDominante = claseMayoritaria(ejemplosLista);
return new Atributo();
}
let gananciaAtributos = obtenerGananciaAtributos(ejemplosLista, valores, atributos);
let atributoMaximo = atributos[maximaGanancia(gananciaAtributos)];
autoIncremental++;
let nodo = new Atributo(atributoMaximo, [], autoIncremental);
let valoresLista = obtenerValoresAtributo(ejemplosLista, atributoMaximo);
valoresLista.forEach((valor) => {
let ejemplosFiltrados = arrayDistincAtributos(ejemplosLista, atributoMaximo, valor);
let arco = new Arco(valor);
arco.sigNodo = crearArbol(ejemplosFiltrados, clase, [...eliminarAtributo(atributoMaximo, atributos)], [...eliminarValores(atributoMaximo, valores)]);
nodo.hijos.push(arco);
});
return nodo;
};
Unfortunately, the code is only in Spanish.
This is the repository that contains my project with this implementation Source code of id3

Low Memory Shortest Path Algorithm

I have a global unique path table which can be thought of as a directed un-weighted graph. Each node represents either a piece of physical hardware which is being controlled, or a unique location in the system. The table contains the following for each node:
A unique path ID (int)
Type of component (char - 'A' or 'L')
String which contains a comma separated list of path ID's which that node is connected to (char[])
I need to create a function which given a starting and ending node, finds the shortest path between the two nodes. Normally this is a pretty simple problem, but here is the issue I am having. I have a very limited amount of memory/resources, so I cannot use any dynamic memory allocation (ie a queue/linked list). It would also be nice if it wasn't recursive (but it wouldn't be too big of an issue if it was as the table/graph itself if really small. Currently it has 26 nodes, 8 of which will never be hit. At worst case there would be about 40 nodes total).
I started putting something together, but it doesn't always find the shortest path. The pseudo code is below:
bool shortestPath(int start, int end)
if start == end
if pathTable[start].nodeType == 'A'
Turn on part
end if
return true
else
mark the current node
bool val
for each node in connectedNodes
if node is not marked
val = shortestPath(node.PathID, end)
end if
end for
if val == true
if pathTable[start].nodeType == 'A'
turn on part
end if
return true
end if
end if
return false
end function
Anyone have any ideas how to either fix this code, or know something else that I could use to make it work?
----------------- EDIT -----------------
Taking Aasmund's advice, I looked into implementing a Breadth First Search. Below I have some c# code which I quickly threw together using some pseudo code I found online.
pseudo code found online:
Input: A graph G and a root v of G
procedure BFS(G,v):
create a queue Q
enqueue v onto Q
mark v
while Q is not empty:
t ← Q.dequeue()
if t is what we are looking for:
return t
for all edges e in G.adjacentEdges(t) do
u ← G.adjacentVertex(t,e)
if u is not marked:
mark u
enqueue u onto Q
return none
C# code which I wrote using this code:
public static bool newCheckPath(int source, int dest)
{
Queue<PathRecord> Q = new Queue<PathRecord>();
Q.Enqueue(pathTable[source]);
pathTable[source].markVisited();
while (Q.Count != 0)
{
PathRecord t = Q.Dequeue();
if (t.pathID == pathTable[dest].pathID)
{
return true;
}
else
{
string connectedPaths = pathTable[t.pathID].connectedPathID;
for (int x = 0; x < connectedPaths.Length && connectedPaths != "00"; x = x + 3)
{
int nextNode = Convert.ToInt32(connectedPaths.Substring(x, 2));
PathRecord u = pathTable[nextNode];
if (!u.wasVisited())
{
u.markVisited();
Q.Enqueue(u);
}
}
}
}
return false;
}
This code runs just fine, however, it only tells me if a path exists. That doesn't really work for me. Ideally what I would like to do is in the block "if (t.pathID == pathTable[dest].pathID)" I would like to have either a list or a way to see what nodes I had to pass through to get from the source and destination, such that I can process those nodes there, rather than returning a list to process elsewhere. Any ideas on how i could make that change?

The most effective solution, if you're willing to use static memory allocation (or automatic, as I seem to recall that the C++ term is), is to declare a fixed-size int array (of size 41, if you're absolutely certain that the number of nodes will never exceed 40). By using two indices to indicate the start and end of the queue, you can use this array as a ring buffer, which can act as the queue in a breadth-first search.
Alternatively: Since the number of nodes is so small, Bellman-Ford may be fast enough. The algorithm is simple to implement, does not use recursion, and the required extra memory is only a distance (int, or even byte in your case) and a predecessor id (int) per node. The running time is O(VE), alternatively O(V^3), where V is the number of nodes and E is the number of edges.

Data structure for selecting groups of machines

I have this old batch system. The scheduler stores all computational nodes in one big array. Now that's OK for the most part, because most queries can be solved by filtering for nodes that satisfy the query.
The problem I have now is that apart from some basic properties (number of cpus, memory, OS), there are also these weird grouping properties (city, infiniband, network scratch).
Now the issue with these is that when a user requests nodes with infiniband I can't just give him any nodes, but I have to give him nodes connected to one infiniband switch, so the nodes can actually communicate using infiniband.
This is still OK, when user only requests one such property (I can just partition the array for each of the properties and then try to select the nodes in each partition separately).
The problem comes with combining multiple such properties, because then I would have to generate all combination of the subsets (partitions of the main array).
The good thing is that most of the properties are in a sub-set or equivalence relation (It sort of makes sense for machines on one infiniband switch to be in one city). But this unfortunately isn't strictly true.
Is there some good data structure for storing this kind of semi-hierarchical mostly-tree-like thing?
EDIT: example
node1 : city=city1, infiniband=switch03, networkfs=server01
node2 : city=city1, infiniband=switch03, networkfs=server01
node3 : city=city1, infiniband=switch03
node4 : city=city1, infiniband=switch03
node5 : city=city2, infiniband=switch03, networkfs=server02
node6 : city=city2, infiniband=switch03, networkfs=server02
node7 : city=city2, infiniband=switch04, networkfs=server02
node8 : city=city2, infiniband=switch04, networkfs=server02
Users request:
2x node with infiniband and networkfs
The desired output would be: (node1, node2) or (node5,node6) or (node7,node8).
In a good situation this example wouldn't happen, but we actually have these weird cross-site connections in some cases. If the nodes in city2 would be all on infiniband switch04, it would be easy. Unfortunately now I have to generate groups of nodes, that have the same infiniband switch and same network filesystem.
In reality the problem is much more complicated, since users don't request entire nodes, and the properties are many.
Edit: added the desired output for the query.

Assuming you have p grouping properties and n machines, a bucket-based solution is the easiest to set up and provides O(2p·log(n)) access and updates.
You create a bucket-heap for every group of properties (so you would have a bucket-heap for "infiniband", a bucket-heap for "networkfs" and a bucket-heap for "infiniband × networkfs") — this means 2p bucket-heaps.
Each bucket-heap contains a bucket for every combination of values (so the "infiniband" bucket would contain a bucket for key "switch04" and one for key "switch03") — this means a total of at most n·2p buckets split across all bucket-heaps.
Each bucket is a list of servers (possibly partitioned into available and unavailable). The bucket-heap is a standard heap (see std::make_heap) where the value of each bucket is the number of available servers in that bucket.
Each server stores references to all buckets that contain it.
When you look for servers that match a certain group of properties, you just look in the corresponding bucket for that property group, and climb down the heap looking for a bucket that's large enough to accomodate the number of servers requested. This takes O(log(p)·log(n)).
When servers are marked as available or unavailable, you have to update all buckets containing those servers, and then update the bucket-heaps containing those buckets. This is an O(2p·log(n)) operation.
If you find yourself having too many properties (and the 2p grows out of control), the algorithm allows for some bucket-heaps to be built on-demand from other bucket-heaps : if the user requests "infiniband × networkfs" but you only have a bucket-heap available for "infiniband" or "networkfs", you can turn each bucket in the "infiniband" bucket-heap into a bucket-heap on its own (use a lazy algorithm so you don't have to process all buckets if the first one works) and use a lazy heap-merging algorithm to find an appropriate bucket. You can then use a LRU cache to decide which property groups are stored and which are built on-demand.

My guess is that there won't be an "easy, efficient" algorithm and data structure to solve this problem, because what you're doing is akin to solving a set of simultaneous equations. Suppose there are 10 categories (like city, infiniband and network) in total, and the user specifies required values for 3 of them. The user asks for 5 nodes, let's say. Your task is then to infer values for the remaining 7 categories, such that at least 5 records exist that have all 10 category fields equal to these values (the 3 specified and the 7 inferred). There may be multiple solutions.
Still, provided there aren't too many different categories, and not too many distinct possibilities within each category, you can do a simple brute force recursive search to find possible solutions, where at each level of recursion you consider a particular category, and "try" each possibility for it. Suppose the user asks for k records, and may choose to stipulate any number of requirements via required_city, required_infiniband, etc.:
either(x, y) := if defined(x) then [x] else y
For each city c in either(required_city, [city1, city2]):
For each infiniband i in either(required_infiniband, [switch03, switch04]):
For each networkfs nfs in either(required_nfs, [undefined, server01, server02]):
Do at least k records of type [c, i, nfs] exist? If so, return them.
The either() function is just a way of limiting the search to the subspace containing points that the user gave constraints for.
Based on this, you will need a way to quickly look up the number of points (rows) for any given [c, i, nfs] combination -- nested hashtables will work just fine for this.

Step 1: Create an index for each property. E.g. for each property+value pair, create a sorted list of nodes with that property. Put each such list into an associative array of some kind- That is something like and stl map, one for each property, indexed by values. Such that when you are done you have a near constant time function that can return to you a list of nodes that match a single property+value pair. The list is simply sorted by node number.
Step 2: Given a query, for each property+value pair required, retrieve the list of nodes.
Step 3: Starting with the shortest list, call it list 0, compare it to each of the other lists in turn removing elements from list 0 that are not in the other lists.
You should now have just the nodes that have all the properties requested.
Your other option would be to use a database, it is already set up to support queries like this. It can be done all in memory with something like BerkeleyDB with the SQL extensions.

If sorting the list by every criteria mentioned in the query is viable (or having the list pre-sorted by each relative criteria), this works very well.
By "relative criteria", I mean criteria not of the form "x must be 5", which are trivial to filter against, but criteria of the form "x must be the same for each item in the result set". If there are also criteria of the "x must be 5" form, then filter against those first, then do the following.
It relies on using a stable sort on multiple columns to find the matching groups quickly (without trying out combinations).
The complexity is number of nodes * number of criteria in the query (for the algorithm itself) + number of nodes * log(number of nodes) * number of criteria (for the sort, if not pre-sorting). So Nodes*Log(Nodes)*Criteria.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace bleh
{
class Program
{
static void Main(string[] args)
{
List<Node> list = new List<Node>();
// create a random input list
Random r = new Random();
for (int i = 1; i <= 10000; i++)
{
Node node = new Node();
for (char c = 'a'; c <= 'z'; c++) node.Properties[c.ToString()] = (r.Next() % 10 + 1).ToString();
list.Add(node);
}
// if you have any absolute criteria, filter the list first according to it, which is very easy
// i am sure you know how to do that
// only look at relative criteria after removing nodes which are eliminated by absolute criteria
// example
List<string> criteria = new List<string> {"c", "h", "r", "x" };
criteria = criteria.OrderBy(x => x).ToList();
// order the list by each relative criteria, using a ***STABLE*** sort
foreach (string s in criteria)
list = list.OrderBy(x => x.Properties[s]).ToList();
// size of sought group
int n = 4;
// this is the algorithm
int sectionstart = 0;
int sectionend = 0;
for (int i = 1; i < list.Count; i++)
{
bool same = true;
foreach (string s in criteria) if (list[i].Properties[s] != list[sectionstart].Properties[s]) same = false;
if (same == true) sectionend = i;
else sectionstart = i;
if (sectionend - sectionstart == n - 1) break;
}
// print the results
Console.WriteLine("\r\nResult:");
for (int i = sectionstart; i <= sectionend; i++)
{
Console.Write("[" + i.ToString() + "]" + "\t");
foreach (string s in criteria) Console.Write(list[i].Properties[s] + "\t");
Console.WriteLine();
}
Console.ReadLine();
}
}
}

I would do something like this (obviously instead of strings you should map them to int, and use int's as codes)
struct structNode
{
std::set<std::string> sMachines;
std::map<std::string, int> mCodeToIndex;
std::vector<structNode> vChilds;
};
void Fill(std::string strIdMachine, int iIndex, structNode* pNode, std::vector<std::string> &vCodes)
{
if(iIndex < vCodes.size())
{
// Add "Empty" if Needed
if(pNode->vChilds.size() == 0)
{
pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair("empty", 0));
pNode->vChilds.push_back(structNode());
}
// Add for "Empty"
pNode->vChilds[0].sMachines.insert(strIdMachine);
Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[0], vCodes );
if(vCodes[iIndex] == "empty")
return;
// Add for "Any"
std::map<std::string, int>::iterator mIte = pNode->mCodeToIndex.find("any");
if(mIte == pNode->mCodeToIndex.end())
{
mIte = pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair("any", pNode->vChilds.size()));
pNode->vChilds.push_back(structNode());
}
pNode->vChilds[mIte->second].sMachines.insert(strIdMachine);
Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[mIte->second], vCodes );
// Add for "Segment"
mIte = pNode->mCodeToIndex.find(vCodes[iIndex]);
if(mIte == pNode->mCodeToIndex.end())
{
mIte = pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair(vCodes[iIndex], pNode->vChilds.size()));
pNode->vChilds.push_back(structNode());
}
pNode->vChilds[mIte->second].sMachines.insert(strIdMachine);
Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[mIte->second], vCodes );
}
}
//////////////////////////////////////////////////////////////////////
// Get
//
// NULL on empty group
//////////////////////////////////////////////////////////////////////
set<std::string>* Get(structNode* pNode, int iIndex, vector<std::string> vCodes, int iMinValue)
{
if(iIndex < vCodes.size())
{
std::map<std::string, int>::iterator mIte = pNode->mCodeToIndex.find(vCodes[iIndex]);
if(mIte != pNode->mCodeToIndex.end())
{
if(pNode->vChilds[mIte->second].sMachines.size() < iMinValue)
return NULL;
else
return Get(&pNode->vChilds[mIte->second], (iIndex + 1), vCodes, iMinValue);
}
else
return NULL;
}
return &pNode->sMachines;
}
To fill the tree with your sample
structNode stRoot;
const char* dummy[] = { "city1", "switch03", "server01" };
const char* dummy2[] = { "city1", "switch03", "empty" };
const char* dummy3[] = { "city2", "switch03", "server02" };
const char* dummy4[] = { "city2", "switch04", "server02" };
// Fill the tree with the sample
Fill("node1", 0, &stRoot, vector<std::string>(dummy, dummy + 3));
Fill("node2", 0, &stRoot, vector<std::string>(dummy, dummy + 3));
Fill("node3", 0, &stRoot, vector<std::string>(dummy2, dummy2 + 3));
Fill("node4", 0, &stRoot, vector<std::string>(dummy2, dummy2 + 3));
Fill("node5", 0, &stRoot, vector<std::string>(dummy3, dummy3 + 3));
Fill("node6", 0, &stRoot, vector<std::string>(dummy3, dummy3 + 3));
Fill("node7", 0, &stRoot, vector<std::string>(dummy4, dummy4 + 3));
Fill("node8", 0, &stRoot, vector<std::string>(dummy4, dummy4 + 3));
Now you can easily obtain all the combinations that you want for example you query would be something like this:
vector<std::string> vCodes;
vCodes.push_back("empty"); // Discard first property (cities)
vCodes.push_back("any"); // Any value for infiniband
vCodes.push_back("any"); // Any value for networkfs (except empty)
set<std::string>* pMachines = Get(&stRoot, 0, vCodes, 2);
And for example only City02 on switch03 with networfs not empty
vector<std::string> vCodes;
vCodes.push_back("city2"); // Only city2
vCodes.push_back("switch03"); // Only switch03
vCodes.push_back("any"); // Any value for networkfs (except empy)
set<std::string>* pMachines = Get(&stRoot, 0, vCodes, 2);

Better, or advantages in different ways of coding similar functions

I'm writing the code for a GUI (in C++), and right now I'm concerned with the organisation of text in lines. One of the problems I'm having is that the code is getting very long and confusing, and I'm starting to get into a n^2 scenario where for every option I add in for the texts presentation, the number of functions I have to write is the square of that. In trying to deal with this, A particular design choice has come up, and I don't know the better method, or the extent of the advantages or disadvantages between them:
I have two methods which are very similar in flow, i.e, iterate through the same objects, taking into account the same constraints, but ultimately perform different operations between this flow. For anyones interest, the methods render the text, and determine if any text overflows the line due to wrapping the text around other objects or simply the end of the line respectively.
These functions need to be copied and rewritten for left, right or centred text, which have different flow, so whatever design choice I make would be repeated three times.
Basically, I could continue what I have now, which is two separate methods to handle these different actions, or I could merge them into one function, which has if statements within it to determine whether or not to render the text or figure out if any text overflows.
Is there a generally accepted right way to going about this? Otherwise, what are the tradeoffs concerned, what are the signs that might indicate one way should be used over the other? Is there some other way of doing things I've missed?
I've edited through this a few times to try and make it more understandable, but if it isn't please ask me some questions so I can edit and explain. I can also post the source code of the two different methods, but they use a lot of functions and objects that would take too long to explain.
// EDIT: Source Code //
Function 1:
void GUITextLine::renderLeftShifted(const GUIRenderInfo& renderInfo) {
if(m_renderLines.empty())
return;
Uint iL = 0;
Array2t<float> renderCoords;
renderCoords.s_x = renderInfo.s_offset.s_x + m_renderLines[0].s_x;
renderCoords.s_y = renderInfo.s_offset.s_y + m_y;
float remainingPixelsInLine = m_renderLines[0].s_y;
for (Uint iTO= 0;iTO != m_text.size();++iTO)
{
if(m_text[iTO].s_pixelWidth <= remainingPixelsInLine)
{
string preview = m_text[iTO].s_string;
m_text[iTO].render(&renderCoords);
remainingPixelsInLine -= m_text[iTO].s_pixelWidth;
}
else
{
FSInternalGlyphData intData = m_text[iTO].stealFSFastFontInternalData();
float characterWidth = 0;
Uint iFirstCharacterOfRenderLine = 0;
for(Uint iC = 0;;++iC)
{
if(iC == m_text[iTO].s_string.size())
{
// wrap up
string renderPart = m_text[iTO].s_string;
renderPart.erase(iC, renderPart.size());
renderPart.erase(0, iFirstCharacterOfRenderLine);
m_text[iTO].s_font->renderString(renderPart.c_str(), intData,
&renderCoords);
break;
}
characterWidth += m_text[iTO].s_font->getWidthOfGlyph(intData,
m_text[iTO].s_string[iC]);
if(characterWidth > remainingPixelsInLine)
{
// Can't push in the last character
// No more space in this line
// First though, render what we already have:
string renderPart = m_text[iTO].s_string;
renderPart.erase(iC, renderPart.size());
renderPart.erase(0, iFirstCharacterOfRenderLine);
m_text[iTO].s_font->renderString(renderPart.c_str(), intData,
&renderCoords);
if(++iL != m_renderLines.size())
{
remainingPixelsInLine = m_renderLines[iL].s_y;
renderCoords.s_x = renderInfo.s_offset.s_x + m_renderLines[iL].s_x;
// Cool, so now try rendering this character again
--iC;
iFirstCharacterOfRenderLine = iC;
characterWidth = 0;
}
else
{
// Quit
break;
}
}
}
}
}
// Done! }
Function 2:
vector GUITextLine::recalculateWrappingContraints_LeftShift()
{
m_pixelsOfCharacters = 0;
float pixelsRemaining = m_renderLines[0].s_y;
Uint iRL = 0;
// Go through every text object, fiting them into render lines
for(Uint iTO = 0;iTO != m_text.size();++iTO)
{
// If an entire text object fits in a single line
if(pixelsRemaining >= m_text[iTO].s_pixelWidth)
{
pixelsRemaining -= m_text[iTO].s_pixelWidth;
m_pixelsOfCharacters += m_text[iTO].s_pixelWidth;
}
// Otherwise, character by character
else
{
// Get some data now we don't get it every function call
FSInternalGlyphData intData = m_text[iTO].stealFSFastFontInternalData();
for(Uint iC = 0; iC != m_text[iTO].s_string.size();++iC)
{
float characterWidth = m_text[iTO].s_font->getWidthOfGlyph(intData, '-');
if(characterWidth < pixelsRemaining)
{
pixelsRemaining -= characterWidth;
m_pixelsOfCharacters += characterWidth;
}
else // End of render line!
{
m_pixelsOfWrapperCharacters += pixelsRemaining; // we might track how much wrapping px we use
// If this is true, then we ran out of render lines before we ran out of text. Means we have some overflow to return
if(++iRL == m_renderLines.size())
{
return harvestOverflowFrom(iTO, iC);
}
else
{
pixelsRemaining = m_renderLines[iRL].s_y;
}
}
}
}
}
vector<GUIText> emptyOverflow;
return emptyOverflow; }
So basically, render() takes renderCoordinates as a parameter and gets from it the global position of where it needs to render from. calcWrappingConstraints figures out how much text in the object goes over the allocated space, and returns that text as a function.
m_renderLines is an std::vector of a two float structure, where .s_x = where rendering can start and .s_y = how large the space for rendering is - not, its essentially width of the 'renderLine', not where it ends.
m_text is an std::vector of GUIText objects, which contain a string of text, and some data, like style, colour, size ect. It also contains under s_font, a reference to a font object, which performs rendering, calculating the width of a glyph, ect.
Hopefully this clears things up.

There is no generally accepted way in this case.
However, common practice in any programming scenario is to remove duplicated code.
I think you're getting stuck on how to divide code by direction, when direction changes the outcome too much to make this division. In these cases, focus on the common portions of the three algorithms and divide them into tasks.
I did something similar when I duplicated WinForms flow layout control for MFC. I dealt with two types of objects: fixed positional (your pictures etc.) and auto positional (your words).
In the example you provided I can list out common portions of your example.
Write Line (direction)
bool TestPlaceWord (direction) // returns false if it cannot place word next to previous word
bool WrapPastObject (direction) // returns false if it runs out of line
bool WrapLine (direction) // returns false if it runs out of space for new line.
Each of these would be performed no matter what direction you are faced with.
Ultimately, the algorithm for each direction is just too different to simplify anymore than that.

How about an implementation of the Visitor Pattern? It sounds like it might be the kind of thing you are after.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js