Understanding Curve Global Approximation algorithm - c++

Problem description
I am trying to understand and implement the Curve Global Approximation, as proposed here:
https://pages.mtu.edu/~shene/COURSES/cs3621/NOTES/INT-APP/CURVE-APP-global.html
To implement the algorithm it is necessary to calculate base function coefficients, as described here:
https://pages.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-curve-coef.html
I have trouble wrapping my head around some of the details.
First there is some trouble with variable nomenclature. Specifically I am tripped up by the fact there is as function parameter as well as input and . Currently I assume, that first I decide how many knot vectors I want to find for my approximation. Let us say I want 10. So then my parameters are:
I assume this is what is input parameter in the coefficient calculation algorithm?
The reason this tripped me up is because of the sentence:
Let u be in knot span
If input parameter was one of the elements of the knot vector , then there was no need for an interval. So I assume is actually one of these elements ( ?), defined earlier:
Is that assumption correct?
Most important question. I am trying to get my N to work with the first of the two links, i.e. the implementation of the Global Curve Approximation. As I look at the matrix dimensions (where P, Q, N dimensions are mentioned), it seems that N is supposed to have n rows and h-1 columns. That means, N has rows equal to the amount of data points and columns equal to the curve degree minus one. However when I look at the implementation details of N in the second link, an N row is initialized with n elements. I refer to this:
Initialize N[0..n] to 0; // initialization
But I also need to calculate N for all parameters which correspond to my parameters which in turn correspond to the datapoints. So the resulting matrix is of ddimension ( n x n ). This does not correspond to the previously mentioned ( n x ( h - 1 ) ).
To go further, in the link describing the approximation algorithm, N is used to calculate Q. However directly after that I am asked to calculate N which I supposedly already had, how else would I have calculated Q? Is this even the same N? Do I have to calculate a new N for the desired amount of control points?
Conclusion
If somebody has any helpful insight on this - please do share. I aim to implement this using C++ with Eigen for its usefulness w.r.t. to solving M * P = Q and matrix calculations. Currently I am at a loss though. Everything seems more or less clear, except for N and especially its dimensions and whether it needs to be calculated multiple times or not.
Additional media
In the last image it is supposed to say, "[...] used before in the calculation of Q"

The 2nd link is telling you how to compute the basis function of B-spline curve at parameter u where the B-spline curve is defined by its degree, knot vector [u0,...um] and control points. So, for your first question, if you want to have 10 knots in your knot vector, then the typical knot vector will look like:
[0, 0, 0, 0, 0.3, 0.7, 1, 1, 1, 1]
This will be a B-spline curve of degree 3 with 6 control points.
For your 2nd question, The input parameter u is generally not one of the knots [u0, u1,...um]. Input parameter u is simply the parameter we would like to evaluate the B-spline curve at. The value of u actually varies from 0 to 1 (assuming the knot vector ranges is also from 0 to 1).
For your 3rd questions, N (in the first link) represents a matrix where each element of this matrix is a Ni,p(tj). So, basically the N[] array computed from 2nd link is actually a row vector of the matrix N in the first link.
I hope my answers have cleared out some of your confusions.

Related

Finding shortest path in a graph, with additional restrictions

I have a graph with 2n vertices where every edge has a defined length. It looks like **
**.
I'm trying to find the length of the shortest path from u to v (smallest sum of edge lengths), with 2 additional restrictions:
The number of blue edges that the path contains is the same as the number of red edges.
The number of black edges that the path contains is not greater than p.
I have come up with an exponential-time algorithm that I think would work. It iterates through all binary combinations of length n - 1 that represent the path starting from u in the following way:
0 is a blue edge
1 is a red edge
There's a black edge whenever
the combination starts with 1. The first edge (from u) is then the first black one on the left.
the combination ends with 0. Then last edge (to v) is then the last black one on the right.
adjacent digits are different. That means we went from a blue edge to a red edge (or vice versa), so there's a black one in between.
This algorithm would ignore the paths that don't meet the 2 requirements mentioned earlier and calculate the length for the ones that do, and then find the shortest one. However doing it this way would probably be awfully slow and I'm looking for some tips to come up with a faster algorithm. I suspect it's possible to achieve with dynamic programming, but I don't really know where to start. Any help would be very appreciated. Thanks.
Seems like Dynamic Programming problem to me.
In here, v,u are arbitrary nodes.
Source node: s
Target node: t
For a node v, such that its outgoing edges are (v,u1) [red/blue], (v,u2) [black].
D(v,i,k) = min { ((v,u1) is red ? D(u1,i+1,k) : D(u1,i-1,k)) + w(v,u1) ,
D(u2,i,k-1) + w(v,u2) }
D(t,0,k) = 0 k <= p
D(v,i,k) = infinity k > p //note, for any v
D(t,i,k) = infinity i != 0
Explanation:
v - the current node
i - #reds_traversed - #blues_traversed
k - #black_edges_left
The stop clauses are at the target node, you end when reaching it, and allow reaching it only with i=0, and with k<=p
The recursive call is checking at each point "what is better? going through black or going though red/blue", and choosing the best solution out of both options.
The idea is, D(v,i,k) is the optimal result to go from v to the target (t), #reds-#blues used is i, and you can use up to k black edges.
From this, we can conclude D(s,0,p) is the optimal result to reach the target from the source.
Since |i| <= n, k<=p<=n - the total run time of the algorithm is O(n^3), assuming implemented in Dynamic Programming.
Edit: Somehow I looked at the "Finding shortest path" phrase in the question and ignored the "length of" phrase where the original question later clarified intent. So both my answers below store lots of extra data in order to easily backtrack the correct path once you have computed its length. If you don't need to backtrack after computing the length, my crude version can change its first dimension from N to 2 and just store one odd J and one even J, overwriting anything older. My faster version can drop all the complexity of managing J,R interactions and also just store its outer level as [0..1][0..H] None of that changes the time much, but it changes the storage a lot.
To understand my answer, first understand a crude N^3 answer: (I can't figure out whether my actual answer has better worst case than crude N^3 but it has much better average case).
Note that N must be odd, represent that as N=2H+1. (P also must be odd. Just decrement P if given an even P. But reject the input if N is even.)
Store costs using 3 real coordinates and one implied coordinate:
J = column 0 to N
R = count of red edges 0 to H
B = count of black edges 0 to P
S = side odd or even (S is just B%1)
We will compute/store cost[J][R][B] as the lowest cost way to reach column J using exactly R red edges and exactly B black edges. (We also used J-R blue edges, but that fact is redundant).
For convenience write to cost directly but read it through an accessor c(j,r,b) that returns BIG when r<0 || b<0 and returns cost[j][r][b] otherwise.
Then the innermost step is just:
If (S)
cost[J+1][R][B] = red[J]+min( c(J,R-1,B), c(J,R-1,B-1)+black[J] );
else
cost[J+1][R][B] = blue[J]+min( c(J,R,B), c(J,R,B-1)+black[J] );
Initialize cost[0][0][0] to zero and for the super crude version initialize all other cost[0][R][B] to BIG.
You could super crudely just loop through in increasing J sequence and whatever R,B sequence you like computing all of those.
At the end, we can find the answer as:
min( min(cost[N][H][all odd]), black[N]+min(cost[N][H][all even]) )
But half the R values aren't really part of the problem. In the first half any R>J are impossible and in the second half any R<J+H-N are useless. You can easily avoid computing those. With a slightly smarter accessor function, you could avoid using the positions you never computed in the boundary cases of ones you do need to compute.
If any new cost[J][R][B] is not smaller than a cost of the same J, R, and S but lower B, that new cost is useless data. If the last dim of the structure were map instead of array, we could easily compute in a sequence that drops that useless data from both the storage space and the time. But that reduced time is then multiplied by log of the average size (up to P) of those maps. So probably a win on average case, but likely a loss on worst case.
Give a little thought to the data type needed for cost and the value needed for BIG. If some precise value in that data type is both as big as the longest path and as small as half the max value that can be stored in that data type, then that is a trivial choice for BIG. Otherwise you need a more careful choice to avoid any rounding or truncation.
If you followed all that, you probably will understand one of the better ways that I thought was too hard to explain: This will double the element size but cut the element count to less than half. It will get all the benefits of the std::map tweak to the basic design without the log(P) cost. It will cut the average time way down without hurting the time of pathological cases.
Define a struct CB that contains cost and black count. The main storage is a vector<vector<CB>>. The outer vector has one position for every valid J,R combination. Those are in a regular pattern so we could easily compute the position in the vector of a given J,R or the J,R of a given position. But it is faster to keep those incrementally so J and R are implied rather than directly used. The vector should be reserved to its final size, which is approx N^2/4. It may be best if you pre compute the index for H,0
Each inner vector has C,B pairs in strictly increasing B sequence and within each S, strictly decreasing C sequence . Inner vectors are generated one at a time (in a temp vector) then copied to their final location and only read (not modified) after that. Within generation of each inner vector, candidate C,B pairs will be generated in increasing B sequence. So keep the position of bestOdd and bestEven while building the temp vector. Then each candidate is pushed into the vector only if it has a lower C than best (or best doesn't exist yet). We can also treat all B<P+J-N as if B==S so lower C in that range replaces rather than pushing.
The implied (never stored) J,R pairs of the outer vector start with (0,0) (1,0) (1,1) (2,0) and end with (N-1,H-1) (N-1,H) (N,H). It is fastest to work with those indexes incrementally, so while we are computing the vector for implied position J,R, we would have V as the actual position of J,R and U as the actual position of J-1,R and minU as the first position of J-1,? and minV as the first position of J,? and minW as the first position of J+1,?
In the outer loop, we trivially copy minV to minU and minW to both minV and V, and pretty easily compute the new minW and decide whether U starts at minU or minU+1.
The loop inside that advances V up to (but not including) minW, while advancing U each time V is advanced, and in typical positions using the vector at position U-1 and the vector at position U together to compute the vector for position V. But you must cover the special case of U==minU in which you don't use the vector at U-1 and the special case of U==minV in which you use only the vector at U-1.
When combining two vectors, you walk through them in sync by B value, using one, or the other to generate a candidate (see above) based on which B values you encounter.
Concept: Assuming you understand how a value with implied J,R and explicit C,B is stored: Its meaning is that there exists a path to column J at cost C using exactly R red branches and exactly B black branches and there does not exist exists a path to column J using exactly R red branches and the same S in which one of C' or B' is better and the other not worse.
Your exponential algorithm is essentially a depth-first search tree, where you keep track of the cost as you descend.
You could make it branch-and-bound by keeping track of the best solution seen so far, and pruning any branches that would go beyond the best so far.
Or, you could make it a breadth-first search, ordered by cost, so as soon as you find any solution, it is among the best.
The way I've done this in the past is depth-first, but with a budget.
I prune any branches that would go beyond the budget.
Then I run if with budget 0.
If it doesn't find any solutions, I run it with budget 1.
I keep incrementing the budget until I get a solution.
This might seem like a lot of repetition, but since each run visits many more nodes than the previous one, the previous runs are not significant.
This is exponential in the cost of the solution, not in the size of the network.

Perfect hashing function in a hash table implementation of a sparse matrix class

I'm currently implementing a sparse matrix for my matrix library - it will be a hash table. I already implemented a dense matrix as a nested vector, and since I'm doing it just to learn new stuff, I decided that my matrices will be multi-dimensional (not just a 2D table of numbers, but also cubes, tesseracts, etc).
I use an index type which holds n numbers of type size_t, n being a number of dimensions for this particular index. Index of dimension n may be used only as an address of an element in a matrix of appropriate dimension. It is simply a tuple with implicit constructor for easy indexing, like Matrix[{1,2,3}].
My question is centered around the hashing function I plan on using for my sparse matrix implementation. I think that the function is always minimal, but is perfect only up to a certain point - to the point of size_t overflow, or an overflow of intermediate operation of the hashing function (they are actually unsigned long long). Sparse matrices have huge boundaries, so it's practically guaranteed to overflow at some point (see below).
What the hashing function does is assign consecutive numbers to matrix elements as follows:
[1 2 3 4 5 6 7 8 ...]^T //transposed 1-dimensional matrix
1 4 7
2 5 8
3 6 9 //2-dimensional matrix
and so on. Unfortunately, I'm unable show you the ordering for higher order matrices, but I hope that you get the idea - the value increases top to bottom, left to right, back to front (for cube matrices), etc.
The hashing function is defined like this:
value = x1+d1*x2+d1*d2*x3+d1*d2*d3*x3+...+d1*d2*d3*...*d|n-1|*xn
where:
x1...xn are index members - row, column, height, etc - {x1, x2, x3,
..., xn}
d1...d|n-1| are matrix boundary dimensions - one past the end of matrix in the appropriate direction
I'm actually using a recursive form of this function (simple factoring, but complexity becomes O(n) instead of O(n^2)):
value = x1+d1(x2+d2(x3+d3(...(x|n-1|+d|n-1|(xn))...)))
I'm assuming that elements will be distributed randomly (uniform) across the matrix, but the bucket number is hash(key) mod numberOfBuckets, so it is practically guaranteed to have collisions despite the fact, that the function is perfect.
Is there any way to exploit the features of my hash function in order to minimize collisions?
How should I choose the load factor? Should I leave the choice to the user? Is there any good default value for this case?
Is the hash table actually a good solution for this problem? Are there any other data structures that guarantee average O(1) complexity given that I can roll the index into a number and a number into an index (mind the size_t overflow)? I am aware of different ways to store a sparse matrix, but I want to try the DOK way first.

Partition an n-dimensional "square" space into cubes

right now I am stuck solving the following "semi"-mathematical Problem.
I would like to partition an n-dimensinal restricted space (a hypercube to be precise)
D={(x_1, ...,x_n), x_i \in IR and -limits<=x_i<=limits \forall i<=n} Into smaller cubes.
Meaning I would like to specify n,limits,m where m would be the number of partitions per side of the cube - 2*limits/m would be the length of the small cubes and I would get m^n such cubes.
Now I would like to return a vector of vectors containing some distinct coordinates of these small cubes. (or perhaps one could represent the cubes as objects which are characterized by a vector pointing to the "left" outer corner ? )
Basically I have no idea whether something like that is even doable using C++. Implementing this for fixed n does not pose a problem. But I would like to enable the user to have free choice of the dimension.
Background: Something like that would be priceless in optimization. Where one would partition the space into smaller ones and use e.g. a genetic algorithms on each of the subspaces and later compare the results. Thus huge initial Populations could be avoided and the search results drastically improved.
Also I am just curious whether sth. like that is doable :)
My Suggestion: Use B+ Trees ?
Let m be the number of partitions per dimension, i.e. per edge, of the hypercube D.
Then there are m^n different subspaces S of D, like you say. Let the subspaces S be uniquely represented by integer coordinates S=[y_1,y_2,...,y_n] where the y_i are integers in the range 1, ..., m. In Cartesian coordinates, then, S=(x_1,x_2,...,x_n) where Delta*(y_i-1)-limits <= x_i < Delta*y_i-limits, and Delta=2*limits/m.
The "left outer corner" or origin of S you were looking for is just the point corresponding to the smallest x_i, i.e. the point (Delta*(y_1-1)-limits, ..., Delta*(y_n-1)-limits). Instead of representing the different S by this point, it makes a lot more sense (and will be faster in a computer) to represent them using the integer coordinates above.

Fitting data to a 3rd degree polynomial

I'm currently writing a C++ program where I have vectors of independent and dependent data that I would like to fit to a cubic function. However, I'm having trouble generating a polynomial that can fit my data.
Part of the problem is that I can't use various numerical packages, such as GSL (long story); it's possible that it might even be overkill for my case. I don't need a very generalized solution for least squares fitting. I specifically want to fit my data to a cubic function. I do have access to Sony's vector library, which supports 4x4 matrices and can calculate their inverses, among other things.
While prototyping this in Scilab, I used a function like:
function p = polyfit(x, y, n)
m = length(x);
aa = zeros(m, n+1)
aa(:,1) = ones(m,1)
for k = 2:n+1
aa(:,k) = x.^(k-1)
end
p = aa\y
endfunction
Unfortunately, this doesn't map well to my current environment. The above example needs to support a matrix of M x N+1 dimensions. In my case, that's M x 4, where M depends on how much sample data that I have. There's also the problem of left division. I would need a matrix library that supported the inverse of matrices of arbitrary dimensions.
Is there an algorithm for least squares where I can avoid having to calculate aa\y, or at least limit it to a 4x4 matrix? I suppose that I'm trying to rewrite the above algorithm into a simpler case that works for fitting to a cubic polynomial. I'm not looking for a code solution, but if someone can point me in the right direction, I'd appreciate it.
Here is the page I am working from, although that page itself doesn't address your question directly. The summary of my answer would be:
If you can't work with Nx4 matrices directly, then do those matrix
computations "manually" until you have the problem down to something that has only 4x4 or smaller matrices. In this answer I'll outline how to do the specific matrix computations you need "manually."
--
Let's suppose you have a bunch of data points (x1,y1)...(xn,yn) and you are looking for the cubic equation y = ax^3 + bx^2 + cx + d that fits those points best.
Then following the link above, you'd write this equation:
I'll write A, x, and B for those matrices. Then following my link above, you'd like to multiply by the transpose of A, which will give you the 4x4 matrix AT*A that you can invert. In equations, the following is the plan:
A * x = B .................... [what we started with]
(AT * A) * x = AT * B ..... [multiply by AT]
x = (AT * A)-1 * AT * B ... [multiply by the inverse of AT * A]
You said you are happy with inverting 4x4 matrices, so if we can code a way to get at these matrices without actually using matrix objects, we should be okay.
So, here is a method, although it might be a little bit too much like making your own matrix library for your taste. :)
Write an explicit equation for each of the 16 entries of the 4x4 matrix. The (i,j)th entry (I'm starting with (0,0)) is given by
x1i * x1j + x2i * x2j + ... + xNi * xNj.
Invert that 4x4 matrix using your matrix library. That is (AT * A)-1.
Now all we need is AT * B, which is a 4x1 matrix. The ith entry of it is given by x1i * y1 + x2i * y2 + ... + xNi * yN.
Multiply our hand-created 4x4 matrix (AT * A)-1 by our hand-created 4x1 matrix AT * B to get the 4x1 matrix of least-squares coefficients for your cubic.
Good luck!
Yes, we can limit the problem to computing with "a 4x4 matrix". The least squares fit of a cubic, even for M data points, only requires the solution of four linear equations in four unknowns. Assuming all the x-coordinates are distinct the coefficient matrix is invertible, so in principle the system can be solved by inverting the coefficient matrix. We assume that M is more than 4, as would typically be the case for least squares fits.
Here's a write-up for Maple, Fitting a cubic to data, that hides almost completely the details of what is being solved. The first-order minimum criteria (first derivatives with respect to coefficients as parameters of sum of squares error) gets us the four linear equations, often called the normal equations.
You can "assemble" these four equations in code, then apply your matrix inverse or a more sophisticated solution strategy. Obviously you need to have the data points stored in some form. One possibility is two linear arrays, one for the x-coordinates and one for the y-coordinates, both of length M the number of data points.
NB: I'm going to discuss this matrix assembly in terms of 1-based array subscripts. The polynomial coefficients are actually one application where 0-based array subscripts make things cleaner and simpler, but rewriting it in C or any other language that favors 0-based subscripts is left as an exercise for the reader.
The linear system of normal equations is most easily expressed in matrix form by referring to an Mx4 array A whose entries are powers of x-coordinate data:
A(i,j) = x-coordinate of ith data point raised to power j-1
Let A' denote the transpose of A, so that A'A is a 4x4 matrix.
If we let d be a column of length M containing the y-coordinates of data points (in the given order), then the system of normal equations is just this:
A'A u = A' d
where u = [p0,p1,p2,p3]' is the column of coefficients for the cubic polynomial with least squares fit:
P(x) = p0 + p1*x + p2*x^2 + p3*x^3
Your objections seem to center on a difficulty in storing and/or manipulating the Mx4 array A or its transpose. Therefore my answer will focus on how to assemble matrix A'A and column A'd without explicitly storing all of A at one time. In other words we will be doing the indicated matrix-matrix and matrix-vector multiplications implicitly to get a 4x4 system that you can solve:
C u = f
If you think about the entry C(i,j) being the product of the ith row of A' with the jth column of A, plus the fact that the ith row of A' is really just the transpose of the ith column of A, it should be clear that:
C(i,j) = SUM x^(i+j-2) over all data points
This is certainly one place where the exposition would be simplified by using 0-based subscripts!
It might make sense to accumulate the entries for matrix C, which depend only on the value of i+j, i.e. a so-called Hankel matrix, in a linear array of length 7 such that:
W(k) = SUM x^k over all data points
where k = 0,..,6. The 4x4 matrix C has a "striped" structure that means only these seven values appear. Looping over the list of x-coordinates of data points, you can accumulate the appropriate contributions of each power of each data point in the appropriate entry of W.
A similar strategy can be used to assemble the column f = A' d, namely to loop over the data points and accumulate the following four summations:
f(k) = SUM (x^k)*y over all data points
where k = 0,1,2,3. [Of course in the above sums the values x,y are the coordinates for a common data point.]
Caveats: This satisfies the goal of working only with a 4x4 matrix. However one typically tries to avoid the explicit formation of the matrix of coefficients for the normal equations because these matrices are often what in numerical analysis is called ill-conditioned. In particular the cases where x-coordinates are closely spaced can cause difficulty when one tries to solve the system by inverting the matrix of coefficients.
A more sophisticated approach to solving these normal equations would be the conjugate gradient method on the normal equations, which can be done with code that computes the matrix-vector products A u and A' v one entry at a time (using what we say above about entries of A).
The accuracy of the conjugate gradient method is often satisfactory because of its natural iterative approach, esp. when one can compute the required dot-products with a little extra precision.
You should never do full matrix inversion for stability reasons. Do LU decomposition and forward-back substitution. The other solutions are spot on otherwise.

C++ How to generate the set of cartesian product of n-dimensional tuples

I wish to generate some data that represents the co-ordinates of a cloud of points representing an n-cube of n dimensions. These points should be evenly distributed throughout the n-space and should be able to be generated with a user-defined spacing between them. This data will be stored in an array.
I have found an implementation of a cartesian product using Boost.MPL.
There is an actual Cartesian product in Boost as well but that is a preprocessor directive, I assume it is of no use for you.
To keep things simple here's an example for an ordinary cube, ie one with 3 dimensions. Let it have side length 1 and suppose you want points spaced at intervals of 1/n. (This is leading to a uniform rectangular distribution of points, not entirely sure that this is what you want).
Now some pseudo-code:
for i=0;i<=n;i++ //NB i<=n because there will be n+1 points along each axis-parallel line
for j=0;j<=n;j++
for k=0;k<=n;k++
addPointAt(i/n,j/n,k/n) //float arithmetic required here
Note that this is not the Cartesian product of anything but seems to satisfy (a special case of) your criteria. If you want the points spaced differently, adjust the loop start and end indices or the interval size.
To generalise this to any specified higher dimension is easy, add more loops.
To generalise to any higher dimension which is not known until run time is only slightly more difficult. Instead of declaring an N-dimensional array, declare a 1-D array with the same number of elements. Then you have to write the index arithmetic explicitly instead of having the compiler write it for you.
I expect that you are now going to tell me that this is not what you want ! If it isn't could you clarify.
You can do this recursively(pseudocode):
Function Hypercube(int dimensions, int current, string partialCoords)
{
for i=0, i<=steps, i++
{
if(current==dimensions)
print partialCoords + ", " + i + ")/n";
else if current==0
Hypercube(dimensions, current+1, "( "+i);
else
Hypercube(dimensions, current+1, partialCoords+", "+i);
}
}
You call it: Hypercube(n,0,""); This will print the coordinates of all points, but you can also store them in a structure.