needed dtw like in R package - c++

There is a function of the dtw package
dtw(x, y=NULL, dist.method="Euclidean", step.pattern=symmetric2, window.type="none", keep.internals=FALSE, distance.only=FALSE, open.end=FALSE, open.begin=FALSE, ... )
In the function, there are three methods of calculating distances
symmetric1 , symmetric2 , asymmetric
I am interested in the method step.pattern = symmetric2.
I have a C ++ function that works exactly like symmetric1
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double dtw_rcpp(const NumericVector& x, const NumericVector& y) {
size_t n = x.size(), m = y.size();
NumericMatrix res = no_init(n + 1, m + 1);
std::fill(res.begin(), res.end(), R_PosInf);
res(0, 0) = 0;
double cost = 0;
size_t w = std::abs(static_cast<int>(n - m));
for (size_t i = 1; i <= n; ++i) {
for (size_t j = std::max(1, static_cast<int>(i - w)); j <= std::min(m, i + w); ++j) {
cost = std::abs(x[i - 1] - y[j - 1]);
res(i, j) = cost + std::min(std::min(res(i - 1, j), res(i, j - 1)), res(i - 1, j - 1));
}
}
return res(n, m);
}
What do I need to change in this с++ function that it considered the method of distance symmetric2.
I do not understand how it works symmetric2.
here it is said very little about it
1. Well-known step patterns
These common transition types are used in quite a lot of implementations.
symmetric1 (or White-Neely) is the commonly used quasi-symmetric, no local constraint, non-normalizable. It is biased in favor of oblique steps. symmetric2 is normalizable, symmetric, with no local slope constraints. Since one diagonal step costs as much as the two equivalent steps along the sides, it can be normalized dividing by N+M (query+reference lengths).
in the source code, I could not understand because I am a beginner programmer
I do not speak English so forgive me for the mistakes.
thank you

OP is asking about dynamic time warping alignments in R. Printing the symmetric2 object should clarify the recursion rule:
g[i,j] = min(
g[i-1,j-1] + 2 * d[i ,j ] ,
g[i ,j-1] + d[i ,j ] ,
g[i-1,j ] + d[i ,j ] ,
)
g is the global cost matrix, d the local distance. I can't comment on the rest of your code.
If you only need the distance value under this specific step pattern, and no other features, the code may be much simplified (see e.g. the pseudocode on Wikipedia).

Related

Need help understanding this line in an FFT algorithm

In my program I have a function that performs the fast Fourier transform. I know there are very good implementations freely available, but this is a learning thing so I don't want to use those. I ended up finding this comment with the following implementation (it originated from the Italian entry for the FFT):
void transform(complex<double>* f, int N) //
{
ordina(f, N); //first: reverse order
complex<double> *W;
W = (complex<double> *)malloc(N / 2 * sizeof(complex<double>));
W[1] = polar(1., -2. * M_PI / N);
W[0] = 1;
for(int i = 2; i < N / 2; i++)
W[i] = pow(W[1], i);
int n = 1;
int a = N / 2;
for(int j = 0; j < log2(N); j++) {
for(int k = 0; k < N; k++) {
if(!(k & n)) {
complex<double> temp = f[k];
complex<double> Temp = W[(k * a) % (n * a)] * f[k + n];
f[k] = temp + Temp;
f[k + n] = temp - Temp;
}
}
n *= 2;
a = a / 2;
}
free(W);
}
I've made a lot of changes by now but this was my starting point. One of the changes I made was to not cache the twiddle factors, because I decided to see if it's needed first. Now I've decided I do want to cache them. The way this implementation seems to do it is it has this array W of length N/2, where every index k has the value . What I don't understand is this expression:
W[(k * a) % (n * a)]
Note that n * a is always equal to N/2. I get that this is supposed to be equal to , and I can see that , which this relies on. I also get that modulo can be used here because the twiddle factors are cyclic. But there's one thing I don't get: this is a length-N DFT, and yet only N/2 twiddle factors are ever calculated. Shouldn't the array be of length N, and the modulo should be by N?
But there's one thing I don't get: this is a length-N DFT, and yet only N/2 twiddle factors are ever calculated. Shouldn't the array be of length N, and the modulo should be by N?
The twiddle factors are equally spaced points on the unit circle, and there is an even number of points because N is a power-of-two. After going around half of the circle (starting at 1, going counter clockwise above the X-axis), the second half is a repeat of the first half but this time it's below the X-axis (the points can be reflected through the origin). That is why Temp is subtracted the second time. That subtraction is the negation of the twiddle factor.

Need help understanding how to work with 2D/3D glyphs

Here's the code snippet I'd like help understanding
for (i = 0; i < samplesX; i++)
for (j = 0; j < samplesY; j++)
{
newI = DIM * i / samplesX;
newJ = DIM * j / samplesY;
idx = (round(newJ) * DIM) + round(newI);
if (color_dir == 1 && draw_vecs == 1) {
direction_to_color(vx[idx], vy[idx], color_dir);
}
if (color_dir == 1 && draw_vecs == 2) {
direction_to_color(fx[idx], fy[idx], color_dir);
}
else if (color_dir == 2) {
scalar = rho[idx];
set_colormap(scalar, min, max, clampLow, clampHigh);
}
else if (color_dir == 3) {
scalar = sqrt(vx[idx] * vx[idx] + vy[idx] * vy[idx]);
set_colormap(scalar, min, max, clampLow, clampHigh);
}
else if (color_dir == 4) {
scalar = sqrt(fx[idx] * fx[idx] + fy[idx] * fy[idx]);
set_colormap(scalar, min, max, clampLow, clampHigh);
}
/*if (draw_vecs == 1) {
glVertex2f(wn + (fftw_real)newI * wn, hn + (fftw_real)newJ * hn);
glVertex2f((wn + (fftw_real)newI * wn) + vec_scale * vx[idx], (hn + (fftw_real)newJ * hn) + vec_scale * vy[idx]);
}
else if (draw_vecs == 2) {
glVertex2f(wn + (fftw_real)newI * wn, hn + (fftw_real)newJ * hn);
glVertex2f((wn + (fftw_real)newI * wn) + vec_scale * fx[idx], (hn + (fftw_real)newJ * hn) + vec_scale * fy[idx]);
}*/
if (draw_vecs == 1) {
glVertex2f(wn + (fftw_real)i * wn, hn + (fftw_real)j * hn);
glVertex2f((wn + (fftw_real)i * wn) + vec_scale * vx[idx], (hn + (fftw_real)j * hn) + vec_scale * vy[idx]);
}
else if (draw_vecs == 2) {
glVertex2f(wn + (fftw_real)i * wn, hn + (fftw_real)j * hn);
glVertex2f((wn + (fftw_real)i * wn) + vec_scale * fx[idx], (hn + (fftw_real)j * hn) + vec_scale * fy[idx]);
}
}
glEnd();
}
What this currently does, as far as my understanding goes, is display these two-dimensional lines/arrows (hedgehogs) that visualize force/velocity in 2D as can be seen in the picture below.
Sadly, my understanding of linear algebra, calculus and computer graphics in general only goes so far and I'm having trouble dissecting this piece.
Ideally I'd like to understand this and also understand how I can take this pre-existing code and also add in functionality that can display two other glyph types that show a vector and/or scalar field such as
three-dimensional cones
three-dimensional ellipsoids
If I'm missing anything here, please let me know!
Some of the variables included in the above snippet:
const int DIM = 50; //size of simulation grid
int color_dir = 0; //use direction color-coding or not
float scalar;
int newI, newJ;
float temp;
float vec_scale = 1000; //scaling of hedgehogs
int draw_vecs = 1; //draw the vector field or not
The code snippet you have there could have been written simpler (also it takes some educated guessing what some of the variables and functions mean).
Let's break it down.
The first two lines are easy to understand, they're the standard stanza to iterate over a 2D array
for (i = 0; i < samplesX; i++)
for (j = 0; j < samplesY; j++)
i and j are running indices, that will iterate over every discrete coordinate tuple in (i,j) ∈ [i, samplesX) × [j, samplesY). The next two lines remap the 2D indices into into a new value range, specifically [i, samplesX)×[j, samplesY) → [0, DIM)×[0, DIM). A missing piece of information is, what type is DIM of. It would make for it to be some floating point type.
newI = DIM * i / samplesX;
newJ = DIM * j / samplesY;
The next line is bug prone. It translates newI and newJ into a running 1D index for a 1D array, that's addressed by i and j.
Why is this problematic? Because in the conversion to DIM-space information may have been lost. This kind of information loss may lead to security bugs(!), as a matter of fact, Skia, the rendering library used by Google Chrome, Android and other projects had exactly this kind of bug recently; the writeup is a worthwhile read: https://googleprojectzero.blogspot.com/2019/02/the-curious-case-of-convexity-confusion.html
The correct way to implement this is to have DIM be an integer and perform fixed point arithmetic on it, eventually truncating the fractional digits. But I digress. The next block is essentially performing a poor man's lookup table lookup. vx``vy and fx``fy are some flattened 2D arrays, accessed through an 1D index, and direction_to_color maps either to a value presumably to a call of glColor; the same probably also goes for set_colormap. This is a bad use of OpenGL.
The whole remapping from i and j to DIM and then the lookups are just poor implementation of a texture lookup. OpenGL already has textures. Just load as texture coordinate array and enable texturing.
Finally for each spine, two calls of glVertex are made, one with the staring point, which lies on grid centers (wn, hn), to an offset location (wn, hn) + (i, j).
My verdict of that code: Utter garbage! All of this could have been done far more elegantly, even back in 1994 with OpenGL-1.0, which is code seems to have been written for. If you want to implement your own vector field plot, don't use this as a starting point.
These days we have programmable GPUs with shaders. All of that bulk up there can be done is a few lines of shader code.

Cholesky decomposition in Halide

I'm trying to implement a Cholesky decomposition in Halide. Part of common algorithm such as crout consists of an iteration over a triangular matrix. In a way that, the diagonal elements of the decomposition are computed by subtracting a partial column sum from the diagonal element of the input matrix. Column sum is calculated over squared elements of a triangular part of the input matrix, excluding the diagonal element.
Using BLAS the code would in C++ look as follows:
double* a; /* input matrix */
int n; /* dimension */
const int c__1 = 1;
const double c_b12 = 1.;
const double c_b10 = -1.;
for (int j = 0; j < n; ++j) {
double ajj = a[j + j * n] - ddot(&j, &a[j + n], &n, &a[j + n], &n);
ajj = sqrt(ajj);
a[j + j * n] = ajj;
if (j < n) {
int i__2 = n - j;
dgemv("No transpose", &i__2, &j, &c_b10, &a[j + 1 + n], &n, &a[j + n], &b, &c_b12, &a[j + 1 + j * n], &c__1);
double d__1 = 1. / ajj;
dscal(&i__2, &d__1, &a[j + 1 + j * n], &c__1);
}
}
My question is if a pattern like this is in general expressible by Halide? And if so, how would it look like?
I think Andrew may have a more complete answer, but in the interest of a timely response, you can use an RDom predicate (introduced via RDom::where) to enumerate triangular regions (or their generalization to more dimensions). A sketch of the pattern is:
Halide::RDom triangular(0, extent, 0, extent);
triangular.where(triangular.x < triangular.y);
Then use triangular in a reduction.
I once had a fast Cholesky written in Halide. Unfortunately I can't find the code. I put the outer loop in C and wrote a good block-panel update routine that operated on something like a 32-wide panel at a time. This was before Halide had triangular iteration, so maybe you can do better now.

More efficient way of finding edit distance over a large array

I have a large array of words (300k words) and I want to find the edit distance between each word, so I was just iterating over it and doing running through this version of the levenstein algorithm:
unsigned int edit_distance(const std::string& s1, const std::string& s2)
{
const std::size_t len1 = s1.size(), len2 = s2.size();
std::vector<std::vector<unsigned int>> d(len1 + 1, std::vector<unsigned int>(len2 + 1));
d[0][0] = 0;
for (unsigned int i = 1; i <= len1; ++i) d[i][0] = i;
for (unsigned int i = 1; i <= len2; ++i) d[0][i] = i;
for (unsigned int i = 1; i <= len1; ++i)
for (unsigned int j = 1; j <= len2; ++j)
// note that std::min({arg1, arg2, arg3}) works only in C++11,
// for C++98 use std::min(std::min(arg1, arg2), arg3)
d[i][j] = std::min({ d[i - 1][j] + 1, d[i][j - 1] + 1, d[i - 1][j - 1] + (s1[i - 1] == s2[j - 1] ? 0 : 1) });
return d[len1][len2];
}
So what I was wondering is, if there was a more efficient way of doing this, I heard about Levenshtein Autonoma but I wasn't sure if that would be any more efficient.
I would imagine that there you could avoid processing the same thing over and over again by preprocessing something but I have no idea how to actually achieve it (some approximate calculations would be to preprocess everything would be around 10^28 operations so that would not be an improvement)
As stated in his comment, The OP is actually looking for all the pairs with edit distance of less than 2.
Given an input of n words, a naive approach would be to make n(n-1)/2 comparisons, but less comparison may be required when L in an edit distance which is a metric space for strings.
Levenshtein distance is a metric space, and satisfies the 4 required metric axioms - including the triangle inequality.
Edit:
Given this, we can use the method proposed by Sergey Brin (Google's co-founder) in his paper Near Neighbor Search in Large Metric Spaces back in 1995, to solve our problem.
Quoting from the paper: Given a metric space (X, d), a data set Y ⊆ X, a query point x ∈ X, and a range r ∈ R, the near neighbors of x are the set of points y ∈ Y, such that d(x, y) ≤ r.
In this paper, Brin introduced GNAT (Geometric Near-neighbor Access Tree) - a data structure to solve this problem. Brin actually test the performance of his algorithm using the Levenshtein distance (which he calls "Edit distance") against two text corpora.
Over the years GNAT become well-known and widely used. Some improvements to GNAT where suggested in Geometric Near-neighbor Access Tree (GNAT) revisited - Fredriksson 2016.
If as indicated in the comments what you actually want is to find pairs with edit distance at most two, you can generate from each word all possibilities of deleting at most two characters (should be at most 500 or so), and store these in a hash table. Then, you only need to check each pair of words in a hash bucket, which is probably not hard to do by looking at whether deletions coincide.

How to compute sum of evenly spaced binomial coefficients

How to find sum of evenly spaced Binomial coefficients modulo M?
ie. (nCa + nCa+r + nCa+2r + nCa+3r + ... + nCa+kr) % M = ?
given: 0 <= a < r, a + kr <= n < a + (k+1)r, n < 105, r < 100
My first attempt was:
int res = 0;
int mod=1000000009;
for (int k = 0; a + r*k <= n; k++) {
res = (res + mod_nCr(n, a+r*k, mod)) % mod;
}
but this is not efficient. So after reading here
and this paper I found out the above sum is equivalent to:
summation[ω-ja * (1 + ωj)n / r], for 0 <= j < r; and ω = ei2π/r is a primitive rth root of unity.
What can be the code to find this sum in Order(r)?
Edit:
n can go upto 105 and r can go upto 100.
Original problem source: https://www.codechef.com/APRIL14/problems/ANUCBC
Editorial for the problem from the contest: https://discuss.codechef.com/t/anucbc-editorial/5113
After revisiting this post 6 years later, I'm unable to recall how I transformed the original problem statement into mine version, nonetheless, I shared the link to the original solution incase anyone wants to have a look at the correct solution approach.
Binomial coefficients are coefficients of the polynomial (1+x)^n. The sum of the coefficients of x^a, x^(a+r), etc. is the coefficient of x^a in (1+x)^n in the ring of polynomials mod x^r-1. Polynomials mod x^r-1 can be specified by an array of coefficients of length r. You can compute (1+x)^n mod (x^r-1, M) by repeated squaring, reducing mod x^r-1 and mod M at each step. This takes about log_2(n)r^2 steps and O(r) space with naive multiplication. It is faster if you use the Fast Fourier Transform to multiply or exponentiate the polynomials.
For example, suppose n=20 and r=5.
(1+x) = {1,1,0,0,0}
(1+x)^2 = {1,2,1,0,0}
(1+x)^4 = {1,4,6,4,1}
(1+x)^8 = {1,8,28,56,70,56,28,8,1}
{1+56,8+28,28+8,56+1,70}
{57,36,36,57,70}
(1+x)^16 = {3249,4104,5400,9090,13380,9144,8289,7980,4900}
{3249+9144,4104+8289,5400+7980,9090+4900,13380}
{12393,12393,13380,13990,13380}
(1+x)^20 = (1+x)^16 (1+x)^4
= {12393,12393,13380,13990,13380}*{1,4,6,4,1}
{12393,61965,137310,191440,211585,203373,149620,67510,13380}
{215766,211585,204820,204820,211585}
This tells you the sums for the 5 possible values of a. For example, for a=1, 211585 = 20c1+20c6+20c11+20c16 = 20+38760+167960+4845.
Something like that, but you have to check a, n and r because I just put anything without regarding about the condition:
#include <complex>
#include <cmath>
#include <iostream>
using namespace std;
int main( void )
{
const int r = 10;
const int a = 2;
const int n = 4;
complex<double> i(0.,1.), res(0., 0.), w;
for( int j(0); j<r; ++j )
{
w = exp( i * 2. * M_PI / (double)r );
res += pow( w, -j * a ) * pow( 1. + pow( w, j ), n ) / (double)r;
}
return 0;
}
the mod operation is expensive, try avoiding it as much as possible
uint64_t res = 0;
int mod=1000000009;
for (int k = 0; a + r*k <= n; k++) {
res += mod_nCr(n, a+r*k, mod);
if(res > mod)
res %= mod;
}
I did not test this code
I don't know if you reached something or not in this question, but the key to implementing this formula is to actually figure out that w^i are independent and therefore can form a ring. In simpler terms you should think of implement
(1+x)^n%(x^r-1) or finding out (1+x)^n in the ring Z[x]/(x^r-1)
If confused I will give you an easy implementation right now.
make a vector of size r . O(r) space + O(r) time
initialization this vector with zeros every where O(r) space +O(r) time
make the first two elements of that vector 1 O(1)
calculate (x+1)^n using the fast exponentiation method. each multiplication takes O(r^2) and there are log n multiplications therefore O(r^2 log(n) )
return first element of the vector.O(1)
Complexity
O(r^2 log(n) ) time and O(r) space.
this r^2 can be reduced to r log(r) using fourier transform.
How is the multiplication done, this is regular polynomial multiplication with mod in the power
vector p1(r,0);
vector p2(r,0);
p1[0]=p1[1]=1;
p2[0]=p2[1]=1;
now we want to do the multiplication
vector res(r,0);
for(int i=0;i<r;i++)
{
for(int j=0;j<r;j++)
{
res[(i+j)%r]+=(p1[i]*p2[j]);
}
}
return res[0];
I have implemented this part before, if you are still cofused about something let me know. I would prefer that you implement the code yourself, but if you need the code let me know.