2D convolution by breaking up mask - c++

Is there a way to perform convolution of two matrices (image and mask) by breaking up the mask into 2 smaller chunks and combining the result of the 2 convolutions to get the original single mask convolution result?

Yes, due to linearity of convolution, you can break things up as:
I * M = I * (M1 + M2) = I * M1 + I * M2
where M is your original mask and M1 and M2 are the two smaller chunks.
For example, M could be
M = [ 1 1 2
2 1 3
2 1 8 ]
and
M1 = [ 0 0 0
0 1 3
0 1 8 ]
M2 = [ 1 1 2
2 0 0
2 0 0 ]
Just be careful that if you do this, and you want to represent M1 as the smaller,
M1 = [ 1 3
1 8 ]
that you'll have to align them properly before you add them back together.

Related

A many-to-one mapping in the natural domain using discrete input variables?

I would like to find a mapping f:X --> N, with multiple discrete natural variables X of varying dimension, where f produces a unique number between 0 to the multiplication of all dimensions. For example. Assume X = {a,b,c}, with dimensions |a| = 2, |b| = 3, |c| = 2. f should produce 0 to 12 (2*3*2).
a b c | f(X)
0 0 0 | 0
0 0 1 | 1
0 1 0 | 2
0 1 1 | 3
0 2 0 | 4
0 2 1 | 5
1 0 0 | 6
1 0 1 | 7
1 1 0 | 8
1 1 1 | 9
1 2 0 | 10
1 2 1 | 11
This is easy when all dimensions are equal. Assume binary for example:
f(a=1,b=0,c=1) = 1*2^2 + 0*2^1 + 1*2^0 = 5
Using this naively with varying dimensions we would get overlapping values:
f(a=0,b=1,c=1) = 0*2^2 + 1*3^1 + 1*2^2 = 4
f(a=1,b=0,c=0) = 1*2^2 + 0*3^1 + 0*2^2 = 4
A computationally fast function is preferred as I intend to use/implement it in C++. Any help is appreciated!
Ok, the most important part here is math and algorythmics. You have variable dimensions of size (from least order to most one) d0, d1, ... ,dn. A tuple (x0, x1, ... , xn) with xi < di will represent the following number: x0 + d0 * x1 + ... + d0 * d1 * ... * dn-1 * xn
In pseudo-code, I would write:
result = 0
loop for i=n to 0 step -1
result = result * d[i] + x[i]
To implement it in C++, my advice would be to create a class where the constructor would take the number of dimensions and the dimensions itself (or simply a vector<int> containing the dimensions), and a method that would accept an array or a vector of same size containing the values. Optionaly, you could control that no input value is greater than its dimension.
A possible C++ implementation could be:
class F {
vector<int> dims;
public:
F(vector<int> d) : dims(d) {}
int to_int(vector<int> x) {
if (x.size() != dims.size()) {
throw std::invalid_argument("Wrong size");
}
int result = 0;
for (int i = dims.size() - 1; i >= 0; i--) {
if (x[i] >= dims[i]) {
throw std::invalid_argument("Value >= dimension");
}
result = result * dims[i] + x[i];
}
return result;
}
};

shifting with re-sampling in time series data

assume that i have this time-series data:
A B
timestamp
1 1 2
2 1 2
3 1 1
4 0 1
5 1 0
6 0 1
7 1 0
8 1 1
i am looking for a re-sample value that would give me specific count of occurrences at least for some frequency
if I would use re sample for the data from 1 to 8 with 2S, i will get different maximum if i would start from 2 to 8 for the same window size (2S)
ds = series.resample( str(tries) +'S').sum()
for shift in range(1,100):
tries = 1
series = pd.read_csv("file.csv",index_col='timestamp') [shift:]
ds = series.resample( str(tries) +'S').sum()
while ( (ds.A.max + ds.B.max < 4) & (tries < len(ds))):
ds = series.resample( str(tries) +'S').sum()
tries = tries + 1
#other lines
i am looking for performance improvement as it takes prohibitively long to finish for large data

How Domain maps map indexes to target locales array in multi-dimension case

I didn't find how the domain map maps the indices in the multi-dimensional domains to the multi-dimensional target locales.
1.) How the target locales (one dimension) is arranged in multi-dimension fashion which equals the distribution dimension to map the indexes?
2.) In documentation it states that for multi-dimension case, the computation should be done in every dimension. For the domain {1..8, 1..8} ==> dom
assume dom is block-distributed over 6 target locales.
Steps in mapping
1 for 1st dimension (1..8) do the computation
if idx is low<=idx<=high then locid is
floor (idx-low)*N / (high-low+1) gives me an index say i.
Repeat the same for 2nd dimension which gives me an index say j.
Now I have a tuple ( i, j )
how this is mapped to the target locales array of dimension 2?
What the domain map do for changing the 1D target locales array to distribution dimension?
Is something like reshape function ?
Please let me know if this lacks sufficient information.
The specific details about how a domain's indices are mapped to a program's locales are not defined by the Chapel language itself, but rather by the implementation of the domain map used to declare the domain. In the comments under your question, you mention that you're referring to the Block distribution, so I'll focus on that in my answer (documented here), but note that any other domain map could take a different approach.
The Block distribution takes an optional targetLocales argument which permits you to specify the set of locales to be targeted, as well as their virtual topology. For instance, if I declare and populate a few arrays of locales:
var grid1: [1..3, 1..2] locale, // a 3 x 2 array of locales
grid2: [1..2, 1..3] locale; // a 2 x 3 array of locales
for i in 1..3 {
for j in 1..2 {
grid1[i,j] = Locales[(2*(i-1) + j-1)%numLocales];
grid2[j,i] = Locales[(3*(j-1) + i-1)%numLocales];
}
}
I can then pass them in as the targetLocales arguments to a few instances of a Block-distributed domain:
use BlockDist;
config const n = 8;
const D = {1..n, 1..n},
D1 = D dmapped Block(D, targetLocales=grid1),
D2 = D dmapped Block(D, targetLocales=grid2);
Each domain will distribute its n rows to the first dimension of its targetLocales grid and its n columns to the second dimension. We can see the results of this distribution by declaring arrays of integers over these domains and assigning them in parallel to make each element store its owning locale's ID, as follows:
var A1: [D1] int,
A2: [D2] int;
forall a in A1 do
a = here.id;
forall a in A2 do
a = here.id;
writeln(A1, "\n");
writeln(A2, "\n");
When running on six or more locales (./a.out -nl 6), the output is as follows, revealing the underlying grid structure:
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
4 4 4 4 5 5 5 5
4 4 4 4 5 5 5 5
0 0 0 1 1 1 2 2
0 0 0 1 1 1 2 2
0 0 0 1 1 1 2 2
0 0 0 1 1 1 2 2
3 3 3 4 4 4 5 5
3 3 3 4 4 4 5 5
3 3 3 4 4 4 5 5
3 3 3 4 4 4 5 5
For a 1-dimensional targetLocales array, the documentation says:
If the rank of targetLocales is 1, a greedy heuristic is used to reshape the array of target locales so that it matches the rank of the distribution and each dimension contains an approximately equal number of indices.
For example, if we distribute to a 1-dimensional 4-element array of locales:
var grid3: [1..4] locale;
for i in 1..4 do
grid3[i] = Locales[(i-1)%numLocales];
var D3 = D dmapped Block(D, targetLocales=grid3);
var A3: [D3] int;
forall a in A3 do
a = here.id;
writeln(A3);
we can see that the target locales form a square, as expected:
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
The documentation is intentionally vague about how a 1D targetLocales argument will be reshaped if it's not a perfect square, but we can find out what's done in practice by using the targetLocales() query on the domain. Also, note that if no targetLocales array is supplied, the entire Locales array (which is 1D) is used by default. As an illustration of both these things, if the following code is run on six locales:
var D0 = D dmapped Block(D);
writeln(D0.targetLocales());
we get:
LOCALE0 LOCALE1
LOCALE2 LOCALE3
LOCALE4 LOCALE5
illustrating that the current heuristic matches our explicit grid1 declaration above.

OpenGL: How are base vectors laid out in memory

this topic has been discussed quite a few times. There are a lot of information on the memory layout of matrices in OpenGL on the internet. Sadly different sources often contradict each other.
My question boils down to:
When I have three base vectors of my matrix bx, by and bz. If I want to make a matrix out of them to plug them into a shader, how are they laid out in memory?
Lets clarify what I mean by base vector, because I suspect this can also mean different things:
When I have a 3D model, that is Z-up and I want to lay it down flat in my world space along the X-axis, then bz is [1 0 0]. I.e. a vertex [0 0 2] in model space will be transformed to [2 0 0] when that vertex is multiplied by my matrix that has bz as the base vector for the Z-axis.
Coming to OpenGL matrix memory layout:
According to the GLSL spec (GLSL Spec p.110) it says:
vec3 v, u;
mat3 m;
u = v * m;
is equivalent to
u.x = dot(v, m[0]); // m[0] is the left column of m
u.y = dot(v, m[1]); // dot(a,b) is the inner (dot) product of a and b
u.z = dot(v, m[2]);
So, in order to have best performance, I should premultiply my vertices in the vertex shader (that way the GPU can use the dot product and so on):
attribute vec4 vertex;
uniform mat4 mvp;
void main()
{
gl_Position = vertex * mvp;
}
Now OpenGL is said to be column-major (GLSL Spec p 101). I.e. the columns are laid out contiguously in memory:
[ column 0 | column 1 | column 2 | column 3 ]
[ 0 1 2 3 | 4 5 6 7 | 8 9 10 11 | 12 13 14 15 ]
or:
[
0 4 8 12,
1 5 9 13,
2 6 10 14,
3 7 11 15,
]
This would mean that I have to store my base vectors in the rows like this:
bx.x bx.y bx.z 0
by.x by.y by.z 0
bz.x bz.y bz.z 0
0 0 0 1
So for my example with the 3D model that I want to lay flat down, it has the base vectors:
bx = [0 0 -1]
by = [0 1 0]
bz = [1 0 0]
The model vertex [0 0 2] from above would be transformed like dis in the vertex shader:
// m[0] is [ 0 0 1 0]
// m[1] is [ 0 1 0 0]
// m[2] is [-1 0 0 0]
// v is [ 0 0 2 1]
u.x = dot([ 0 0 2 1], [ 0 0 1 0]);
u.y = dot([ 0 0 2 1], [ 0 1 0 0]);
u.z = dot([ 0 0 2 1], [-1 0 0 0]);
// u is [ 2 0 0]
Just as expected!
On the contrary:
This: Correct OpenGL matrix format?
SO question and consequently the OpenGL Faq states:
For programming purposes, OpenGL matrices are 16-value arrays with base vectors laid out contiguously in memory. The translation components occupy the 13th, 14th, and 15th elements of the 16-element matrix, where indices are numbered from 1 to 16 as described in section 2.11.2 of the OpenGL 2.1 Specification.
This says that my base vectors should be laid out in columns like this:
bx.x by.x bz.x 0
bx.y by.y bz.y 0
bx.z by.z bz.z 0
0 0 0 1
To me these two sources which both are official documentation from Khronos seem to contradict each other.
Can somebody explain this to me? Have I made a mistake? Is there indeed some wrong information?
The FAQ is correct, it should be:
bx.x by.x bz.x 0
bx.y by.y bz.y 0
bx.z by.z bz.z 0
0 0 0 1
and it's your reasoning that is flawed.
Assuming that your base vectors bx, by, bz are the model basis given in world coordinates, then the transformation from the model-space vertex v to the world space vertex Bv is given by linear combination of the base vectors:
B*v = bx*v.x + by*v.y + bz*v.z
It is not a dot product of b with v. Instead it's the matrix multiplication where B is of the above form.
Taking a dot product of a vertex u with bx would answer the inverse question: given a world-space u what would be its coordinates in the model space along the axis bx? Therefore multiplying by the transposed matrix transpose(B) would give you the transformation from world space to model space.

Formula that uses previous value

In Stata I want to have a variable calculated by a formula, which includes multiplying by the previous value, within blocks defined by a variable ID. I tried using a lag but that did not work for me.
In the formula below the Y-1 is intended to signify the value above (the lag).
gen Y = 0
replace Y = 1 if count == 1
sort ID
by ID: replace Y = (1+X)*Y-1 if count != 1
X Y count ID
. 1 1 1
2 3 2 1
1 6 3 1
3 24 4 1
2 72 5 1
. 1 1 2
1 2 2 2
7 16 3 2
Your code can be made a little more concise. Here's how:
input X count ID
. 1 1
2 2 1
1 3 1
3 4 1
2 5 1
. 1 2
1 2 2
7 3 2
end
gen Y = count == 1
bysort ID (count) : replace Y = (1 + X) * Y[_n-1] if count > 1
The creation of a dummy (indicator) variable can exploit the fact that true or false expressions are evaluated as 1 or 0.
Sorting before by and the subsequent by command can be condensed into one. Note that I spelled out that within blocks of ID, count should remain sorted.
This is really a comment, not another answer, but it would be less clear if presented as such.
Y-1, the lag in the formula would be translated as seen in the below.
gen Y = 0
replace Y = 1 if count == 1
sort ID
by ID: replace Y = (1+X)*Y[_n-1] if count != 1