Implementing iterative autocorrelation process in C++ using for loops - c++

I am implementing pitch tracking using an autocorrelation method in C++ but I am struggling to write the actual line of code which performs the autocorrelation.
I have an array containing a certain number ('values') of amplitude values of a pre-recorded signal, and I am performing the autocorrelation function on a set number (N) of these values.
In order to perform the autocorrelation I have taken the original array and reversed it so that point 0 = point N, point 1 = point N-1 etc, this array is called revarray
Here is what I want to do mathematically:
(array[0] * revarray[0])
(array[0] * revarray[1]) + (array[1] * revarray[0])
(array[0] * revarray[2]) + (array[1] * revarray[1]) + (array[2] * revarray[0])
(array[0] * revarray[3]) + (array[1] * revarray[2]) + (array[2] * revarray[1]) + (array[3] * revarray[0])
...and so on. This will be repeated for array[900]->array[1799] etc until autocorrelation has been performed on all of the samples in the array.
The number of times the autocorrelation is carried out is:
values / N = measurements
Here is the relevent section of my code so far
for (k = 0; k = measurements; ++k){
for (i = k*(N - 1), j = k*N; i >= 0; i--, j++){
revarray[j] = array[i];
for (a = k*N; a = k*(N - 1); ++a){
autocor[a]=0;
for (b = k*N; b = k*(N - 1); ++b){
autocor[a] += //**Here is where I'm confused**//
}
}
}
}
I know that I want to keep iteratively adding new values to autocor[a], but my problem is that the value that needs to be added to will keep changing. I've tried using an increasing count like so:
for (i = (k*N); i = k*(N-1); ++i){
autocor[i] += array[i] * revarray[i-1]
}
But I clearly know this won't work as when the new value is added to the previous autocor[i] this previous value will be incorrect, and when i=0 it will be impossible to calculate using revarray[i-1]
Any suggestions? Been struggling with this for a while now. I managed to get it working on just a single array (not taking N samples at a time) as seen here but I think using the inverted array is a much more efficient approach, I'm just struggling to implement the autocorrelation by taking sections of the entire signal.

It is not very clear to me, but I'll assume that you need to perform your iterations as many times as there are elements in that array (if it is indeed only half that much - adjust the code accordingly).
Also the N is assumed to mean the size of the array, so the index of the last element is N-1.
The loops would looks like that:
for(size_t i = 0; i < N; ++i){
autocorr[i] = 0;
for(size_t j = 0; j <= i; ++j){
const size_t idxA = j
, idxR = i - j; // direct and reverse indices in the array
autocorr[i] += array[idxA] * array[idxR];
}
}
Basically you run the outer loop as many times as there are elements in your array and for each of those iterations you run a shorter loop up to the current last index of the outer array.
All that is left to be done now is to properly calculate the indices of the array and revarray to perform the calculations and accummulate a running sum in the current outer loop's index.

Related

Trying to understand this solution

I was trying to solve a question and I got into a few obstacles that I failed to solve, starting off here is the question: Codeforces - 817D
Now I tried to brute force it, using a basic get min and max for each segment of the array I could generate and then keeping track of them I subtract them and add them together to get the final imbalance, this looked good but it gave me a time limit exceeded cause brute forcing n*(n+1)/2 subsegments of the array given n is 10^6 , so I just failed to go around it and after like a couple of hours of not getting any new ideas I decided to see a solution that I could not understand anything in to be honest :/ , here is the solution:
#include <bits/stdc++.h>
#define ll long long
int a[1000000], l[1000000], r[1000000];
int main(void) {
int i, j, n;
scanf("%d",&n);
for(i = 0; i < n; i++) scanf("%d",&a[i]);
ll ans = 0;
for(j = 0; j < 2; j++) {
vector<pair<int,int>> v;
v.push_back({-1,INF});
for(i = 0; i < n; i++) {
while (v.back().second <= a[i]) v.pop_back();
l[i] = v.back().first;
v.push_back({i,a[i]});
}
v.clear();
v.push_back({n,INF});
for(i = n-1; i >= 0; i--) {
while (v.back().second < a[i]) v.pop_back();
r[i] = v.back().first;
v.push_back({i,a[i]});
}
for(i = 0; i < n; i++) ans += (ll) a[i] * (i-l[i]) * (r[i]-i);
for(i = 0; i < n; i++) a[i] *= -1;
}
cout << ans;
}
I tried tracing it but I keep wondering why was the vector used , the only idea I got is he wanted to use the vector as a stack since they both act the same(Almost) but then the fact that I don't even know why we needed a stack here and this equation ans += (ll) a[i] * (i-l[i]) * (r[i]-i); is really confusing me because I don't get where did it come from.
Well thats a beast of a calculation. I must confess, that i don't understand it completely either. The problem with the brute force solution is, that you have to calculate values or all over again.
In a slightly modified example, you calculate the following values for an input of 2, 4, 1 (i reordered it by "distance")
[2, *, *] (from index 0 to index 0), imbalance value is 0; i_min = 0, i_max = 0
[*, 4, *] (from index 1 to index 1), imbalance value is 0; i_min = 1, i_max = 1
[*, *, 1] (from index 2 to index 2), imbalance value is 0; i_min = 2, i_max = 2
[2, 4, *] (from index 0 to index 1), imbalance value is 2; i_min = 0, i_max = 1
[*, 4, 1] (from index 1 to index 2), imbalance value is 3; i_min = 2, i_max = 1
[2, 4, 1] (from index 0 to index 2), imbalance value is 3; i_min = 2, i_max = 1
where i_min and i_max are the indices of the element with the minimum and maximum value.
For a better visual understanding, i wrote the complete array, but hid the unused values with *
So in the last case [2, 4, 1], brute-force looks for the minimum value over all values, which is not necessary, because you already calculated the values for a sub-space of the problem, by calculating [2,4] and [4,1]. But comparing only the values is not enough, you also need to keep track of the indices of the minimum and maximum element, because those can be reused in the next step, when calculating [2, 4, 1].
The idead behind this is a concept called dynamic programming, where results from a calculation are stored to be used again. As often, you have to choose between speed and memory consumption.
So to come back to your question, here is what i understood :
the arrays l and r are used to store the indices of the greatest number left or right of the current one
vector v is used to find the last number (and it's index) that is greater than the current one (a[i]). It keeps track of rising number series, e.g. for the input 5,3,4 at first the 5 is stored, then the 3 and when the 4 comes, the 3 is popped but the index of 5 is needed (to be stored in l[2])
then there is this fancy calculation (ans += (ll) a[i] * (i-l[i]) * (r[i]-i)). The stored indices of the maximum (and in the second run the minimum) elements are calculated together with the value a[i] which does not make much sense for me by now, but seems to work (sorry).
at last, all values in the array a are multiplied by -1, which means, the old maximums are now the minimums, and the calculation is done again (2nd run of the outer for-loop over j)
This last step (multiply a by -1) and the outer for-loop over j are not necessary but it's an elegant way to reuse the code.
Hope this helps a bit.

knapsack with weight only

if i had given the maximum weight say w=20 .and i had given a set on weights say m=[5,7,12,18] then how could i calculate the max possible weight that we can hold inside the maximum weight using the m. in this case the answer is 19.by adding 12+7=19. and my code is giving me 18.please help me in this.
int weight(int W, vector<int> &m) {
int current_weight = 0;
int temp;
for (int i = 0; i < w.size(); i++) {
for (int j = i + 1; j < m.size(); j++) {
if (m[i] < m[j]) {
temp = m[j];
m[j] = m[i];
m[i] = temp;
}
}
}
for (size_t i = 0; i < m.size(); ++i) {
if (current_weight + m[i] <= W) {
current_weight += m[i];
}
}
return current_weight;
}
The problem you describe looks more like a version of the maximum subset sum problem. Basically, there is nothing wrong with your implementaion in the first place; apparently you have correctly implemented a greedy algorithm for the problem. That being said, this algorithm fails to generate an optimal solution for every input. The instance you have found is such an example.
However, the problem can be solved using a different approach termed dynamic programming, which can be seen as form of organization of a recursive formulation of the solution.
Let m = { m_1, ... m_n } be the set of positive item sizes and W a capscity constraint where n is a positive integer. Organize an array A[n][W] as a state space where
A[i][j] = the maximum weight at most j attainable for the set of items
with indices from 0 to i if such a solution exists and
minus infinity otherwise
for each i in {1,...,n} and j in {1,...,W}; for ease of presentation, suppose that A has a value of minus infinity everywhere else. Note that for each such i and j the recurrence relation
A[i][j] = min { A[i-1][W-m_j] + m_j, A[i-1][W] }
holds, where the first case corresponds to selecting item i into the solution and the second case corresponds to not selecting item i into the solution.
Next, organize a loop which fills this table in an order of increasing values of i and j, where the initialization for i = 1 has to be done before. After filling the state space, the maximum feasible value in the last colum
max{ A[n][j] : j in {1,...,W}, A[n][j] is not minus infinity }
yields the optimal solution. If the associated set of items is also desired, either some backtracking or suitable auxiliary data structures have to be used.
So it feels like this solution can be a trivial change to the commonly existing 0-1 knapsack problem, by passing the copy of the weight array as the value array.

Treats for the cows - bottom up dynamic programming

The full problem statement is here. Suppose we have a double ended queue of known values. Each turn, we can take a value out of one or the other end and the values still in the queue increase as value*turns. The goal is to find maximum possible total value.
My first approach was to use straightforward top-down DP with memoization. Let i,j denote starting, ending indexes of "subarray" of array of values A[].
A[i]*age if i == j
f(i,j,age) =
max(f(i+1,j,age+1) + A[i]*age , f(i,j-1,age+1) + A[j]*age)
This works, however, proves to be too slow, as there are superfluous stack calls. Iterative bottom-up should be faster.
Let m[i][j] be the maximum reachable value of the "subarray" of A[] with begin/end indexes i,j. Because i <= j, we care only about the lower triangular part.
This matrix can be built iteratively using the fact that m[i][j] = max(m[i-1][j] + A[i]*age, m[i][j-1] + A[j]*age), where age is maximum on the diagonal (size of A[] and linearly decreases as A.size()-(i-j).
My attempt at implementation meets with bus error.
Is the described algorithm correct? What is the cause for the bus error?
Here is the only part of the code where the bus error might occur:
for(T j = 0; j < num_of_treats; j++) {
max_profit[j][j] = treats[j]*num_of_treats;
for(T i = j+1; i < num_of_treats; i++)
max_profit[i][j] = max( max_profit[i-1][j] + treats[i]*(num_of_treats-i+j),
max_profit[i][j-1] + treats[j]*(num_of_treats-i+j));
}
for(T j = 0; j < num_of_treats; j++) {
Inside this loop, j is clearly a valid index into the array max_profit. But you're not using just j.
The bus error is caused by trying to access array via negative index when j=0 and i=1 as I should have noticed during the debugging. The algorithm is wrong as well. First, the relationship used to construct the max_profit[][] array should is
max_profit[i][j] = max( max_profit[i+1][j] + treats[i]*(num_of_treats-i+j),
max_profit[i][j-1] + treats[j]*(num_of_treats-i+j));
Second, the array must by filled diagonally, so that max_profit[i+1][j] and max_profit[i][j-1] is already computed with exception of the main diagonal.
Third, the data structure chosen is extremely inefficient. I am using only half of the space allocated for max_profit[][]. Plus, at each iteration, I only need the last computed diagonal. An array of size num_of_treats should suffice.
Here is a working code using this improved algorithm. I really like it. I even used bit operators for the first time.

Writing MATLAB arrays in C/C++

The MATLAB code samples part of background of an grayscale image by creating a cell array that is backgroundSample{1}, backgroundSample{2}, ... , backgroundSample{9}. Here halfRows and halfCols is the half size of the image.
Since backgroundSample is an array that contains nine 2-D matrices. It confused me that how to write this code in C/C++. Can I get the elements of backgroundSample{i} using something like backgroundSample[i].elements[m][n]?
MATLAB code:
offset = [-60, -20, 20, 60];
for i = 1: 1: 3
for j = 1: 1: 3
backgroundSample{(i - 1) * 3 + j} =
background(halfRows + offset(i): halfRows + offset(i + 1), ...
halfCols + offset(j): halfCols + offset(j + 1));
end;
end;
EDIT:
As we can assign a matrix simply by A = B in MATLAB. For an example, backgroundSample{1} = background(60: 100, 60: 100) in my question and this assignment is in the loops of i: 1→3 and j: 1→3. However, when assigning a matrix in C/C++, it should assign every element one by one. Maybe like this:
for(int i = 0; i < 3; i++)
for(int j = 0; n < 3; j++)
// to get every elements
for(int m = 0 ...)
for(int n = 0 ...)
// not sure whether there is such usage of "->" in array
backgroundSample[(i - 1) * 3 + j]->elements[m][n] = background[i iteration][j iteration]
So there are conflicts between indices of matrix backgroundSample[m][n] and background[i][j]. How to resolve the issue?
The simplest way to implement what you're describing is to declare a multidimensional array:
int backgroundSample[9][3][3];
where the dimensions of each 2-D matrix is assumed to be 3×3. To access the (m, n) element in the k-th matrix, you write backgroundSample[k][m][n], e.g:
for (int m = 0; m < 3; ++m)
{
for(int n = 0; n < 3; ++n)
{
backgroundSample[(i - 1) * 3 + j][m][n] = background[i][j];
}
}
Alternatively, if each sample in this array stores more information, you can declare a structure:
typedef struct
{
int elements[3][3];
// More fields...
} TSample;
and then create an array of these:
TSample backgroundSample[9];
To access an element you would write backgroundSample[k].elements[m][n].
There's also the possibility of allocating the memory dynamically (during runtime, meaning that you don't know how much of these structures you have in advance):
TSample* backgroundSample;
In C++ the actual process of memory allocation would look like this:
backgroundSample = new TSample[9];
Accessing an element would be done by writing backgroundSample[k]->elements[m][n]. Notice the array operator -> which accesses the field elements by dereferencing the pointer backgroundSample[k].
Note: each call to new needs to be accompanied by a matching call to delete when done in order to release the memory, i.e:
delete[] backgroundSample;
Hope that helps!

Converting MatLab code - Confused

Basically, I have this final piece of code to convert from MatLab to C++.
The function takes in a 2D vector and then checks the elements of the 2D vector against 2 criteria and if not matched, it removes the blocks. But I'm confused to what the code in MatLab wants to be returned, a 2D or a 1D vector? Here is the code:
function f = strip(blocks, sumthresh, zerocrossthresh)
% This function removes leading and trailing blocks that do
% not contain sufficient energy or frequency to warrent consideration.
% Total energy is measured by summing the entire vector.
% Frequency is measured by counting the number of times 0 is crossed.
% The parameters sumthresh and zerocrossthrech are the thresholds,
% averaged across each sample, above which consideration is warrented.
% A good sumthresh would be 0.035
% A good zerocrossthresh would be 0.060
len = length(blocks);
n = sum(size(blocks)) - len;
min = n+1;
max = 0;
sumthreshtotal = len * sumthresh;
zerocrossthreshtotal = len * zerocrossthresh;
for i = 1:n
currsum = sum(abs(blocks(i,1:len)));
currzerocross = zerocross(blocks(i,1:len));
if or((currsum > sumthreshtotal),(currzerocross > zerocrossthreshtotal))
if i < min
min = i;
end
if i > max;
max = i;
end
end
end
% Uncomment these lines to see the min and max selected
% max
% min
if max > min
f = blocks(min:max,1:len);
else
f = zeros(0,0);
end
Alternatively, instead of returning another vector (whether it be 1D or 2D) might it be better to actually send the memory location of the vector and remove the blocks from it? So for example..
for(unsigned i=0; (i < theBlocks.size()); i++)
{
for(unsigned j=0; (j < theBlocks[i].size()); j++)
{
// handle theBlocks[i][kj] ....
}
}
Also, I do not understand this line:
currsum = sum(abs(blocks(i,1:len)));
Basically the: (i,1:len)
Any ideas? Thanks :)
blocks(i,1:len) is telling the array that it wants to go from blocks[i][1 to the end]. So if it was a 3x3 array it's doing something like:
blocks[i][1]
blocks[i][2]
blocks[i][3]
.
.
.
blocks[i][end]
Then it's taking the absolute value of the contents of the matrix and adding them together. It's returning a [x][x] matrix but the length is either going to be a 0x0 or of (max)X(len).