Reordering with Accelerate framework - c++

I would like to use the Accelerate Framework libraries for sorting data (pairs of x and y values).
I used the function vDSP_vsorti to find the vector with the ordering indices of the x data. Now I should reorder the y data accordingly to the x sorting indices order.
How I could do it? Does exist a function in Accelerate Framework to reorder the vector?

Can you use vDSP_vgathr? This APIs grabs values from a vector using another vector of indices.
https://developer.apple.com/library/mac/documentation/Accelerate/Reference/vDSPRef/index.html#//apple_ref/c/func/vDSP_vgathr
Here is a summary:
Single-Vector Gathering
The functions in this group use either indices or pointers stored within one source vector to generate a new vector containing the chosen elements from either a second source vector or from memory.
vDSP_vgathr
Vector gather; single precision.
Declaration
SWIFT
func vDSP_vgathr(_ __vDSP_A: UnsafePointer<Float>,
_ __vDSP_B: UnsafePointer<vDSP_Length>,
_ __vDSP_J: vDSP_Stride,
_ __vDSP_C: UnsafeMutablePointer<Float>,
_ __vDSP_K: vDSP_Stride,
_ __vDSP_N: vDSP_Length)
OBJECTIVE-C
void vDSP_vgathr ( const float *__vDSP_A, const vDSP_Length *__vDSP_B, vDSP_Stride __vDSP_IB, float *__vDSP_C, vDSP_Stride __vDSP_IC, vDSP_Length __vDSP_N );
Parameters
__vDSP_A
Single-precision real input vector
__vDSP_B
Integer vector containing indices
__vDSP_J
Stride for B
__vDSP_C
Single-precision real output vector
__vDSP_K
Stride for C
__vDSP_N
The number of elements to process
Discussion
Performs the following operation:
Uses elements of vector B as indices to copy selected elements of vector A to sequential locations in vector C. Note that 1, not zero, is treated as the first location in the input vector when evaluating indices. This function can only be done out of place.

Related

How to use accumulate function to sum a row of values in a variable array?

I've been working on a program where I need to be able to sum rows in a two-dimensional array whose number of columns are variables. I should also add that the rows are "split" into two parts (part A, and part B) whose sizes depend on user input.
I can obviously sum a row just using a for loop, but I wanted a more elegant solution that would also be easier to set up across the whole program. I stumbled across the accumulate function out of the numeric library, but all examples that I was able to find were exclusively for one-dimensional arrays.
Here's a sample of my problem code:
total = partNum[PART_A] + partNum[PART_B];
partStart[PART_A] = 0;
partEnd[FUNC_A] = partNum[PART_A];
partStart[PART_B] = partNum[PART_A];
partEnd[FUNC_B] = total;
double stat[5][total];
double mass_sum = 0.0
func = PART_A;
accumulate(stat[MASS][partStart[func]], stat[MASS][partStart[func]], mass_sum);
However, I get a buildtime error which states that:
Indirection requires pointer operand ('double' invalid')
I assume this is a syntax error, but changing how I defined the array's start and end did nothing to fix the error.
The two first argument of accumulate are iterators that the function will use to iterate over the range, but you are passing actual element of the array
Iterator in C++ is a concept that requires certain operations to be valid on your object, as defined per the standard. For instance, pointer types usually match the LegacyRandomAccessIterator, meaning that you can basically use them to as array-like object (you can increment them with ++, you can indirect them with *, you can access an element at position i with [], etc.). I won't go into full details about what are iterators in C++ because it's a whole topic and you can find plenty of references, in particular on the two links I provided.
Back to your problem, what you want is to give accumulate iterator to the beginning and the end of your row, or the beginning and the end of your subranges. There are two ways to do this:
Take the address of the element stat[MASS][partStart[func]], i.e., &stat[MASS][partStart[func]], which will give you the beginning of the range, and then &stat[MASS][partEnd[func]] will give you the end of the range. This works because stat is as double stat[5][total] and the access operator ([]) gives you a reference (a double&), that you can take the address of, and the element on the row are contiguous in memory (that would not work for a column).
Use stat[MASS] + partStart[func] and stat[MASS] + partEnd[func]. In this case, you take the beginning of the row (stat[MASS]), which is (or is implicitly convertible to) a pointer to double (double*) and you increment that pointer by partStart[func] or partEnd[func], giving you the addresses of the elements you want in the row.
So basically:
std::accumulate(&stat[MASS][partStart[func]], &stat[MASS][partEndfunc]], mass_sum);
// or
std::accumulate(stat[MASS] + partStart[func], stat[MASS] + partEnd[func], mass_sum);

Reshaping Fortran arrays

I have a huge m by 1 array (m is very large) called X which is a result of Fortran matmul operation. My problem is to store this apparently 2D array into an 1D array Y of size m.
I tried with Y = reshape(X, [[2]]) and this result some elements NaN. Can anyone point me to Fortran commands to do it quickly. The elements of X may be zero or non-zero.
The second argument of reshape (or the one with keyword shape=) is the shape of the function's result. In your call, you have requested shape [2].
An array with shape [2] is a rank-1 array with two elements. You want a rank-1 array with m elements:
Y = RESHAPE(X, [m])
Now, in this case there's no need to use reshape:
Y = X(:,1)
where the right-hand side is the rank-1 array section of X.
When you have Y=reshape(X,[2]), if Y is not allocatable and not of size 2 then you have a problem which may indeed result in your compiler deciding---as it is quite entitled to do---to give you a few NaNs.
Note also that you may not need to reshape your array, depending on how you intend to later use it.

Armadillo: efficient RAM sparse batch insertion

I know that Sparse matrix in armadillo is still in preliminary support.
I'm using armadillo lib in my quantum systems research and I have problem to construct sparse mat in effective RAM way.
So far I was using my own implementation of sparse matrixes, but I want to have an optimized matrix class.
I'm filling elements in batch mode:
umat loc(2,size);
cx_vec val(size);
// calculate loc and val
...
//
sp_cx_mat Hamiltonian(loc, val);
This kind of action copy values from loc,val to constructor of Hamiltonian and for some few seconds require 2x RAM. I calculate huge matrix (size is about 2**L, where L=22, 24, ...) so I wish I had well optimised code in memory.
For comparison, matrix size: 705432x705432 - RAM and "filling time":
my implementation (COO format): time 7.95s, memory 317668kB
armadillo (CSC format): time 5.32s, memory 715000kB
Is it possible to deallocate fragments of vectors: loc, val on the fly to save memory, element by element?
The answer here will be to use the other sparse matrix constructor that takes the CSC format, so you will need to modify your // calculate loc and val code, instead filling the following three arrays:
values (length equal to number of points)
row_indices (length equal to number of points)
col_ptrs (length equal to number of columns plus one)
The points should be arranged in column-major ordering in the values and row_indices vectors, and the col_ptrs vector contains the number of nonzero elements before the beginning of the column. That is, col_ptrs[0] will always contain 0, col_ptrs[1] will contain the number of nonzero elements in the first column, col_ptrs[2] will contain the number of nonzero elements in the first and second columns, and col_ptrs[n_cols + 1] will contain the number of nonzero elements in the matrix.
For more documentation on this constructor, see the "Batch constructors" section of http://arma.sourceforge.net/docs.html#SpMat ; this is the fourth entry in that list.
If you cannot easily modify your calculation code to adhere to that format, then you might be better off trying to specify sort_locations = false to the constructor you are using, if you are not already doing that.

Are functions on matrices applied to the entire matrix or each row in Fortran?

I've never written in Fortran, but I'm trying to adapt a script to R and the following lines are confusing me. So this is how the variable is defined:
real, dimension(n,nd) :: x
Does this mean x is n arrays filled with nd number of real values or a n x nd matrix?
Then
amax = maxval(abs(x))
x = x/amax
is applied. Is the variable amax a global max of the absolute values in x or is it an array of n max values, one for each row? This is important to know if the x = x/amax is being applied to each row or the entire matrix. The purpose of this function seems to be some type of normalization.
The question of the title is much more general than that of the body, so I'll come to that later.
The result of maxval(array) is a scalar, being the maximum value in array (if it's of non-zero size).
In your example, x is a single array of rank 2 (which is commonly thought of as being a matrix). Thus, maxval(x) is indeed what you call the global maximum of that matrix. An alternative form of maxval is required to give the row-by-row maxima: maxval(x,dim=2).
Now, there is something else to note from your example:
x = x/amax
has a requirement about the shapes of x and amax.
You don't give a declaration for amax but there are two possibilities:
amax has the same shape as x; or
amax is a scalar.
[Note that amax needn't be a scalar just because it is assigned a scalar result from that maxval reference. However, you will see that amax won't be declared as rank 1 with size the number of rows of x, so that's another clue that maxval is giving the global maximum.]
These two possibilities come from conformability rules for division. With amax a scalar each element of x is divided by that value; with amax an array each element of x is divided by the corresponding element in amax.
If you want to normalize each individual row of x then you just can't use that division expression with amax a rank 1 array.
Coming to the more general question: even though it's an either/or question the answer is "no". There is no single way. Each function acts as it is defined.
As a general rule, though, the intrinsic functions of Fortran rarely care about the specific case of arrays which have "rows". But one useful thought is that a function acts either:
on all elements individually, returning an array of the same shape;
on the array as a whole, returning a scalar.
Moderated by the fact that many will have this dim argument which causes the function to act on slices instead.
The first line means that the variable x is an array of two dimensions (n,nd) and not n arrays of nd values. The function maxval returns the maximum value in this array.
See page 130 (in the PDF not the printed number) in F90_notes.pdf (you will also find a whole chapter concerning the arrays in the same document).
To add to Baruchel's answer: x/amax divides each element of the 2D array x by the scalar amax.

What data structure to use?

I need a data structure with the following properties:
Access to elements must be very fast
Elements, that are not added, shouldn't take memory (as ideal, size of empty structure near to zero)
Each element has two integer coordinates (x,y) (access to elements only by them)
Max count of elements known at creation time (over 10^3)
Element contains few float values
It would be good if you also directed to an implementation of this structure in C or C++.
Are you looking for a sparse matrix ?
Check this out - you could alter the element type to float if this does everything you want.
Concise Sparse Matrix Package in C
For C++ you could use Boost.uBLAS - sparse_matrix details here.
If your X and Y are relatively small then a two dimensional array of pointers would work. 10000 pointers would be 40K in 32 bit code.
typdef ElementAccessor std::pair<int, int>;
struct Element
{
float f1;
float f2;
//etc.
};
std::map< ElementAccessor, Element > myElementMap;
You can now use this map as a matrix. ElementAccessor refers to x,y. Just make sure to see if the element exists in the map before you try to access it, or one is created by default.
http://www.cplusplus.com/reference/std/utility/pair/
http://www.cplusplus.com/reference/stl/map/find/
edit: the template brackets are showing up for the map. the map key type is ElementAccessor, the value is Element. Also, for the pair, the templating is int, int.