Calculate order statistics from a matrix in SAS

Calculate order statistics from a matrix in SAS - sas

I have a matrix in SAS/IML:
x = {7 6 3 3 8,
2 3 5 2 5,
2 6 4 3 8,
7 4 8 1 3,
8 8 6 8 7,
3 2 6 1 5 };
I want to create a new matrix that contains the highest k values of each column in x. For example, if k=3, I want the result matrix to contain:
8 8 8 8 8
7 6 6 3 8
7 6 6 3 7
because, for instance, the largest 3 numbers in the first column of x are 8, 7, and 7.
I've unsuccessfully tried to figure out how to do this using the rank function.

Your code looks fine. Here's a minor revision:
do c=1 to ncol(x);
r = rank(x[,c]);
y = x[loc(r>=nrow(x)-k+1), c];
call sort(y);
tops[,c] = y;
end;
As to avoiding the loop to make it faster, it's not necessary. Even with 10,000 columns, this code runs in a fraction of a second. Try running the following timing code:
x = j(500, 10000);
call randgen(x,"normal");
k = 3;
t0=time();
tops = j(k,ncol(x),0);
do c=1 to ncol(x);
r = rank(x[,c]);
y = x[loc(r>=nrow(x)-k+1), c];
call sort(y);
tops[,c] = y;
end;
t=time()-t0;
print t;

Here's a partial answer I've come up with:
k = 3;
tops = j(k,ncol(x),0);
do c=1 to ncol(x);
r = rank(x[,c]);
h=loc(r>=nrow(x)-k+1);
tops[,c] = x[,c][h];
end;
This approach uses a loop, which I'd like to avoid, so please post improvements if possible!

Related

What will be the output of the following pseudo code for input 7?

Please help me to understand the following code and what will be the possiable output.
What will be the output of the following pseudo code for input 7?
1.Input n
2.Set m = 1, T = 0
3.if (m > n)
Go to step 9
5.else
T = T + m
m = m + 1
8.Go to step 3
9.Print T

0
n is less than n so go to step 9 which is print T which is equal to 0 as set in step 2.

T should be 28. It will loop till m>7 (since n=7) and in each iteration T adds m to itself, since T is 0 initially it is only summing up m after incrementing it by 1 in each iteration.So if you add 1+2+3.....+7 you get 28 and that is when the loop breaks since m is now equal to 8.

for m = 1 2 3 4 5 6 7 and for 8 m>n will be true and it will go to step 9
T=(T+M)= 1 3 6 10 15 21 28 basically T is a series where next is added as 2,3,4,5,6,7 to prev number 2 3 4 5 6 7 if one look from other angle

Downscale array for decimal factor

Is there efficient way to downscale number of elements in array by decimal factor?
I want to downsize elements from one array by certain factor.
Example:
If I have 10 elements and need to scale down by factor 2.
1 2 3 4 5 6 7 8 9 10
scaled to
1.5 3.5 5.5 7.5 9.5
Grouping 2 by 2 and use arithmetic mean.
My problem is what if I need to downsize array with 10 elements to 6 elements? In theory I should group 1.6 elements and find their arithmetic mean, but how to do that?

Before suggesting a solution, let's define "downsize" in a more formal way. I would suggest this definition:
Downsizing starts with an array a[N] and produces an array b[M] such that the following is true:
M <= N - otherwise it would be upsizing, not downsizing
SUM(b) = (M/N) * SUM(a) - The sum is reduced proportionally to the number of elements
Elements of a participate in computation of b in the order of their occurrence in a
Let's consider your example of downsizing 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 to six elements. The total for your array is 55, so the total for the new array would be (6/10)*55 = 33. We can achieve this total in two steps:
Walk the array a totaling its elements until we've reached the integer part of N/M fraction (it must be an improper fraction by rule 1 above)
Let's say that a[i] was the last element of a that we could take as a whole in the current iteration. Take the fraction of a[i+1] equal to the fractional part of N/M
Continue to the next number starting with the remaining fraction of a[i+1]
Once you are done, your array b would contain M numbers totaling to SUM(a). Walk the array once more, and scale the result by N/M.
Here is how it works with your example:
b[0] = a[0] + (2/3)*a[1] = 2.33333
b[1] = (1/3)*a[1] + a[2] + (1/3)*a[3] = 5
b[2] = (2/3)*a[3] + a[4] = 7.66666
b[3] = a[5] + (2/3)*a[6] = 10.6666
b[4] = (1/3)*a[6] + a[7] + (1/3)*a[8] = 13.3333
b[5] = (2/3)*a[8] + a[9] = 16
--------
Total = 55
Scaling down by 6/10 produces the final result:
1.4 3 4.6 6.4 8 9.6 (Total = 33)
Here is a simple implementation in C++:
double need = ((double)a.size()) / b.size();
double have = 0;
size_t pos = 0;
for (size_t i = 0 ; i != a.size() ; i++) {
if (need >= have+1) {
b[pos] += a[i];
have++;
} else {
double frac = (need-have); // frac is less than 1 because of the "if" condition
b[pos++] += frac * a[i]; // frac of a[i] goes to current element of b
have = 1 - frac;
b[pos] += have * a[i]; // (1-frac) of a[i] goes to the next position of b
}
}
for (size_t i = 0 ; i != b.size() ; i++) {
b[i] /= need;
}
Demo.

You will need to resort to some form of interpolation, as the number of elements to average isn't integer.
You can consider computing the prefix sum of the array, i.e.
0 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10
yields by summation
0 1 2 3 4 5 6 7 8 9
1 3 6 10 15 21 28 36 45 55
Then perform linear interpolation to get the intermediate values that you are lacking, like at 0*, 10/6, 20/6, 30/5*, 40/6, 50/6, 60/6*. (Those with an asterisk are readily available).
0 1 10/6 2 3 20/6 4 5 6 40/6 7 8 50/6 9
1 3 15/3 6 10 35/3 15 21 28 100/3 36 45 145/3 55
Now you get fractional sums by subtracting values in pairs. The first average is
(15/3-1)/(10/6) = 12/5

I can't think of anything in the C++ library that will crank out something like this, all fully cooked and ready to go.
So you'll have to, pretty much, roll up your sleeves and go to work. At this point, the question of what's the "efficient" way of doing it boils down to its very basics. Which means:
1) Calculate how big the output array should be. Based on the description of the issue, you should be able to make that calculation even before looking at the values in the input array. You know the input array's size(), you can calculate the size() of the destination array.
2) So, you resize() the destination array up front. Now, you no longer need to worry about the time wasted in growing the size of the dynamic output array, incrementally, as you go through the input array, making your calculations.
3) So what's left is the actual work: iterating over the input array, and calculating the downsized values.
auto b=input_array.begin();
auto e=input_array.end();
auto p=output_array.begin();
Don't see many other options here, besides brute force iteration and calculations. Iterate from b to e, getting your samples, calculating each downsized value, and saving the resulting value into *p++.

Array elements not getting edited in a Python List

sticks = int(raw_input());
stickList= map(int,raw_input().split()) ;
stickList = sorted(stickList);
for i in xrange(0,len(stickList)):
stickList[i] = stickList[i]-stickList[0];
print stickList;
Given Input is :
6
5 4 4 2 2 8
Why the output is this: [0, 2, 4, 4, 5, 8]
instead of [0,0,2,2,3,6]

That is because you are changing the value in source stickList in for loop.
After first iteration in loop stickList[0] will become 0 for remaining iterations.
As ShadowRanger mentioned reversed list will do,
stickList = map(int, "5 4 4 2 2 8".split())
stickList.sort()
for i in reversed(xrange(len(stickList))):
stickList[i] -= stickList[0]
print stickList

Mapping the subdivision of an matrix to a vector

I am trying to map the subdivision of a matrix to an array.
By subdivision of a matrix I mean a box like the 3x3 boxes in a 9x9 sudoku matrix.
To do so I use :
grid[x][y] = box[x/3 + (y/3)*3];
But it does not work, any sugesstion on a solution and an explanation of why it does not work ?
EDIT:
I know how to map a vector to a matrix.
I want to map a vector to a portion of a square matrix like just like in the sudoku game.
EDIT2:
Bassicaly what I want is to be able to map a box number to a tuple ,
for example with 3x3 boxes and a 9x9 matrix
(0,0) => 1
(0,1) => 1
(8,8) => 9

Updated Answer to Edit2:
If you want a mapping like:
1 2 3
4 5 6
7 8 9
then your original code is almost want you want (just add 1):
for (int y = 0; y < 9; ++y)
{
for (int x = 0; x < 9; ++x)
{
int index = x/3 + (y/3) * 3 + 1;
printf("%d ", index);
}
printf("\n");
}
Which outputs:
1 1 1 2 2 2 3 3 3
1 1 1 2 2 2 3 3 3
1 1 1 2 2 2 3 3 3
4 4 4 5 5 5 6 6 6
4 4 4 5 5 5 6 6 6
4 4 4 5 5 5 6 6 6
7 7 7 8 8 8 9 9 9
7 7 7 8 8 8 9 9 9
7 7 7 8 8 8 9 9 9

Most efficient sorting algorithm to continuously sort an vector<vector<double>>

What is the fastest algorithm to keep a
vector<vector<double>>
continuously "merge" sorted being able to handle updates in realtime?
For example, at T0 vec<vector<double> is empty
At T1, (in fact only one vec<double> comes in at once)
A = 1, 2, 4
B = 1, 3, 4, 5
C = 6, 7
The vector<vector> gets merge-sorted into,
1
1
2
3
4
4
5
6
7
At T2
C = 0, 4
D = 3, 7
The new list would be
0
1
1
2
3
3
4
4
4
5
7
So first we have to remove the old values of C, then "insert" the new values of C correctly.
Some sort function like this AVL_Tree Func(vector<vector<double>> vecvec, vector<double> newVec) that returns tree would seem to be best. AVL Tree? Can someone show me a c++ templatized version of code that would work? Boost, STL etc use is fine.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Calculate order statistics from a matrix in SAS - sas

Here's a partial answer I've come up with: k = 3; tops = j(k,ncol(x),0); do c=1 to ncol(x); r = rank(x[,c]); h=loc(r>=nrow(x)-k+1); tops[,c] = x[,c][h]; end; This approach uses a loop, which I'd like to avoid, so please post improvements if possible!

Related

What will be the output of the following pseudo code for input 7?

Downscale array for decimal factor

Array elements not getting edited in a Python List

Mapping the subdivision of an matrix to a vector

Most efficient sorting algorithm to continuously sort an vector<vector<double>>

Categories

Resources