Indicator for Top3 and ranking across ros - sas

What I am trying to do are following: I want to find out if a observation (A) is top 3 across others.
For example,
A B C D E F G H TOP3-A
1 20 30 40 50 60 70 80 90 N
2 80 90 70 80 0 0 0 0 Y
3 70 0 0 80 90 0 0 0 Y
4 60 70 80 90 0 0 0 0 N
I am thinking transpose + rank + transpose + if <4 then Y else N, however it seems too cumbersome and to be honest as a newbie I do not how to code all these steps correctly...

Your method would work, but there's a much simpler way of doing it.
You could use an array, which reads across rows, however I'm using an even easier way of reading across rows.
The OF statement can be used in conjunction with a summary function to calculate values across rather than down. The LARGEST function returns the largest nth value from a range, so you can compare field A to the 3rd largest value in the row.
I've given you the answer to produce Y, N plus an alternative that produces 1, 0 which is even simpler.
data have;
input A B C D E F G H;
datalines;
20 30 40 50 60 70 80 90
80 90 70 80 0 0 0 0
70 0 0 80 90 0 0 0
60 70 80 90 0 0 0 0
;
run;
data want;
set have;
if A >= largest(3, of A--H) then top3_A = 'Y'; /* A--H references all columns between A and H */
else top3_A = 'N';
/* or */
top3_A2 = (A >= largest(3, of A--H)); /* returns 1 for true, 0 for flase */
run;

Related

Sorting a 2-d vector but getting an unexpected output

I am trying to sort a 2 D vector and I am getting the desired output for input 'n' less than 15 but above that it is not arranged in the order that I want. If all the first column values are 0 then the second column must have increasing ordered values.
#include <bits/stdc++.h>
using namespace std;
bool sortcol(const vector<long long int>& v1, const vector<long long int>& v2)
{
return v1[0] < v2[0];
}
int main()
{
int n;
cin >> n;
vector<vector<long long int>> arr(n,vector<long long int> (2));
for (int i = 0; i < n; i++)
{
arr[i][0] = 0;
arr[i][1] = i;
}
sort(arr.begin(), arr.end(),sortcol);
for(int i = 0;i<n;i++){
cout << i << " - " << arr[i][0] << " , " << arr[i][1] << endl;
}
}
Output I want to be like :-
15 0
0 - 0 , 0
1 - 0 , 1
2 - 0 , 2
3 - 0 , 3
4 - 0 , 4
5 - 0 , 5
6 - 0 , 6
7 - 0 , 7
8 - 0 , 8
9 - 0 , 9
10 - 0 , 10
11 - 0 , 11
12 - 0 , 12
13 - 0 , 13
14 - 0 , 14
But what I getting is :-
50 0
0 - 0 , 38
1 - 0 , 26
2 - 0 , 27
3 - 0 , 28
4 - 0 , 29
5 - 0 , 30
6 - 0 , 31
7 - 0 , 32
8 - 0 , 33
9 - 0 , 34
10 - 0 , 35
11 - 0 , 36
12 - 0 , 37
13 - 0 , 25
14 - 0 , 39
15 - 0 , 40
16 - 0 , 41
17 - 0 , 42
18 - 0 , 43
19 - 0 , 44
20 - 0 , 45
21 - 0 , 46
22 - 0 , 47
23 - 0 , 48
24 - 0 , 49
25 - 0 , 13
26 - 0 , 1
27 - 0 , 2
28 - 0 , 3
29 - 0 , 4
30 - 0 , 5
31 - 0 , 6
32 - 0 , 7
33 - 0 , 8
34 - 0 , 9
35 - 0 , 10
36 - 0 , 11
37 - 0 , 12
38 - 0 , 0
39 - 0 , 14
40 - 0 , 15
41 - 0 , 16
42 - 0 , 17
43 - 0 , 18
44 - 0 , 19
45 - 0 , 20
46 - 0 , 21
47 - 0 , 22
48 - 0 , 23
49 - 0 , 24
I am running this code on VS code
As others have also noted in the comments, your sortcol() always returns false because v1[0] and v2[0] are always 0. Since the predicate sortcol() tells the sorting algorithm which elements are considered to be "smaller"/"less" than other elements, no element is considered smaller than another one. This implies that all elements are considered to be equal: If a<b is false and b<a is false, this implies a==b is true. In other words, the STL sorting algorithms assume a strict weak ordering, compare e.g. this post and this one.
So all your elements are considered to be equal by the sorting algorithm. The order of elements considered to be equal is implementation defined for std::sort(). Quote for std::sort:
The order of equal elements is not guaranteed to be preserved.
Hence, in your case the implementation is free to change the order of all elements as it sees fit, since all elements are considered to be equal. In practice a different algorithm is used once the input reaches a certain size, in which case a different algorithm is selected. For libstdc++ (the STL of gcc), this happens for n>16 (see the constant _S_threshold in the source code). That is the reason why you see a jump in behavior for n>16 with std::sort().
Other implementations of the STL might use other thresholds (e.g., the Microsoft STL seems to use a value of 32).
On the other hand, std::stable_sort() guarantees that the order of equal elements remains the same. Quote for std::stable_sort:
The order of equivalent elements is guaranteed to be preserved.
Of course, preserving the order is not free, and hence std::stable_sort can be slower.
So, if your sortcol() is really the predicate you want (although, in the example, it does not really make much sense), using std::stable_sort() is the solution you are look for.

Categorise the number of values in a list

I have the list [5,15,25,27,30,39,45,50,55]
How do I code the categorisation of the values in a list, where the categories are groups of roughly 10 ? such that I get the following result. I have absolutely no clue where to start, and I am only just learning to code.
0 - 9 =1
10 - 19 =1
20 - 29 =2
30 - 39 =2
40 - 49 =1
50 - 59 =2
thankyou
(I did think of something like
if list[1] > 0 and < 10 make Group[1] == 1
list[2] > 0 and < 10 make Group[1] == 2
but this was going to generate LOADS of bulky code )
In your example the thing to the left of = is a number divided by 10. Accumulate groups in dictionary then:
from collections import defaultdict
a=[5,15,25,27,30,39,45,50,55]
b=defaultdict(int)
for x in a:
b[x//10]+=1
for k,v in b.items():
print("%d - %d =%d"%(k*10, k*10+9, v))
result:
0 - 9 =1
10 - 19 =1
20 - 29 =2
30 - 39 =2
40 - 49 =1
50 - 59 =2

Density of fractions between 2 given numbers

I'm trying to do some analysis over a simple Fraction class and I want some data to compare that type with doubles.
The problem
Right know I'm looking for some good way to get the density of Fractions between 2 numbers. Fractions is basically 2 integers (e.g. pair< long, long>), and the density between s and t is the amount of representable numbers in that range. And it needs to be an exact, or very good approximation done in O(1) or very fast.
To make it a bit simpler, let's say I want all the numbers (not fractions) a/b between s and t, where 0 <= s <= a/b < t <= M, and 0 <= a,b <= M (b > 0, a and b are integers)
Example
If my fractions were of a data type which only count to 6 (M = 6), and I want the density between 0 and 1, the answer would be 12. Those numbers are:
0, 1/6, 1/5, 1/4, 1/3, 2/5, 1/2, 3/5, 2/3, 3/4, 4/5, 5/6.
What I thought already
A very naive approach would be to cycle trough all the possible fractions, and count those which can't be simplified. Something like:
long fractionsIn(double s, double t){
long density = 0;
long M = LONG_MAX;
for(int d = 1; d < floor(M/t); d++){
for(int n = ceil(d*s); n < M; n++){
if( gcd(n,d) == 1 )
density++;
}
}
return density;
}
But gcd() is very slow so it doesn't works. I also try doing some math but i couldn't get to anything good.
Solution
Thanks to #m69 answer, I made this code for Fraction = pair<Long,Long>:
//this should give the density of fractions between first and last, or less.
double fractionsIn(unsigned long long first, unsigned long long last){
double pi = 3.141592653589793238462643383279502884;
double max = LONG_MAX; //i can't use LONG_MAX directly
double zeroToOne = max/pi * max/pi * 3; // = approx. amount of numbers in Farey's secuence of order LONG_MAX.
double res = 0;
if(first == 0){
res = zeroToOne;
first++;
}
for(double i = first; i < last; i++){
res += zeroToOne/(i * i+1);
if(i == i+1)
i = nextafter(i+1, last); //if this happens, i might not count some fractions, but i have no other choice
}
return floor(res);
}
The main change is nextafter, which is important with big numbers (1e17)
The result
As I explain at the begining, I was trying to compare Fractions with double. Here is the result for Fraction = pair<Long,Long> (and here how I got the density of doubles):
Density between 0,1: | 1,2 | 1e6,1e6+1 | 1e14,1e14+1 | 1e15-1,1e15 | 1e17-10,1e17 | 1e19-10000,1e19 | 1e19-1000,1e19
Doubles: 4607182418800017408 | 4503599627370496 | 8589934592 | 64 | 8 | 1 | 5 | 0
Fraction: 2.58584e+37 | 1.29292e+37 | 2.58584e+25 | 2.58584e+09 | 2.58584e+07 | 2585 | 1 | 0
Density between 0 and 1
If the integers with which you express the fractions are in the range 0~M, then the density of fractions between the values 0 (inclusive) and 1 (exclusive) is:
M: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0~(1): 1 2 4 6 10 12 18 22 28 32 42 46 58 64 72 80 96 102 120 128 140 150 172 180 200 212 230 242 270 278 308 ...
This is sequence A002088 on OEIS. If you scroll down to the formula section, you'll find information about how to approximate it, e.g.:
Φ(n) = (3 ÷ π2) × n2 + O[n × (ln n)2/3 × (ln ln n)4/3]
(Unfortunately, no more detail is given about the constants involved in the O[x] part. See discussion about the quality of the approximation below.)
Distribution across range
The interval from 0 to 1 contains half of the total number of unique fractions that can be expressed with numbers up to M; e.g. this is the distribution when M = 15 (i.e. 4-bit integers):
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
72 36 12 6 4 2 2 2 1 1 1 1 1 1 1 1
for a total of 144 unique fractions. If you look at the sequence for different values of M, you'll see that the steps in this sequence converge:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1: 1 1
2: 2 1 1
3: 4 2 1 1
4: 6 3 1 1 1
5: 10 5 2 1 1 1
6: 12 6 2 1 1 1 1
7: 18 9 3 2 1 1 1 1
8: 22 11 4 2 1 1 1 1 1
9: 28 14 5 2 2 1 1 1 1 1
10: 32 16 5 3 2 1 1 1 1 1 1
11: 42 21 7 4 2 2 1 1 1 1 1 1
12: 46 23 8 4 2 2 1 1 1 1 1 1 1
13: 58 29 10 5 3 2 2 1 1 1 1 1 1 1
14: 64 32 11 5 4 2 2 1 1 1 1 1 1 1 1
15: 72 36 12 6 4 2 2 2 1 1 1 1 1 1 1 1
Not only is the density between 0 and 1 half of the total number of fractions, but the density between 1 and 2 is a quarter, and the density between 2 and 3 is close to a twelfth, and so on.
As the value of M increases, the distribution of fractions across the ranges 0-1, 1-2, 2-3 ... converges to:
1/2, 1/4, 1/12, 1/24, 1/40, 1/60, 1/84, 1/112, 1/144, 1/180, 1/220, 1/264 ...
This sequence can be calculated by starting with 1/2 and then:
0-1: 1/2 x 1/1 = 1/2
1-2: 1/2 x 1/2 = 1/4
2-3: 1/4 x 1/3 = 1/12
3-4: 1/12 x 2/4 = 1/24
4-5: 1/24 x 3/5 = 1/40
5-6: 1/40 x 4/6 = 1/60
6-7: 1/60 x 5/7 = 1/84
7-8: 1/84 x 6/8 = 1/112
8-9: 1/112 x 7/9 = 1/144 ...
You can of course calculate any of these values directly, without needing the steps inbetween:
0-1: 1/2
6-7: 1/2 x 1/6 x 1/7 = 1/84
(Also note that the second half of the distribution sequence consists of 1's; these are all the integers divided by 1.)
Approximating the density in given interval
Using the formulas provided on the OEIS page, you can calculate or approximate the density in the interval 0-1, and multiplied by 2 this is the total number of unique values that can be expressed as fractions.
Given two values s and t, you can then calculate and sum the densities in the intervals s ~ s+1, s+1 ~ s+2, ... t-1 ~ t, or use an interpolation to get a faster but less precise approximate value.
Example
Let's assume that we're using 10-bit integers, capable of expressing values from 0 to 1023. Using this table linked from the OEIS page, we find that the density between 0~1 is 318452, and the total number of fractions is 636904.
If we wanted to find the density in the interval s~t = 100~105:
100~101: 1/2 x 1/100 x 1/101 = 1/20200 ; 636904/20200 = 31.53
101~102: 1/2 x 1/101 x 1/102 = 1/20604 ; 636904/20604 = 30.91
102~103: 1/2 x 1/102 x 1/103 = 1/21012 ; 636904/21012 = 30.31
103~104: 1/2 x 1/103 x 1/104 = 1/21424 ; 636904/21424 = 29.73
104~105: 1/2 x 1/104 x 1/105 = 1/21840 ; 636904/21840 = 29.16
Rounding these values gives the sum:
32 + 31 + 30 + 30 + 29 = 152
A brute force algorithm gives this result:
32 + 32 + 30 + 28 + 28 = 150
So we're off by 1.33% for this low value of M and small interval with just 5 values. If we had used linear interpolation between the first and last value:
100~101: 31.53
104~105: 29.16
average: 30.345
total: 151.725 -> 152
we'd have arrived at the same value. For larger intervals, the sum of all the densities will probably be closer to the real value, because rounding errors will cancel each other out, but the results of linear interpolation will probably become less accurate. For ever larger values of M, the calculated densities should converge with the actual values.
Quality of approximation of Φ(n)
Using this simplified formula:
Φ(n) = (3 ÷ π2) × n2
the results are almost always smaller than the actual values, but they are within 1% for n ≥ 182, within 0.1% for n ≥ 1880 and within 0.01% for n ≥ 19494. I would suggest hard-coding the lower range (the first 50,000 values can be found here), and then using the simplified formula from the point where the approximation is good enough.
Here's a simple code example with the first 182 values of Φ(n) hard-coded. The approximation of the distribution sequence seems to add an error of a similar magnitude as the approximation of Φ(n), so it should be possible to get a decent approximation. The code simply iterates over every integer in the interval s~t and sums the fractions. To speed up the code and still get a good result, you should probably calculate the fractions at several points in the interval, and then use some sort of non-linear interpolation.
function fractions01(M) {
var phi = [0,1,2,4,6,10,12,18,22,28,32,42,46,58,64,72,80,96,102,120,128,140,150,172,180,200,212,230,242,270,278,308,
324,344,360,384,396,432,450,474,490,530,542,584,604,628,650,696,712,754,774,806,830,882,900,940,964,1000,
1028,1086,1102,1162,1192,1228,1260,1308,1328,1394,1426,1470,1494,1564,1588,1660,1696,1736,1772,1832,1856,
1934,1966,2020,2060,2142,2166,2230,2272,2328,2368,2456,2480,2552,2596,2656,2702,2774,2806,2902,2944,3004,
3044,3144,3176,3278,3326,3374,3426,3532,3568,3676,3716,3788,3836,3948,3984,4072,4128,4200,4258,4354,4386,
4496,4556,4636,4696,4796,4832,4958,5022,5106,5154,5284,5324,5432,5498,5570,5634,5770,5814,5952,6000,6092,
6162,6282,6330,6442,6514,6598,6670,6818,6858,7008,7080,7176,7236,7356,7404,7560,7638,7742,7806,7938,7992,
8154,8234,8314,8396,8562,8610,8766,8830,8938,9022,9194,9250,9370,9450,9566,9654,9832,9880,10060];
if (M < 182) return phi[M];
return Math.round(M * M * 0.30396355092701331433 + M / 4); // experimental; see below
}
function fractions(M, s, t) {
var half = fractions01(M);
var frac = (s == 0) ? half : 0;
for (var i = (s == 0) ? 1 : s; i < t && i <= M; i++) {
if (2 * i < M) {
var f = Math.round(half / (i * (i + 1)));
frac += (f < 2) ? 2 : f;
}
else ++frac;
}
return frac;
}
var M = 1023, s = 100, t = 105;
document.write(fractions(M, s, t));
Comparing the approximation of Φ(n) with the list of the 50,000 first values suggests that adding M÷4 is a workable substitute for the second part of the formula; I have not tested this for larger values of n, so use with caution.
Blue: simplified formula. Red: improved simplified formula.
Quality of approximation of distribution
Comparing the results for M=1023 with those of a brute-force algorithm, the errors are small in real terms, never more than -7 or +6, and above the interval 205~206 they are limited to -1 ~ +1. However, a large part of the range (57~1024) has fewer than 100 fractions per integer, and in the interval 171~1024 there are only 10 fractions or fewer per integer. This means that small errors and rounding errors of -1 or +1 can have a large impact on the result, e.g.:
interval: 241 ~ 250
fractions/integer: 6
approximation: 5
total: 50 (instead of 60)
To improve the results for intervals with few fractions per integer, I would suggest combining the method described above with a seperate approach for the last part of the range:
Alternative method for last part of range
As already mentioned, and implemented in the code example, the second half of the range, M÷2 ~ M, has 1 fraction per integer. Also, the interval M÷3 ~ M÷2 has 2; the interval M÷4 ~ M÷3 has 4. This is of course the Φ(n) sequence again:
M/2 ~ M : 1
M/3 ~ M/2: 2
M/4 ~ M/3: 4
M/5 ~ M/4: 6
M/6 ~ M/5: 10
M/7 ~ M/6: 12
M/8 ~ M/7: 18
M/9 ~ M/8: 22
M/10 ~ M/9: 28
M/11 ~ M/10: 32
M/12 ~ M/11: 42
M/13 ~ M/12: 46
M/14 ~ M/13: 58
M/15 ~ M/14: 64
M/16 ~ M/15: 72
M/17 ~ M/16: 80
M/18 ~ M/17: 96
M/19 ~ M/18: 102 ...
Between these intervals, one integer can have a different number of fractions, depending on the exact value of M, e.g.:
interval fractions
202 ~ 203 10
203 ~ 204 10
204 ~ 205 9
205 ~ 206 6
206 ~ 207 6
The interval 204 ~ 205 lies on the edge between intervals, because M ÷ 5 = 204.6; it has 6 + 3 = 9 fractions because M modulo 5 is 3. If M had been 1022 or 1024 instead of 1023, it would have 8 or 10 fractions. (This example is straightforward because 5 is a prime; see below.)
Again, I would suggest using the hard-coded values for Φ(n) to calculate the number of fractions for the last part of the range. If you use the first 17 values as listed above, this covers the part of the range with fewer than 100 fractions per integer, so that would reduce the impact of rounding errors below 1%. The first 56 values would give you 0.1%, the first 182 values 0.01%.
Together with the values of Φ(n), you could hard-code the number of fractions of the edge intervals for each modulo value, e.g.:
modulo: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
M/ 2 1 2
M/ 3 2 3 4
M/ 4 4 5 5 6
M/ 5 6 7 8 9 10
M/ 6 10 11 11 11 11 12
M/ 7 12 13 14 15 16 17 18
M/ 8 18 19 19 20 20 21 21 22
M/ 9 22 23 24 24 25 26 26 27 28
M/10 28 29 29 30 30 30 30 31 31 32
M/11 32 33 34 35 36 37 38 39 40 41 42
M/12 42 43 43 43 43 44 44 45 45 45 45 46
M/13 46 47 48 49 50 51 52 53 54 55 56 57 58
M/14 58 59 59 60 60 61 61 61 61 62 62 63 63 64
M/15 64 65 66 66 67 67 67 68 69 69 69 70 70 71 72
M/16 72 73 73 74 74 75 75 76 76 77 77 78 78 79 79 80
M/17 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
M/18 96 97 97 97 97 98 98 99 99 99 99 100 100 101 101 101 101 102
This is exactly the same as: (Sum of phi(k)) where m <= k <= M where phi(k) is the Euler Totient Function and with phi(0) = 1 (as defined by the problem). There is no known closed form for this sum. However there are many optimizations known as mentioned in the wiki link. This is known as the Totient Summatory Function in Wolfram. The same website also links to the series: A002088 and provides a few asymptotic approximations.
The reasoning is this: consider the number of values of the form {1/M, 2/M, ...., (M-1)/M, M/M}. All those fractions that will be reducible to a smaller value will not be counted in phi(M) because they are not relatively prime. They will appear in the summation of another totient.
For example, phi(6) = 12 and you have 1 + phi(6), since you also count the 0.

SAS proc optmodel: trying to find the optimal cut-off

I'm new to proc optmodel and would appreciate any help to solve the problem at hand.
Here's my problem:
My dataset is like below:
data my data;
input A B C;
cards;
0 240 3
3.4234 253 2
0 258 7
0 272 4
0 318 7
0 248 8
0 260 2
0.2555 305 5
0 314 5
1.7515 235 7
32 234 4
0 301 3
0 293 5
0 302 12
0 234 2
0 258 4
0 289 2
0 287 10
0 313 3
0.7725 240 7
0 268 3
1.4411 286 9
0 234 13
0.0474 318 2
0 315 4
0 292 5
0.4932 272 3
0 288 4
0 268 4
0 284 6
0 270 4
50.9188 293 3
0 272 3
0 284 2
0 307 3
;
run;
There are 3 variables(A,B,C) and I want to classify observations into three classes (H,M,L) based on these 3 variables.
For class H, I want to maximize A, minimize B and C;
For class M, I want to median A,B and C;
For class L, I want to minimize A, maximize B and C.
Also, the constrain is that I want to limit the total observations classified into H less than 5%, and total observations classified into M less than 7%.
The final target is finding the cut-off of A,B,C for classifying obs into three different classes.
Since the three classes are equally weighted,so I scaled the vars first and create a risk var where risk = A+(1-B)+(1-C);
Thanks in advance for any help.
my sas code:
proc stdize data=my_data out=my_data1 method=RANGE;
var A B C;
run;
data new;
set my_data1;
risk = A+(1-B)+(1-C);
run;
proc sort data=new out=range;
by risk;
run;
proc optmodel;
/* read data */
set CUTOFF;
/* str risk_level {CUTOFF}; */
num a {CUTOFF};
num b {CUTOFF};
num c {CUTOFF};
read data my_data1 into CUTOFF=[_n_] a=A b=B c=C;
impvar risk{p in CUTOFF} = a[p]+(1-b[p])+(1-c[p]);
var indh {CUTOFF} binary;
var indmh {CUTOFF} binary;
var indo {CUTOFF} binary;
con sum{p in CUTOFF} indh[p] le 10;
con sum{p in CUTOFF} indmh[p] le 6;
con sum{p in CUTOFF} indo[p] le 19;
con class{p in CUTOFF}:indh[p]+indmh[p]+indo[p] le 1;
max new = sum{p in CUTOFF}(10*indh[p]+4*indmh[p]+indo[p])*risk[p];
solve;
print a b c risk indh indmh indo new;
quit;
So now my problem is how to find the min risk value in each class,Thanks!

Bash: expand a list of coordinates (sed?)

I have a list of simple coordinates (longitude, latitude pairs) like
110 30
-120 0
130 -30
0 30
and try to expand it to this:
110 30 110\272E 30\272N 110 30 LON0
-120 0 120\272W 0\272 -120 0 LON0
130 -30 130\272E 30\272S 130 -30 LON0
0 30 0\272 30\272N 0 30 LON0
Examining the first line:
110 30 110\272E 30\272N 110 30 LON0
110 30 The first two values just stay the same
110\272E the third value is basically the first value with an added (octal \272) degree symbol and an E for positive values or a W for negative values
30\272N similar to the third value, this is the latitude with an added degree symbol and a N for positive and a S for negative values.
110 30 is just a repetition of the first two values
LON0 is a fixed string for later replacement.
Things tried so far:
I played around with sed, but was unable to achieve anything remotely useful. I wasn't able to manipulate the matched values depending on them being negative or positive.
Any help is greatly appreciated.
All the best,
Chris
EDIT: #jaypal suggested to add different possible cases that can occur. Original was only one case with minor deviations in value.
EDIT2: Had to adjust the example data due to me not updating all values in the sample data. My apologies.
Can you use awk? It will be very easy:
$ cat file
110 30
-120 0
130 -30
0 30
awk '
function abs(x) {
x = x > 0 ? x : x * -1
return x
}
{
print abs($1),abs($2), ($1>0?abs($1)"\272E":$1==0?$1"\272":abs($1)"\272W"), ($2>0?abs($2)"\272N":$2==0?$2"\272":abs($2)"\272S"), abs($1), abs($2), "LON0"
}' file
110 30 110ºE 30ºN 110 30 LON0
120 0 120ºW 0º 120 0 LON0
130 30 130ºE 30ºS 130 30 LON0
0 30 0º 30ºN 0 30 LON0
If you want to print \272 instead of º just add another backslash to prevent it from interpolating. So modify the above script and use \\272 where ever you see \272.
We print the fields as you desire in your output and the following two syntax:
($1>0?$1"\272E":$1"\272W")
($2>0?$2"\272N":$2"\272S")
are ternary operators that checks for the positivity of the values. If first is positive use E else W. If second is positive use N else use S.
Update:
awk '
function abs(x) {
x = x > 0 ? x : x * -1
return x
}
{
print $1,$2,($1>0?$1"\\272E":$1==0?$1"\\272":abs($1)"\\272W"),($2>0?$2"\\272N":$2==0?$2"\\272":abs($2)"\\272S"),$1,$2, "LON0"
}' file
110 30 110\272E 30\272N 110 30 LON0
-120 0 120\272W 0\272 -120 0 LON0
130 -30 130\272E 30\272S 130 -30 LON0
0 30 0\272 30\272N 0 30 LON0