I am looking at some Fortran code from an old scanned paper. The scan quality is not great so I may have copied it wrong. I tried to run this using an online Fortran compiler but it bombs out. Not being familiar with Fortran, I was wondering if someone can point out where the syntax does not make sense? The code is from a paper on sediment dynamics:
Komar, P.D. and Miller, M.C., 1975. On the comparison between the threshold of sediment motion under waves and unidirectional currents with a discussion of the practical evaluation of the threshold: Reply. Journal of Sedimentary Research, 45(1).
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (6O,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
15 J = 1.20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM //
1 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC //
2 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl //
3 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH //
4 22X, 2HCM, 8X, 2HCM /)
C INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1.60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.0
1 + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
C CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
I0 FORMAT(IH0, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END
The problem is more likely that old Fortran requires fixed form code formatting where the number of spaces before a statement is very important.
Here are some general rules
Normal statements start at column 7 and beyond
Lines cannot exceed 72 columns
Any character placed on column 6 indicates the line is a continuation from the line above. I see that on the code above in the lines following 9 FORMAT(..
A number placed between columns 1-5 indicates a label, which can be a target of a GO TO statement, a DO statement or a formatting specification.
The character C on the first column, and sometimes any character on the first column indicate the line is a comment line.
see https://people.cs.vt.edu/~asandu/Courses/MTU/CS2911/fortran_notes/node4.html for more info.
Based on the rules above, here is how to enter the code, with the correct spacing. I run the F77 code through a converter to make it compatible with F90 and F77 at the same time. The code below might compile with the online compiler now.
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (60,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
DO 15 J=1,20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM // &
& 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC // &
& 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl // &
& 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH // &
& 22X, 2HCM, 8X, 2HCM /)
! INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1,60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.&
& + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
! CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
10 FORMAT(G14.4, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END
I found several transcription errors, replacing commas with dots, zeros with the letter O, and a missing DO statement.
I want a new data set in which the variable y is equal to the value in the n row minus the lags values.
The original data set:
data test;
input x;
datalines;
20
40
2
5
74
;
run;
I used the dif function, but It returns the difference with a one lag:
data want;
set test;
y = dif(x);
run;
And I want:
_n_ = 1 y = 20
_n_ = 2 y = 40 - 20 = 20
_n_ = 3 y = 2 - (40 + 20) = -58
_n_ = 4 y = 5 - (2 + 40 + 20) = - 57
_n_ = 5 y = 74 - (5 + 2 + 40 + 20) = 7
Thanks.
No need for lag() or dif(). Just make another variable to retain the running total.
data want ;
set test;
y=x-cumm;
output;
cumm+x;
run;
I kept the extra column and output the values before updating the running total to make it clearer what value was used in the calculation of Y.
Obs x y cumm
1 20 20 0
2 40 20 20
3 2 -58 60
4 5 -57 62
5 74 7 67
Possible solution (thanks to Longfish for suggestions):
data want;
set test;
retain total 0;
total = total + x;
y = x - coalesce(lag(total), 0);
run;
I answered a SAS question a few minutes ago and realized there is a generalization that might be more useful than that one (here). I didn't see this question already in StackOverflow.
The general question is: How can you process and keep an entire BY-group based on some characteristic of the BY-group that you might not know until you have looked at all the observations in the group?
Using input data similar to that from the earlier question:
* For some reason, we are tasked with keeping only observations that
* are in groups of ID_1 and ID_2 that contain at least one obs with
* a VALUE of 0.;
* In the following data, the following ID and ID_2 groups should be
* kept:
* A 2 (2 obs)
* B 1 (3 obs)
* B 3 (2 obs)
* B 4 (1 obs)
* The resulting dataset will have 8 observations.;
data x;
input id $ id_2 value;
datalines;
A 1 1
A 1 1
A 1 1
A 2 0
A 2 1
B 1 0
B 1 1
B 1 3
B 2 1
B 3 0
B 3 0
B 4 0
C 2 4
;
run;
Double DoW loop solution:
data have;
input id $ id_2 value;
datalines;
A 1 1
A 1 1
A 1 1
A 2 0
A 2 1
B 1 0
B 1 1
B 1 3
B 2 1
B 3 0
B 3 0
B 4 0
C 2 4
;
run;
data want;
do _n_ = 1 by 1 until(last.id_2);
set have;
by id id_2;
flag = sum(flag,value=0);
end;
do _n_ = 1 to _n_;
set have;
if flag then output;
end;
drop flag;
run;
I've tested this against the point approach using ~55m rows and found no appreciable difference in performance. Dataset used:
data have;
do ID = 1 to 10000000;
do id_2 = 1 to ceil(ranuni(1)*10);
do value = floor(ranuni(2) * 5);
output;
end;
end;
end;
run;
My answer might not be the most efficient, especially for large datasets, and I'm interested in seeing other possible answers. Here it is:
* For some reason, we are tasked with keeping only observations that
* are in groups of ID_1 and ID_2 that contain at least one obs with
* a VALUE of 0.;
* In the following data, the following ID and ID_2 groups should be
* kept:
* A 2 (2 obs)
* B 1 (3 obs)
* B 3 (2 obs)
* B 4 (1 obs)
* The resulting dataset will have 8 observations.;
data x;
input id $ id_2 value;
datalines;
A 1 1
A 1 1
A 1 1
A 2 0
A 2 1
B 1 0
B 1 1
B 1 3
B 2 1
B 3 0
B 3 0
B 4 0
C 2 4
;
run;
* I realize the data are already sorted, but I think it is better
* not to assume they are.;
proc sort data=x;
by id id_2;
run;
data obstokeep;
keep id id_2 value;
retain startptr haszero;
* This SET statement reads through the dataset in sequence and
* uses the CUROBS option to obtain the observation number. In
* most situations, this will be the same as the _N_ automatic
* variable, but CUROBS is probably safer.;
set x curobs=myptr;
by id id_2;
* When this is the first observation in a BY-group, save the
* current observation number (pointer).
* Also initialize a flag variable that will become 1 if any
* obs contains a VALUE of 0;
* The variables are in a RETAIN statement, so they keep their
* values as the SET statement above is executed for each obs
* in the BY-group.;
if first.id_2
then do;
startptr=myptr;
haszero=0;
end;
* This statement is executed for each observation. We check
* whether VALUE is 0 and, if so, record that fact.;
if value = 0
then haszero=1;
* At the end of the BY-group, we check to see if there were
* any observations with VALUE = 0. If so, we go back using
* another SET statement, re-read them via direct access, and
* write them to the output dataset.
* (Note that if VALUE order is not relevant, you can gain a bit
* more efficiency by writing the current obs first, then going
* back to get the rest.);
if last.id_2 and haszero
then do;
* When LAST and FIRST at the same time, there is only one
* obs, so no need to backtrack, just output and go on.;
if first.id_2
then output obstokeep;
else do;
* Here we assume that the observations are sequential
* (which they will be for a sequential SET statement),
* so we re-read these observations using another SET
* statement with the POINT option for direct access
* starting with the first obs of the by-group (the
* saved pointer) and ending with the current one (the
* current pointer).;
do i=startptr to myptr;
set x point=i;
output obstokeep;
end;
end;
end;
run;
proc sql;
select a.*,b.value from (select id,id_2 from have where value=0)a left join have b
on a.id=b.id and a.id_2=b.id_2;
quit;
Here is my full problem:
Information:
*Max. total investment: $125
*Pay-off is the sum of the units bought x pay-off/unit
*Cost per investment: Buy-in cost + cost/unit x number of units if you buy at least one unit
*The cost is sum of the costs per investment
Constraints:
*You may not invest in both 2 and 5.
*You may invest in 1 only if you invest at least one of 2 and 3.
*You must invest at least two of 3,4,5.
*You may not invest more than max number of units.
Problem: Maximize profit : pay-off - cost
xi: # of units i ∈ {1,2,3,4,5}
yi=1 if xi>0 else yi=0
cost = sum{i in I} buyInCost_i * yi + cost-unit_i*xi
pay-off = sum{i in I} (pay-off/unit)_i*xi
profit = pay-off - cost
Maximize profit
Subject to
y2+y5 <= 1
y1<= y2+y3
y3+y4+y5 >= 2
x1<=5, x2<=4, x3<=5, x4<=7, x5<=3
cost<=125
Here is my question:
For example I have this binary variable y
yi=1 if xi>0 else yi=0 and i ∈ {1,2,3,4,5}
I declared i as a data set
set I;
data;
set I := 1 2 3 4 5;
I don't know how to add if else condition to y variable in glpk. Can you please help me out?
My modelling :
set I;
/*if x[i]>0 y[i]=1 else y[i]=0 ?????*/
var y{i in I}, binary;
param a{i in I};
/* buy-in cost of investment i */
param b{i in I};
/* cost per unit of investment i */
param c{i in I};
/* pay-off per unit of investment i */
param d{i in I};
/* max number of units of investment i */
var x{i in I} >=0;
/* Number of units that is bought of investment i */
var po := sum{i in I} c[i]*x[i];
var cost := sum{i in I} a[i]*y[i] + b[i]*x[i];
maximize profit: po-cost;
s.t. c1: y[2]+y[5]<=1;
s.t. c2: y[1]<y[2]+y[3];
s.t. c3: y[3]+y[4]+y[5]>=2;
s.t. c4: x[1]<=5
x[2]<=4
x[3]<=5
x[4]<=7
x[5]<=3;
s.t. c5: cost <=125;
s.t. c6{i in I}: M * y[i] > x[i]; // if condition of y[i]
set I := 1 2 3 4 5;
param a :=
1 25
2 35
3 28
4 20
5 40;
param b :=
1 5
2 7
3 6
4 4
5 8;
param c :=
1 15
2 25
3 17
4 13
5 18;
param d :=
1 5
2 4
3 5
4 7
5 3;
param M := 10000;
I am getting this syntax error:
problem.mod:21: syntax error in variable statement
Context: ...I } ; param d { i in I } ; var x { i in I } >= 0 ; var po :=
MathProg model processing error
You can't directly do that (there is no way to write 'directly' an if constraint in a LP).
However, there are workarounds for this.
For example, you can write:
M * yi > xi
where M is a large constant (greater than any value of xi).
This way:
if xi > 0, then the constraint is equivalent to yi > 0, that is yi == 1 since yi is binary (if M is large enough).
if xi == 0, then the constraint is always verified, and yi will be equal to 0 since your objective is increasing with yi and you are minimizing.
in both case, the constraint is equivalent to the if test.
I am trying to make some existing values to missing values (not deleting them).
Here is the basic structure of my data set.
I want to treat AGE and GENDER as missing whenever A is less than B. For example, when A=1 and B=3, I want to treat values of AGE and GENDER on the last two rows as missing (as shown on the data sets).
In my data both A and B go from 1 to 4 and have every combination of them.
Asterisks mean I have more data between them. Thanks in advance!
BEFORE
ID A B AGE GENDER
--------------
1 1 1 35 M
* * * * *
* * * * *
5 1 2 23 F
5 1 2 21 M
6 1 2 42 F
6 1 2 43 M
* * * * *
* * * * *
20 1 3 43 F
20 1 3 39 M
20 1 3 23 M
21 1 3 32 F
21 1 3 39 M
21 1 3 23 F
* * * * *
* * * * *
55 2 4 32 M
55 2 4 12 M
55 2 4 31 F
55 2 4 43 M
* * * * *
* * * * *
AFTER
ID A B AGE GENDER
--------------
1 1 1 35 M
* * * * *
* * * * *
5 1 2 23 F
5 1 2 . .
6 1 2 42 F
6 1 2 . .
* * * * *
* * * * *
20 1 3 43 F
20 1 3 . .
20 1 3 . .
21 1 3 32 F
21 1 3 . .
21 1 3 . .
* * * * *
* * * * *
55 2 4 32 M
55 2 4 12 M
55 2 4 . .
55 2 4 . .
* * * * *
* * * * *
How about now?
data temp;
retain idcount 0;
set olddata;
** Create an observation counter for each id **;
prev_id = lag(id);
if id ^= prev_id then idcount = 0;
idcount = idcount + 1;
run;
** Sort the obs by ID in reverse order **;
proc sort data=temp;
by id descending idcount;
run;
data temp2;
retain misscount 0;
set temp;
by id descending idcount;
** Keep the previous age and gender **;
old_age = age;
old_gender = gender;
** Count the number that should be missing **;
if a < b then nummiss = b - a;
else nummiss = 0;
** Set a counter of obs that we will set to missing **;
if first.id then misscount = 0;
** Set the appropriate number of rows to missing and update the counter **;
if misscount < nummiss then do;
misscount = misscount + 1;
call missing(age, gender);
end;
run;
proc sort data=temp2 out=temp3(drop=misscount nummiss idcount prev_id);
by id idcount;
run;