Please explain the value of j in output - sas

data primes;
length status $12.;
do i=3 to 6;
status='Prime';
do j=2 to i-1;
if mod(i, j) = 0 then do;
status='Composite';
leave; *exit loop;
end;
end;
Output;
end;
run;
proc print data = primes;
run;
I wrote this code and got the output as below. can someone please explain how the value of j is being picked here? How can j=i in output when the loop goes till j=i-1
Obs status i j
1 Prime 3 3
2 Composite 4 2
3 Prime 5 5
4 Composite 6 2

I has to do with the way the loop is stopped. The check is done at the top of the loop after the index variable is incremented. If it is greater than the stop value the loop stops. You could stop your loop with until(j eq i-1) and see the value you expect. The reason the DO loop uses GT stop is because the increment may never have the exact value of stop.
Also note this is all in the book.
DO Statement, Iterative

Related

mean of 10 variables with different starting point (SAS)

I have 18 numerical variables pm25_total2000 to pm25_total2018
Each person have a starting year between 2013 and 2018, we can call that variable "reqyear".
Now I want to calculate mean for each persons 10 years before the starting year.
For example if a person have starting year 2015 I want mean(of pm25_total2006-pm25_total2015)
Or if a person have starting year 2013 I want mean(of pm25_total2004-pm25_total2013)
How to do this?
data _null_;
set scapkon;
reqyear=substr(iCDate,1,4)*1;
call symput('reqy',reqyear);
run;
data scatm;
set scapkon;
/* Medelvärde av 10 år innan rekryteringsår */
pm25means=mean(of pm25_total%eval(&reqy.-9)-pm25_total%eval(&reqy.));
run;
%eval(&reqy.-9) will be constant value (the same value for all as for the first person) , in my case 2007
That doesn't work.
You can compute the mean with a traditional loop.
data want;
set have;
array x x2000-x2018;
call missing(sum, mean, n);
do _n_ = 1 to 10;
v = x ( start - 1999 -_n_ );
if not missing(v) then do;
sum + v;
n + 1;
end;
end;
if n then mean = sum / n;
run;
If you want to flex your SAS skill, you can use POKE and PEEK concepts to copy a fixed length slice (i.e. a fixed number of array elements) of an array to another array and compute the mean of the slice.
Example:
You will need to add sentinel elements and range checks on start to prevent errors when start-10 < 2000.
data have;
length id start x2000-x2018 8;
do id = 1 to 15;
start = 2013 + mod(id,6);
array x x2000-x2018;
do over x;
x = _n_;
_n_+1;
end;
output;
end;
format x: 5.;
run;
data want;
length id start mean10yrPriorStart 8;
set have;
array x x2000-x2018;
array slice(10) _temporary_;
call pokelong (
peekclong ( addrlong ( x(start-1999-10) ) , 10*8 ) ,
addrlong ( slice (1))
);
mean10yrPriorStart = mean(of slice(*));
run;
use an array and loop
index the array with years
accumulate the sum of the values
accumulate the count to account for any missing values
divide to obtain the mean value
data want;
set have;
array _pm(2000:2018) pm25_total2000 - pm25_total2018;
do year=reqyear to (reqyear-9) by -1;
*add totals;
total = sum(total, _pm(year));
*add counts;
nyears = sum(nyears,not missing(_pm(year)));
end;
*accounts for possible missing years;
mean = total/nyears;
run;
Note this loop goes in reverse (start year to 9 years previous) because it's slightly easier to understand this way IMO.
If you have no missing values you can remove the nyears step, but not a bad thing to include anyways.
NOTE: My first answer did not address the OP's question, so this a redux.
For this solution, I used Richard's code for generating test data. However, I added a line to randomly add missing values.
x = _n_;
if ranuni(1) < .1 then x = .;
_n_+1;
This alternative does not perform any checks for missing values. The sum() and n() functions inherently handle missing values appropriately. The loop over the dynamic slice of the data array only transfers the value to a temporary array. The final sum and count is performed on the temp array outside of the loop.
data want;
set have;
array x(2000:2018) x:;
array t(10) _temporary_;
j = 1;
do i = start-9 to start;
t(j) = x(i);
j + 1;
end;
sum = sum(of t(*));
cnt = n(of t(*));
mean = sum / cnt;
drop x: i j;
run;
Result:
id start sum cnt mean
1 2014 72 7 10.285714286
2 2015 305 10 30.5
3 2016 458 9 50.888888889
4 2017 631 9 70.111111111

Random number generation without repetition in SAS

I am trying to obtain random numer generation without repetition. My idea is to make do while loop which will go 5 times. Inside i will get the random numer, store it in the table and check at every iteration if the picked numer is in the table or not and then decide if this random pick is an repetition or not.
Here is my code where I try to perform my idea but something is wrong and i do not know where i made a mistake.
data WithoutRepetition;
counter = 0;
array temp (5) _temporary_;
do while(1);
rand=round(4*ranuni(0) +1,1);
if counter = 0 then
do;
temp(1) = rand;
counter=counter+1;
output;
continue;
end;
do a=1 to counter by 1;
if temp(a) = rand then continue ;
end;
temp(counter) = rand;
output;
counter=counter+1;
if counter = 5 then do;
leave;
end;
end;
run;
Sounds like you want a random permutation.
165 data _null_;
166 seed=12345;
167 array r[5] (1:5);
168 put r[*];
169 call ranperm(seed,of r[*]);
170 put r[*];
171 run;
1 2 3 4 5
5 1 4 3 2
This is a simplified version of what you are trying to do.
data WithoutRepetition;
i=0;
array temp[5];
do r=1 by 1 until(i eq dim(temp));
rand=round(4*ranuni(0)+1,1);
if rand not in temp then do; i+1; temp[i]=rand; end;
end;
drop i rand;
run;
You were reasonably close to a working, if convoluted solution. For educational purposes, although data _null_'s answer is much cleaner, here's why your code wasn't working:
Your leave statement is inside a do-end block which is inside another do-loop. Leave statements only break out of the innermost do-loop, so yours has no effect.
The same is true of your continue statements, the first of which is completely unnecessary.
Because you are updating your array with newly-found unique values before you increment counter, previously populated values are being overwritten. This often results in duplicates of the overwritten values appearing in your output.
I would place continue and leave in the same category as goto - avoid using them if at all possible, as they tend to make code difficult to debug. It's clearer to set the exit conditions for all your loops at the point of entry.
Just for fun, though, here's a fixed version of your original code:
data WithoutRepetition;
counter = 0;
array temp (5) _temporary_;
do while(1);
rand=round(4*ranuni(0) +1,1);
if counter = 0 then do;
temp(1) = rand;
counter +1;
output;
end;
dupe = 0;
do a=1 to counter;
if temp(a) = rand then dupe=1;
end;
if dupe then continue;
counter +1;
temp(counter) = rand;
output;
if counter = 5 then leave;
end;
run;
And here is an equivalent version with all the leave and continue statements replaced with more readable alternatives:
data WithoutRepetition;
counter = 0;
array temp (5) _temporary_;
do while(counter < 5);
rand=round(4*ranuni(0) +1,1);
if counter = 0 then do;
temp(1) = rand;
counter +1;
output;
end;
else do;
dupe = 0;
do a=1 to counter while(dupe = 0);
if temp(a) = rand then dupe=1;
end;
if dupe = 0 then do;
counter +1;
temp(counter) = rand;
output;
end;
end;
end;
run;

sas variable not found in simulating binomial distribution

I am new to SAS and I was wondering how to cure the variable not found problem in creating a binomial distribution?
DATA additional (KEEP=X);
DO REPEAT = 1 TO 1000;
CALL STREAMINIT(1234);
DO I=1 TO 1000;
X=RAND("BINOMIAL",0.6,10); /*NUMBER OF WINS IN TEN TOSSES*/
END;
IF X GE 5 THEN WINNER + 1;
ELSE LOSER + 1;
OUTPUT;
END;
RUN;
PROC PRINT DATA=additional;
VAR WINNER LOSER;
RUN;
I am creating a binomial random variable which if x is great than 5 then counts one for the winner, if less than 5 then counts one for the loser, the question is asking to found how many time are winners and how many times are losers. I kept on getting variable not found error. Am i doing something wrong with generating the binomial distribution.
/further editing/ this is the problem I am given.
You are given $10. Let the variable money = 10.
You play a game 10 times. The probability that you win a game is 0.4,
and the probability that you lose a game is 0.6.
If you win a game, you win $1. If you lose a game, you lose $1. So if
you win the first game, money becomes 11. But if you lose the first
game, money becomes 9.
After you have played the game 10 times, money is the amount that you
go home with. If you end up with at least $10, call yourself a winner.
Otherwise, call yourself a loser. Define the variable result as winner
or loser.
(a) Write a data step to generate random numbers and simulate your
result 1000 times. So that I can easily check your outputs, use
1234 as your seed for the random number generator. (You do not
need to show me the 1000 results.)
(b) Write a proc step to show how many times you are a winner, and
how many times you are a loser.
Not fully understand that you want to do with simulation. From your codes, you just keep 1000 records, which are all kept at last loop because of your first loop end position; call streaminit should be first line; you only keep X, you could not get winner and loser variable.
I guess maybe you could try this.
DATA additional;
CALL STREAMINIT(1234);
DO REPEAT= 1 TO 1000; *numbers of sample;
DO I=1 TO 100; *size of sample;
X=RAND("BINOMIAL",0.6,10); /*NUMBER OF WINS IN TEN TOSSES*/
IF X GE 5 THEN results='WINNER';
ELSE results='LOSER';
OUTPUT;
END;
END;
RUN;
proc freq data=additional;
by repeat;
table results;
run;
Edit: It seems that you want to know final results, you could get it from above code by changing results as numeric variable. Here is modified codes, if win is +1, lose is -1.
DATA additional;
CALL STREAMINIT(1234);
DO REPEAT= 1 TO 100; *numbers of sample;
DO I=1 TO 10; *size of sample;
X=RAND("BINOMIAL",0.6,10); /*NUMBER OF WINS IN TEN TOSSES*/
IF X GE 5 THEN results+1;
ELSE results=results-1;
OUTPUT;
END;
results=0;
END;
RUN;
proc freq data=additional;
by repeat;
table results;
run;

SAS do loop substring index with if statment

I have a binary string like '100111111100001111111111000'
It shows as a char variable in SAS.
How can I capture every single change either from 1 to 0 or 0 to 1?
My idea output would be like
Blockquote
type position
1-0 2
0-1 4
1-0 11
0-1 15
1-0 22
I stuck at how to write a recursive statement.(process like 20000 string all in once , every string could be really long......) I'm thinking I can have
zero=index(string,'0'); one=index(string,'1'); if zero>one then
string=substr(string, zero); else if zero
Is this a right direction? How should I put in a DO LOOP statement?
Thank you very much
Aaron
Seems reasonable to me. Slightly simplified.
do position = 1 to length(String)-1;
if subpad(string,position,2)='10' then do;
... output a row for the 1-0 change ...
end;
else if subpad(string,position,2)='01' then do;
... output a row for the 0-1 change ...
end;
end;
With you doing whatever it is you want to output (I assume something like setting a variable to '1-0' and then output;).
I use SUBPAD there sort of out of habit, SUBSTR should work just as well as long as you check the string length properly. SUBPAD won't error if it goes past the end of the string is all.
Please try these codes to see if this is what you are looking for
data have (keep=type pos);
retain type pos;
x = '100111111100001111111111000';
ct01 = count(x,'01');
ct10 = count(x,'10');
pos = 1;
do i =1 to ct01;
pos = find(x,'01',pos)+1;
type='0-1';
output;
end;
pos = 1;
do i =1 to ct10;
pos = find(x,'10',pos)+1;
type='1-0';
output;
end;
run;
proc sort data=have;
by pos;
run;

Shortcut code writing with loop

I am trying to create a loop that wiil shortcut code writing.
I want that every veriable from x1- x30 will be equal: x square i, when i is the index of x1(i.e 1).
For example x7 will be x7=x**7;
I wrote a code, but it doesn't work. and i don't know how to fix him. I will glad for your help people.
DATA maarah (drop = i e);
e = constant("e");
do i = -10 to 10 by 0.01;
x=i;
y=e**x;
output;
end;
length x1-x30 $2001;
do i =1 to 30 by 1;
x i=x**i;
output;
end;
run;
You're close. You need to declare an array. You don't explain what the first half is (the e**i part), so it's not clear what you want here - do you want a few thousand rows with powers of e, and then some rows with x1-x30? And why do you output each time in the second loop? To answer the core question, here:
DATA maarah (drop = i e);
e = constant("e");
do i = -10 to 10 by 0.01;
x=i;
y=e**x;
output;
end;
*length x1-x30 $2001; *what is this? Why do you want it 2001 characters, instead of numeric?;
array xs x1-x30; *you would need a $ after this if you truly wanted character;
do i =1 to 30 by 1;
xs[i]=x**i;
*output; *You probably do not want this. Output is probably outside of the loop.;
end;
run;
I would guess what you really want is this:
DATA maarah (drop = i e);
e = constant("e");
do i = -10 to 10 by 0.01;
x=i;
y=e**x;
*length x1-x30 $2001; *what is this? Why do you want it 2001 characters, instead of numeric?;
array xs x1-x30; *you would need a $ after this if you truly wanted character;
do j =1 to 30;
xs[j]=x**j;
end; *the x1-x30 loop;
output;
end; *the outer loop;
run;