SAS - "where variable" versus "where not missing(variable)" - sas

In trying to make my code more readable, I face the following situation.
Consider a data step in which you want to select only observations which have a value for variable. It seems there are two ways to do this using a WHERE statement: express the variable alone or use the MISSING function.
For example,
Case 1. Where VARIABLE
data where_var;
set sashelp.electric;
where AllPower;
run;
Case 2. Where not missing(VARIABLE)
data where_not_missing;
set sashelp.electric;
where not missing(AllPower);
run;
These produce the same result. However, I'm unsure whether this is necessarily the case.
Are these functionally equivalent?
Is Case 1 merely syntactic sugar for Case 2?
Are there instances when they will produce different results?

In SAS, any numeric value other than 0 or missing is true. See https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000780367.htm for more info.
That means that your Case 1 and Case 2 are not 100% equivalent. If AllPower = 0, the result will be different.

If you are trying to determine if AllPower has a value then only Case 2 will work.
Case 1 is selecting records where AllPower is non-zero. Zero is a value and is not the same as missing.

Related

Dax Switch Function 3rd value skipping

I use power bi Version: 2.112.603.0 64-bit (December 2022).In this version when im writing dax measure with switch function after giving the second result with the comma its asking third result.Its skipping 3 rd Value.What Im doing wrong here.
Intellisense isn't always correct. Just provide the correct number of arguments and the DAX engine will interpret them correctly.
The (switch function) has a minimum of 3 parameters that are Expression, Value, Result. It is in this situation like the IF function. But you can also add as many as you want other Value, Result pairs, in this case is more similar to the Case When statement.
To come back to your example, you could add other hasonevalue functions and the respective Result value and as last Parameter the else Value (if no Value is matched).

Why does SAS not give p-values for the fixed effects in the mixed model?

I work with a mixed model to see the effects of variables. The code I use is:
proc mixed data=pb2;
class treat_a treat_b hoknr_ day;
model conc=treat_a|treat_b hoknr_/outp=residuals1 residual;
repeated day/subject=hoknr_(treat_a treat_b)type=vc;
run;
The outcome has no p-values for treat_a, treat_b or treat_a|treat_b but it does for hoknr_. I excluded the repeated statement, I simplified the model, I changed class but still I got no p-values for all of my fixed effects. I have used this model before and it worked, now I fitted it to this dataset and it this not fully function.
Edit1
The table of Type 3 Tests of Fixed Effects shows like this:
Type 3 Table.
The treatments could be non-estimable (treat_a is yes or no, likewise for treat_b). I have changed yes/no to 0 or 1, did not change the Type 3 Table. I have worked before with treatments being expressed in words what did not result in a table like this.
Edit2 When solution is added to the model statement, this is the result: Solution for Fixed effects.
What is wrong with this model that it does not show p-values for all fixed effects?
You might need to specify the type of test that you want with the htype= option. It sounds like one of those procs where someone didn't program the function initially, and it was kind of an afterthought late in development (not unlike the showpvalues option in proc glmselect; to this day I think that's the weirdest option in a regression proc).
https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_mixed_sect015.htm#statug.mixed.mixedmodelhtype
Type 3 Tests of Fixed Effects
You can use the HTYPE= option in the MODEL statement to obtain tables
of Type 1 (sequential) tests and Type 2 (adjusted) tests in addition
to or instead of the table of Type 3 (partial) tests.
The ODS table names are "Tests1" for the Type 1 tests, "Tests2" for the Type 2 tests, and "Tests3" for the Type 3 tests.
Or, it could be that some of your fixed effects are not estimable.
The reason of causing the 0 in your result is your treat_a and treat_b are categorical variables. And treat_a = 1 and treat_b = 1 are reference levels. So you are missing p values in your solution table. In your interaction terms, there won't have p values if the terms include treat_a=1 or treat_b=1.

first and last order SAS

I have two lines of data,
Order
17/01/2016
01/02/2014
Basically I want to run a logic like so;
data A.test_active;
set A.Weekly_Email_files_cleaned4;
length active :8.;
length inactive :8.;
if first.Order between '01Jan2014'd and '31Dec2015'd then active= 1;
if last.order between '01Jan2014'd and '31Dec2015'd then inactive= 1;
run;
the field "Order" is formatted by DDMMYY10 when I checked the file properties, but I keep getting this error
ERROR 388-185: Expecting an arithmetic operator.
Can anyone help or suggest something different in the same vain?
In SAS, between is only valid in SQL contexts: either actual PROC SQL, or WHERE statements, generally. It is not otherwise valid in SAS. You would use in (firstval:lastval) instead, if those values are integers (dates are). If they're not integers, you need to use if firstval le val le lastval or similar (can also use ge/lt/gt/>/< or whatever you like, depending on the ordering of things).
Second, first.order and last.order are boolean values - 1 or 0, nothing else, that indicate that you are on a row that is the first row for a new value when sorted by that variable, or the last row similarly. You also must have a by statement by that variable if you're going to use them.
Third, your length statements are wrong; you're confusing some three different things here, I think. Length statements for numerics aren't needed if you're using default length 8, and if you do like having them anyway, you need:
length active 8;
No : or ., both are used for different purposes.
ID first_order Order
alex 01/01/2013 23/01/2015
alex 01/01/2013 23/01/2015
alex 01/01/2013 03/04/2013
basically if an order exists after the first order that is within a certain timeframe (within a year of the date of the first order) then the user is "active"
any ideas much appreciated
thanks

SAS Do-Loop and IF Statement to compare Current and previous row values

I’m pretty new with do loops in SAS and I know that I am trying to make this loop work like a MATLAB script. I haven’t found many helpful tips online as most of the do-loop examples are just for calculations, not actually checking to see if the row before the current one has the same value.
Here is my issue that I need to solve:
I want to look at each policy numbers below and see if the one before is the same, if it is, I want to flag it.
Policy
26X0118907
26X0375309
26X0375309
26X0527509
I would consider i=1 to be the first policy(26X0118907) and i=2 to be the second policy (26X0375309).
In this case according to the code (that doesn't work) below this increment would be flagged as ‘B’. Do you know how to properly code a situation like this?
data AF_Inforce_&thestate.;
set AF_Inforce_&thestate.;
by Rating_St;
if first.Rating_St then counter=0;
counter+1;
myloop:
do i=2 to counter;
P2(i)=Policy(i);
P1(i)=Policy(i-1);
if P1(i)=P2(i) then flag='A';
else flag='B';
end;
return;
run;
The first thing you need to learn coming from MATLAB or a similar language is that SAS is different. In particular, the DATA step is its own DO loop, looping over records.
Second, it's a bit complicated to access data accross rows. However, there are a few tricks.
Vasja showed you one (lag, which doesn't actually go to a previous record, but sort of acts like it does). dif does the same thing except it compares, so if your policynum had been numeric, Vasja's code could be rewritten as dif(policy)=0 instead of policy=lag(policy)(though this is only for numerics).
A better trick in my opinion in your case is to use by group processing. Normally this works with sorted fields, but here it doesn't matter if it's sorted: you just want to know if two consecutive rows are identical, right?
data want;
set have;
by rating_st policy notsorted;
if first.policy and last.policy then recflag='A';
else if first.rating_st then recflag='A';
else recflag='B';
run;
I don't know that I understand your rules entirely, but they're probably going to be some form of this. I put the two possibilities there, you might just want the second one (ie, you don't care if it's singular or just the first). The first would flag only singular policies.
Try looking at LAG function (it "remembers" the values of a variable in a queue)
Your code should go like this:
data AF_Inforce_&thestate.;
set AF_Inforce_&thestate.;
by Rating_St;
if first.Rating_St = 0 and Policy=LAG(Policy) then flag='A';
else flag='B';
run;

CONTRAST for CLASS variable with more than two levels in PROC GLM

Background: When we test the significance of a categorical variable that has been coded as dummy variables, we need to simultaneously test all dummy variables are 0. For example, if X takes on values of 0, 1, 2, 3 and 4, I would fit dummy variables for levels 1-4 (assuming I want 0 to be baseline), then want to simultaneously test B1=B2=B3=B4=0.
If this is the only variable in my data set, I can use the overall F-statistic to achieve this. However, if I have other covariates, the overall F-test doesn't work.
In Stata, for example, this is (very, very) simply carried out by the testparm command as:
testparm i.x (after fitting the desired regression model), where the i. prefix tells Stata X is a categorical data to be treated as dummy variables.
Question/issue: I'm wondering how I can do this in SAS with a CONTRAST (or ESTIMATE?) statement while fitting a regression model with PROC GLM. Since I have scoured the internet and haven't found what I'm looking for, I'm guessing I'm missing something very obvious. However, all of the examples I've seen are NOT for categorical (class) variables, but rather two separate (say continuous) variables. The contrast statement in that case would simply be something like
CONTRAST 'Contrast1' y 1 z 1;
Otherwise, they're for calculating hypotheses like H_0: B1-B2=0.
I feel like I need to breakdown the hypotheses into smaller pieces and determine that set that defines the whole relationship, but I'm not doing it correctly. For example, for B1=B2=B3=B4=0, I thought I might say B1=B2=B3=-B4, then define (1) B1=-B4, (2) B2=-B4 and (3) B2=B3. I was trying to code this as a CONTRAST statement as (say X is in descending order in data set: 4-0):
CONTRAST 'Contrast' x -1 0 0 1 0
x -1 0 1 0 0
x 0 1 1 0 0;
I know this is not correct, and I tried many, many variations and whatever random logic I could come up with. My problem is I have relatively novice-level knowledge of CONTRAST (and unfortunately have not found great documentation to help with this) and also of how this hypothesis test should really be formulated for the sake of estimation (do I try to split it up into pieces as I did above, or...?).
From my note above, you actually can get SAS to do this for you with PROC GENMOD and the CLASS statement and a TYPE3 specification.
proc genmod data=input;
class classvar ;
model slope= classvar othervar/ type3;
run;
quit;
In the example above, my class levels are in the classvar variable. The othervar is my other covariate.
At the end of the output, you see a table labeled LR Statistics For Type 3 Analysis. The row for classvar is the LR test of all the class effects=0.
Another case where PROC REG with TEST works (TEST x1=0, x2=0, x3=0, x4=0, e.g.), which isn't answering my initial question for PROC GLM, but is an option if PROC REG gets the job done for your type of model.