i am getting a log warning stating
WARNING: 21 observations omitted due to missing ID values
i was transposing the dataset using this code:
PROC TRANSPOSE DATA= PT OUT= PT;
BY SOC_NM PT_NM;
ID TREATMENT;
VAR COUNT;
RUN;
i want to remove this warning from log.is there any option available in SAS for this
thank you for help.
You need to decide whether you are keeping the TREATMENT=' ' records or not. If you want to keep them, then you need to assign a nonmissing value to TREATMENT. If not, then the WHERE statement like vasja's answer will work.
Will adding WHERE clause do the job for you?
PROC TRANSPOSE DATA= PT OUT= PT;
BY SOC_NM PT_NM;
ID TREATMENT;
VAR COUNT;
WHERE NOT MISSING(TREATMENT);
RUN;
Before transposing, add this condition in the data step
if TREATMENT=. then TREATMENT=99;
after transposing, drop the variable "_99"
There's no option to remove warning messages from the log. If you really must keep your code as is then you can use PROC PRINTTO to temporarily divert the log output to an external file. However, this means you won't see anything in the log for that particular step, so it is not something I would recommend unless you are very sure of what you are doing. Check out the example code below, you'll see that only the steps creating tables a and c show in the log.
data a;
run;
proc printto log='c:\temp\temp.log';
run;
data b;
run;
proc printto;
run;
data c;
run;
Related
Working code:
data t2;
set t1;
where a like "%SR";
run;
Code errored:
data t2;
set t1;
if a like "%SR";
run;
Error message:
ERROR 388-185: Expecting an arithmetic operator.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
It complained about 'like'
Any ideal?
LIKE is not an operator that SAS code understands. The only reason it works in WHERE is because WHERE statement supports SQL syntax such as LIKE and BETWEEN to make it easier to push the WHERE condition into a remote database.
Use some other way to test if the last two letters are SR. Here are two methods.
if 'SR' = substrn(a,length(a)-1);
if 'RS' =: left(reverse((a)) ;
The most similar solution is to use prxmatch:
data t2;
set t1;
if prxmatch("/.*SR/ios",a);
run;
Note that this is much slower than WHERE with LIKE, and Tom's solutions are faster if there's a reasonable way to do them (as there is in the example).
The LIKE operator is not understood by the DATA Step IF statement.
LIKE is available to DATA Step in the WHERE statement, the WHERE= data set option, or PROC SQL WHERE clause.
data have;
input text $CHAR20.;
datalines;
ABCEFG
YESSR
Mark JR
Mark SR
;
data want;
set have;
where text like '%SR'; /* where statement */
run;
data want;
set have(where=(text like '%SR')); /* where= option */
run;
proc sql;
create table want as
select text from have
where text like '%SR' /* where clause */
;
Is there a way to get a list of all the outputs (datasets/files) created by a step(iteration) in SAS?
I tried using the automatic variables but all that I could get was the last created dataset using &syslast and &sysdsn variables. But what if a data step creates multiple datasets? How can I get their names/details automatically in SAS without using any list, etc keywords? Is there a way possible?
Please Suggest!
Thank you!
I don't believe this is possible. The only way I can think of is to parse the log following your data step / iteration.
For this you can use something like:
/* set up a fresh log prior to your iteration */
%let logloc=%sysfunc(pathname(work))/mylog.txt;
proc printto log="&logloc" new;
run;
/* run your iteration */
data mystep with lots of output datasets;
set something;
run;
/* return to normal logging */
proc printto log=log;
run;
data _null_;
infile "&logloc";
input;
if _infile_=:'data' then do;
/* perform log scanning */
/* will likely need some complex logic to be robust!*/
end;
run;
PROC SCAPROC will report this in the log, with the caveat that you have to run the process first and then you'll get the output.
In SAS, how can I assign a variable coming from either the OUTEST or OUTSTAT functions to be used in a loop?
For example, say I want to run some sort of iterative analysis until my mean (average) reaches a certain threshold. I know how to extract the mean using either OUTEST or OUTSTAT, but then how can I perform operations or blocks of code on it?
Thank you.
If you are interested in details, I am trying to perform backward selection of VIFs (to remove multicollinearity). Unfortunately, SAS doesn't seem to have a 'SELECTION=BACKWARD' feature for this...
EDIT: Updated with sample code:
%MACRO MULTICOLLINEARITY(TABLE_SUFFIX,YVAR,FIELDS,MAX_VIF);
/* PRELIMINARY PROC REG ON ALL FIELDS*/
PROC REG DATA=TABLE_&TABLE_SUFFIX. NOPRINT;
MODEL &YVAR = &FIELDS / VIF COLLIN NOINT;
ODS OUTPUT PARAMETERESTIMATES=PAREST1;
RUN;
/* RETAIN NON-NULL VIF FIELDS ONLY */
DATA NO_NULL_VIF;
SET PAREST1 (WHERE=(VarianceInflation <> .));
RUN;
/* CREATE VARIABLE LIST OF NON-NULL VIF FIELDS */
PROC SQL;
SELECT VARIABLE
INTO :NO_NULL_VIF_FIELDS SEPARATED BY ' '
FROM NO_NULL_VIF;
QUIT;
/* RE-RUN REGRESSION WITH NON-NULL VIF FIELDS ONLY */
PROC REG DATA=TABLE_&TABLE_SUFFIX. NOPRINT;
MODEL &YVAR = &NO_NULL_VIF_FIELDS / VIF COLLIN NOINT;
ODS OUTPUT PARAMETERESTIMATES=PAREST2;
RUN;
/* START ITERATION OF DROPPING THE HIGHEST VIF UNTIL THE CRITERIA IS MET */
???
%MEND;
%MULTICOLLINEARITY(, RESPONSE, &INPUT_FIELDS,???)
And by criteria I mean VIF_MAX < N where N is some threshold specified in the macro. For example, if we want to retain only fields with VIF less than 5, then it should drop the highest one, re-run the PROC REG, drop the highest, re-run, etc. etc. until the highest on is less than 5.
First off - I'd verify that you can't do this using PROC MODEL. I'm not a regression guy so I don't know for sure. Might be worth posting on a more stat-focused site; CV isn't really appropriate since they're not generally trying to answer software questions, but maybe communities.sas.com . I would find it surprising if this wasn't directly possible in PROC MODEL and/or in one of the more complicated procs.
Second, the way I'd approach this is to write a recursive macro. Take out the first part (the non-null VIF fields) and either move that to an outer macro that just runs once, or make it an expectation of the programmer to do on his/her own (unless this is not feasible, and/or can change with iterations - not something I'm knowledgeable of). Then do something like this:
%MACRO MULTICOLLINEARITY(TABLE_SUFFIX,YVAR,FIELDS,MAX_VIF);
ods _all_ close;
%put Running with &fields; *note which fields currently running;
*also may want to include a run # counter as parameter;
PROC REG DATA=TABLE_&TABLE_SUFFIX.;
MODEL &YVAR = &FIELDS / VIF COLLIN NOINT;
ODS OUTPUT PARAMETERESTIMATES=PAREST2;
RUN;
quit;
*Data step to analyse PAREST2 and see if any of the fields can be dropped;
proc sort data=parest2;
by descending varianceinflation;
run;
data _null_;
set parest2(obs=1);
if varianceinflation > &max_vif then do;
fields_run = tranwrd("&fields",trim(variable),' ');
if not missing(fields_run) then do;
call_string = cats('%multicollinearity(',"&table_suffix.,&yvar.,",fields_run,",&max_vif.)");
call execute(call_string);
end;
end;
else do;
put "Stopped with Max VIF:" variable "=" varianceinflation;
run;
ods preferences;
%MEND MULTICOLLINEARITY;
Then you call it once with the full field list, and it calls itself in the CALL EXECUTE if there is still a parameter left. An incremented # of runs may be helpful (both to see how many times it ran in your log, and to be able to make sure that you don't end up in an infinite loop if you make a mistake with the fields variable deletion.)
I would run this with OPTION NONOTES NOSOURCE; and none of the symbogen/mprint stuff on, so you can just get the %put/put statements in your log.
I am trying to determine the number of observations in a dataset, then convert this number into a macro variable that i can use as part of a loop. I've searched the web for answers and not had much luck. I would post some example code I've tried but I have literally no idea how to approach this.
Could anybody assist?
Thanks
Chris
SAS stores dataset information, such as number of observations, separately, so the key is to access this information without having to read in the entire dataset.
The following code will do just that, the if 0 part is never true so the dataset isn't read, however the information is.
data _null_;
if 0 then set sashelp.class nobs=n;
call symput('numobs',n);
stop;
run;
%put n=&numobs;
You can also get it from dictionary.tables like this:
proc sql noprint;
select nobs into :nobs
from dictionary.tables
where libname='YourLibrary' and memname='YourDatasetName';
quit;
Here it is:
Create macro variable:
data _null_;
set sashelp.class;
call symput("nbobs",_N_);
run;
See result:
%put &nbobs;
Use it:
data test;
do i = 1 to &nbobs;
put i;
end;
run;
I have a SAS program that loops through certain sets of data and generates a bunch of reports to an ODS HTML destination.
Sometimes, due to small sets of data I run these reports for, a certain PROC REPORT will not generate because, for this set of data I'm on, there is no data to report. I get this message for those instances:
WARNING: A GROUP, ORDER, or ACROSS variable is missing on every observation.
What I want in the HTML is to display some sort of message for these like "did not generate" or something.
I tried to use return/error codes or the warning text above to detect this, but the error code is 0 (no problem, really?) and the warning text doesn't reset if the next PROC REPORT generates OK.
If it is of any importance, I'm using a data step with CALL EXECUTE to get all this PROC REPORT code generated for these sets of data.
Is there any way to generate this "did not generate" message or at least to catch these warnings per PROC REPORT?
You can substitute in a value for the missing observations in your report.
First redefine missing values to some character. I think you can only use a single character, I could be wrong, though.
options missing='M';run;
Then make sure to use the "missing" option in your PROC REPORT.
proc report data=somedata nowd headline missing;
....
run;
EDITS BASED ON COMMENTS
To get comments to show up, I see a few possibilities.
One, scan the the data set and check for missing values. If any are present throw a message out.
Data _Null_;
Set dataset;
file print notitles;
if obs = . then do;
put #01 'DID NOT COMPUTE';
stop;
end;
run;
Two, add a column with a compute:
define xx /computed "(Message)";
compute xx /char length=16 ;
if obs =. then xx = 'did not compute value in row';
Three, a conditional line using compute:
compute after obs;
if obs = . then do;
line #1 "DID NOT COMPUTE";
end;
endcomp;
endcomp;
See: http://www2.sas.com/proceedings/sugi26/p095-26.pdf
Look for the MTANYOBS macro and the section on printing a 'no observations' page.