I have following code in proc SQL and I want to move my case statement to data step.
Proc SQL;
select
Xas,
Yas,
case when missing(prj_role_desc) eq 1 then 'Unknown' else prj_role_desc end as prj_role_desc,
case when job_descr eq 'X' or project_status in ('Open', 'Filled', 'Pending') then 'TB'end as tb_status
from employee;
quit;
You have two case statements.
data pies;
set employee;
if missing(prj_role_desc) then prj_role_desc='Unknown';
if job_descr eq 'X' or project_status in ('Open', 'Filled', 'Pending')
then tb_status='TB'
keep Xas Yas prj_role_desc tb_status;
run;
Related
I'm trying to use the hash object as something like an sql query.
What I wanted to know is the equivalent of a conditional if statement for SAS hash objects.
I want a conditional statement on my second hash object that filters based on effective_date.
Something like
If effective_date<"27.05.1966"'d then output.
rc=h.find() returns a 0/1 depending if found or not found but how do I add if statements or where statements to the hash objects?
I was thinking something like
if rc=0 and effective_date<"27.05.1966"'d then do;
set want;
hoh.add();
output;
end;
But I don't think the hash table would know that effective_date is being referenced this way.
Original proc sql statement I'm trying to translate:
proc sql;
create table tb2 as
select distinct
a.effective_date,
a.S_FACILITY_CUSTOMER_ID as customer_id,
a.s_facility_id as facility_id,
a.s_facility_type as facility_type,
a.CUSTOMER_ASSET_CLASS_ID,
a.FACILITY_START_DATE,
a.FACILITY_end_DATE,
sum
(
case
when midas_type_id in ("KC", "KR", "KO","KF") then sum(a.loan_prinicpal_local,a.loan_interest_unpaid_local)
else 0
end
)
as arrear_amount,
sum(a.LOAN_PRINICPAL_LOCAL) as amount_local
from tbexport.tb_export_full_all a
where a.effective_date between "&repdate_from"dt and "&repdate_to"dt
AND BUSINESS_SEGMENT NOT IN ("MCR", "SE", "COR")
group by s_facility_id, effective_date;
quit;
This is what I come up with:
data _NULL_;
if 0 then set tbexport.tb_export_full_all;
dcl hash HoH(ordered : 'D');
HoH.definekey('s_facility_id','effective_date');
HoH.definedata('effective_date', 'S_FACILITY_CUSTOMER_ID', 's_facility_type','s_facility_id','CUSTOMER_ASSET_CLASS_ID',
'FACILITY_START_DATE','FACILITY_end_DATE','midas_type_id','BUSINESS_SEGMENT','loan_prinicpal_local','loan_interest_unpaid_local');
HoH.definedone();
dcl hiter HoHiter('HoH');
do until (lr);
set tbexport.tb_export_full_all end=lr;
where '&repdate_from.'dt<=effective_date<='&repdate_to.'dt;
if HoH.find() ne 0 then do;
dcl hash h(multidata : 'Y', ordered : 'D');
h.definekey('effective_date','s_facility_id');
h.definedata('effective_date', 'S_FACILITY_CUSTOMER_ID', 's_facility_type','s_facility_id','CUSTOMER_ASSET_CLASS_ID',
'FACILITY_START_DATE','FACILITY_end_DATE','midas_type_id','BUSINESS_SEGMENT','loan_prinicpal_local','loan_interest_unpaid_local');
h.definedone();
dcl hiter hi('h');
HoH.add();
end;
h.add();
end;
do until(HoHiter.next() = 1);
set tbexport.tb_export_full_all;
if hoh.find()=0 and midas_type_id in ("KC" , "KR" , "KO" , "KF") then do;
arrear_amount= sum(loan_prinicpal_local,loan_interest_unpaid_local);
end;
hoh.output(dataset:'tb2',ordered:'A');
end;
run;
But it gives me this error:
ERROR: An exception has been encountered.
Please contact technical support and provide them with the following traceback information:
The SAS task name is [DATASTEP]
Segmentation Violation
This should be doable without a hash of hashes. This roughly replicates what you're doing - creating a summary by several variables. The logic is pretty straightforward to add to it at this point I think - it can be a where statement, probably. Just make sure to add it to both of the dataset reads (probably best as a where data set option, perhaps?).
data want;
if 0 then set sashelp.prdsale;
if _n_ eq 1 then do;
declare hash h();
h.defineKey('country','region');
h.defineData('actual_sum','predict_sum');
h.defineDone();
do _n_ = 1 to prdsale_recs;
set sashelp.prdsale nobs=prdsale_recs point=_n_;
rc = h.find();
actual_sum = sum(actual_sum,actual);
predict_sum = sum(predict_sum,predict);
rc = h.replace();
end;
end;
set sashelp.prdsale;
rc = h.find();
run;
So I'd like to do a macro code mixed with proc sql and data step. I have the following code in SAS:
data work.calendar;
set work.calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else
business_day_count = -1;
run;
I'd like to put it into macro code, but it doesn't work.
My macro code:
%macro1();
data work.job_calendar;
set work.job_calendar;
%if business_day_count^=-1 %then %do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
%end;
else
business_day_count = -1;
run;
%mend;
The whole code is:
%macro update_day(date);
proc sql;
update work.job_calendar
set business_day_count =
case when datepart(calendar_date) = "&date"d then -1
else business_day_count
end;
quit;
proc sql;
update work.job_calendar
set status_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set daily_rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set weekly_rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
proc sql;
update work.job_calendar
set monthly_rundate_ind =
case when business_day_count = -1 then 'N'
else status_ind
end;
quit;
data work.job_calendar;
set work.job_calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else
business_day_count = -1;
%mend;
The error code is: ERROR 180 - 322 Statement is not valid or it is used out of proper order. I don't know what I'm doing wrong.
You're mixing data step and macro code. Not sure what you're trying to achieve so no idea on what to propose but removing the %IF/%THEN will allow your code to work.
%macro1();
data work.job_calendar;
set work.job_calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else
business_day_count = -1;
run;
%mend;
%macro1;
Here is a tutorial on converting working code to a macro and one on overall macro programming.
To define a macro you need to use the %MACRO statement. Also why did you change lines 3 and 7 from data statements to macro statements?
That code cannot work. First the %IF is always true since the string business_day_count is never going to match the string -1. Second you have an else statement without any previous if statement.
Try something like this instead.
%macro macro1();
data work.job_calendar;
set work.job_calendar;
if business_day_count^=-1 then do;
num_seq + 1;
drop num_seq;
business_day_count = num_seq;
end;
else business_day_count = -1;
run;
%mend macro1;
/*create macro variables*/
PROC SQL NOPRINT;
SELECT RESTRICTIONS
INTO :RESTRI1 - :RESTRI35
FROM SASDATA.RESTRICTIONLIST;
QUIT;
%PUT &RESTRI2;
/*the resolved value is: */
gender = 'M' and state = 'CA'
I want to create a data set sasdata.newlist&i when the ith restriction
is &&restri&i (eg: gender = 'M' and state = 'CA').
I only want the observations which meet the restriction &&restri&I* in this new created dataset
While the sasdata.newlist2 contains all data in sasdata.oldlist, the if condition doesn't work. Anybody can help me to solve this problem?
%Macro testing(I);
data sasdata.newlist&i;
set sasdata.oldlist;
%if &&restri&i %then;
run;
%mend testing;
%testing(2)
You are not resolving the macro variables in the proper context. When applying the restriction code, resolve it so it can be compiled (data step-wise) as part of the DATA step.
%Macro testing(I);
data sasdata.newlist&i;
set sasdata.oldlist;
/* %if &&restri&i %then; NO-no-no, incorrect context */
* apply ith restriction as a sub-setting IF statement;
if &&restri&i;
run;
%mend testing;
%testing(2)
While it's hard to tell when to use macro statement, when not.
for example: Do I need to put % in the if -then-else statement and do while statement in code below? By the way, can I use "Do i = 1 to n while (condition)" statement here like this?
%MACRO FUNDSOURCE(I);
DATA SASDATA.STUDENT&I;
SET SASDATA.STUDENTLIST
DO M = 1 TO 310 WHILE(&&BUDG&I > 0); /*loop through all observations_ALL
STUDENTS*/
IF &&BUDG&I LE 3000- FA_TOT1 THEN do;
DISBURSE = &&BUDG&I;
FA_TOT1+DISBURE;
&&BUDG&I - DISBURSE;
end;
ELSE IF &&BUDG&I GT (3000- FA_TOT1) THEN DO;
DISBURSE = 3000-FA_TOT1;
FA_TOT1+DISBURSE;
&&BUDG&I - DISBURSE;
END;
END;
IF _n_ > M THEN DELETE; /*if budget are all gone, delete other observations,
keep observations only for the student who get funds*/
RUN;
%MEND FUNDSOURCE;
The following code, built using the Summary Statistics task from SAS Enterprise Guide, finds the min of each column of a table.
How can I find the second smallest value?
I tried replacing MIN with SMALLEST(2) but doesn't work.
Thank you.
TITLE;
TITLE1 "Summary Statistics";
TITLE2 "Results";
FOOTNOTE;
FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on
%TRIM (%QSYSFUNC(DATE(), NLDATE20.)) at
%TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))";
PROC MEANS DATA=WORK.SORTTempTableSorted
NOPRINT
CHARTYPE
MIN NONOBS ;
VAR A B C;
OUTPUT OUT=WORK.MEANSummaryStats(LABEL="Summary Statistics for
WORK.QUERY_FOR_TRNSTRANSPOSEDPD__0001")
MIN()=
/ AUTONAME AUTOLABEL INHERIT
;
RUN;
Using the ExtremeValue table from PROC UNIVARIATE.
ods select none;
ods output ExtremeValues=ExtremeValues(where=(loworder=2) drop=high:);
proc univariate data=sashelp.class NEXTRVAL=2;
run;
ods select all;
proc print;
run;
I don't think there's any way to accomplish this within proc means. There are ways using a variety of other procs. The Univariate procedure highlights one method using the Extreme Observations.
https://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_univariate_sect058.htm
title 'Extreme Blood Pressure Observations';
ods select ExtremeObs;
proc univariate data=BPressure;
var Systolic Diastolic;
id PatientID;
run;
proc print data=ExtremeObs;
run;
I will assume you are interested in all numeric columns. If the IFLIST macro variable is longer than 64k bytes in length due to the number of numeric variables and the length of their names, this code will fail. It should work for all reasonably narrow data sets.
UNTESTED CODE
We get a list of the variables in your data set.
PROC CONTENTS DATA=SORTTempTableSorted OUT=md NOPRINT ;
RUN ;
We use that list to create statements and expressions.
IFLIST is a block of statements to store the minimum value of fieldname in fieldname_1 and the second lowest in fieldname_2. If the comparison is LT, then we keep distinct values, not necessarily the order statistics. If the comparision is LE and there are multiple observations with the minimum value, fieldname_1 and fieldname_2 will be equal to each other. I will assume you want distinct values.
MAXLIST is an expression that will resolve to the largest numeric value in the data set :)
MINLIST and MINLIST2 are created for use in RETAIN and KEEP statements.
PROC SQL STIMER NOPRINT EXEC ;
SELECT 'IF ' || name || ' LT ' || name '_1 THEN DO;' ||
name || '_2=' || name || '_1;' ||
name || '_1=' || name || ';END;ELSE IF ' ||
name || ' LT ' || name || '_2 THEN ' ||
name || '_2=' || name,
'MAX(' || name || ')',
name || '_1',
name || '_2'
INTO :iflist SEPARATED BY '; ',
:maxlist SEPARATED BY '<>'
:minlist SEPARATED BY ' ',
:min2list SEPARATED BY ' '
FROM md
WHERE type EQ 1
;
Now we get the largest numeric value from the data set:
SELECT &maxlist
INTO :maxval
FROM SORTTempTableSorted
;
QUIT ;
Now we do the work. The END option sets "eof" to 1 on the last observation, which is the only time we want to write a record to the output data set.
DATA min2 ;
SET SORTTempTableSorted END=eof;
RETAIN &minlist &min2list &maxval;
KEEP &minlist &min2list ;
&iflist ;
IF eof THEN
OUTPUT ;
RUN ;
A data step solution that should work for any number of columns without ever running into macro limitations:
proc sql noprint;
select count(*) into :NUM_COUNT from dictionary.columns
where LIBNAME='SASHELP' and MEMNAME = 'CLASS' and TYPE = 'num';
quit;
data class_min2;
do until(eof);
set sashelp.class end = eof;
array min2[&NUM_COUNT,2] _temporary_;
array nums[*] _numeric_;
do _n_ = 1 to &NUM_COUNT;
min2[_n_,1] = min(min2[_n_,1],nums[_n_]);
if min2[_n_,1] < nums[_n_] then min2[_n_,2] = min(nums[_n_],min2[_n_,2]);
end;
end;
do _iorc_ = 1 to 2;
do _n_ = 1 to &NUM_COUNT;
nums[_n_] = min2[_n_,_iorc_];
end;
output;
end;
keep _NUMERIC_;
run;
This outputs the two lowest distinct values of each numeric variable, without transposing the data in the same way that proc univariate does. You can easily allow for duplicate minimum values with a bit of tweaking.
Sort the values in ascending order. Delete the first value. This would be the minimum value. Now the value left at the first position is your second minimum.
I want to create a series of tables using SAS macro language, but the strings I am trying to pass through have spaces in them. Any ideas on what to add to make them valid table names?
%macro has_spaces(string);
proc sql;
create table &string. as
select
*
from my_table
;
quit;
%mend;
%has_spaces(has 2 spaces);
Thanks.
Another option is translate:
%macro has_spaces(string);
proc sql;
create table %sysfunc(translate(&string.,_,%str( ))) as
select *
from my_table
;
quit;
%mend;
You could do something like this as this will catch pretty much anything that isnt valid for a SAS table name and replace it with an underscore. We use a similar approach when creating file names based on customer names that contain all kinds of weird symbols and spaces etc... :
Macro Version:
%macro clean_tablename(iField=);
%local clean_variable;
%let clean_variable = %sysfunc(compress(&iField,,kns));
%let clean_variable = %sysfunc(compbl(&clean_variable));
%let clean_variable = %sysfunc(translate(&clean_variable,'_',' '));
&clean_variable
%mend;
Test Case 1:
%let x = "kjJDHF f'ke''''j d (kdj-328) *#& J#ld!!!";
%put %clean_variable(iField=&x);
Result:
kjJDHF_fkej_d_kdj328_Jld
Your test case:
%macro has_spaces(string);
proc sql;
create table %clean_variable(iField=&string) as
select *
from sashelp.class
;
quit;
%mend;
%has_spaces(has 2 spaces);
Result:
NOTE: Table WORK.HAS_2_SPACES created, with 19 rows and 5 columns.
FCMP Version:
proc fcmp outlib=work.funcs.funcs;
function to_valid_sas_name(iField $) $32;
length clean_variable $32;
clean_variable = compress(iField,'-','kns');
clean_variable = compbl(clean_variable);
clean_variable = translate(cats(clean_variable),'_',' ');
clean_variable = lowcase(clean_variable);
return (clean_variable);
endsub;
run;
Example FCMP Usage:
data x;
length invalid_name valid_name $100;
invalid_name = "kjJDHF f'ke''''j d (kdj-328) *#& J#ld!!!";
valid_name = to_valid_sas_name(invalid_name);
put _all_;
run;
Result:
invalid_name=kjJDHF f'ke''''j d (kdj-328) *#& J#ld!!! valid_name=kjjdhf_fkej_d_kdj-328_jld
Please note that there are limits to what you can name a table in SAS. Ie. it must start with an underscore or character, and must be no more than 32 chars long. You can add additional logic to do that if needed...
Compress out the spaces - one method is to use the datastep compress() function within a %SYSFUNC, e.g.
%macro has_spaces(string);
proc sql;
create table %SYSFUNC(compress(&string)) as
select
*
from my_table
;
quit;
%mend;
%has_spaces(has 2 spaces);
Just put the table name in quotes followed by an 'n' eg if your table name is "Table one"
then pass this as the argument "Table one"n.