I have a table with observations from the date 01.08.2016 to 30.08.2016.
How to create 12 tables in the following way:
the first one contains observations from the date 01.08.2016 to 20.08.2016;
the second one contains observations from the date 01.08.2016 to 21.08.2016;
...
the 12th one contains observations from the date 01.08.2016 to 30.08.2016.
I think that it is better to do using loops, but dont know how.
This assumes that the date is in SAS date format. You can use character comparison if your date is in character format.
The data vector still contains the observation after the output statement is executed. So as long as the condition is true, the data step will write the same observation to multiple datasets. Also, I think you will need the date comparisons till 31st August if you want 12 datasets.
data want1 want2 want3 ... want12;
set have;
if date <= '20AUG2016'd then output want1;
if date <= '21AUG2016'd then output want2;
if date <= '22AUG2016'd then output want3;
.
.
.
if date <= '31AUG2016'd then output want12;
run;
It is probably better to use WHERE statements than to make separate tables. But to do either without hardcoding you need to use code generation. That is normally done using macro logic.
%macro split(start,stop);
%local i n;
%let n=%sysfunc(intck(day,&start,&stop));
%let n=%eval(&n+1);
DATA
%do i=1 %to &n;
WANT&i
%end;
;
set have ;
%do i=1 %to &n ;
if date <= %sysfunc(intnx(day,&start,&i-1)) then output WANT&i ;
%end;
run;
%mend split;
%split('20AUG2016'd,'31AUG2016'd);
Related
Good afternoon,
I am decent with SAS but I've never written macros.
I have a DB where I need to break out in separate datasets ID's where the date in a field occurs. E.g. All ID's with a date in Jan 2018 would be one dataset, All ID's with a date in Feb 2018 would be another data set, so on and so forth. Field name is ZDate.
I found this which seems to do exactly what I want. However I think my date isn't in the correct format. The date I'm pulling is in a timestamp in snowflake and I'm converting it to a date with to_date. It's showing as formatted date (16FEB2020) in the original vintagedata data set but the subsequent data sets are completely blank.
%macro month;
%local mindate maxdate i date month ;
proc sql noprint;
select min(ZDate),max(ZDate)
into :mindate , :maxdate
from vintagedata
;
quit;
data
%do i=0 %to %sysfunc(intck(month,&mindate,&maxdate));
%let date=%sysfunc(intnx(month,&mindate,&i));
%let month=%sysfunc(putn(&date,monyy7.));
&month
%end;
;
set vintagedata;
%do i=0 %to %sysfunc(intck(month,&mindate,&maxdate));
%let date=%sysfunc(intnx(month,&mindate,&i));
%let month=%sysfunc(putn(&date,monyy7.));
if intnx('month',date,0)=&date then output &month ;
%end;
run;
%mend;
%month;
So I have this macro: stab_index(yearmonth,period).
Let's say that I have to run it 5 times (maybe more) with different parameters like this
%stab_index(201601,01/2016);
%stab_index(201602,02/2016);
%stab_index(201603,03/2016);
%stab_index(201604,04/2016);
%stab_index(201605,05/2016);
in order to generate a adequate dataset to run another macro: Stab_Ind_DYNAMICS.
But I don't want to run 6 times to get the result, I would like to run all of them at once without having to fill the parameters every time.
Can someone point me in the direction of how I would set this up?
Thanks!
This assumes your parameter values always exist within your data. If you can get your dataset down to every unique combination of yearmonth and period (how my unique dataset looks below), then you don't need to input anything, just let the data do the work which can accommodate changing data:
** create test data **;
data have0;
year = 2016;
do i=1 to 12;
temp=i;
output;
end;
run;
data have; set have0;
temp1 = strip(put(year,best4.))||strip(put(temp,z2.));
yearmonth=intnx('month', input(put(temp1,6.), yymmn6.), 1)-1;
period=yearmonth;
format yearmonth yymmn6. period mmyys7.;
run;
** get data down to every unique combination of yearmonth and period **;
proc sort data = have out=unique(keep=yearmonth period) nodupkey;
by yearmonth period;
run;
** create a macro string dynamically using data **;
data create_macro_string; set unique;
macro_str=%nrstr("%stab_index")||"("||strip(put(yearmonth,yymmn6.))||","||strip(put(period,mmyys7.))||");";
keep yearmonth period macro_str;
run;
** put all your macros into a list **;
proc sql noprint;
select macro_str
into: macro_list separated by " "
from create_macro_string;
quit;
** call your macros **;
%put ¯o_list.;
You can achieve this with another macro which loops trough your list of parameters.
%let param1 = 201601 201602 201603 201604 201605;
%let param2 = 01/2016 02/2016 03/2016 04/2016 05/2016;
%macro loop();
%do i=1 %to %sysfunc(countw(¶m1,%str( )));
%let thisparam1=%scan(¶m1,&i,%str( ));
%let thisparam2=%scan(¶m2,&i,%str( ));
%put &thisparam1 &thisparam2;
%stab_index(&thisparam1,&thisparam2);
%end;
%mend loop;
%loop;
You first need to define your lists of parameters (I called them param1 & param2 here).
Then you can loop from 1 to the number of words and retrieve the i'th paramter from the list and use it in your stab_index macro.
Just in case you parameters contains spaces, you can use another separator than spaces for your lists and define it with a 2nd argument in the countw function (%sysfunc(countw(¶m1,'-'))) and a third parameter in the scan function (%scan(¶m1,&i,'-')).
I have multiple tables in a library call snap1:
cust1, cust2, cust3, etc
I want to generate a loop that gets the records' count of the same column in each of these tables and then insert the results into a different table.
My desired output is:
Table Count
cust1 5,000
cust2 5,555
cust3 6,000
I'm trying this but its not working:
%macro sqlloop(data, byvar);
proc sql noprint;
select &byvar.into:_values SEPARATED by '_'
from %data.;
quit;
data_&values.;
set &data;
select (%byvar);
%do i=1 %to %sysfunc(count(_&_values.,_));
%let var = %sysfunc(scan(_&_values.,&i.));
output &var.;
%end;
end;
run;
%mend;
%sqlloop(data=libsnap, byvar=membername);
First off, if you just want the number of observations, you can get that trivially from dictionary.tables or sashelp.vtable without any loops.
proc sql;
select memname, nlobs
from dictionary.tables
where libname='SNAP1';
quit;
This is fine to retrieve number of rows if you haven't done anything that would cause the number of logical observations to differ - usually a delete in proc sql.
Second, if you're interested in the number of valid responses, there are easier non-loopy ways too.
For example, given whatever query that you can write determining your table names, we can just put them all in a set statement and count in a simple data step.
%let varname=mycol; *the column you are counting;
%let libname=snap1;
proc sql;
select cats("&libname..",memname)
into :tables separated by ' '
from dictionary.tables
where libname=upcase("&libname.");
quit;
data counts;
set &tables. indsname=ds_name end=eof; *9.3 or later;
retain count dataset_name;
if _n_=1 then count=0;
if ds_name ne lag(ds_name) and _n_ ne 1 then do;
output;
count=0;
end;
dataset_name=ds_name;
count = count + ifn(&varname.,1,1,0); *true, false, missing; *false is 0 only;
if eof then output;
keep count dataset_name;
run;
Macros are rarely needed for this sort of thing, and macro loops like you're writing even less so.
If you did want to write a macro, the easier way to do it is:
Write code to do it once, for one dataset
Wrap that in a macro that takes a parameter (dataset name)
Create macro calls for that macro as needed
That way you don't have to deal with %scan and troubleshooting macro code that's hard to debug. You write something that works once, then just call it several times.
proc sql;
select cats('%mymacro(name=',"&libname..",memname,')')
into :macrocalls separated by ' '
from dictionary.tables
where libname=upcase("&libname.");
quit;
¯ocalls.;
Assuming you have a macro, %mymacro, which does whatever counting you want for one dataset.
* Updated *
In the future, please post the log so we can see what is specifically not working. I can see some issues in your code, particularly where your macro variables are being declared, and a select statement that is not doing anything. Here is an alternative process to achieve your goal:
Step 1: Read all of the customer datasets in the snap1 library into a macro variable:
proc sql noprint;
select memname
into :total_cust separated by ' '
from sashelp.vmember
where upcase(memname) LIKE 'CUST%'
AND upcase(libname) = 'SNAP1';
quit;
Step 2: Count the total number of obs in each data set, output to permanent table:
%macro count_obs;
%do i = 1 %to %sysfunc(countw(&total_cust) );
%let dsname = %scan(&total_cust, &i);
%let dsid=%sysfunc(open(&dsname) );
%let nobs=%sysfunc(attrn(&dsid,nobs) );
%let rc=%sysfunc(close(&dsid) );
data _total_obs;
length Member_Name $15.;
Member_Name = "&dsname";
Total_Obs = &nobs;
format Total_Obs comma8.;
run;
proc append base=Total_Obs
data=_total_obs;
run;
%end;
proc datasets lib=work nolist;
delete _total_obs;
quit;
%mend;
%count_obs;
You will need to delete the permanent table Total_Obs if it already exists, but you can add code to handle that if you wish.
If you want to get the total number of non-missing observations for a particular column, do the same code as above, but delete the 3 %let statements below %let dsname = and replace the data step with:
data _total_obs;
length Member_Name $7.;
set snap1.&dsname end=eof;
retain Member_Name "&dsname";
if(NOT missing(var) ) then Total_Obs+1;
if(eof);
format Total_Obs comma8.;
run;
(Update: Fixed %do loop in step 2)
UPDATE I've been told this isn't possible using arrays because of they way they are stored. This changes my question a bit, but the gist is still the same. How can I most efficiently generate the tables I need from a given vector of values (ex: day, week, month, year) without just repeating the code multiple times? Is there any way to simply substitute the given date value into INTX in a loop?
Ok, this is my last question on this subject, I promise. After some good advice, I'm using the INTX function. However, I'd like to just loop through the different categories I select and create tables. I tried this, but to no avail.
data;
array period [*] $ day week month year;
run;
%MACRO sqlloop;
proc sql;
%DO k = 1 %TO dim(&period); /* in case i decide to drop/add from array later */
%LET bucket = &period[&k];
CREATE TABLE output.t_&bucket AS (
SELECT INTX( "&bucket.", date_field, O, 'E') AS test FROM table);
%END
quit;
%MEND
%sqlloop
Sadly this doesn't work because I'm fouling up the array reference somehow. If I can get this step I'll be in good shape.
You could replace your array with a macro variable string:
%let period=day week month year;
In your macro then, you loop over the words in the macro variable:
%MACRO sqlloop;
proc sql;
%DO k = 1 %TO %sysfunc(countw(&period.)); /*fixed extra s*/
%LET bucket = %scan(&period.,&k.);
CREATE TABLE output.t_&bucket AS (
SELECT INTNX( "&bucket.", date_field, 0, 'E') AS test FROM table);
%END;
quit;
%MEND;
%sqlloop
edit you forgot some semicolons apparently. :p
How can you create a SAS data set from another dataset using only the last n observations from original dataset. This is easy when you know the value of n. If I don't know 'n' how can this be done?
This assumes you have a macro variable that says how many observations you want. NOBS tells you the number of observations in the dataset currently without reading the whole thing.
%let obswant=5;
data want;
set sashelp.class nobs=obscount;
if _n_ gt (obscount-&obswant.);
run;
Using Joe's example of a macro variable to specify the number of observations you want, here is another answer:
%let obswant = 10;
data want;
do _i_=nobs-(&obswant-1) to nobs;
set have point=_i_ nobs=nobs;
output;
end;
stop; /* Needed to stop data step */
run;
This should perform better since it only reads the specific observations you want.
If the dataset is large, you might not want to read the whole dataset. Instead you could try a construction that reads the total number of Observations in the dataset first. So if you want to have the last of observations:
data t;
input x;
datalines;
1
2
3
4
;
%let dsid=%sysfunc(open(t));
%let num=%sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(close(&dsid));
%let number = 2;
data tt;
set t (firstobs = %eval(&num.-&number.+1));
run;
For the sake of variety, here's another approach (not necessarily a better one)
%let obswant=5;
proc sql noprint;
select nlobs-&obswant.+1 into :obscalc
from dictionary.tables
where libname='SASHELP' and upcase(memname)='CLASS';
quit;
data want;
set sashelp.class (firstobs=&obscalc.);
run;
You can achive this using the
_nobs_ and _n_ variables. First, create a temporary variable to store the total no of obs. Then compare the automatic variable N to nobs.
data a;
set sashelp.class nobs=_nobs_;
if _N_ gt _nobs_ -5;
run;