I have a sas data set. In it i have some variables following a pattern
-W 51 Sales
-W 52 Sales
-W 53 Sales
and so on.
Now i want to rename all of these variables dynamically such that W 51 is replaced by starting date of that week and the new name becomes - 5/2/2013 Sales?
The reason i want to rename them is that i have sales data of all the 53 weeks in an year and the data set would be eassier for me to understand if i had the starting date of a week instead of W(week_no) Sales as a variable name
Is there any way i can do that in sas?
You really don't want to rename your variables. You may think you do, but it'll just bite you eventually.
What you can do instead is give them descriptive labels. This can be done via proc datasets.
proc datasets library=<lib>;
modify <dataset>;
label <variable> '5/2/2013 sales';
run;
Just for fun lets assume you want to do this anyway -- Safest thing to do is just create a copy of the dataset for your output...
this code assumes your variable names are named like w1_sales and output names are going to be renamed to 03JAN2013_sale or something like that.
data newDataSet;
set oldDataSet;
%MACRO rename_vars(mdataset,year);
data &mdataset.;
set &mdataset.;
%do i = 1 %to 53;
%let weekStartDate = %sysfunc(intnx('week&i','01jan&year.'d,0)); %*returns the starting day of week(i) uses sunday as starting date. If you want monday use 0.1 as last param;
%let weekstartDateFormatted = %sysfunc(putn(&weekStartDate.,DATE.)) %*formats into ddMONyyy. substitute whatever format you want;
rename w&i._Sale = &weekstartDateFormatted ._SALES;
%end;
run;
%MEND rename_vars;
%rename_vars(newDataSet,2013);
I don't have time to test this right now, so sommebody let me know if I screwed it up somewhere. This should at least get you going though. Or you can send me or post some code to read a small sample dataset (obviously if this is possible without having to share some proprietary info. You might have to genericize it a bit) with those vars like that and I'll debug it.
Related
I am calculating around 12 metrics (Say Sales for each month individually for latest 12 months). Every month I need to go manually and change the month everywhere. If there is any way to automate it, it would be very helpful. My code is
proc sql;
create table inter.calls as
select a.district_name,
sum(01JAN2016,01FEB2016,01MAR2016)/terr_count as q1_workingdays,
sum(01APR2016,01JUN2016,01MAY2016)/terr_count as q2_workingdays,
sum(01AUG2016,01SEP2016,01JUL2016)/terr_count as q3_workingdays,
sum(01NOV2016,01DEC2016,01OCT2016)/terr_count as q4_workingdays
from inter.calls_made_bymon_reg3 a left join inter.territory_count b
on a.district_name=b.district_name;
quit;
Now when I refresh for JAN2017, I need to change from FEB2016 to JAN2017 for latest 12months. Every time it is difficult to change the code manually.
I will be very thankful if I get any help!!
It's difficult to understand your problem completely, because as Reeza mentioned the arguments in your sum() functions are invalid (the names begin with a number).
However, I do understand the desire not to manually change dates all over the place. You might find a macro like the below helpful:
%macro prev_month(n);
%let latest_date = %sysfunc(intnx(month,%sysfunc(inputn(&latest_month.,monyy7.)),-&n.));
days_%sysfunc(putn(&latest_date.,monyy7.))
%mend prev_month;
And you would then use it in your query like so:
%let latest_month = JAN2017;
proc sql;
create table inter.calls as
select a.district_name,
sum(%prev_month(11),%prev_month(10),%prev_month(9))/terr_count as q1_workingdays,
sum(%prev_month(8) ,%prev_month(7) ,%prev_month(6))/terr_count as q2_workingdays,
sum(%prev_month(5) ,%prev_month(4) ,%prev_month(3))/terr_count as q3_workingdays,
sum(%prev_month(2) ,%prev_month(1) ,%prev_month(0))/terr_count as q4_workingdays
from inter.calls_made_bymon_reg3 a left join inter.territory_count b
on a.district_name=b.district_name;
quit;
Hope it helps.
I'm trying to have a SAS data set automatically limit the results based on date but don't want to manually have to manually change the date through a %Let statement.
If I try
%let BeginDate = %EVAL(MDY(MONTH(TODAY()), 1, YEAR(TODAY()));
I get a "Open code statement recursion detected"... I've tried &SYSFUNC and &SYSEVALF but no luck either. It seems like this should be much simpler... any suggestions would surely be appreciated.
Thanks!
#Joe's method is the most straightforward. Additionally, if you wanted to do this in a datastep with similar syntax you could do:
data _null_;
call symputx('BeginDate_ds',mdy(month(today()),1,year(today())));
run;
%put &BeginDate_ds.;
Depending on what you're doing, you either don't need anything, or you need %SYSFUNC.
If you want to have &begindate evaluate to an actual date value, you would use %SYSFUNC.
However, you have five functions there - that's going to require a bunch of sysfuncs, though I think we can do two not five.
%let begindate = %sysfunc(intnx(MONTH,%sysfunc(today()),0,b));
%put &begindate;
We use INTNX with the MONTH and B(eginning) options to tell SAS to go ahead 0 months (so current month) and to go to the Beginning of that month. A second SYSFUNC grabs TODAY(). You could simplify this more:
%let begindate = %sysfunc(intnx(MONTH,"&sysdate."d,0,b));
%put &begindate;
&SYSDATE is a macro variable that stores the system date when SAS was started up; so only use that if you're okay with that (i.e., if SAS likely/definitely started up today).
With SYSFUNC don't forget that you need to drop quotation marks, with the one big exception of the date constant above - that is okay to use them - but note "MONTH" and "b" are not quoted.
I have the following code:
ods tagsets.excelxp file = 'G:\CPS\myworkwithoutmissing.xml'
style = printer;
proc tabulate data = final;
Class Year Self_Emp_Inc Self_Emp_Uninc Self_Emp Multi_Job P_Occupation Full_Part_Time_Status;
table Year, P_Occupation*n;
table Year, (P_Occupation*Self_Emp_Inc)*n;
table Year, (Self_Emp_Inc*P_Occupation)*n;
run;
ods tagsets.excelxp close;
When I run this code, I get the following error message:
WARNING: A class, frequency, or weight variable is missing on every observation.
WARNING: A class, frequency, or weight variable is missing on every observation.
WARNING: A class, frequency, or weight variable is missing on every observation.
Now in order to circumvent this issue, I add the "missing" option at the end of the class statement such that:
class year self_emp_inc ....... Full_Part_Time_Status/ missing;
This fixes the problem in that it doesn't give me the error message and creates the table. However, my chart now also counts the number of missing values, something that I do not want. For example my variable self_emp_inc has values of 1 and .(for missing). Now when I run the code with the missing option,I get a count of P_Occupation for all the missing values as well, but I only want the count for when the value of self_emp_Inc is 1. How can I accomplish that task?
This is one of those frustrating things in SAS that for some reason SAS hasn't given us a "good" option to work around. Depending on what you're working with, there are a few solutions.
The real problem here is not that you have missings - in a 1x1 table (1 var by 1 var), excluding missings is what you want. It's because you're calling for multiple tables and each table is affected by missings in the class variables in the other table.
As such, oftentimes the easiest answer is simply to split the tables into multiple proc tabulate statements. This might occasionally be too complicated or too onerous in terms of runtime, but I suspect the majority of the time this is the best solution - it often is for me, anyway.
Since you're only working with n, you could instead construct the tabulation with the missings, output to a dataset, then filter them out and re-print or export that dataset. That's the easiest solution, typically.
How exactly you want to do this of course depends on what exactly you want. For example:
data test_cars;
set sashelp.cars;
if _n_=5 then call missing(make);
if _n_=7 then call missing(model);
if _n_=10 then call missing(type);
if _n_=13 then call missing(origin);
run;
proc tabulate data=test_cars out=test_tabulate(rename=n=count);
class make model type origin/missing;
tables (make model type),origin*n;
run;
data test_tabulate_want;
set test_tabulate;
if cmiss(of make model type origin)>2 then delete;
length colvar $200;
colvar = coalescec(of make model type);
run;
proc tabulate data=test_tabulate_want missing;
class colvar origin/order=data;
var count;
tables colvar,origin*count*sum;
run;
This isn't perfect, though it can be made a lot better with some more work on the formatting - this is just a quick example.
If you're using percents, of course, this doesn't exactly work. You either need to refactor the percents in that data step - which is a bit of work, but doable - or you need separate tabulates for each class variable.
I want to perform some regression and i would like to count the number of nonmissing observation for each variable. But i don't know yet which variable i will use. I've come up with the following solution which does not work. Any help?
Here basically I put each one of my explanatory variable in variable. For example
var1 var 2 -> w1 = var1, w2= var2. Notice that i don't know how many variable i have in advance so i leave room for ten variables.
Then store the potential variable using symput.
data _null_;
cntw=countw(¶meters);
i = 1;
array w{10} $15.;
do while(i <= cntw);
w[i]= scan((¶meters"),i, ' ');
i = i +1;
end;
/* store a variable globally*/
do j=1 to 10;
call symput("explanVar"||left(put(j,3.)), w(j));
end;
run;
My next step is to perform a proc sql using the variable i've stored. It does not work as
if I have less than 10 variables.
proc sql;
select count(&explanVar1), count(&explanVar2),
count(&explanVar3), count(&explanVar4),
count(&explanVar5), count(&explanVar6),
count(&explanVar7), count(&explanVar8),
count(&explanVar9), count(&explanVar10)
from estimation
;quit;
Can this code work with less than 10 variables?
You haven't provided the full context for this project, so it's unclear if this will work for you - but I think this is what I'd do.
First off, you're in SAS, use SAS where it's best - counting things. Instead of the PROC SQL and the data step, use PROC MEANS:
proc means data=estimation n;
var ¶meters.;
run;
That, without any extra work, gets you the number of nonmissing values for all of your variables in one nice table.
Secondly, if there is a reason to do the PROC SQL, it's probably a bit more logical to structure it this way.
proc sql;
select
%do i = 1 %to %sysfunc(countw(¶meters.));
count(%scan(¶meters.,&i.) ) as Parameter_&i., /* or could reuse the %scan result to name this better*/
%end; count(1) as Total_Obs
from estimation;
quit;
The final Total Obs column is useful to simplify the code (dealing with the extra comma is mildly annoying). You could also put it at the start and prepend the commas.
You finally could also drive this from a dataset rather than a macro variable. I like that better, in general, as it's easier to deal with in a lot of ways. If your parameter list is in a data set somewhere (one parameter per row, in the dataset "Parameters", with "var" as the name of the column containing the parameter), you could do
proc sql;
select cats('%countme(var=',var,')') into :countlist separated by ','
from parameters;
quit;
%macro countme(var=);
count(&var.) as &var._count
%mend countme;
proc sql;
select &countlist from estimation;
quit;
This I like the best, as it is the simplest code and is very easy to modify. You could even drive it from a contents of estimation, if it's easy to determine what your potential parameters might be from that (or from dictionary.columns).
I'm not sure about your SAS macro, but the SQL query will work with these two notes:
1) If you don't follow your COUNT() functions with an identifier such as "COUNT() AS VAR1", your results will not have field headings. If that's ok with you, then you may not need to worry about it. But if you export the data, it will be helpful for you if you name them by adding "...AS "MY_NAME".
2) For observations with fewer than 10 variables, the query will return NULL values. So don't worry about not getting all of the results with what you have, because as long as the table you're querying has space for 10 variables (10 separate fields), you will get data back.
I tried to recode the missing values but instead lost all my other variables within a dataset
BEFORE:
AFTER:
data work.newdataset;
if (year =.) then year = 2000;
run;
You are missing the SET statement.
data want;
set have;
myvar=5;
run;
will create a new dataset, want, from have, with the new variable value applied (or the recode or whatever). You could also do
data have;
set have;
myvar=5;
run;
That would replace have with itself plus the recode/whatever. This is actually less common in SAS; it is often preferable to do all recodes in one step, but to create a new dataset (so that the code is reversible easily).