SAS: Drop column in a if statement - sas

I have a dataset called have with one entry with multiple variables that look like this:
message reference time qty price
x 101 35000 100 .
the above dataset changes every time in a loop where message can be ="A". If the message="X" then this means to remove 100 qty from the MASTER set where the reference number equals the reference number in the MASTER database. The price=. is because it is already in the MASTER database under reference=101. The MASTER database aggregates all the available orders at some price with quantity available. If in the next loop message="A" then the have dataset would look like this:
message reference time qty price
A 102 35010 150 500
then this mean to add a new reference number to the MASTER database. In other words, to append the line to the MASTER.
I have the following code in my loop to update the quantity in my MASTER database when there is a message X:
data b.master;
modify b.master have(where=(message="X")) updatemode=nomissingcheck;
by order_reference_number;
if _iorc_ = %sysrc(_SOK) then do;
replace;
end;
else if _iorc_ = %sysrc(_DSENMR) then do;
output;
_error_ = 0;
end;
else if _iorc_ = %sysrc(_DSEMTR) then do;
_error_ = 0;
end;
else if _iorc_ = %sysrc(_DSENOM) then do;
_error_ = 0;
end;
run;
I use the replace to update the quantity. But since my entry for price=. when message is X, the above code sets the price='.' where reference=101 in the MASTER via the replace statement...which I don't want. Hence, I prefer to delete the price column is message=X in the have dataset. But I don't want to delete column price when message=A since I use this code
proc append base=MASTER data=have(where=(msg_type="A")) force;
run;
Hence, I have this code price to my Modify statement:
data have(drop=price_alt);
set have; if message="X" then do;
output;end;
else do; /*I WANT TO MAKE NO CHANGE*/
end;run;
but it doesn't do what I want. If the message is not equal X then I don't want to drop the column. If it is equal X, I want to drop the column. How can I adapt the code above to make it work?

Its a bit of a strange request to be honest, such that it raises questions about whether what you're doing is the best way of doing it. However, in the spirit of answering the question...
The answer by DomPazz gives the option of splitting the data into two possible sets, but if you want code down the line to always refer to a specific data set, this creates its own complications.
You also can't, in the one data step, tell SAS to output to the "same" data set where one instance has a column and one instance doesn't. So what you'd like, therefor, is for the code itself to be dynamic, so that the data step that exists is either one that does drop the column, or one that does not drop the column, depending on whether message=x. The answer to this, dynamic code, like many things in SAS, resolves to the creative use of macros. And it looks something like this:
/* Just making your input data set */
data have;
message='x';
time=35000;
qty=1000;
price=10.05;
price_alt=10.6;
run;
/* Writing the macro */
%macro solution;
%local id rc1 rc2;
%let id=%sysfunc(open(work.have));
%syscall set(id);
%let rc1=%sysfunc(fetchobs(&id, 1));
%let rc2=%sysfunc(close(&id));
%IF &message=x %THEN %DO;
data have(drop=price_alt);
set have;
run;
%END;
%ELSE %DO;
data have;
set have;
run;
%END;
%mend solution;
/* Running the macro */
%solution;

Try this:
data outX(drop=price_alt) outNoX;
set have;
if message = "X" then
output outX;
else
output outNoX;
run;
As #sasfrog says in the comments, a table either has a column or it does not. If you want to subset things where MESSAGE="X" then you can use something like this to create 2 data sets.

Related

Insert text into all cells of first column in a sas dataset

I've output 'Moments' from Proc Univariate to datasets. Many.
Example: Moments_001.sas7bdat through to Moments_237.sas7bdat
For the first column of each dataset (new added first column, and probably new dataset, as opposed to the original) I would like to have a particular text in every cell going down to bottom row.
The exact text would be the name of the respective dataset file: say, "Moments_001".
I do not have to 'grab' the filename, per se, if that's not possible. As I know what the names are already, I can put that text into the procedure. However, grabbing the filenames, if possible, would be easier from my standpoint.
I'd greatly appreciate any help anyone could provide to accomplish this.
Thanks,
Nicholas Kormanik
Are you looking for the INDSNAME option of the SET statement? You need to define two variables because the one generated by the option is automatically dropped.
data want;
length moment dsn $41 ;
set Moments_001 - Moments_237 indsname=dsn ;
moment=dsn;
run;
I think something along these lines should be what you're after. Assuming you have a list of moments, you can loop through it and add a new variable as the first column of each dataset.
%let list_of_moments = moments_001 moments_002 ... moments_237;
%macro your_macro;
%do i = 1 %to %sysfunc(countw(&list_of_moments.));
%let this_moment = %scan(&list_of_moments., &i.);
data &this_moment._v2;
retain new_variable;
set &this_moment.;
new_variable = "&this_moment.";
run;
%end;
%mend your_macro;
%your_macro;
The brute force entering of text into column 1 looks like this:
data moments_001;
length text $ 16;
set moments_001;
text="Moments_001";
run;
You could also write a macro that would loop through all 237 data sets and insert the text.
UNTESTED CODE
%macro do_all;
%do i=1 %to 237;
%let num = %sysfunc(putn(&i,z3.));
data moments_#
length text & 16;
set moments_#
text="Moments_&num";
run;
%end;
%mend
%do_all
It seems to me (not knowing your problem) that if you use PROC UNIVARIATE with the BY option, then you wouldn't need 237 different data sets, all of your output would be in one data set and the BY variable would also be in the data set. Does that solve your problem?

Check if a value exists in a dataset or column

I have a macro program with a loop (for i in 1 to n). With each i i have a table with many columns - variables. In these columns, we have one named var (who has 3 possible values: a b and c).
So for each table i, I want to check his column var if it exists the value "c". If yes, I want to export this table into a sheet of excel. Otherwise, I will concatenate this table with others.
Can you please tell me how can I do it?
Ok, in your macro at step i you have to do something like this
proc sql;
select min(sum(case when var = 'c' then 1 else 0 end),1) into :trigger from table_i;
quit;
then, you will get macro variable trigger equal 1 if you have to do export, and 0 if you have to do concatenetion. Next, you have to code something like this
%if &trigger = 1 %then %do;
proc export data = table_i blah-blah-blah;
run;
%end;
%else %do;
data concate_data;
set concate_data table_i;
run;
%end;
Without knowing the whole nine yard of your problem, I am at risk to say that you may not need Macro at all, if you don't mind exporting to .CSV instead of native xls or xlsx. IMHO, if you do 'Proc Export', meaning you can't embed fancy formats anyway, you 'd better off just use .CSV in most of the settings. If you need to include column headings, you need to tap into metadata (dictionary tables) and add a few lines.
filename outcsv '/share/test/'; /*define the destination for CSV, change it to fit your real settings*/
/*This is to Cat all of the tables first, use VIEW to save space if you must*/
data want1;
set table: indsname=_dsn;
dsn=_dsn;
run;
/*Belowing is a classic 2XDOW implementation*/
data want;
file outcsv(test.csv) dsd; /*This is your output CSV file, comma delimed with quotes*/
do until (last.dsn);
set want1;
by dsn notsorted; /*Use this as long as your group is clustered*/
if var='c' then _flag=1; /*_flag value will be carried on to the next DOW, only reset when back to top*/
end;
do until (last.dsn);
set want1;
by dsn notsorted;
if _flag=1 then put (_all_) (~); /*if condition meets, output to CSV file*/
else output; /*Otherwise remaining in the Cat*/
end;
drop dsn _flag;
run;

Running a SAS macro for every day in a range of dates

I am an intermediate user of SAS, but I have limited knowledge of arrays and macros. I have a set of code that prompts the user to enter a date range. For example, the user might enter December 1, 2015-December 5,2015. For simplicity, imagine the code looks like:
data new; set old;
if x1='December 1, 2015'd then y="TRUE";
run;
I need to run this same code for every day in the date prompt range, so for the 1st, 2nd, 3rd, 4th, and 5th. My thought was to create an array that contains the dates, but I am not sure how I would do that. My second thought was to create a macro, but I can't figure out out to feed a list through a macro.
Also, just FYI, the code is a lot longer and more complicated than just a data step.
The following macro can be used as a framework for your code:
%MACRO test(startDate, endDAte);
%DO i=&startDate %to &endate;
/* data steps go here */
/* example */
DATA test;
SET table;
IF x1 = &i THEN y = "true";
RUN;
%END;
%MEND;
Look into call execute to call your macro and a data null step using a do loop to loop through the days. Getting the string correct for the call execute can sometimes be tricky, but worth the effort overall.
data sample;
do date='01Jan2014'd to '31Jan2014'd;
output;
end;
run;
%macro print_date(date);
proc print data=sample;
where date="&date"d;
format date date9.;
run;
%mend;
%let date_start=05Jan2014;
%let date_end=11Jan2014;
data _null_;
do date1="&date_start"d to "&date_end"d by 1;
str='%print_date('||put(date1, date9.)||');';
call execute(str);
end;
run;

Sas renaming variable with do loop and if then condition

I'm trying to rename variables x0 - x40 so that x0 will become y_q1_2014, x1 will become y_q4_2013, x2 will become y_q3_2013 and so on till x40 that will become y_q1_2004.
I want my new variable to display in its name the quarter and year of the observation. Now I have the following macro in SAS that is not working properly: the values of j and k are not changing according to the if - then condition. What am i doing wrong?
%macro rename(data);
%let j=1;
%let k=2014;
%do i = 0 %to 40 %by 1;
data mydata;
set &data.;
y_q&j._&k. = x&i.;
if &j.=1 then do k = &k.-1 and j = 4;
else do j=&j.-1;
run;
%end;
%mend;
This will likely be easier to do using the data step rather than a macro loop (as most things are!).
In this case, you have two problems:
How to mass-rename variables
How to convert x# to y_q#_####
An easy way to rename variables is to create a dataset with the variable names as rows, then create the new variable names. You can then pull that into a rename list very easily.
So something like this would do that.
*Create dataset with names in it.
data names;
set sashelp.vcolumn;
where memname='HAVE' and libname='WORK' and name =: 'X';
keep name;
run;
*some operation to determine new_name needs to go in that dataset also - coming later;
*Now create a list of rename macro calls.
proc sql;
select cats('%rename(var=',name,',newvar=',new_name,')')
into :renamelist separated by ' '
from names;
quit;
*Here is the simple rename macro.
%macro rename(var=,newvar=);
rename &var.=&newvar.;
%mend rename;
*Now do the renames. Can also go in a data step.
proc datasets lib=work;
modify have;
&renamelist.
quit;
How to convert is a more interesting question, and begs the question: is this a one time thing, or is this a repeated process? If it's a repeated process, does X0 always mean the most recent quarter in the data, or does it always mean q1 2014?
Assuming it is always the most recent quarter, you can use intnx to do this.
%let initdate='01JAN2014'd;
data have;
do x = 0 to 40;
qtr = intnx('QUARTER',&initdate,-1*x);
format qtr YYQ.;
output;
end;
run;
You can thus use this code (the portion inside the do loop, operating on an x that you pull out of the name in the dataset) in the earlier names data step to create new_name however you want. You might use the YYQ format in your new name if you have flexibility here (as it's standard, and the easiest solution). Otherwise, you would want to pull this apart either using put and then substring, or quarter() and year() functions off of the date variable here.

An efficient way to update multiple columns using join

I am really struggling with it guys.
The table needs to be updates has ~15M rows and ~200 columns.
I need to update few columns using a work table table.
This is (partly) what I need to do:
%macro condition;
%if &row_count>0 %then %do;
data _null_;
set W4TWGKJ6 end=final;
if _n_ = 1 then call execute("proc sql ;");
call execute
("update dds.insurance_policy set X_STORNO_BY_VERSION="||TOSNUM||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";,
update dds.insurance_policy set STATUS_CHANGE_DT="||ISSUE_DT||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";");
if final then call execute('quit;'); run;
%end;
%mend;
%condition;
I first check if there are rows in table (&row_count)
if there are,
I update 2 columns (I need to update 5, I just cut them from the example)
using a work table called W4TWGKJ6.
This update takes forever.
In fact, I stopped the process every single time, as it worked for hours without returning anything....
Does anyone knows a better solution for this problem?
Thanks in advance,
Gal.
I'd suggest using MODIFY statement in datastep:
You should have same column names in both tables for BY variables and have them sorted by those variables.
data dds.insurance_policy;
modify
dds.insurance_policy
W4TWGKJ6 (keep= POLICY_NO X_INSURER_SERIAL_NO /* key variables */
X_STORNO_BY_VERSION STATUS_CHANGE_DT /* ... other variables from source to update target */
updatemode=nomissingcheck;
by POLICY_NO X_INSURER_SERIAL_NO;
if _iorc_ = %sysrc(_SOK) then do;
* Update row ;
replace;
end;
else _error_ = 0;
run;
See SAS: How not to overwrite a dataset when the "where" condition in a "Modify" statement does not hold? for complete reference of iorc return values.