sas insert table into html email - sas

Is it possible to loop through the records of a table to populate an html email without repeating the beginning and the end of the email?
With this example I get a mail with 5 tables of 1 row (because WORK.MyEmailTable is table of 5 records and set creates a loop in the data step):
data _null_;
file mymail;
set WORK.MyEmailTable;
put '<html><body><table>';
***loop through all records;
put '<tr>';
put %sysfunc(cats('<td>',var1,'</td>'));
put %sysfunc(cats('<td>',var2,'</td>'));
put %sysfunc(cats('<td>',var3,'</td>'));
put '</tr>';
put '</table></body></html>';
run;
And I'm looking to have 1 table of 5 rows.
I don't know if there is a way to prevent recursively put the beginning and the end of the mail when you use set in the data step.
(Let me know if it's not clear I'll update.)
Thank you,

You can use the _n_ automatic datastep variable to let you know when you are on the first observation, and the set statement option end= to know that you are on the last observation:
data _null_;
file mymail;
set WORK.MyEmailTable end=eof;
if _n_ eq 1 then do;
put '<html><body><table>';
end;
/*loop trhough all records*/
put '<tr>';
put %sysfunc(cats('<td>','_n_=',n,' eof=',eof,' ',var1,'</td>'));
put %sysfunc(cats('<td>','_n_=',n,' eof=',eof,' ',var2,'</td>'));
put %sysfunc(cats('<td>','_n_=',n,' eof=',eof,' ',var3,'</td>'));
put '</tr>';
if eof then do;
put '</table></body></html>';
end;
run;
I've added the values _n_ and eof to the output so you can see clearly how they work.

Rob's method is pretty much the standard, but there is another option if you prefer scripting an explicit loop (which can be more comfortable for non-SAS programmers to read). This will function exactly like Rob's answer, and may well compile to the same machine code even.
data _null_;
file mymail;
put '<html><body><table>';
do _n_ = 1 by 1 until (eof);
/*loop trhough all records*/
set WORK.MyEmailTable end=eof;
put '<tr>';
put %sysfunc(cats('<td>',var1,'</td>'));
put %sysfunc(cats('<td>',var2,'</td>'));
put %sysfunc(cats('<td>',var3,'</td>'));
put '</tr>';
end;
put '</table></body></html>';
stop;
run;
_n_ here doesn't have any special meaning (like it does in Rob's answer); it's used by convention since this way it does effectively have the same meaning as it does normally.
You need to use the end=eof to create a variable eof which is true on the last record of the dataset; otherwise the data step will terminate prematurely (before actually hitting your final statement). You also need the stop to tell it to not go back to the start - otherwise it will, and will put a new starting section, then terminate instantly when it hits the set. (Try it and see.)
do _n_=1 by 1 until (eof); is a SAS-specific way of using an incremental loop; it's similar to the c/c++ for (_n_=1; !eof; _n_++) for example - it allows you to have an auto-incremented do loop whilst having a separate, unrelated stopping criteria.

Related

SAS Do Loop is Omitting Rows in Processing

I have the following code. I am trying to test a paragraph (descr) for a list of keywords (key_words). When I execute this code, the log reads in all the variables for the array, but will only test 2 of the 20,000 rows in the do loop (do i=1 to 100 and on). Any suggestions on how to fix this issue?
data JE.KeywordMatchTemp1;
set JE.JEMasterTemp end=eof;
if _n_ = 1 then do i = 1 by 1 until (eof);
set JE.KeyWords;
array keywords[100] $30 _temporary_;
keywords[i] = Key_Words;
end;
match = 0;
do i = 1 to 100;
if index(descr, keywords[i]) then match = 1;
end;
drop i;
run;
Your problem is that your end=eof is in the wrong place.
This is a trivial example calculating the 'rank' of the age value for each SASHELP.CLASS respondent.
See where I put the end=eof. That's because you need to use it to control the array filling operation. Otherwise, what happens is your loop that is do i = 1 to eof; doesn't really do what you're saying it should: it's not actually terminating at eof since that is never true (as it is defined in the first set statement). Instead it terminates because you reach beyond the end of the dataset, which is specifically what you don't want.
That's what the end=eof is doing: it's preventing you from trying to pull a row when the array filling dataset is finished, which terminates the whole data step. Any time you see a data step terminate after exactly 2 iterations, you can be confident that's what the problem is likely to be - it is a very common issue.
data class_ranks;
set sashelp.class; *This dataset you are okay iterating over until the end of the dataset and then quitting the data step, like a normal data step.;
array ages[19] _temporary_;
if _n_=1 then do;
do _i = 1 by 1 until (eof); *iterate until the end of the *second* set statement;
set sashelp.class end=eof; *see here? This eof is telling this loop when to stop. It is okay that it is not created until after the loop is.;
ages[_i] = age;
end;
call sortn(of ages[*]); *ordering the ages loaded by number so they are in proper order for doing the trivial rank task;
end;
age_rank = whichn(age,of ages[*]); *determine where in the list the age falls. For a real version of this task you would have to check whether this ever happens, and if not you would have to have logic to find the nearest point or whatnot.;
run;

SAS Data Step if statement not working

With the below code the email always gets sent. 1 obviously does not equal 0, but yet it still runs. I have tried removing the do part but still get the same issue.
data _null_;
set TestTable;
if 1 = 0 then do;
file sendit email
to=("email#gmail.com")
subject="Some Subject Line";
end;
run;
While the file statement is considered an executable statement (and thus should not be executed when behind a false if statement), that is not really entirely true. SAS sees the file statement during compilation, and knows that it needs to create a file to write to - and thus, it's somewhat compile-time. That's what is happening here - SAS creates the file (in this case, the email) as a result of the compiler's activity, then doesn't actually populate it with anything, but still has an email at the end of the day.
The same thing happens with any other file - like so:
data _null_;
set sashelp.class;
if 0 then do;
file "c:\temp\test_non_zero.txt";
put name $;
end;
run;
A blank file is created by that code.
If you need to conditionally send emails, I would recommend wrapping your email code in a macro, and then calling that macro using call execute or similar from the dataset. Like so:
%macro write_email(parameters);
data _null_;
file sendit email
to=("email#gmail.com")
subject="Some Subject Line";
run;
%mend write_email;
data _null_;
set TestTable;
if 0 then do;
call execute('%write_email(parameters)');
end;
run;
Use email directives to abort the message.
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002058232.htm
data _null_;
file sendit email to=("email#gmail.com") subject="Some Subject Line";
if nobs=0 then put '!EM_ABORT!';
set TestTable nobs=nobs;
....
run;

SAS Data Step | Between 2 Dates

Probably a simple question. I have a simple dataset with scheduled payment dates in it.
DATA INFORM2;
INFORMAT previous_pmt_date scheduled_pmt_date MMDDYY10.;
INPUT previous_pmt_date scheduled_pmt_date;
FORMAT previous_pmt_date scheduled_pmt_date MMDDYYS10.;
DATALINES;
11/16/2015 12/16/2015
12/17/2015 01/16/2016
01/17/2016 02/16/2016
;
What I'm trying to do is to create a binary latest row indicator. For example, If I wanted to know the latest row as of 1/31/2016 I'd want row 2 to be flagged as the latest row. What I had been doing before is finding out where 1/31/2016 is between the previous_pmt_date and the scheduled_pmt_date, but that isn't correct for my purposes. I'd like to do this in an data step as opposed to SQL subqueries. Any ideas?
Want:
previous_pmt_date scheduled_pmt_date latest_row_ind
11/16/2015 12/16/2015 0
12/17/2015 01/16/2016 1
01/17/2016 02/16/2016 0
Here's a solution that does it all in the single existing datastep without any additional sorting. First I'm going to modify your data slightly to include account as the solution really should take that into account as well:
DATA INFORM2;
INFORMAT previous_pmt_date scheduled_pmt_date MMDDYY10.;
INPUT account previous_pmt_date scheduled_pmt_date;
FORMAT previous_pmt_date scheduled_pmt_date MMDDYYS10.;
DATALINES;
1 11/16/2015 12/16/2015
1 12/17/2015 01/16/2016
1 01/17/2016 02/16/2016
2 11/16/2015 12/16/2015
2 12/17/2015 01/16/2016
2 01/17/2016 02/16/2016
;
run;
Specify a cutoff date:
%let cutoff_date = %sysfunc(mdy(1,31,2016));
This solution uses the approach from this question to save the variables in the next row of data, into the current row. You can drop the vars at the end if desired (I've commented out for the purposes of testing).
data want;
set inform2 end=eof;
by account scheduled_pmt_date;
recno = _n_ + 1;
if not eof then do;
set inform2 (keep=account previous_pmt_date scheduled_pmt_date
rename=(account = next_account
previous_pmt_date = next_previous_pmt_date
scheduled_pmt_date = next_scheduled_pmt_date)
) point=recno;
end;
else do;
call missing(next_account, next_previous_pmt_date, next_scheduled_pmt_date);
end;
select;
when ( next_account eq account and next_scheduled_pmt_date gt &cutoff_date ) flag='a';
when ( next_account ne account ) flag='b';
otherwise flag = 'z';
end;
*drop next:;
run;
This approach works by using the current observation in the dataset (obtained via _n_) and adding 1 to it to get the next observation. We then use a second set statement with the point= option to load in that next observation and rename the variables at the same time so that they don't overwrite the current variables.
We then use some logic to flag the necessary records. I'm not 100% of the logic you require for your purposes, so I've provided some sample logic and used different flags to show which logic is being triggered.
Some notes...
The by statement isn't strictly necessary but I'm including it to (a) ensure that the data is sorted correctly, and (b) help future readers understand the intent of the datastep as some of the logic requires this sort order.
The call missing statement is simply there to clean up the log. SAS doesn't like it when you have variables that don't get assigned values, and this will happen on the very last observation so this is why we include this. Comment it out to see what happens.
The end=eof syntax basically creates a temporary variable called eof that has a value of 1 when we get to the last observation on that set statement. We simply use this to determine if we're at the last row or not.
Finally but very importantly, be sure to make sure you are keeping only the variables required when you load in the second dataset otherwise you will overwrite existing vars in the original data.

SAS: do not send email when no records

I'm new to SAS base and I am trying to either add a line saying all ID's transferred when nobs=0 Here's what I added on what Alex gave me which print the ID's when there are any in the data set workgo.recds_not_processed. I added nobs=0 condition but it is not sending the ID's when they are present.
data _null_;
length id_list $ 3000;
retain id_list '';
file mymail;
SET set workgo.recds_not_processed nobs = nobs end = eof;
IF nobs = 0 then PUT "All ID's transferred successfully";
else if _n_ = 1 then do;
put 'Number of records not processed=' nobs;
put 'The IDs are:';
end;
/* Print the IDs in chunks */
if length(strip(id_list)) > 2000 then do;
put id_list;
call missing(id_list);
end;
call catx(', ', id_list, id);
if eof then put id_list;
run;
You are very nearly there - the unmatched end spotted by Joe and the double set that I pointed out were the only obvious errors preventing your code from working.
The reason you got no output when dealing with an empty dataset is that SAS terminates a data step when it tries to read a record from a set statement and there are none left to read. One solution is to move your empty dataset logic before the set statement.
Also, you can achieve the desired result without resorting to retain and call catx by using a double trailing ## in your put statement to write to the same line repeatedly from multiple observations from the input dataset:
data recds_not_processed;
do id=1 to 20;
output;
end;
run;
data _null_;
/*nobs is populated when the data step is compiled, so we can use it before the set statement first executes*/
IF nobs=0 then
do;
PUT "All IDs transferred successfully";
stop;
end;
/*We have to put the set statement after the no-records logic because SAS terminates the data step after trying to read a record when there are none remaining.*/
SET recds_not_processed nobs=nobs end=eof;
if _n_=1 then
do;
put 'Number of records not processed=' nobs;
put 'The IDs are:';
end;
/* Print the IDs in chunks */
if eof or mod(_N_, 5)=0 then
put id;
else
put id +(-1) ', ' ##;
run;

Get the current observation count in SAS

I have a file in which the first line is a header line containing some meta-data information.
How can I get the current observation number(say =1 for the first observation) that the SAS processor is dealing with so that I can put in a IF clause to handle such special data line.
Follow up: I want to process the first line and keep one of the column values in a local variable for further processing. I don't want to keep this line in my final output. is this possible?
The automatic variable _N_ returns the current iteration number of the SAS data step loop. For a traditional data step, ie:
data something;
set something;
(code);
run;
_N_ is equivalent to the row number (since one row is retrieved for each iteration of the data step loop).
So if you wanted to only do something once, on the first iteration, this would accomplish that:
data something;
set something;
if _n_ = 1 then do;
(code);
end;
(more code);
run;
For your follow up, you want something like this:
data want;
set have;
retain _temp;
if _n_ = 1 then do;
_temp = x;
end;
... more code ...
drop _temp;
run;
DROP and RETAIN statements can appear anywhere in the code and have the same effect, I placed them in their human-logical locations. RETAIN says to not reset the variable to missing each time through the data step loop, so you can access it further down.
if you are reading a particularly large text file, you may want to avoid having to execute the (if _n_=1 then) condition for every iteration. You can do this by reading the file twice - once to extract the header row, and again to read in the file, as follows:
data _null_; /* create dummy file for demo purposes */
file "c:\myfile.txt";
put 'blah'; output;
put 'blah blah blah 666'; output;
data _null_; /* read in header info */
infile "c:\myfile.txt";
input myvar:$10.; /* or wherever the info is that you need */
call symput('myvar',myvar);/* create macro variable with relevant info */
stop; /* no further processing at this point */
data test; /* read in data FROM SECOND LINE */
infile "c:\myfile.txt" firstobs=2 ; /* note the FIRSTOBS option */
input my $ regular $ input $ statement ;
remember="&myvar";
run;
For short / simple stuff though, Joe's answer is better as it's more readable.. (and may be more efficient for small files).