Creating vertical detail tables - sas

I'm looking for a way to create vertical tables in SAS where the variables are each treated as rows (as opposed to each row being an observation).
For example lets say I have some data for a bunch of companies, some of which is more important than others. It is easy to make proc report spit out a summary table with a few variables like this:
Name Price Shares MarketCap
co1 $5 100 $500
co2 $1 100 $100
co3 $2 200 $400
What I want to do after this is print a page of detailed information for each company which is essentially a table with a column for the description and a column for the value (and maybe a third column for the calculation).
Company 1
Location: CA
CEO: Bob Johnson
Industry: Semiconductors
Shares: 100
Share Price: $5
Market Cap: $500
The only way I can think of to do this in SAS is to basically transpose everything, create a new character variable that has the label (Location, Stock Price, Etc) and a second character variable that has the value and then make a two column report BY company to get a page for each. This is messy since some of the values are numeric and others are character so to get them to display on one column requires creating a new character variable and filling it with text versions of the numeric variables.
I figure there has got to be an easier way to create a vertical table since there are so many easy ways to create the horizontal tables.

There is also this solution which is probably better for your needs.
First create a HTML file that will be used as a template. Wherever you want to put a value, use a macro variable as a placeholder like so:
<html>
<h1> My title is &title </h1><br>
Name: &name <br>
Value of Blah: &blah
</html>
Make it as attractive looking as you like.
Next create a macro that will import the HTML template, replace the placeholders with actual values and save the result to a new file:
/*****************************************************************************
** PROGRAM: MACRO.RESOLVE_FILE.SAS
**
** READS IN A FILE AND REPLACES ANY MACRO REFERENCES IN THE FILE WITH THE
** ACTUAL MACRO VALUES. EG. IF THE FILE WAS AN HTML FILE AND IT CONTAINED
** THE FOLLOWING HTML:
**
** <TITLE>&HTML_TITLE</TITLE>
**
** THEN THE PROGRAM WOULD READ THE FILE IN AND RESOLVE IT SO THAT THE OUTPUT
** LOOKED LIKE THIS:
**
** <TITLE>ROB</TITLE>
**
** ... WHEN THE MACRO VARIABLE "HTML_TITLE" EXISTED AND CONTAINED A VALUE OF
** "ROB". THIS IS USEFUL WHEN YOU NEED TO CREATE "DYNAMIC" HTML FILES FROM
** SAS BUT DONT WANT TO DO IT FROM A DATASTEP USING PUT STATEMENTS. DOING
** IT THIS WAY IS MUCH CLEANER.
**
** PARAMETERS: NONE
**
******************************************************************************
** HISTORY:
** 1.0 MODIFIED: 22-JUL-2010 BY:RP
** - CREATED.
** 1.1 MODIFIED: 18-FEB-2011 BY:RP
** - ADDED LRECL OF 32K TO STOP TRUNCATION
*****************************************************************************/
%macro resolve_file(iFileIn=, iFileOut=);
data _null_;
length line $32767;
infile "&iFileIn" truncover lrecl=32767;
file "&iFileOut" lrecl=32767;
input;
line = resolve(_infile_);
len = length(line);
put line $varying. len;
run;
%mend;
Create some test data. Also create some commands to call the above macro and pass in the values from the dataset:
data mydata;
attrib name length=$10 format=$10. label='FirstName'
blah length=6 format=comma6. label='SomeValue'
cmd1 length=$1000
cmd2 length=$1000
;
title = 1;
name = "Rob" ;
blah = 1000;
cmd1 = cats('%let title=',title,';',
'%let name=',name,';',
'%let blah=',blah,';');
cmd2 = cats('%resolve_file(iFileIn=c:\template.html, iFileOut=c:\result',title,'.html);');
output;
title = 2;
name = "Pete";
blah = 100 ;
cmd1 = cats('%let title=',title,';',
'%let name=',name,';',
'%let blah=',blah,';');
cmd2 = cats('%resolve_file(iFileIn=c:\template.html, iFileOut=c:\result',title,'.html);');
output;
run;
Use call execute to run the cmd1 and cmd2 that we created in the prior dataset. We have to only execute call execute on 1 row at a time so that the correct macro variables are used so do it using a loop. First calculate the number of rows in your dataset using your preferred technique:
proc sql noprint;
select count(*) into :nobs from mydata;
quit;
Then iterate through the dataset executing the commands one at a time and building each row to a new file:
%macro publish;
%local tmp;
%do tmp = 1 %to &nobs;
data _null_;
set mydata(firstobs=&tmp obs=&tmp);
call execute (cmd1);
call execute (cmd2);
run;
%end;
%mend;
%publish;
That should do the trick.

Perhaps I'm missing something but didn't you answer your own question? It should be as easy as:
Create some sample data. Be sure that every column has a format and label applied:
data mydata;
attrib name length=$10 format=$10. label='FirstName'
blah length=6 format=comma6. label='SomeValue';
bygroup = 1; name = "Rob" ; blah = 1000; output;
bygroup = 2; name = "Pete"; blah = 100 ; output;
run;
Transpose the data to make it tall:
proc transpose data=mydata out=trans;
by bygroup;
var _all_;
run;
Print it Out:
data _null_;
set trans2;
by bygroup;
if first.bygroup then do;
put bygroup;
put "------";
end;
put _label_ ":" value;
run;
Result:
1
------
FirstName :Rob
SomeValue :1,000
2
------
FirstName :P
SomeValue :100

How about one of these solutions instead then... Open a table in Bases SAS to view it. Go to View->Form View. That puts it in the layout you requested. It may not look exactly the way you want but it's a quick option.
The alternative is to write your own. Create a macro that takes a dataset and anything else you want to specify as parameters and display it using ODS, the put statement, or whatever other technique you would like.
I'm not aware of any other built in method in SAS to do this.

Related

How to select a single value from a table, to use for comparison(greater than/less than)?

I am handing over some code to a colleague, which is to be run daily to generate reports.
Once every month a new cycle starts, and we have to update the code for cycle_start_date
data mtd_table;
set ytd_table;
where entry_date> '10Mar2021'd; /*different every month*/
run;
Since he'll be running them from now on, along with other reports from other teams, I don't want to bother him every month to tweak the code. So I devised this:
i run(once a month)
data shared1.cycle_start_date;
cycle_start_date='10Mar2021'd;
run;
he runs(everyday)
data mtd_table;
set ytd_table;
where entry_date>/*(select cycle_start_date from shared1.cycle_start_date)*/;
run;
I'm not sure how to correctly implement this (select cycle_start_date from shared1.cycle_start_date) part, since it is from proc sql. Would appreciate help.
When you store program parameters in a data set (called control data) one use case is having later code extract the values into macro variables, at which point other code can resolve the macro variable for replacement at (automatic) step compile and run time. Two ways to extract values into macro variables are:
Proc SQL, SELECT ... INTO :<macro-variable>, and
DATA _NULL_, CALL SYMPUT(<macro-variable>, <data step expression>);
Don't forget, macro resolution replaces the macro variable as source code text. Dates in macro variables can be either the SAS data value (the text representation of a SAS date integer) or part of a date literal (the text <dd-mon-yyyy>) that would be resolved as source date literal "&<macro-variable>"D when to be utilized as a date value. The date literal part is used when you want to show the date value as human readable in when output; for example: TITLE "cycle start: &cycle_start_date";
Control data (you)
Rebuild or edit values in data set (name it parameters to be more useful)
data shared1.parameters;
cycle_start_date = '10Mar2021'd; * stored as a SAS date value (integer);
run;
Note: Some control data layouts use a name/value organization and has one row per parameter.
Other
Extract date value as SAS date value text, and as date literal text portion and use.
proc sql noprint;
select
cycle_start_date
, cycle_start_date format=date11.
into
:cycle_start_date_value trimmed
, :cycle_start_date_literal trimmed
from
shared1.parameters
;
%put &=cycle_start_date_value;
%put &=cycle_start_date_literal;
/*
* will log the macro variable value as follows:
* CYCLE_START_DATE_VALUE=22349 and
* CYCLE_START_DATE_LITERAL=10-MAR2021
*/
data ...
set ...;
where date >= &cycle_start_date; *resolve parameter as text representation of a SAS date value (integer);
...
title "Cycle starts: &cycle_start_date_literal";
proc print data=...; * title in output shows human readable part of date;
run;
Another approach is to use a common source code file that is %included by others. You would edit or recreate the parameters file by whatever process you want.
parameters.sas
%let cycle_start_date = 10-Mar-2021;
use
%include 'parameters.sas';
data ...
set ...;
where date >= "&cycle_start_date"D; *resolve parameter as part of date literal;
...
title "Cycle starts: &cycle_start_date";
proc print data=...; * title in output shows human readable part of date literal;
run;
One possible solution would be to put the date from the cycle_start_date table that is in the shared library shared1 into a macro-variable date that will be used in your data step to filter the ytd_table table based on the entry_date variable.
proc sql noprint;
select cycle_start_date into :date
from shared1.cycle_start_date;
quit;
data mtd_table;
set ytd_table;
where entry_date > &date.;
run;

SAS pass-through facility. How to insert a big list from a local table in a query?

I need to query a large table in a server (REMOTE_TBL) using the SAS pass-through facility. In order to make the query shorter, I want to send a list of IDs extracted from a local table (LOCAL_TBL).
My first step is to get the IDs into a variable called id_list using an INTO statement:
select distinct ID into: id_list separated by ',' from WORK.LOCAL_TBL
Then I pass these IDs to the pass-through query:
PROC SQL;
CONNECT TO sybaseiq AS dbcon
(host="name.cl" server=alias db=iws user=sas_user password=XXXXXX);
create table WANT as
select * from connection to dbcon(
select *
from dbo.REMOTE_TBL
where ID in (&id_list)
);
QUIT;
The code runs fine except that I get the following message:
The length of the value of the macro variable exceeds the maximum length
Is there an easier way to send the selected ID's to the pass-through query?
Is there a way to store the selected ID's in two or more variables?
Store the values into multiple macro variables and then store the names of the macro variables into another macro variable.
So this code will make a series of macro variables named M1, M2, .... and then set ID_LIST to &M1,&M2....
data _null_;
length list $20200 mlist $20000;
do until(eof or length(list)>20000);
set LOCAL_TBL end=eof;
list=catx(',',list,id);
end;
call symputx(cats('m',_n_),list);
mlist=catx(',',mlist,cats('&m',_n_));
if eof then call symputx('id_list',mlist);
run;
Then when you expand ID_LIST the macro processor will expand all of the individual Mx macro variables. This little data step will create a couple of example macro variables to demonstrate the idea.
data _null_;
call symputx('m1','a,b,c');
call symputx('m2','d,e,f');
call symputx('id_list','&m1,&m2');
run;
Results:
70 %put ID_LIST=%superq(id_list);
ID_LIST=&m1,&m2
71 %put ID_LIST=&id_list;
ID_LIST=a,b,c,d,e,f
You are passing many data values that appear in your IN (…) clause. The number of values allowed varies by data base; some may limit to 250 values per clause and the length of a statement might have limitations. If the macro variable creates a list of values 20,000 characters long, the remote side might not like that.
When dealing with a lookup of perhaps > 100 values, take some time first to communicate your need to the DB admin for creating temporary tables. When you have such rights, your queries will be more efficient remote side.
… upload id values to #myidlist …
create table WANT as
select * from connection to dbcon(
select *
from dbo.REMOTE_TBL
where ID in (select id from #myidlist)
);
QUIT;
If you can't get the proper permissions, you would have to chop up the id list into pieces and have a macro create a series of ORed IN searches.
1=0
OR ID IN ( … list-values-1 … )
…
OR ID IN ( … list-values-N … )
For example:
data have;
do id = 1 to 44;
output;
end;
run;
%let IDS_PER_MACVAR = 10; * <---------- make as large as you want until error happens again;
* populated the macro vars holding the chopped up ID list;
data _null_;
length macvar $20; retain macvar;
length macval $32000; retain macval;
set have end=end;
if mod(_n_-1, &IDS_PER_MACVAR) = 0 then do;
if not missing(macval) then call symput(macvar, trim(macval));
call symputx ('VARCOUNT', group);
group + 1;
macvar = cats('idlist',group);
macval = '';
end;
macval = catx(',',macval,id);
if end then do;
if not missing(macval) then call symput(macvar, trim(macval));
call symputx ('MVARCOUNT', group);
end;
run;
* macro that assembles the chopped up bits as a series of ORd INs;
%macro id_in_ors (N=,NAME=);
%local i;
1 = 0
%do i = 1 %to &N;
OR ID IN (&&&NAME.&i)
%end;
%mend;
* use %put to get a sneak peek at what will be passed through;
%put %id_in_ors(N=&MVARCOUNT,NAME=IDLIST);
* actual sql with pass through;
...
create table WANT as
select * from connection to dbcon(
select *
from dbo.REMOTE_TBL
where ( %ID_IN_ORS(N=&MVARCOUNT,NAME=IDLIST) ) %* <--- idlist piecewise ors ;
);
...
I suggest that you first save all the distinct values into a table, and then (again using proc sql + into) load the values into a few stand-alone macrovariables, reading the table several times in a few sets; indeed they have to be mutually exclusive yet jointly exhaustive.
Do you have access to and CREATE privileges in the DB where your dbo.REMOTE_TBL resides? If so you might also think about copying your WORK.LOCAL_TBL into a temporary table in the DB and run an inner join right there.
Another option - write out the query to a temporary file and then %include it. No macro logic needed!
proc sort
data = WORK.LOCAL_TBL(keep = ID)
out = distinct_ids
nodupkey;
run;
data _null_;
set distinct_ids end = eof;
file "%sysfunc(pathname(work))/temp.sas";
if _n_ = 1 then put "PROC SQL;
CONNECT TO sybaseiq AS dbcon
(host=""name.cl"" server=alias db=iws user=sas_user password=XXXXXX);
create table WANT as
select * from connection to dbcon(
select *
from dbo.REMOTE_TBL
where ID in (" #;
put ID #;
if not(eof) then put "," #;
if eof then put ");QUIT;" #;
put;
run;
/*Use nosource2 to avoid cluttering the log*/
%include "%sysfunc(pathname(work))/temp.sas" /nosource2;

enter column in a dataset to an array

I have 33 different datasets with one column and all share the same column name/variable name;
net_worth
I want to load the values into arrays and use them in a datastep. But the array that I use should depend on the the by groups in the datastep (country by city). There are total of 33 datasets and 33 groups (country by city). each dataset correspond to exactly one by group.
here is an example what the by groups look like in the dataset: customers
UK 105 (other fields)
UK 102 (other fields)
US 291 (other fields)
US 292 (other fields)
Could I get some advice on how to go about and enter the columns in arrays and then use them in a datastep. or do you suggest to do it in another way?
%let var1 = uk105
%let var2 = uk102
.....
&let var33 = jk12
data want;
set customers;
by country city;
if _n_ = 1 then do;
*set datasets and create and populate arrays*;
* use array values in calculations with fields from dataset customers, depending on which by group. if the by group is uk and city is 105 then i need to use the created array corresponding to that by group;
It is a little hard to understand what you want.
It sounds like you have one dataset name CUSTOMERS that has all of the main variables and a bunch of single variable datasets that the values of NET_WORTH for a lot of different things (Countries?).
Assuming that the observations in all of the datasets are in the same order then I think you are asking for how to generate a data step like this:
data want;
set customers;
set uk105 (rename=(net_worth=uk105));
set uk103 (rename=(net_worth=uk103));
....
run;
Which might just be easiest to do using a data step.
filename code temp;
data _null_;
input name $32. ;
file code ;
put ' set ' name '(rename=(net_worth=' name '));' ;
cards;
uk105
uk102
;;;;
data want;
set customers;
%include code / source2;
run;

Text manipulation of macro list variables to stack datasets with automated names

I have written a macro that accepts a list of variables, runs a proc mixed model using each variable as a predictor, and then exports the results to a dataset with the variable name appended to it. I am trying to figure out how to stack the results from all of the variables in a single data set.
Here is the macro:
%macro cogTraj(cog,varlist);
%let j = 1;
%let var = %scan(&varlist, %eval(&j));
%let solution = sol;
%let outsol = &solution.&var.;
%do %while (&var ne );
proc mixed data = datuse;
model &cog = &var &var*year /solution cl;
random int year/subject = id;
ods output SolutionF = &outsol;
run;
%let j = %eval(&j + 1);
%let var = %scan(&varlist, %eval(&j));
%let outsol = &solution.&var.;
%end;
%mend;
/* Example */
%cogTraj(mmmscore, varlist = bio1 bio2 bio3);
The result would be the creation of Solbio1, Solbio2, and Solbio3.
I have created a macro variable containing the "varlist" (Ideally, I'd like to input a macro variable list as the argument but I haven't figured out how to deal with the scoping):
%let biolist = bio1 bio2 bio3;
I want to stack Solbio1, Solbio2, and Solbio3 by using text manipulation to add "Sol" to the beginning of each variable. I tried the following, outside of any data step or macro:
%let biolistsol = %add_string( &biolist, Sol, location = prefix);
without success.
Ultimately, I want to do something like this;
data Solbio_stack;
set %biolistsol;
run;
with the result being a single dataset in which Solbio1, Solbio2, and Solbio3 are stacked, but I'm sure I don't have the right syntax.
Can anyone help me with the text string/dataset stacking issue? I would be extra happy if I could figure out how to change the macro to accept %biolist as the argument, rather than writing out the list variables as an argument for the macro.
I would approach this differently. A good approach for the problem is to drive it with a dataset; that's what SAS is good at, really, and it's very easy.
First, construct a dataset that has a row for each variable you're running this on, and a variable name that contains the variable name (one per row). You might be able to construct this using PROC CONTENTS or sashelp.vtable or dictionary.tables, if you're using a set of variables from one particular dataset. It can also come from a spreadsheet you import, or a text file, or anything else really - or just written as datalines, as below.
So your example would have this dataset:
data vars_run;
input name $ cog $;
datalines;
bio1 mmmscore
bio2 mmmscore
bio3 mmmscore
;;;;
run;
If your 'cog' is fairly consistent you don't need to put it in the data, if it is something that might change you might also have a variable for it in the data. I do in the above example include it.
Then, you write the macro so it does one pass on the PROC MIXED - ie, the inner part of the %do loop.
%macro cogTraj(cog=,var=, sol=sol);
proc mixed data = datuse;
model &cog = &var &var*year /solution cl;
random int year/subject = id;
ods output SolutionF = &sol.&var.;
run;
%mend cogTraj;
I put the default for &sol in there. Now, you generate one call to the macro from each row in your dataset. You also generate a list of the sol sets.
proc sql;
select cats('%cogTraj(cog=',cog,',var=',name,',sol=sol)')
into :callList
sepearated by ' '
from have;
select cats('sol',name') into :solList separated by ' '
from have;
quit;
Next, you run the macro:
&callList.
And then you can do this:
data sol_all;
set &solList.;
run;
All done, and a lot less macro variable parsing which is messy and annoying.

How to combine text and numbers in catx statement

The variable upc is already defined in my cool dataset. How do I convert it to a macro variable? I am trying to combine both text and numbers. For example blah should equal upc=123;
data cool;
set cool;
blah = catx("","upc=&upc","ccc")
run;
If upc is a numeric variable and you just want to include its value into some character string then you don't need to do anything special. Concatenation function will convert it into character before concatenating automatically:
data cool;
blah = catx("","upc=",upc,"ccc");
run;
The result:
upc----blah
123 upc= 123ccc
BTW, if you want to concatenate strings without blanks between them, you can use function CATS(), which strips all leading and trailing spaces from each argument.
The following test code works for my SAS 9.3 x64 PC.
Please note that:
1.symputx() provide the connection between dataset and macro variables.
2.cats() will be more appropriate than catx() if delimiting characters are not needed.
3.If you did not attempt to create a new data set, data _NULL_ is fine.
You can check the log to see that the correct values are being assigned.
Bill
data a;
input ID $ x y ##;
datalines;
A 1 10 A 2 20 A 3 30
;
run;
options SymbolGen MPrint MLogic MExecNote MCompileNote=all;
data _NULL_;
set a;
call symputx(cats("blah",_N_),cats(ID,x),"G");
run;
%put blah1=&blah1;
%put blah2=&blah2;
%put blah3=&blah3;