Insert text into all cells of first column in a sas dataset - sas

I've output 'Moments' from Proc Univariate to datasets. Many.
Example: Moments_001.sas7bdat through to Moments_237.sas7bdat
For the first column of each dataset (new added first column, and probably new dataset, as opposed to the original) I would like to have a particular text in every cell going down to bottom row.
The exact text would be the name of the respective dataset file: say, "Moments_001".
I do not have to 'grab' the filename, per se, if that's not possible. As I know what the names are already, I can put that text into the procedure. However, grabbing the filenames, if possible, would be easier from my standpoint.
I'd greatly appreciate any help anyone could provide to accomplish this.
Thanks,
Nicholas Kormanik

Are you looking for the INDSNAME option of the SET statement? You need to define two variables because the one generated by the option is automatically dropped.
data want;
length moment dsn $41 ;
set Moments_001 - Moments_237 indsname=dsn ;
moment=dsn;
run;

I think something along these lines should be what you're after. Assuming you have a list of moments, you can loop through it and add a new variable as the first column of each dataset.
%let list_of_moments = moments_001 moments_002 ... moments_237;
%macro your_macro;
%do i = 1 %to %sysfunc(countw(&list_of_moments.));
%let this_moment = %scan(&list_of_moments., &i.);
data &this_moment._v2;
retain new_variable;
set &this_moment.;
new_variable = "&this_moment.";
run;
%end;
%mend your_macro;
%your_macro;

The brute force entering of text into column 1 looks like this:
data moments_001;
length text $ 16;
set moments_001;
text="Moments_001";
run;
You could also write a macro that would loop through all 237 data sets and insert the text.
UNTESTED CODE
%macro do_all;
%do i=1 %to 237;
%let num = %sysfunc(putn(&i,z3.));
data moments_#
length text & 16;
set moments_#
text="Moments_&num";
run;
%end;
%mend
%do_all
It seems to me (not knowing your problem) that if you use PROC UNIVARIATE with the BY option, then you wouldn't need 237 different data sets, all of your output would be in one data set and the BY variable would also be in the data set. Does that solve your problem?

Related

SAS-Creating Panel by several datasets

Suppose there are ten datasets with same structure: date and price, particularly they have same time period but different price
date price
20140604 5
20140605 7
20140607 9
I want to combine them and create a panel dataset. Since there is no name in each datasets, I attempt to add a new variable name into each data and then combine them.
The following codes are used to add name variable into each dataset
%macro name(sourcelib=,from=,going=);
proc sql noprint; /*read datasets in a library*/
create table mytables as
select *
from dictionary.tables
where libname = &sourcelib
order by memname ;
select count(memname)
into:obs
from mytables;
%let obs=&obs.;
select memname
into : memname1-:memname&obs.
from mytables;
quit;
%do i=1 %to &obs.;
data
&going.&&memname&i;
set
&from.&&memname&i;
name=&&memname&i;
run;
%end;
%mend;
So, is this strategy correct? Whether are there a different way to creating a panel data?
There are really two ways to setup repeated measures data. You can use the TALL method that your code will create. That is generally the most flexible. The other would be a wide format with each PRICE being stored in a different variable. That is usually less flexible, but can be easier for some analyses.
You probably do not need to use macro code or even code generation to combine 10 datasets. You might find that it is easier to just type the 10 dataset names than to write complex code to pull the names from metadata. So a data step like this will let you list any number of datasets in the SET statement and use the membername as the value for the new PANEL variable that distinguishes the source dataset.
data want ;
length dsn $41 panel $32 ;
set in1.panel1 in1.panela in1.panelb indsname=dsn ;
panel = scan(dsn,-1,'.') ;
run;
And if your dataset names follow a pattern that can be used as a member list in the SET statement then the code is even easier to write. So you could have a list of names that have a numeric suffix.
set in1.panel1-in1.panel10 indsname=dsn ;
or perhaps names that all start with a particular prefix.
set in1.panel: indsname=dsn ;
If the different panels are for the same dates then perhaps the wide format is easier? You could then merge the datasets by DATE and rename the individual PRICE variables. That is generate a data step that looks like this:
data want ;
merge in1.panel1 (rename=(price=price1))
in1.panel2 (rename=(price=price2))
...
;
by date;
run;
Or perhaps it would be easier to add a BY statement to the data set that makes the TALL dataset and then transpose it into the WIDE format.
data tall;
length dsn $41 panel $32 ;
set in1.panel1 in1.panela in1.panelb indsname=dsn ;
by date ;
panel = scan(dsn,-1,'.') ;
run;
proc transpose data=tall out=want ;
by date;
id panel;
var price ;
run;
I can't comment on the SQL code but the strategy is correct. Add a name to each data set and then panel on the name with the PANELBY statement.
That is a valid way to achieve what you are looking for.
You are going to need 2 . in between the macros for library.data syntax. The first . is used to concatenate. The second shows up as a ..
I assume you will want to append all of these data sets together. You can add
data &going..want;
set
%do i=1 %to &obs;
&from..&&memname&i
%end;
;
run;
You can combine your loop that adds the names and that data step like this:
data &going..want;
set
%do i=1 %to &obs;
&from..&&memname&i (in=d&i)
%end;
;
%do i=1 %to &obs;
if d&i then
name = &&memname&i;
%end;
run;

Check if a value exists in a dataset or column

I have a macro program with a loop (for i in 1 to n). With each i i have a table with many columns - variables. In these columns, we have one named var (who has 3 possible values: a b and c).
So for each table i, I want to check his column var if it exists the value "c". If yes, I want to export this table into a sheet of excel. Otherwise, I will concatenate this table with others.
Can you please tell me how can I do it?
Ok, in your macro at step i you have to do something like this
proc sql;
select min(sum(case when var = 'c' then 1 else 0 end),1) into :trigger from table_i;
quit;
then, you will get macro variable trigger equal 1 if you have to do export, and 0 if you have to do concatenetion. Next, you have to code something like this
%if &trigger = 1 %then %do;
proc export data = table_i blah-blah-blah;
run;
%end;
%else %do;
data concate_data;
set concate_data table_i;
run;
%end;
Without knowing the whole nine yard of your problem, I am at risk to say that you may not need Macro at all, if you don't mind exporting to .CSV instead of native xls or xlsx. IMHO, if you do 'Proc Export', meaning you can't embed fancy formats anyway, you 'd better off just use .CSV in most of the settings. If you need to include column headings, you need to tap into metadata (dictionary tables) and add a few lines.
filename outcsv '/share/test/'; /*define the destination for CSV, change it to fit your real settings*/
/*This is to Cat all of the tables first, use VIEW to save space if you must*/
data want1;
set table: indsname=_dsn;
dsn=_dsn;
run;
/*Belowing is a classic 2XDOW implementation*/
data want;
file outcsv(test.csv) dsd; /*This is your output CSV file, comma delimed with quotes*/
do until (last.dsn);
set want1;
by dsn notsorted; /*Use this as long as your group is clustered*/
if var='c' then _flag=1; /*_flag value will be carried on to the next DOW, only reset when back to top*/
end;
do until (last.dsn);
set want1;
by dsn notsorted;
if _flag=1 then put (_all_) (~); /*if condition meets, output to CSV file*/
else output; /*Otherwise remaining in the Cat*/
end;
drop dsn _flag;
run;

Text manipulation of macro list variables to stack datasets with automated names

I have written a macro that accepts a list of variables, runs a proc mixed model using each variable as a predictor, and then exports the results to a dataset with the variable name appended to it. I am trying to figure out how to stack the results from all of the variables in a single data set.
Here is the macro:
%macro cogTraj(cog,varlist);
%let j = 1;
%let var = %scan(&varlist, %eval(&j));
%let solution = sol;
%let outsol = &solution.&var.;
%do %while (&var ne );
proc mixed data = datuse;
model &cog = &var &var*year /solution cl;
random int year/subject = id;
ods output SolutionF = &outsol;
run;
%let j = %eval(&j + 1);
%let var = %scan(&varlist, %eval(&j));
%let outsol = &solution.&var.;
%end;
%mend;
/* Example */
%cogTraj(mmmscore, varlist = bio1 bio2 bio3);
The result would be the creation of Solbio1, Solbio2, and Solbio3.
I have created a macro variable containing the "varlist" (Ideally, I'd like to input a macro variable list as the argument but I haven't figured out how to deal with the scoping):
%let biolist = bio1 bio2 bio3;
I want to stack Solbio1, Solbio2, and Solbio3 by using text manipulation to add "Sol" to the beginning of each variable. I tried the following, outside of any data step or macro:
%let biolistsol = %add_string( &biolist, Sol, location = prefix);
without success.
Ultimately, I want to do something like this;
data Solbio_stack;
set %biolistsol;
run;
with the result being a single dataset in which Solbio1, Solbio2, and Solbio3 are stacked, but I'm sure I don't have the right syntax.
Can anyone help me with the text string/dataset stacking issue? I would be extra happy if I could figure out how to change the macro to accept %biolist as the argument, rather than writing out the list variables as an argument for the macro.
I would approach this differently. A good approach for the problem is to drive it with a dataset; that's what SAS is good at, really, and it's very easy.
First, construct a dataset that has a row for each variable you're running this on, and a variable name that contains the variable name (one per row). You might be able to construct this using PROC CONTENTS or sashelp.vtable or dictionary.tables, if you're using a set of variables from one particular dataset. It can also come from a spreadsheet you import, or a text file, or anything else really - or just written as datalines, as below.
So your example would have this dataset:
data vars_run;
input name $ cog $;
datalines;
bio1 mmmscore
bio2 mmmscore
bio3 mmmscore
;;;;
run;
If your 'cog' is fairly consistent you don't need to put it in the data, if it is something that might change you might also have a variable for it in the data. I do in the above example include it.
Then, you write the macro so it does one pass on the PROC MIXED - ie, the inner part of the %do loop.
%macro cogTraj(cog=,var=, sol=sol);
proc mixed data = datuse;
model &cog = &var &var*year /solution cl;
random int year/subject = id;
ods output SolutionF = &sol.&var.;
run;
%mend cogTraj;
I put the default for &sol in there. Now, you generate one call to the macro from each row in your dataset. You also generate a list of the sol sets.
proc sql;
select cats('%cogTraj(cog=',cog,',var=',name,',sol=sol)')
into :callList
sepearated by ' '
from have;
select cats('sol',name') into :solList separated by ' '
from have;
quit;
Next, you run the macro:
&callList.
And then you can do this:
data sol_all;
set &solList.;
run;
All done, and a lot less macro variable parsing which is messy and annoying.

Get data filtered by dynamic column list in SAS stored process

My goal is to create a SAS stored process is to return data for a single dataset and to filter the columns in that dataset based on a multi-value input parameter passed into the stored process.
Is there a simple way to do this?
Is there a way to do this at all?
Here's what I have so far. I'm using a macro to dynamically generate the KEEP statement to define which columns to return. I've defined macro variables at the top to mimic what gets passed into the stored process when called through SAS BI Web Services, so unfortunately those have to remain as they are. That's why I've tried to use the VVALUEX method to turn the column name strings into variable names.
Note - I'm new to SAS
libname fetchlib meta library="lib01" metaserver="123.12.123.123"
password="password" port=1234
repname="myRepo" user="myUserName";
/* This data represents input parameters to stored process and
* is removed in the actual stored process*/
%let inccol0=3;
%let inccol='STREET';
%let inccol1='STREET';
%let inccol2='ADDRESS';
%let inccol3='POSTAL';
%let inccol_count=3;
%macro keepInputColumns;
%if &INCCOL_COUNT = 1 %then
&inccol;
%else
%do k=1 %to (&INCCOL_COUNT);
var&k = VVALUEX(&&inccol&k);
%end;
KEEP
%do k=1 %to (&INCCOL_COUNT);
var&k
%end;
;
%mend;
data test1;
SET fetchlib.Table1;
%keepInputColumns;
run;
/*I switch this output to _WEBOUT in the actual stored process*/
proc json out='C:\Logs\Log1.txt';
options firstobs=1 obs=10;
export test1 /nosastags;
run;
There are some problems with this. The ouput uses var1, var2 and var3 as the column names and not the actual column names. It also doesn't filter by any columns when I change the output to _webout and run it using BI Web Services.
OK, I think I have some understanding of what you're doing here.
You can use KEEP and RENAME in conjunction to get your variable names back.
KEEP
%do k=1 %to (&INCCOL_COUNT);
var&k
%end;
;
This has an equivalent
RENAME
%do k=1 %to (&INCCOL_COUNT);
var&k = &&inccol&k.
%end;
;
and now, as long as the user doesn't separately keep the original variables, you're okay. (If they do, then you will get a conflict and an error).
If this way doesn't work for your needs, and I don't have a solution for the _webout as I don't have a server to play with, you might consider trying this in a slightly different way.
proc format;
value agef
11-13 = '11-13'
14-16 = '14-16';
quit;
ods output report=mydata(drop=_BREAK_);
proc report data=sashelp.class nowd;
format age agef.;
columns name age;
run;
ods output close;
The first part is just a proc format to show that this grabs the formatted value not the underlying value. (I assume that's desired, as if it's not this is a LOT easier.)
Now you have the data in a dataset a bit more conveniently, I think, and can put it out to JSON however you want. In your example you'd do something like
ods output report=work.mydata(drop=_BREAK_);
proc report data=fetchlib.Table1 nowd;
columns
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
;
run;
ods output close;
And then you can send that dataset to JSON or whatever. It's actually possible that you might be able to go more directly than that even, but I don't know almost anything about PROC JSON.
Reading more about JSON, you may actually have an easier way to do this.
On the export line, you have the various format options. So, assuming we have a dataset that is just a subset of the original:
proc json out='C:\Logs\Log1.txt';
options firstobs=1 obs=10;
export fetchlib.Table1
(
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
)
/ nosastags FMTCHARACTER FMTDATETIME FMTNUMERIC ;
run;
This method doesn't allow for the variable order to be changed; if you need that, you can use an intermediate dataset:
data intermediate/view=intermediate;
set fetchlib.Table1;
retain
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
;
keep
%do k=1 %to (&INCCOL_COUNT);
&&inccol&k.;
%end;
;
run;
and then write that out. I'm just guessing that you can use a view in this context.
It turns out that the simplest way to implement this was to change the way that the columns (aka SAS variables) were passed into the stored process. Although Joe's answer was helpful, I ended up solving the problem by passing in the columns to the keep statement as a space-separated column list, which greatly simplified the SAS code because I didn't have to deal with a dynamic list of columns.
libname fetchlib meta library="lib01" metaserver="123.12.123.123"
password="password" port=1234
repname="myRepo" user="myUserName";"&repository" user="&user";
proc json out=_webout;
export fetchlib.&tablename(keep=&columns) /nosastags;
run;
Where &columns gets set to something like this:
Column1 Column2 Column3

SAS: Drop column in a if statement

I have a dataset called have with one entry with multiple variables that look like this:
message reference time qty price
x 101 35000 100 .
the above dataset changes every time in a loop where message can be ="A". If the message="X" then this means to remove 100 qty from the MASTER set where the reference number equals the reference number in the MASTER database. The price=. is because it is already in the MASTER database under reference=101. The MASTER database aggregates all the available orders at some price with quantity available. If in the next loop message="A" then the have dataset would look like this:
message reference time qty price
A 102 35010 150 500
then this mean to add a new reference number to the MASTER database. In other words, to append the line to the MASTER.
I have the following code in my loop to update the quantity in my MASTER database when there is a message X:
data b.master;
modify b.master have(where=(message="X")) updatemode=nomissingcheck;
by order_reference_number;
if _iorc_ = %sysrc(_SOK) then do;
replace;
end;
else if _iorc_ = %sysrc(_DSENMR) then do;
output;
_error_ = 0;
end;
else if _iorc_ = %sysrc(_DSEMTR) then do;
_error_ = 0;
end;
else if _iorc_ = %sysrc(_DSENOM) then do;
_error_ = 0;
end;
run;
I use the replace to update the quantity. But since my entry for price=. when message is X, the above code sets the price='.' where reference=101 in the MASTER via the replace statement...which I don't want. Hence, I prefer to delete the price column is message=X in the have dataset. But I don't want to delete column price when message=A since I use this code
proc append base=MASTER data=have(where=(msg_type="A")) force;
run;
Hence, I have this code price to my Modify statement:
data have(drop=price_alt);
set have; if message="X" then do;
output;end;
else do; /*I WANT TO MAKE NO CHANGE*/
end;run;
but it doesn't do what I want. If the message is not equal X then I don't want to drop the column. If it is equal X, I want to drop the column. How can I adapt the code above to make it work?
Its a bit of a strange request to be honest, such that it raises questions about whether what you're doing is the best way of doing it. However, in the spirit of answering the question...
The answer by DomPazz gives the option of splitting the data into two possible sets, but if you want code down the line to always refer to a specific data set, this creates its own complications.
You also can't, in the one data step, tell SAS to output to the "same" data set where one instance has a column and one instance doesn't. So what you'd like, therefor, is for the code itself to be dynamic, so that the data step that exists is either one that does drop the column, or one that does not drop the column, depending on whether message=x. The answer to this, dynamic code, like many things in SAS, resolves to the creative use of macros. And it looks something like this:
/* Just making your input data set */
data have;
message='x';
time=35000;
qty=1000;
price=10.05;
price_alt=10.6;
run;
/* Writing the macro */
%macro solution;
%local id rc1 rc2;
%let id=%sysfunc(open(work.have));
%syscall set(id);
%let rc1=%sysfunc(fetchobs(&id, 1));
%let rc2=%sysfunc(close(&id));
%IF &message=x %THEN %DO;
data have(drop=price_alt);
set have;
run;
%END;
%ELSE %DO;
data have;
set have;
run;
%END;
%mend solution;
/* Running the macro */
%solution;
Try this:
data outX(drop=price_alt) outNoX;
set have;
if message = "X" then
output outX;
else
output outNoX;
run;
As #sasfrog says in the comments, a table either has a column or it does not. If you want to subset things where MESSAGE="X" then you can use something like this to create 2 data sets.