What's the easiest way to get SAS to do this? - sas

I have a dataset that looks like this but with many, many more variable pairs:
Stuff2016 Stuff2008 Earth2016 Earth2008 Fire2016 Fire2008
123456 5646743 45 456 456 890101
541351 543534534 45 489 489 74456
352352 564889 98 489489 1231 189
464646 542235423 13 15615 1561 78
987654 4561889 44 1212 12121 111
For each pair of almost identically named variables,
I want SAS to subtract 2016 data - 2008 data without typing the variable names.
What's the easiest way to tell SAS to do this without having to specifically type the variable names? Is there a way to tell it to subtract every other variable minus the one that precedes it without mentioning the specific variable names?
Thanks a lot!!!!

I would probably recommend three arrays but you could do it with one. This highly depends on the order of the variables which isn't a good assumption in my book. Also, how would you name the results automatically?
data want;
set have;
array vars(*) stuff2016--fire2008;
array diffs(*) diffs1-diffs20; *something big enough to hold difference;
do i=1 to dim(vars)-1;
diffs(i) = vars(i)-vars(i+1);
end;
run;
Instead, I'd highly suggest you use the dictionary tables to query your variable names and dynamically generate your variable lists which are then passed onto three different arrays, one for 2016, one for 2008 and one for the difference. The libname and memname are stored in uppercase in the Dictionary table so keep that in mind.
data have;
input Stuff2016 Stuff2008 Earth2016 Earth2008 Fire2016 Fire2008;
cards;
123456 5646743 45 456 456 890101
541351 543534534 45 489 489 74456
352352 564889 98 489489 1231 189
464646 542235423 13 15615 1561 78
987654 4561889 44 1212 12121 111
;
run;
proc sql;
select name into :var2016 separated by " "
from sashelp.vcolumn
where libname='WORK'
and memname='HAVE'
and name like '%2016'
order by name;
select name into :var2008 separated by " "
from sashelp.vcolumn
where libname='WORK'
and memname='HAVE'
and name like '%2008'
order by name;
select catx("_", compress(name, ,'d'), "diff") into :vardiff separated by " "
from sashelp.vcolumn
where libname='WORK'
and memname='HAVE'
and name like '%2016'
order by name;
quit;
%put &var2016.;
%put &var2008.;
%put &vardiff.;
data want;
set have;
array v2016(*) &var2016;
array v2008(*) &var2008;
array diffs(*) &vardiff;
do i=1 to dim(v2016);
diffs(i)=v2016(i)-v2008(i);
end;
run;

Related

enter column in a dataset to an array

I have 33 different datasets with one column and all share the same column name/variable name;
net_worth
I want to load the values into arrays and use them in a datastep. But the array that I use should depend on the the by groups in the datastep (country by city). There are total of 33 datasets and 33 groups (country by city). each dataset correspond to exactly one by group.
here is an example what the by groups look like in the dataset: customers
UK 105 (other fields)
UK 102 (other fields)
US 291 (other fields)
US 292 (other fields)
Could I get some advice on how to go about and enter the columns in arrays and then use them in a datastep. or do you suggest to do it in another way?
%let var1 = uk105
%let var2 = uk102
.....
&let var33 = jk12
data want;
set customers;
by country city;
if _n_ = 1 then do;
*set datasets and create and populate arrays*;
* use array values in calculations with fields from dataset customers, depending on which by group. if the by group is uk and city is 105 then i need to use the created array corresponding to that by group;
It is a little hard to understand what you want.
It sounds like you have one dataset name CUSTOMERS that has all of the main variables and a bunch of single variable datasets that the values of NET_WORTH for a lot of different things (Countries?).
Assuming that the observations in all of the datasets are in the same order then I think you are asking for how to generate a data step like this:
data want;
set customers;
set uk105 (rename=(net_worth=uk105));
set uk103 (rename=(net_worth=uk103));
....
run;
Which might just be easiest to do using a data step.
filename code temp;
data _null_;
input name $32. ;
file code ;
put ' set ' name '(rename=(net_worth=' name '));' ;
cards;
uk105
uk102
;;;;
data want;
set customers;
%include code / source2;
run;

how to use select into creating a macro array with numeric value descending

I am use select into creating a macro array
proc sql;
select numValue into:num_value separated by ' ' from tableA;
quit;
%put %scan(num_value,1);
however,the value in macro num_value did not arrange their numeric values from its original order(from small to large).
so how could I arrage their values descending or ascending depending on their index,or the macro array has a same order as the original table is.
thanks!
If I understand your question correctly, you want to order value in macro variable, you could do something like this:
proc sql;
select height into:height from sashelp.class order by height;
quit;
%put &height;
Code: Values ordered in descending order below, the default order is ascending if you don't specify.
proc sql;
select height into:height separated by ' ' from sashelp.class order by height desc;
quit;
%put &height;
Log:
72 69 67 66.5 66.5 65.3 64.8 64.3 63.5 62.8 62.5 62.5 59.8 59 57.5 57.3 56.5 56.3 51.3
Output:

Get data set with maximum date in name by proc SQL

Suppose I have some data sets in library lib, their names look like Table_YYYYMMDD (e.g. Table_20150101).
I want to get a name of a data set with maximum date (YYYYMMDD) and store it in a macro variable.
I'm using proc sql and from dictionary.tables.
First I extract a YYYYMMDD part of name. Then I should convert it to date and then find MAX. And I want to be sure that I have at least one data set in library.
proc sql;
select put(MAX(input(scan(memname, 2, '_'), yymmdd8.)), yymmddn8.)
into :mvTable_MaxDate
from dictionary.tables
where libname = 'LIB';
quit;
So,
Is it right to use sas functions like scan in proc sql?
How could I check whether the query is not empty (mvTable_MaxDate hasn't missing value)?
Thanks for your help:)
The cause of the error is that your are using the INPUTN() function, which expects the second argument to be a text literal or the name of a variable. If you change to INPUT(), it will avoid the error.
Also note you need to upcase the literal value of the library name on your where clause. Dictionary.tables stores libnames in upcase.
As written, the value of the macro variable will be a SAS date value. If you want it formatted as YYMMDDN8. you will need to add that.
Here's an example:
74 data a_20151027
75 a_20141022
76 a_20130114
77 ;
78 x=1;
79 run;
NOTE: The data set WORK.A_20151027 has 1 observations and 1 variables.
NOTE: The data set WORK.A_20141022 has 1 observations and 1 variables.
NOTE: The data set WORK.A_20130114 has 1 observations and 1 variables.
80
81 proc sql noprint;
82 select COALESCE(MAX(input(scan(memname, 2, '_'), yymmdd8.)), 0)
83 into :mvTable_MaxDate
84 from dictionary.tables
85 where libname = 'WORK';
86 quit;
87
88 %put &mvTable_MaxDate;
20388
89 %put %sysfunc(putn(&mvTable_MaxDate,yymmddn8));
20151027
As a side-comment, often life becomes much easier if you can just combine all your data into one dataset, and store the dataset name date suffix as a variable.

proc contents is truncating the values in out put dataset.How to get full values in out put dataset?

I'm trying to get all dataset names in a library in to a data set.
proc datasets library=LIB1 memtype=data ;
contents data=_all_ noprint out=Datasets_in_Lib1(keep=memname) ;
run;
The final data set (Datasets_in_Lib1) is having all the data set names that are in LIB1, but names are truncated to 6 characters.Is there any way to get full names of the datasets with out truncation.
Ex: If dataset name is x123456789, the Datasets_in_Lib1 will have x12345 only.
Thanks in advance,
Sam.
Agree with the comment, in 9.3 memname is $32. I don't think there was a version of SAS where data set names were limited to 6 characters. They went from 8 characters to 32 characters in v7 (I think).
Here's a log from running your code, showing it works as you want in 9.3
51 data work.x123456789;
52 x=1;
53 run;
NOTE: The data set WORK.X123456789 has 1 observations and 1 variables.
54
55 proc datasets library=work memtype=data nolist;
56 contents data=_all_ noprint out=Datasets_in_Lib1(keep=memname) ;
57 run;
NOTE: The data set WORK.DATASETS_IN_LIB1 has 1 observations and 1 variables.
58
59 data _null_;
60 set Datasets_in_Lib1;
61 put _all_;
62 run;
MEMNAME=X123456789 _ERROR_=0 _N_=1
NOTE: There were 1 observations read from the data set WORK.DATASETS_IN_LIB1.
You can also query the sashelp.vtable to obtain the list of datasets:
proc sql;
create table mem_list as
select memname
from sashelp.vtable
where libname='LIB1' and memtype='DATA';
quit;

Combining data from different rows into one variable

I have a table as below:
id sprvsr phone name
2 123 5232 ali
2 128 5458 ali
3 145 7845 oya
3 125 4785 oya
I would like to put same id and same name on one column and sprvsr and phone in one column together as below:
id sprvsr phone name
2 123-128 5232-5458 ali
3 145-125 7845-4785 oya
edit question:
have one more question- related this one.
i followed the way you showed me and works. Thank you! Another problem is for example:
sprvsr name
5232-5458 ali
5232-5458 ali
5458-5232 ali
is there any way that i can make them in same order?
If you need the variables in the same order, you'll need to use a temporary array and sort it. This requires having some idea of how many rows you might have. Also requires it to be sorted. This is a bit more complicated than the previous solution (in a previous revision).
data have;
input id sprvsr $ phone $ name $;
datalines;
2 123 5232 ali
2 128 5458 ali
3 145 7845 oya
3 125 4785 oya
4 128 5458 ali
4 123 5232 ali
;
run;
data want;
array phones[99] $8 _temporary_; *initialize these two to some reasonably high number;
array sprvsrs[99] $3 _temporary_;
length phone_all sprvsr_all $200; *same;
set have;
by id;
if first.id then do; *for each id, start out clearing the arrays;
call missing(of phones[*] sprvsrs[*]);
_counter=0;
end;
_counter+1; *increment counter;
phones[_counter]=phone; *assign current phone/sprvsr to array elements;
sprvsrs[_counter]=sprvsr;
if last.id then do; *now, create concatenated list and output;
call sortc(of phones[*]); *sort the lists;
call sortc(of sprvsrs[*]);
phone_all = catx('-',of phones[*]); *concatenate them together;
sprvsr_all= catx('-',of sprvsrs[*]);
output;
end;
drop phone sprvsr;
rename
phone_all=phone
sprvsr_all=sprvsr;
run;
The construction array[*] means "All variables of that array". So catx('-',of phones[*]) means put all phones elements in the catx (fortunately, missing ones are ignored by catx).
This is a way to do that:
data have;
input id sprvsr $ phone $ name $;
datalines;
2 123 5232 ali
2 128 5458 ali
3 145 7845 oya
3 125 4785 oya
;
run;
data want (drop=lag_sprvsr lag_phone);
format id;
length sprvsr $7 phone $9;
set have;
by id;
lag_sprvsr=lag(sprvsr);
lag_phone=lag(phone);
if lag(id)=id then do;
sprvsr=catx('-',lag_sprvsr,sprvsr);
phone=catx('-',lag_phone,phone);
end;
if last.id then output;
run;
Just pay attention to the possible lenghts of the input variables and that of the concatenated string. The input dataset must be sorted by id.
The catx() function removes the leading and trailing blanks and concatenates with a delimiter.