Convert string with spaces to valid table name - sas

I want to create a series of tables using SAS macro language, but the strings I am trying to pass through have spaces in them. Any ideas on what to add to make them valid table names?
%macro has_spaces(string);
proc sql;
create table &string. as
select
*
from my_table
;
quit;
%mend;
%has_spaces(has 2 spaces);
Thanks.

Another option is translate:
%macro has_spaces(string);
proc sql;
create table %sysfunc(translate(&string.,_,%str( ))) as
select *
from my_table
;
quit;
%mend;

You could do something like this as this will catch pretty much anything that isnt valid for a SAS table name and replace it with an underscore. We use a similar approach when creating file names based on customer names that contain all kinds of weird symbols and spaces etc... :
Macro Version:
%macro clean_tablename(iField=);
%local clean_variable;
%let clean_variable = %sysfunc(compress(&iField,,kns));
%let clean_variable = %sysfunc(compbl(&clean_variable));
%let clean_variable = %sysfunc(translate(&clean_variable,'_',' '));
&clean_variable
%mend;
Test Case 1:
%let x = "kjJDHF f'ke''''j d (kdj-328) *#& J#ld!!!";
%put %clean_variable(iField=&x);
Result:
kjJDHF_fkej_d_kdj328_Jld
Your test case:
%macro has_spaces(string);
proc sql;
create table %clean_variable(iField=&string) as
select *
from sashelp.class
;
quit;
%mend;
%has_spaces(has 2 spaces);
Result:
NOTE: Table WORK.HAS_2_SPACES created, with 19 rows and 5 columns.
FCMP Version:
proc fcmp outlib=work.funcs.funcs;
function to_valid_sas_name(iField $) $32;
length clean_variable $32;
clean_variable = compress(iField,'-','kns');
clean_variable = compbl(clean_variable);
clean_variable = translate(cats(clean_variable),'_',' ');
clean_variable = lowcase(clean_variable);
return (clean_variable);
endsub;
run;
Example FCMP Usage:
data x;
length invalid_name valid_name $100;
invalid_name = "kjJDHF f'ke''''j d (kdj-328) *#& J#ld!!!";
valid_name = to_valid_sas_name(invalid_name);
put _all_;
run;
Result:
invalid_name=kjJDHF f'ke''''j d (kdj-328) *#& J#ld!!! valid_name=kjjdhf_fkej_d_kdj-328_jld
Please note that there are limits to what you can name a table in SAS. Ie. it must start with an underscore or character, and must be no more than 32 chars long. You can add additional logic to do that if needed...

Compress out the spaces - one method is to use the datastep compress() function within a %SYSFUNC, e.g.
%macro has_spaces(string);
proc sql;
create table %SYSFUNC(compress(&string)) as
select
*
from my_table
;
quit;
%mend;
%has_spaces(has 2 spaces);

Just put the table name in quotes followed by an 'n' eg if your table name is "Table one"
then pass this as the argument "Table one"n.

Related

dynamically exclude column names in proc sql macro

I have a proc sql statement in a macro function that selects column names from dictionary.columns. I would like to exclude column names based on multiple string patterns that are passed a arguments - see below
%symdel keepnames;
%macro test(data=, col=);
%global keepnames;
%let data_lib = %sysfunc(upcase(%sysfunc(scan("&data", 1, "."))));
%let data_data = %sysfunc(upcase(%sysfunc(scan("&data", 2, "."))));
%put &data_lib;
%put &data_data;
proc sql noprint;
select name into :keepnames separated by " "
from dictionary.columns
where libname = "&data_lib" and
memname = "&data_data" and
upcase(name) not like upcase("&col.");
quit;
%mend test;
%test(data=sashelp.cars, col=mpg w)
%put &keepnames;
Ideally, the col argument would turn into %mpg%, %w% thereby excluding any column names with mpg or w in their name.
There are a couple issues I'm encountering. First, I can't quite figure out how to hide the % from the macro processor. I tried using %str() in several ways but without luck. Second, I can't easily add % symbols around the words in the col argument. Any help is appreciated!
Change the macro parameter name to be something better informing, for example
%macro fetch_names (data=, dropPattern=, resultVar=fetchedNames)
...
%mend;
Consider passing a regular expression instead of a space separated list of values that would have to be iterated over.
%let fetchedNames = ;
%fetch_names (
data = sashelp.cars
, dropPattern = mpg|w /* <------- regular expression pattern */
, resultVar = fetchedNames
)
The innard of the macro would be similar.
change into :keepnames to into :&resultVar
change upcase(name) not like upcase("&col.") to not prxmatch("/&dropPattern./i", name)
I made use of the contains operator in the following and looped through the arguments in col to generate separate tests for each exclusion. For some reason I don't have cars so I used class:
%symdel keepnames;
%macro test(data=, col=);
%global keepnames;
%let data_lib = %sysfunc(upcase(%sysfunc(scan("&data", 1, "."))));
%let data_data = %sysfunc(upcase(%sysfunc(scan("&data", 2, "."))));
%put &data_lib;
%put &data_data;
%let i = 1;
%let exclusion = %scan(&col,&i);
proc sql noprint;
select name into :keepnames separated by " "
from dictionary.columns
where libname = "&data_lib" and
memname = "&data_data"
%do %while(&exclusion ne );
and upcase(name) not contains upcase("&exclusion")
%let i = %eval(&i + 1);
%let exclusion = %scan(&col,&i);
%end;
;
quit;
%mend test;
option mprint;
%test(data=sashelp.class, col=ame x)
%put &keepnames;
%test(data=sashelp.class)
%put &keepnames;
%test(data=sashelp.class, col=e)
%put &keepnames;

sas + formatting macro variable result created using proc sql select into:

I use code to write tables that captures the total count for each column as a macro variable, then uses it in the labels statement to complete the table column headers.
The count cohort&cnum._tot is created as:
proc sql noprint;
select count(*) into : cohort&cnum._tot from &analytic_file. (&&cohort&cnum);
quit;
And is used:
proc print data=TABLES.&tbl noobs label split="*";
var label_
c1_STAT1 c2_STAT1 c12_stat
c3_STAT1 c4_STAT1 c34_stat
c5_STAT1 c6_STAT1 c56_stat ;
* labeling step creates column header detail ;
label
%do i=1 %to &num;
c&i._STAT1 = "&&&c&i.lab. * N= &&cohort&i._tot. * N"
%end;
c12_stat = "* * * % of row"
c34_stat = "* * * % of row"
c56_stat = "* * * % of row"
;
run;
I've looked around and can't find a solution ... so I'm here asking is there a way to format &&cohort&i._tot. so that it returns 8,675,309 instead of 8675309?
Thanks!
You can format the count(*) in the select by using the PUT function. In this example the row count is multiplied to get a number large enough to require commas. The TRIMMED option removes leading and trailing spaces from the value before sticking it into the macro variable.
proc sql noprint;
select put( 123456789 * count(*),comma18.-L) into :count trimmed from sashelp.class;
%put !&count.!;
The alternative is to format the macro value using sysfunc. Two ways, either works.
%put %sysfunc(sum(&count.), comma12.); %* format feature of sysfunc evaluation;
%put %sysfunc(putn(&count , comma12.)); %* versus putn function;
You can assign the format in your proc sql using format=comma12.
Your code would be like this:
proc sql noprint;
select count(*) format=comma12. into : cohort&cnum._tot from &analytic_file. (&&cohort&cnum);
quit;

How to scan a numeric variable

I have a table like this:
Lista_ID 1 4 7 10 ...
in total there are 100 numbers.
I want to call each one of these numbers to a macro i created. I was trying to use 'scan' but read that it's just for character variables.
the error when i runned the following code was
there's the code:
proc sql;
select ID INTO: LISTA_ID SEPARATED BY '*' from
WORK.AMOSTRA;
run;
PROC SQL;
SELECT COUNT(*) INTO: NR SEPARATED BY '*' FROM
WORK.AMOSTRA;
RUN;
%MACRO CICLO_teste();
%LET LIM_MSISDN = %EVAL(NR);
%LET I = %EVAL(1);
%DO %WHILE (&I<= &LIM_MSISDN);
%LET REF = %SCAN(LISTA_ID,&I,,'*');
DATA WORK.UP&REF;
SET WORK.BASE&REF;
FORMAT PERC_ACUM 9.3;
IF FIRST.ID_CLIENTE THEN PERC_ACUM=0;
PERC_ACUM+PERC;
RUN;
%LET I = %EVAL(&I+1);
%END;
%MEND;
%CICLO_TESTE;
the error was that:
VARIABLE PERC IS UNITIALIZED and
VARIABLE FIRST.ID_CLIENTE IS UNITIALIZED.
What I want is to run this macro for each one of the Id's in the List I showed before, and that are referenced in work.base&ref and work.up&ref.
How can I do it? What I'm doing wrong?
thanks!
Here's the CALL EXECUTE version.
%MACRO CICLO_teste(REF);
DATA WORK.UP&REF;
SET WORK.BASE&REF;
BY ID_CLIENTE;
FORMAT PERC_ACUM 9.3;
IF FIRST.ID_CLIENTE THEN PERC_ACUM=0;
PERC_ACUM+PERC;
RUN;
%CICLO_TESTE;
DATA _NULL_;
SET amostra;
*CREATE YOUR MACRO CALL;
STR = CATT('%CLIO_TESTE(', ID, ')');
CALL EXECUTE(STR);
RUN;
First you should note that SAS macro variable resolve is intrinsically a "text-based" copy-paste action. That is, all the user-defined macro variables are texts. Therefore, %eval is unnecessary in this case.
Other miscellaneous corrections include:
Check the %scan() function for correct usage. The first argument should be a text string WITHOUT QUOTES.
run is redundant in proc sql since each sql statement is run as soon as they are sent. Use quit; to exit proc sql.
A semicolon is not required for macro call (causes unexpected problems sometimes).
use %do %to for loops
The code below should work.
data work.amostra;
input id;
cards;
1
4
7
10
;
run;
proc sql noprint;
select id into :lista_id separated by ' ' from work.amostra;
select count(*) into :nr separated by ' ' from work.amostra;
quit;
* check;
%put lista_id=&lista_id nr=&nr;
%macro ciclo_teste();
%local ref;
%do i = 1 %to &nr;
%let ref = %scan(&lista_id, &i);
%*check;
%put ref = &ref;
/* your task below */
/* data work.up&ref;*/
/* set work.base&ref;*/
/* format perc_acum 9.3;*/
/* if first.id_cliente then perc_acum=0;*/
/* perc_acum + perc;*/
/* run; */
%end;
%mend;
%ciclo_teste()
tested on SAS 9.4 win7 x64
Edited:
In fact I would recommend doing this to avoid scanning a long string which is inefficient.
%macro tester();
/* get the number of obs (a more efficient way) */
%local NN;
proc sql noprint;
select nobs into :NN
from dictionary.tables
where upcase(libname) = 'WORK'
and upcase(memname) = 'AMOSTRA';
quit;
/* assign &ref by random access */
%do i = 1 %to &NN;
data _null_;
a = &i;
set work.amostra point=a;
call symputx('ref',id,'L');
stop;
run;
%*check;
%put ref = &ref;
/* your task below */
%end;
%mend;
%tester()
Please let me know if you have further questions.
Wow that seems like a lot of work. Why not just do the following:
data work.amostra;
input id;
cards;
1
4
7
10
;
run;
%macro test001;
proc sql noprint;
select count(*) into: cnt
from amostra;
quit;
%let cnt = &cnt;
proc sql noprint;
select id into: x1 - :x&cnt
from amostra;
quit;
%do i = 1 %to &cnt;
%let x&i = &&x&i;
%put &&x&i;
%end;
%mend test001;
%test001;
now in variables &x1 - &&x&cnt you have your values and you can process them however you like.
In general if your list is small enough (macro variables are limited to 64K characters) then you are better off passing the list in a single delimited macro variable instead of multiple macro variables.Remember that PROC SQL will automatically set the count into the macro variable SQLOBS so there is no need to run the query twice. Or you can use %sysfunc(countw()) to count the number of entries in your delimited list.
proc sql noprint ;
select id into :idlist separated by '|' from .... ;
%let nr=&sqlobs;
quit;
...
%do i=1 %to &nr ;
%let id=%scan(&idlist,&i,|);
data up&id ;
...
%end;
If you do generate multiple macro variables there is no need to set the upper bound in advance as SAS will only create the number of macro variables it needs based on the number of observations returned by the query.
select id into :idval1 - from ... ;
%let nr=&sqlobs;
If you are using an older version of SAS the you need set an upper bound on the macro variable range.
select id into :idval1 - :idval99999 from ... ;

Create a sequence of new column names

I have a hundred or so columns which I would like to rename in SAS using the following macro:
%macro rename1(oldvarlist, newvarlist);
%let k=1;
%let old = %scan(&oldvarlist, &k);
%let new = %scan(&newvarlist, &k);
%do %while(("&old" NE "") & ("&new" NE ""));
rename &old = &new;
%let k = %eval(&k + 1);
%let old = %scan(&oldvarlist, &k);
%let new = %scan(&newvarlist, &k);
%end;
%mend;
The columns are currently named C5, C7, C9, ..., C205 and I would like to rename them AR_0, AR_1, ..., AR100.
With the macro above, how can I put these new names after the comma of the following code without writing each and every one of them?
%rename1(C5--C205, # new names here #);
This is a bit of a longer solution, but it's fairly dynamic and you easy to see how things work. I'm assuming you'll use the rename statement in proc datasets. Otherwise you could just be lazy and use arrays to replace then drop the old variables, though that isn't efficient.
proc sql;
create table oldvar as
select name, varnum
from sashelp.vcolumn
where upcase(libname)='SASHELP'
and upcase(memname)='CLASS'
order by varnum;
quit;
data rename;
set oldvar;
new_var=catx("_", "AR",varnum);
run;
proc sql noprint;
select catx("=", name, new_var) into :rename_list
separated by " "
from rename;
quit;
%put rename &rename_list;
proc datasets library=work;
modify my_dataset;
rename &rename_list;
run;quit;
This will first find the old columns and rename them to AR_# and create macrovariable varlist that you can use:
proc sql noprint;
create table newvar as
select name
from sashelp.vcolumn
where libname="SASHELP" and memname="CLASS"
order by name;
quit;
data newvar;
set newvar;
name=compress("AR_"!!put(_n_,4.));
run;
proc sql noprint;
select name into :varlist separated by " "
from newvar;
quit;
Probably, something like this would do the job
%macro rename2(oldvarlist, newPrefix);
%let k=1;
%let old = %scan(&oldvarlist, &k);
%do %while(("&old" NE ""));
rename &old = &newPrefix.&k.;
%let k = %eval(&k + 1);
%let old = %scan(&oldvarlist, &k);
%end;
%mend;

sas how to use a list to store count distinct of all the variables in a table

I want to store the count distinct of each variable from a table in another. I wanted to use a loop for it, over the list of the variables. So first, I stored the variables names in "vars", doing this:
proc sql ;
select name
into :vars separated by ' '
from dictionary.columns
where libname eq 'HW' and
memname eq "ORDERS";
quit;
Then, I created another list with the result of the count distinct with the following code:
%macro g();
%let b=;
%do i = 1 %to 3;
%let a=%scan(&vars,&i);
proc sql;
select count(distinct &a)
into :gaby from hw.ORDERS;
quit;
%let b=&b &gaby;
%end;
%put &b;
%mend g;
%g();
After this, I wanted to add both to a table, but I can add the vars variable but not the b variable.
data a;
call symput('lista', symget('vars'));
call symput('lista1', symget('b'));
do i=1 to 3;
timept=i;
variable=scan("&vars",i);
dist=scan("&b",i);
output;
end;
run;
The table shows correctly the name of the variables but instead of showing the count distinct (that were stored in b) shows the letter "b".
Is there a way to perform this? also, is there a way to perform it easily?
Thanks!!!!!!!!!!
You're pretty close. I would just use a single SQL pass and create an output table directly. If you want it in a column form, then use PROC TRANSPOSE.
proc sql noprint;
select name
into :vars separated by ' '
from dictionary.columns
where libname eq 'SASHELP' and
memname eq "SHOES";
quit;
%put &vars;
%macro create_table();
proc sql noprint;
%local i n var;
%let n = %sysfunc(countw(&vars));
create table output as
select
%do i=1 %to %eval(&n-1);
%let var = %scan(&vars,&i);
count(distinct &var) as &var,
%end;
%let var = %scan(&vars,&n);
count(distinct &var) as &var
from sashelp.shoes;
quit;
%mend;
%create_table;
proc transpose data=output out=want(rename=(_NAME_=variable COL1=Dist));
run;