Syntax error using CATX in SAS PROC SQL - sas

I am generating a syntax error in SAS 9.4 when trying to use CATX("|", of a1-a5) in PROC SQL.
Why do the first two outputs work, but the third fails?
data test;
input a1 $ a2 $ a3 $ a4 $ a5 $;
cards;
a b c d e
f g h i j
k l m n o
p q r s t
u v w x y
;
run;
proc sql;
select CATX('|',a1,a2,a3,a4,a5) as catx from test;
quit;
data test2;
set test;
catx=CATX('|',OF a1-a5);
run;
proc print data=test2; run;
proc sql;
select CATX('|',OF a1-a5) as catx from test;
quit;
The first proc sql and the data step produce the expected "a|b|c|d|e", etc. But the third proc sql produces a syntax error pointed at the "a1":
32 proc sql;
33 select CATX('|',OF a1-a5) as catx from test;
--
22
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
CONTAINS, EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.
Thanks

You have hit one of those walls in Proc SQL where some base SAS functions are not fully supported. As someone mentioned earlier you're better off creating a macro variable containing the columns you need for your concatenation.
here is a quick example:
proc sql noprint;
select name into :cols separated by ','
from dictionary.columns
where libname = "WORK" and
memname = "TEST";
quit;
%put &cols;
proc sql;
select CATX('|',&cols) as catx from test;
quit;
Obviously your where clause will be more complex as your original data set may contain columns not required in the CATX expression.

This syntax "OF a1-a5" is specified for data-step code only.
"proc sql" sas procedure is a simpliest sql code and it cant't be mixed with data-step code.

Related

Partitions in Proc SQL

Can we use partitions inside Proc Sql, if not could you please help me the equivalent logic to use?
Proc sql;
select user_id,
content_title,
calendar_date,
sum(watch_second)/3600 as hours_watched,
sum(hours_watched) over (partition by user_id, content_title order by calendar_date) AS cumulative_hours,
available_hours,
cumulative_hours / available_hours as pct_completed
from watch_history as a
inner join dim_user s
ON s.USER_ID = a.USER_ID
left join dim_content_meta pb
ON pb.metrics_video_id = a.metrics_video_id
inner join sascidm.coop_top50 v
on pb.hummus_show_id = v.hummus_show_id
where pb.hummus_playback_type in ('VOD')
AND pb.hummus_show_type = 'series'
order by 1,2,3;
quit;
Error:
sum(watch_second)/3600 as hours_watched,
29 sum(hours_watched) over (partition by user_id, content_title order by calendar_date) AS cumulative_hours,
____
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, ',', -, /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
CONTAINS, EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
No, proc sql supports only a relatively limited set of SQL that's close to ANSI SQL from a few decades ago, and does not support partition by as that's much newer.
For the most part, a SAS programmer would instead use SAS procs to compute things like this rather than SQL - PROC TABULATE could probably do this I suspect.
It is simple in a DATA step to generate a cummulative sum. Just use a SUM statement. To reset by groups use a FIRST. flags to detect when a new group is starting.
Proc sql;
create table raw as
select
user_id
,content_title
,calendar_date
,sum(watch_second)/3600 as hours_watched
/*
,sum(hours_watched) over (partition by user_id, content_title order by calendar_date) AS cumulative_hours
*/
,available_hours
/*
,cumulative_hours / available_hours as pct_completed
*/
from watch_history as a
inner join dim_user s
ON s.USER_ID = a.USER_ID
left join dim_content_meta pb
ON pb.metrics_video_id = a.metrics_video_id
inner join sascidm.coop_top50 v
on pb.hummus_show_id = v.hummus_show_id
where pb.hummus_playback_type in ('VOD')
AND pb.hummus_show_type = 'series'
order by 1,2,3
;
quit;
data want;
set raw ;
by user_id content_title calendar_date;
if first.content_title then cumulative_hours=0;
cumulative_hours + hours_watched;
pct_completed = cumulative_hours / available_hours;
run;

Using the result of a query as a parameter for another query

I am trying to use a query result as a parameter for another query.
As below:
PROC SQL;
SELECT mydate INTO : varmydate FROM work.MyTable WHERE codigo = 1234;
QUIT;
PROC SQL;
SELECT * FROM work.MyOtherTable
WHERE codata = &varmydate;
QUIT;
But, unfortunately, this didn't work.
In this example, the varmydate variable will receive a value of type data in the first query.
96 PROC SQL;
97 SELECT codata FORMAT date9., valor FROM work.MyOtherTable WHERE codata = &varmydate;
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
NOTE: Line generated by the macro variable "VARMYDATE".
97 01MAR2020
_______
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, <, <=, <>, =, >, >=, AND, EQ, EQT, GE, GET,
GROUP, GT, GTT, HAVING, LE, LET, LT, LTT, NE, NET, OR, ORDER, ^=, |, ||, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
98 QUIT;
After executing the first query, the command below:
%PUT &varmydate;
will have the following result:
01MAR2020
But the second query returns an error.
If I use the syntax below:
PROC SQL;
SELECT * FROM work.MyOtherTable
WHERE codata = '01MAR2020'd;
QUIT;
then it will work.
You did not indicate what the LOG window messaged regarding didn't work
I presume column mydate in table mytable is type character storing the date9. represention of a date. If it had instead been numeric variable with a date format the macro variable myvardate would have a value such as 21975
Thus, you need to resolve the date9 representation in a SAS date literal context
WHERE codata = "&myvardate"d;
Double quotes are needed because macro resolution will not occur within single quoted literals.

list the values instead of the formatted values with proc sql select into

When doing a proc sql select into, I get the list of formatted values.
proc sql noprint;
select germ
into :oklist separated by '","'
from maxposition
where max <=10
;run;quit;
%put oklist=("&oklist");
oklist=("B. pertussis","Campylobacter","C. trachomatis","E. coli (VTEC)","Giardia","L.
pneumophila","Salmonella","Hepatitis A virus","Hepatitis B virus","Hepatitis C virus","Influenza
virus","Mumps virus")
How can you list the unformatted values instead?
(I mean without changing or removing the format)
Thanks!
You can use a format= statement in the proc sql to assign a generic format.
proc sql noprint;
select germ format=$20.
into :oklist separated by '","'
from maxposition
where max <=10
;run;quit;
%put oklist=("&oklist");

SAS macro to resolve in proc sql into statement

Can anyone help me with the syntax error here.
Y is dataset which contain some value let's say 1,2,3,4 (Actually it contains many records)
/*This is working fine*/
proc sql;
select count(*) into:m from y;
select x into:a1 - :a4 from y;
quit;
%put &m &a1 &a2 &a3 &a4;
/*When i am trying to create a macro which will have a1 to a4 values, it's giving me error. below is my approach*/
proc sql;
select count(*) into:m from y;
select x into:a1 - :a||trim(left(&m.)) from y;
quit;
%put &m &a1 &a2 &a3 &a4;
Please can someone help me with this, explain me the reason for error.
You don't need to tell how many anymore. SQL will create just enough variables.
data y;
do x = 1 to 4;
output;
end;
run;
proc sql;
select count(*) into:m from y;
select x into:a1- from y;
quit;
%put &m &a1 &a2 &a3 &a4;
This is valid at least as of 9.3.
The syntax error is probably caused by not understanding how the macro processor works. It is just a tool for generating text. The generated text needs to be valid SAS code for it to execute. So trying to write something like:
into :a1 - :a||trim(left(&m.))
will not work. The only macro trigger there is the reference to the M macro variable. So that would evaluate to:
into :a1 - :a||trim(left( 4))
but the INTO operator just wants the name of the upperbound for the macro variable name list there. It cannot handle the || concatenation operator or call functions like TRIM() or LEFT().
Fortunately you do not need these since the INTO clause of PROC SQL is smart enough to only generate as many macro variables as you need. If you are using a current version of SAS you can leave the upperbound empty.
into :a1 -
or if you are running an old version you just use an upperbound that is larger than any expected number of observations.
into :a1-:a999999
Also you do not need to run the query twice to find out how many records SQL found. It will set the count into the automatic macro variable SQLOBS. So your query becomes:
proc sql noprint;
select x into :a1- from y;
%let m=&sqlobs;
quit;
While DN is correct in the other answer that you really don't have to bother with this anymore, I thought it useful to show you the error in your attempt - which was not an unreasonable approach.
You can't use trim the way you did. That's going to operate on character expressions, not on program code - which a macro variable is producing (program code). You've got a few options for how to do this.
First, there is the %trim autocall macro. That's how you'd do what you were trying to do directly.
proc sql;
select count(*) into :ccount trimmed from sashelp.class;
select name into :a1 - :a%trim(&ccount.) from sashelp.class;
quit;
%put &ccount &a1 &a2 &a3 &a4;
Second, SQL will actually do this for you, in two possible options.
trimmed keyword
separated by
The trimmed keyword as part of the into clause, or using separated by even when there's only one resulting value, will cause a trimmed result (so, left-aligned with no spaces following or preceding) to be stored in the destination macrovariable. (Why the defaults are different for separated by vs. not is one of the weird things in SAS related to not breaking code that works, even when it's a silly result.)
proc sql;
select count(*) into :ccount trimmed from sashelp.class;
select name into :a1 - :a&ccount. from sashelp.class;
quit;
%put &ccount &a1 &a2 &a3 &a4;

How to count distinct of the concatenation/cross of two variables in SAS Proc Sql?

I know in teradata or other sql platforms you can find the count distinct of a combination of variables by doing:
select count(distinct x1||x2)
from db.table
And this will give all the unique combinations of x1,x2 pairs.
This syntax, however, does not work in proc sql.
Is there anyway to perform such a count in proc sql?
Thanks.
That syntax works perfectly fine in PROC SQL.
proc sql;
select count(distinct name||sex)
from sashelp.class;
quit;
If the fields are numeric, you must put them to character (using put) or use cat or one of its siblings, which happily take either numeric or character.
proc sql;
select count(distinct cats(age,sex))
from sashelp.class;
quit;
This maybe redundant, but when you mentioned "combination", it instantly triggered 'permutation' in my mind. So here is one solution to differentiate these two:
DATA TEST;
INPUT (X1 X2) (:$8.);
CARDS;
A B
B A
C D
C D
;
PROC SQL;
SELECT COUNT(*) AS TOTAL, COUNT(DISTINCT CATS(X1,X2)) AS PERMUTATION,
COUNT(DISTINCT CATS(IFC(X1<=X2,X1,X2),IFC(X1>X2,X1,X2))) AS COMBINATION
FROM TEST;
QUIT;