Using IIF in Proc SQL - sas

I want to translate the following snippet of code for use in Proc SQL for SAS:
SUM( IIF( INLIST( a.Department, DptGI, DptOncology, DptSurgery ), 1, 0 )) as TotalApts,;
However IIF() is not recognized by PROC SQL.
Can I implement an if/else or some kind of CASE statement?
None of that seems to be working for me.

The SAS version of IIF is IFN or IFC, depending on numeric or character return value (doesn't matter what the input columns are, just what you want back). As long as you're using it in native SAS PROC SQL (and not pass-through SQL) this should work.
However, INLIST is also not a SAS function, you'd have to rewrite that to a normal in operator as well.
You should be able to use case when statements as well, however, if you would like help with those you can post your code that you've tried.

IIF() is not an SQL function. It might be a function that is implemented as an extension in some version of SQL, most likely those created by Microsoft.
I am not sure what INLIST() is either. I will assume it is some type of test of whether the first argument is in the other arguments. If so you could replace it with one of the SAS functions WHICHN() or WHICHC(), depending on whether the values in the variables are numbers or strings.
The normal SQL syntax for that type of operation is CASE.
SUM( case when ( a.Department= DptGI) then 1
when ( a.Department=DptOncology) then 1
when ( a.Department=DptSurgery ) then 1
else 0 end
) as TotalApts
If you are just trying to get result of 1 or 0 and you are ok with using SAS specific syntax then you can just sum the result of the boolean expression.
SUM( (a.Department=DptGI) or (a.Department=DptOncology)
or (a.Department=DptSurgery)) as TotalApts
Now if DPTGI and the other values are constants and not variables then you could use the IN operator.
SUM(a.Department in ('GI','Oncology','Surgery')) as TotalApts

Related

Dax Switch Function 3rd value skipping

I use power bi Version: 2.112.603.0 64-bit (December 2022).In this version when im writing dax measure with switch function after giving the second result with the comma its asking third result.Its skipping 3 rd Value.What Im doing wrong here.
Intellisense isn't always correct. Just provide the correct number of arguments and the DAX engine will interpret them correctly.
The (switch function) has a minimum of 3 parameters that are Expression, Value, Result. It is in this situation like the IF function. But you can also add as many as you want other Value, Result pairs, in this case is more similar to the Case When statement.
To come back to your example, you could add other hasonevalue functions and the respective Result value and as last Parameter the else Value (if no Value is matched).

SAS Data integration studio

Long time reader first-time questioner.
Using SAS Data Integration studio, when you create a summary transformation in the table options advanced tab you can add a where statement to your code automatically. Unfortunately, it adds some code that makes this resolve incorrectly. Putting the following in the where text box:
TESTFIELD = "TESTVALUE"
creates
%let _INPUT_options = %nrquote(WHERE = %(TESTFIELD = %"TESTVALUE%"%));
In the code, used
proc tabulate data = &_INPUT (&_INPUT_options)
But resolves to
WHERE = (TESTFIELD = "TESTVALUE")
_
22
ERROR: Syntax error while parsing WHERE clause. ERROR 22-322: Syntax
error, expecting one of the following: a name, a quoted string, a
numeric constant, a datetime constant,
a missing value, (, *, +, -, :, INPUT, NOT, PUT, ^, ~.
My question is this: Is there a way to add a function to the where statement box that would allow this quotation mark to be properly added here?
Note that all functions get the preceding % when added to the where statement automatically and I have no control over that. This seems like something that should be relatively easy to fix but I haven't found a simple way yet.
The % are simply escaping the " and () characters; they're perfectly harmless, really. The bigger problem is the %NRQUOTE "quotes" (which are nonprinting characters that tell SAS this is macro-quoted); they mess up the WHERE processing.
Use %UNQUOTE( ... ) to remove these.
Example:
data have;
testfield="TESTVALUE";
output;
testfield="AMBASDF";
output;
run;
%let _INPUT_options = %nrquote(WHERE = %(TESTFIELD = %"TESTVALUE%"%));
%put &=_input_options;
data want;
set have(%unquote(&_INPUT_options.));
run;
Thank you all for your responses. Long story short, I ended up creating a SAS Troubleshooting ticket. The analyst told me that they have now documented the issue, which should now be resolved in a future iteration of DI.
The temporary solution was to create a new transformation, with a slight alteration, adding an UNQOUTE (as mentioned above by Joe) to the source code before the input options:
proc tabulate data = &_INPUT (%unquote(&_INPUT_options)) %unquote(&procOptions);
For those interested you will need to create the transformation in a public subfolder of your project so others can use it. Not what I was hoping for, but a workable solution while waiting for the version update.

SAS - "where variable" versus "where not missing(variable)"

In trying to make my code more readable, I face the following situation.
Consider a data step in which you want to select only observations which have a value for variable. It seems there are two ways to do this using a WHERE statement: express the variable alone or use the MISSING function.
For example,
Case 1. Where VARIABLE
data where_var;
set sashelp.electric;
where AllPower;
run;
Case 2. Where not missing(VARIABLE)
data where_not_missing;
set sashelp.electric;
where not missing(AllPower);
run;
These produce the same result. However, I'm unsure whether this is necessarily the case.
Are these functionally equivalent?
Is Case 1 merely syntactic sugar for Case 2?
Are there instances when they will produce different results?
In SAS, any numeric value other than 0 or missing is true. See https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000780367.htm for more info.
That means that your Case 1 and Case 2 are not 100% equivalent. If AllPower = 0, the result will be different.
If you are trying to determine if AllPower has a value then only Case 2 will work.
Case 1 is selecting records where AllPower is non-zero. Zero is a value and is not the same as missing.

How to choose indexed assignment variable dynamically in SAS?

I am trying to build a custom transformation in SAS DI. This transformation will "act" on columns in an input data set, producing the desired output. For simplicity let's assume the transformation will use input_col1 to compute output_col1, input_col2 to compute output_col2, and so on up to some specified number of columns to act on (let's say 2).
In the Code Options section of the custom transformation users are able to specify (via prompts) the names of the columns to be acted on; for example, a user could specify that input_col1 should refer to the column named "order_datetime" in the input dataset, and either make a similar specification for input_col2 or else leave that prompt blank.
Here is the code I am using to generate the output for the custom transformation:
data cust_trans;
set &_INPUT0;
i=1;
do while(i<3);
call symputx('index',i);
result = myfunc("&&input_col&index");
output_col&index = result; /*what is proper syntax here?*/
i = i+1;
end;
run;
Here myfunc refers to a custom function I made using proc fcmp which works fine.
The custom transformation works fine if I do not try to take into account the variable number of input columns to act on (i.e. if I use "&&input_col&i" instead of "&&input_col&index" and just use the column result on the output table).
However, I'm having two issues with trying to make the approach more dynamic:
I get the following warning on the line containing
result = myfunc("&&input_col&index"):
WARNING: Apparent symbolic reference INDEX not resolved.
I do not know how to have the assignment to the desired output column happen dynamically; i.e., depending on the iteration of the do loop I'd like to assign the output value to the corresponding output column.
I feel confident that the solution to this must be well known amongst experts, but I cannot find anything explaining how to do this.
Any help is greatly appreciated!
You can't use macro variables that depend on data variables, in this manner. Macro variables are resolved at compile time, not at run time.
So you either have to
%do i = 1 %to .. ;
which is fine if you're in a macro (it won't work outside of an actual macro), or you need to use an array.
data cust_trans;
set &_INPUT0;
array in[2] &input_col1 &input_col2; *or however you determine the input columns;
array output_col[2]; *automatically names the results;
do i = 1 to dim(in);
result = myfunc(in[i]); *You quote the input - I cannot see what your function is doing, but it is probably wrong to do so;
output_col[i] = result; /*what is proper syntax here?*/
end;
run;
That's the way you'd normally do that. I don't know what myfunc does, and I also don't know why you quote "&&input_col&index." when you pass it to it, but that would be a strange way to operate unless you want the name of the input column as text (and don't want to know what data is in that variable). If you do, then pass vname(in[i]) which passes the name of the variable as a character.

Tuples in MDX SSAS OLAP queries (SS management studio)

I have a question about MDX tuples, I would like to gain some insight on something that seems confusing to me.
Most of the literature I have read talks about tuples being a set of co-ordinates essentially pointing to a cell which contains a measure value. From what I understand a tuple is defined as containing only one distinct member from each dimension. Typically when writing queries we don't specify every member for every dimension we let SSAS engine use the default members and aggregate the measure data accordingly.
Straight out of the adventure works sample OLAP database (cube) "adventure works"
A super simple query that I understand represents a tuple:
SELECT
([Date].[Calendar Quarter of Year].&[CY Q3],[Measures].[Sales Amount]) --Tuple
ON COLUMNS
FROM [Adventure Works]
SS Management studio returns this result
No problem here the tuple specified by the &[CY Q3] member point to the cell containing the displayed measure amount. Clearly a tuple has been returned.
Typically though I use this sort of thing more often:
select
non empty ([Date].[Calendar].[Calendar Quarter],[Measures].[Sales Amount]) --Tuple??
ON COLUMNS
FROM [Adventure Works]
Which returns all the quarter totals across all years for said measure (not a great example but it's just an example):
I see this result as a set because more than one distinct member has been returned from the same dimension (date). In fact, by default all members are being returned if so how can it be a tuple?
So my question is this. The parenthesis around the "tuple" in the query above, indicate to me that I'm selecting a tuple, the query engine processes and a result is returned that to me looks like a set, not because more than one cell value is returned but because more than one member from the date dimension has been used.
The query indicates that a tuple is being selected, and the query engine seems to accept it as one however the result set, includes multiple members from the same dimension and corresponding cell values indicating to me that more than one tuple will be returned --> set.
Also, The query engine throws no error when I treat it as a set and use set functions on it:
select
nonempty({([Date].[Calendar].[Calendar Quarter],[Measures].[Sales Amount])}) --Set
ON COLUMNS
FROM [Adventure Works]
My question is this, Assuming I am correct and that the results do in fact represent a set (a set of tuples denoted by each distinct member instance), why does the query engine allow you to specify parenthesis indicating selection of a tuple to return something that is not a tuple?
This makes more sense to me :
SELECT
nonemptycrossjoin(
{[Date].[Calendar].[Calendar Quarter]}, --Set 1
{[Measures].[Sales Amount]} --Set 2
)
ON COLUMNS
FROM [Adventure Works]
At least this code reflects the result set that's returned Thoughts?
Or is it all just Analysis Services semantics?
Thanks
Unfortunately, MDX is an ambiguous language. From what I've understood, the () notation is a tuple or an operator precedence notation. And:
{...} , {...}
is actually a crossjoin:
{...} * {...}
Then when you specify a " level " where a set is expected, the level.members is defaulted. When you specify a tuple where a set is expected a singleton set is created with this only tuple. So:
( [level], [measures].[amount] )
is equivalent to:
crossjoin( [level].members, { [measures].[amount] } )
The only tuple (a member is a tuple) specified is [Measures].[Amount] which by the way does not use the () notation ;-)
You mention that you typically use this syntax - but I write mdx nearly every day and never use this syntax -
SELECT
NON EMPTY ([Date].[Calendar].[Calendar Quarter],[Measures].[Sales Amount]) ON COLUMNS
FROM [Adventure Works];
I'm a little surprised it runs as it seems to be indicating to analysis services to create a tuple from a level and a measure member.
MarcP mentions that mdx is ambiguous, but I'm more in favour of saying it can be written ambiguously - it unfortunately fails quite graciously most of the time and you get no numbers returned or the wrong numbers - I wish it threw more errors and enforced tighter syntax rules as this might make it more understandable.
Your original script I would just use the * operator rather than typing out the full crossjoin function when you need it - in your script it is much more readable to move measures onto the rows and delete that tuple? Like Marc mentions SSAS's implicit use of the MEMBERS function - I find things more readable to include it explicitly when it is being used:
SELECT
NON EMPTY
[Date].[Calendar].[Calendar Quarter].MEMBERS ON 0,
[Measures].[Sales Amount] ON 1
FROM [Adventure Works];
[Date].[Calendar].[Calendar Quarter]
As Marc mentioned this is actually the same as [Date].[Calendar].[Calendar Quarter].MEMBERS in this case MSDN is your friend (and for mdx unlike some other ms languages I find msdn very good) - here is the definition of the MEMBERS function:
https://msdn.microsoft.com/en-us/library/ms144851.aspx
telling you the return type:
Returns the set of members in a dimension, level, or hierarchy.
nonemptycrossjoin
You mentioned nonemptycrossjoin - this is not needed any more - just a simple crossjoin via * is all that is needed with the last 2 or three versions of SSAS.