Displaying 'NA' when it's an empty row in SAS - sas

This is the given data
Company_NO Hierarchy_1 Hierarchy_2
1234 Insurance A
1234 Insurance A
1234 Auto B
5678 Claims B
5678 Claims B
5678 New C
In the above table, the column hierarchy_2 has three distinct values A,B,C. In the above example, for company_no=1234, since there is no row for hierarchy_2='C', that row should still appear. That is company_no=1234, hierarchy_1='NA', hierarchy_2='C'
Expected Output:
Company_NO Hierarchy_1 Hierarchy_2
1234 Insurance A
1234 Insurance A
1234 Auto B
1234 NA C
5678 Claims B
5678 Claims B
5678 New C
5678 NA A
As you can see above, an extra row is added with hierarchy_1='NA' whenever there is an empty row. Please help! Thank you!

One option is to first create all combinations of company_no and hierarchy_2 and then left join your dataset on this table:
data have;
length company_no 8. hierarchy_1 hierarchy_2 $20;
input company_no hierarchy_1 $ hierarchy_2 $;
datalines;
1234 Insurance A
1234 Insurance A
1234 Auto B
5678 Claims B
5678 Claims B
5678 New C
;
run;
proc sql;
create table want as
select a.company_no
,case when missing(c.hierarchy_1) then "NA"
else c.hierarchy_1
end as hierarchy_1
,b.hierarchy_2
from (select distinct company_no from have) as a
cross join (select distinct hierarchy_2 from have) as b
left join have as c
on a.company_no = c.company_no and
b.hierarchy_2 = c.hierarchy_2
;
quit;

Related

Data step issue in sas enterprise guide

I need to write data step query in sas where i need to give sequence numbers to a column starting from a particular number.
For example right now my table looks like this:
Column 1 Column 2
abc book1
xyz book2
zex book3
I want my table to look like this:
Column 1 Column 2 Column3
abc book1 151
xyz book2 152
zex book3 153
How to add Column 3 with a sequence number staring from a particular number?
How about this
data have;
input Column1 $ Column2 $;
datalines;
abc book1
xyz book2
zex book3
;
data want;
do Column3 = 150 by 1 until (lr);
set have end=lr;
output;
end;
run;

Create multiple rows from a single row in PowerBI

I have a sales dataset where, in addition to (Sale#) and several foreign keys ID1, ID2, ID3, ID4 I have up to 3 Invoices (associated with a Sale#). E.g. Invoice#1, SaleAmount1, Date1, Invoice#2, SaleAmount2, Date2, Invoice#3, SaleAmount3, Date3; all as columns.
I need these three invoice information as rows (as shown below ) instead of columns. Any idea how can it be done in Power BI?
Sales# ID1 ID2 ID3 Invoice# SaleAmount Date
----------------------------------------------------------
Sales#1 123 XYZ A234Y Invoice#1 SaleAmount1 Date1
Sales#1 123 XYZ A234Y Invoice#2 SaleAmount2 Date2
Sales#1 123 XYZ A234Y Invoice#3 SaleAmount3 Date3
Yep, you just need to unpivot sets of columns.
For example, if you select the 3 date columns in the following table
Sales# ID1 Date1 Date2 Date3
-----------------------------------------
Sale#1 123 1/1/2018 1/2/2018 1/3/2018
Sale#2 456 2/2/2018 3/3/2018 4/4/2018
and go to Transform > Unpivot Column in the query editor, then you'll get this:
Sales# ID1 Attribute Value
--------------------------------
Sale#1 123 Date1 1/1/2018
Sale#1 123 Date2 1/2/2018
Sale#1 123 Date3 1/3/2018
Sale#2 456 Date1 2/2/2018
Sale#2 456 Date2 3/3/2018
Sale#2 456 Date3 4/4/2018
Then pick which column(s) you want to keep and rename appropriately.
You can do the other sets of 3 columns in the same way.

Unable to get details from both the tables in sas using join statement

I am using the below code but in the final output I am not able to get the name in the first entry where income is 234234. How do I get name entry here.
data names;
input name $ age;
datalines;
John 10
Mary 12
Sally 12
Fred 1
Paul 2
;
run;
data check;
input name $ income;
datalines;
Mary 121212
Fred 334343
Ben 234234
;
Proc sql;
title 'Inner Join';
create table common_names as
select * from names as n right join check as c on
n.name = c.name;
run;
Proc print data = common_names;
run;
Output
Inner Join
Obs name age income
1 . 234234
2 Fred 1 334343
3 Mary 12 121212
You cannot create two variables with the same name, in this case the variable NAME. So either create two variables
select n.name as name1, c.name as name2, ....
or use the COALESCE() function to create a single variable.
select coalesce(n.name,c.name) as name, ....
You might also what to look at SAS's NATURAL join. That will link tables on variables with the same name and automatically coalesce the key variable values.
create table common_names as
select *
from names as n
natural right join check as c
;

SAS: How to use results from one table to search and count based on matches in a second table

I'm attempting to put together a list of people and the number of times a claim is submitted in a unique combination.
Table A structure is setup like this:
PERSON_ID CLAIM_ID
123456 A123C
123456 Z321C
123456 B123C
111111 A123C
111111 Z321C
Table B structure is setup like this:
PERSON_ID CLAIM_1 CLAIM_2 CLAIM_3
123456 A123C Z321C B123C
123456 A123C B123C
123456 B123C
111111 A123C Z321C
111111 A321C
The results I need to produce is like this:
PERSON_ID CLAIM_ID NUM_TIMES_CLAIMED
123456 A123C 2
123456 Z321C 1
123456 B123C 3
111111 A123C 1
111111 Z321C 2
I can do this in MSAccess using loops with open recordsets and I've tried researching on how to open a SAS recordset to loop through (macros) it but I can't seem to sort out how to implement it correctly.
Any ideas?
EDIT
The steps that I think I have to take are:
Step 1 - Isoloate a single persons distinct list of CLAIM_IDs
Step 2 - For each CLAIM_ID, scan across 25 variables to find a match
Step 3 - Count each time a match is found
Step 4 - Save observation (PERSON_ID, CLAIM_ID, NUM_TIMES_CLAIMED)
From VBA to SAS I can't seem to isolate the single persons distinct list of claims and loop through them while looping through each of the 25 variables in TABLE B
Here's what I use to evaluate if one claim is billed with another which is what I think I need to automate somehow:
data LOCALPC.SEL_ASMT_DEL;
SET LOCALPC.FY2014_CC_FINAL;
ARRAY FSC{25} $ FSC1-FSC25;
DO I = 1 TO 25;
IF FIND (FSC{I},'A123A') THEN
DO N = I+11 TO 25;
IF FIND (FSC{J},'Z321A') THEN
OUTPUT;
END;
END;
RUN;
I think you can get the result from just from 'Table A' assuming all the claims are inserted in Table A in the form of rows and there are duplicated claims for a person_id.
SELECT PERSON_ID, CLAIM_ID, COUNT(1)
FROM [TABLE A] A
GROUP BY PERSON_ID, CLAIM_ID
If not, then please describe your table structures and relations between them so that we could help you.
Not sure why you would ever use loops to answer a straight forward join. Now it would be easier if you first convert table B to a more normalized form.
First get your sample data into datasets:
data A ;
length PERSON_ID CLAIM_ID $10 ;
input PERSON_ID CLAIM_ID ;
cards;
123456 A123C
123456 Z321C
123456 B123C
111111 A123C
111111 Z321C
;;;;
data B ;
length PERSON_ID CLAIM_1 - CLAIM_3 $10 ;
input PERSON_ID CLAIM_1-CLAIM_3 ;
cards;
123456 A123C Z321C B123C
123456 A123C B123C .
123456 B123C . .
111111 A123C Z321C .
111111 A321C ..
;;;;
Then just join the tables and count the number of matching rows.
proc sql ;
create table want as
select a.*,count(*) as num_times_claimed
from a
left join b
on a.person_id = b.person_id
and (a.claim_id = b.claim_1
or a.claim_id = b.claim_2
or a.claim_id = b.claim_3
)
group by 1,2
order by 1,2
;
quit;
proc print; run;
Results:
PERSON_
ID CLAIM_ID num_times_claimed
111111 A123C 1
111111 Z321C 1
123456 A123C 2
123456 B123C 3
123456 Z321C 1

Compare each row of one dataset with another dataset

Just a general question lets say I have two datasets called dataset1 and dataset2 and If I want to compare the rows of dataset1 with the complete dataset2 so essentially compare each row of dataset1 with dataset2. Below is just an example of the two datasets
Dataset1
EmployeeID Name Employeer
12345 John Microsoft
1234567 Alice SAS
1234565 Jim IBM
Dataset1
EmployeeID2 Name DateAbsent
12345 John 25/06/2009
12345 John 26/06/2009
1234567 Alice 27/06/2010
1234567 Alice 30/06/2011
1234567 Alice 2/8/2012
12345 John 28/06/2009
12345 John 25/07/2009
12345 John 25/08/2009
1234565 Jim 26/08/2009
1234565 Jim 27/08/2010
1234565 Jim 28/08/2011
1234565 Jim 29/08/2012
I have written some programming logic its not sas code, this is just my logic
for item in dataset1:
for item2 in dataset2:
if item.EmployeeID=item2.EmployeeID2 and item.Name=item2.Name then output newSet
This is an inner join.
proc sql noprint;
create table output as
select a.EmployeeId,
a.Name,
a.Employeer,
b.DateAbsent
from dataset1 as a
inner join
dataset2 as b
on a.EmployeeID = b.EmployeeID2
and a.Name = b.name;
quit;
I recommend reading the SAS documentation on PROC SQL if you are unfamiliar with the syntax
To do this in a Data step, the data sets need to be sorted by the variables to join on (or indexed). Also the variable names need to be the same, so I will assume both variables are EmployeeID.
/*sort*/
proc sort data=dataset1;
by EmployeeID Name;
run;
proc sort data=dataset2;
by EmployeeID Name;
run;
data output;
merge dataset1 (in=ds1) dataset2 (inds2);
by EmployeeID Name;
if ds1 and ds2;
run;
The data step does the loop for you. It needs sorted sets because it only takes 1 pass over the data sets. The if clause checks to make sure you are getting a value from both data sets.
Is your goal to compare the two dataset and see where there are differences? Proc Compare will do this for you. You can compare specific columns or the entire dataset.