I'm very noob in SAS.
I want to check if a value exist based on a date.
Example. I have a table with dates 1may to 3may and a variable called y. I want to check if y="hi" in every day. If it doesn´t exist, I want to create the field.
input
| date | y |
|:-----|:--|
|1may |la |
|1may |le |
|1may |hi |
|2may |la |
|2may |le |
|3may |la |
|3may |le |
|3may |hi |
output
| date | y |
|:-----|:--|
|1may | la|
|1may | le|
|1may |hi |
|2may |la |
|2may |le |
|2may |hi |
|3may |la |
|3may |le |
|3may |hi |
Sorry for my english.
Thank you
Looks like you want to add an observation, not make any new variable (aka FIELD).
One way to do this is to make a dataset with one observation per DATE and with 'hi' as the value of Y and then combine that with the existing dataset.
So if you input dataset is named HAVE and it is already sorted by DATE and Y your code could be.
data all_dates;
set have ;
by date ;
if first.date ;
y= 'hi';
keep date y;
end;
data want ;
merge all_dates have ;
by date y ;
run;
You can assign the result of a logical evaluation to a variable. An evaluation will return 0 for false and 1 for true.
Example:
data want;
set have;
hi_flag = (y='hi'); * new variable hi_flag contains result of test (y='hi');
run;
Related
Given the following table have, I would like to delete the records that satisfy the conditions based on the to_delete table.
data have;
infile datalines delimiter="|";
input id :8. item :$8. datetime : datetime18.;
format datetime datetime18.;
datalines;
111|Basket|30SEP20:00:00:00
111|Basket|30SEP21:00:00:00
111|Basket|31DEC20:00:00:00
111|Backpack|31MAY22:00:00:00
222|Basket|31DEC20:00:00:00
222|Basket|30JUN20:00:00:00
;
+-----+----------+------------------+
| id | item | datetime |
+-----+----------+------------------+
| 111 | Basket | 30SEP20:00:00:00 |
| 111 | Basket | 30SEP21:00:00:00 |
| 111 | Basket | 31DEC20:00:00:00 |
| 111 | Backpack | 31MAY22:00:00:00 |
| 222 | Basket | 31DEC20:00:00:00 |
| 222 | Basket | 30JUN20:00:00:00 |
+-----+----------+------------------+
data to_delete;
infile datalines delimiter="|";
input id :8. item :$8. datetime : datetime18.;
format datetime datetime18.;
datalines;
111|Basket|30SEP20:00:00:00
111|Backpack|31MAY22:00:00:00
222|Basket|30JUN20:00:00:00
;
+-----+----------+------------------+
| id | item | datetime |
+-----+----------+------------------+
| 111 | Basket | 30SEP20:00:00:00 |
| 111 | Backpack | 31MAY22:00:00:00 |
| 222 | Basket | 30JUN20:00:00:00 |
+-----+----------+------------------+
In the past, I used to operate with the catx() function to concatenate the conditions in a where statement, but I wonder if there is a better way of doing this
proc sql;
delete from have
where catx('|',id,item,datetime) in
(select catx('|',id,item,datetime) from to_delete);
run;
+-----+--------+------------------+
| id | item | datetime |
+-----+--------+------------------+
| 111 | Basket | 30SEP21:00:00:00 |
| 111 | Basket | 31DEC20:00:00:00 |
| 222 | Basket | 31DEC20:00:00:00 |
+-----+--------+------------------+
Please note that it should allow the have table to have more columns than the table to_delete.
You can use except from to compute difference set of two sets:
proc sql;
create table want as
select * from have except select * from to_delete
;
quit;
Asked on SAS communitiesas well , havent gotten a correct response.
https://communities.sas.com/t5/SAS-Programming/Identifying-overlap-medication-use/m-p/628115#M185541
I have a problem similar to the problem in -
https://communities.sas.com/t5/SAS-Programming/Concomitant-drug-medication-use/m-p/339879#M77587
However I have an issue , I have overlapping of same drug as well -
Eg:
+----+------+-----------+-----------+-----------+
| ID | DRUG | START_DT | DAYS_SUPP | END_DT |
+----+------+-----------+-----------+-----------+
| 1 | A | 2/17/2010 | 30 | 3/19/2010 |
| 1 | A | 3/17/2010 | 30 | 4/16/2010 |
| 1 | A | 4/12/2010 | 30 | 5/12/2010 |
| 1 | A | 8/20/2010 | 30 | 9/19/2010 |
| 1 | B | 5/6/2009 | 30 | 6/5/2009 |
+----+------+-----------+-----------+-----------+
Here the three A prescriptions are over lapping .
So using the code in the link gives me combinations like A-A-B
whereas I don't want that.
However I want to account for the overlapping days for drug A. So I want to shift the second row prescription to 3/20/2010 to 4/19/2010. Similarly for 3rd A prescription.
the code I have tried -
data have2;
set have_sorted1;
format NEW_START_DT NEW_END_DT _lagEND_DT date9.;
_lagID = lag(patient_ID);
_lagDRUG = lag(drg_cls);
_lagEND_DT = lag(rx_ed_dt);
if patient_ID = _lagID and drg_cls= _lagDRUG and rx_st_dt <= _lagEND_DT then flag=1;
else flag = 0;
retain NEW_START_DT NEW_END_DT;
if flag=0 then do;
NEW_START_DT = rx_st_dt;
NEW_END_DT = rx_ed_dt;
end;
else do;
New_start_dt = NEW_End_DT + 1;
NEW_END_DT = new_start_dt + DAY_SUPP ;
end;
/* drop flag _:;*/
run;
But even then I get incorrect result -
id Drug drug_start day_supp drug_end New_start New_end
15 A 6-Sep-15 30 5-Oct-15 6-Sep-15 5-Oct-15
15 A 24-Sep-15 90 22-Dec-15 6-Oct-15 4-Jan-16
15 A 6-Dec-15 90 4-Mar-16 5-Jan-16 4-Apr-16
15 A 26-Feb-16 90 25-May-16 5-Apr-16 4-Jul-16
15 A 29-May-16 90 26-Aug-16 29-May-16 26-Aug-16
15 A 7-Dec-16 90 6-Mar-17 7-Dec-16 6-Mar-17
15 A 17-Feb-17 90 17-May-17 7-Mar-17 5-Jun-17
It might be easier to track the 'flag' state implicitly in a shift variable that tracks how many days to shift forward.
Example:
Shift is always applied, but will be zero when no overlap occurs. The prior end, after computation, is tracked in a retained variable. The code does not need to rely on LAG.
data have;
infile cards firstobs=3 dlm='|';
input ID DRUG: $ START_DT: mmddyy10. DAYS_SUPP END_DT: mmddyy10.;
format start_dt end_dt mmddyy10.;
datalines;
| ID | DRUG | START_DT | DAYS_SUPP | END_DT |
+----+------+-----------+-----------+-----------+
| 1 | A | 2/17/2010 | 30 | 3/19/2010 |
| 1 | A | 3/17/2010 | 30 | 4/16/2010 |
| 1 | A | 4/12/2010 | 30 | 5/12/2010 |
| 1 | A | 8/20/2010 | 30 | 9/19/2010 |
| 1 | B | 5/6/2009 | 30 | 6/5/2009 |
;
data want;
set have;
by id drug;
retain shift prior_shifted_end;
select;
when (first.drug) shift = 0;
when (prior_shifted_end > start_dt) shift = prior_shifted_end - start_dt + 1;
otherwise shift = 0;
end;
original_start_dt = start_dt;
original_end_dt = end_dt;
start_dt + shift;
end_dt + shift;
prior_shifted_end = end_dt;
format prior: original: mmddyy10.;
run;
I have a database with 3 columns. ID, Date and amount. It is ordered by ID and Date. All I want to do is to add a row after the latest occurrence of every ID with the same ID, Date = Date + 1 Month and Amount = 0.
As an Illustration I want to go from this:
id | Date |amount |
A | 01JAN| 1 |
A | 01FEB| 1 |
B | 01FEB| 0 |
B | 01MAR| 1 |
to this:
id | Date |amount |
A | 01JAN| 1 |
A | 01FEB| 1 |
A | 01MAR| 0 | <- ADD THIS ROW
B | 01FEB| 0 |
B | 01MAR| 1 |
B | 01APR| 0 |<- ADD THIS ROW
I know I should use intxn but beyond that I don't really know what to do. I appreciate any input.
Assuming that the DATE variable has actual date values in it you just need to output twice on the last observation in each group.
data want;
set have;
by id;
output;
if last.id then do;
date=intnx('month',date,1,'b');
amount=0;
output;
end;
run;
I am trying to merge two tables. table A has an id column, a date column, and an amount value for every date in a period
Table B has both id and date, but also other columns with details. However, there is only one entry any time there is a change in the details, so I do not know how to merge with normal joins. I want that for every entry in A, the details are populated as of the latest day available in B for that ID before the date in A.
Table A
| ID | date | amount |
| 1 | 01JAN| 56 |
| 1 | 02JAN| 54 |
| 1 | 03JAN| 23 |
| 1 | 04JAN| 43 |
Table B
| ID | date | details|
| 1 | 01JAN| x |
| 1 | 03JAN| y |
Wanted Output
Table A
| ID | date | amount | details |
| 1 | 01JAN| 56 | x |
| 1 | 02JAN| 54 | x |
| 1 | 03JAN| 23 | y |
| 1 | 04JAN| 43 | y |
for the jan2 entry, the latest available details as of that date is 'x', for jan3 it is y
Thank you in advance for any guidance you could provide
This will work for the question you have asked literally:
data want;
retain details_last;
merge table1 table2;
by ID date;
if not missing(details) then details_last = details;
else details = details_last;
drop details_last;
run;
But this will only work if your data meets the conditions that you have presented like the date ranges in table B should always fall within the date ranges in table A and not outside (i.e. only interpolation, no extrapolation).
I was wondering if there is a way to separate a column into 2 or more columns. The values are separated by semicolon.
Here is how my data is currently
+------------+
| Col1 |
+------------+
| 541.6;I345 |
+------------+
I would like something as below
+--------+------+
| Col1 | Col2 |
+--------+------+
| 541.6 | I345 |
+--------+------+
Try scan:
data want;
set have;
col2 = scan(col1,2,";");
col1 = scan(col1,1,";");
run;
Let me know in case of any queries.