I cannot seem to find a way to use and if-then or just an if statement below the unique command in the SAS merge code below. I am matching all of the same S2_Liab numbers just pulling a different number where there is a different title in the "Liability_Lmt" column.
/*--------------------------------------------------------------------------------------
/*---- Additional Partner and/or Corporation charge ----------
/*--------------------------------------------------------------------------------------
data full;
set whole_FR;
record_num=_N_;
S2_Liab = S2LiabLimit;
run;
proc sort;
by S2_Liab;
run;
data unique(keep=S2_Liab S2Partners P_charge FD_charge);
set WORK.FR_S2_Mandatory_Liab_Cov;
S2_Liab = Liability_Lmt;
if Additional = "Partner_Corp" then P_charge = round(&thestate.,0.01);
if Additional = "Farm_Dwelling" then FD_charge = round(&thestate.,0.01);
run;
proc sort nodupkey;
by S2_Liab;
run;
data match nonmatch;
merge full(in=a) unique(in=b);
by S2_Liab;
if a=1 and b=1 then output match;
if a=1 and b=0 then output nonmatch;
run;
data whole_FR;
set match nonmatch;
proc sort;
by record_num;
run;
For the first if-then statement above I am getting no results, instead the code seems to skip directly to the second if-then statement.
These are my results show me all of the numbers for the second if statement ("FD_charge") but blank answers for "P_charge". For some reason the program skips over the first if statement and prints out the answer for the second if statement. I have also tried an if-then-else statement but I get the same answers.
Does anybody have a clue how to make this work?
Related
PROC PRINT DATA = pg1.eu_occ obs=10 label;
RUN;
I tried print first 10 observations with their label, but it doesn't work at all. Any idea how to solve this problem?enter image description here
Thanks.
obs= is a data set option, and thus must be specified in parenthesis after the data set name. A name=value coded into a Proc statement is known as a procedure option.
Your code should be
proc print data=pg1.eu_occ(obs=10) label;
run;
Now the question I have is I have a bigger problem as I am getting "this range is repeated or overlapped"... To be specific my values of label are repeating I mean my format has repeated values like a=aa b=aa c=as kind of. How do I resolve this error. When I use the hlo=M as muntilqbel option it gives double the data...
I am mapping like below.
Santhan=Santhan
Chintu=Santhan
Please suggest a solution.
To convert data to a FORMAT use the CNTLIN= option on PROC FORMAT. But first make sure the data describes a valid format. So read the data from the file.
data myfmt ;
infile 'myfile.txt' dsd truncover ;
length fmtname $32 start $100 value $200 ;
fmtname = '$MYFMT';
input start value ;
run;
Make sure to set the lengths of START and VALUE to be long enough for any actual values your source file might have.
Then make sure it is sorted and you do not have duplicate codes (START values).
proc sort data=myfmt out=myfmt_clean nodupkey ;
by start;
run;
The SAS log will show if any observations were deleted because of duplicate START values.
If you do have duplicate values then examine the dataset or original text file to understand why and determine how you want to handle the duplicates. The PROC SORT step above will keep just one of the duplicates. You might just has exact duplicates, in which case keeping only one is fine. Or you might want to collapse the duplicate observations into a single observation and concatenate the multiple decodes into one long decode.
If you want you can add a record that will add the functionality of the OTHER keyword of the VALUE statement in PROC FORMAT. You can use that to set a default value, like 'Value not found', to decode any value you might encounter that was not in your original source file.
data myfmt_final;
set myfmt_clean end=eof;
output;
if eof then do;
start = ' ';
label = 'Value not found';
hlo = 'O' ;
output;
end;
run;
Then use PROC FORMAT to make the format from the cleaned up data file.
proc format cntlin = myfmt_final;
run;
To convert a FORMAT to a dataset use the CNTLOUT= option on PROC FORMAT.
For example if you had created this format previously.
proc format ;
value $myfmt 'ABC'='ABC' 'BCD'='BCD' 'BCD1'='BCD' 'BCD2'='BCD' ;
run;
then you can use another PROC FORMAT step to make a dataset. Use the SELECT statement if you format catalog has more than one format defined and you just want one (or some) of them.
proc format cntlout=myfmt ;
select $myfmt ;
run;
Then you can use that dataset to easily make a text file. For example a comma delimited file.
data _null_;
set myfmt ;
file 'myfmt.txt' dsd ;
put start label;
run;
The result would be a text file that looks like this:
ABC,ABC
BCD,BCD
BCD1,BCD
BCD2,BCD
You get this error because you have the same code that maps to two different categories. I'm going to guess you likely did not import your data correctly from your text file and ended up getting some values truncated but without the full process it's an educated guess.
This will work fine:
proc format;
value $ test
'a'='aa' 'b'='aa' 'c'='as'
;
run;
This version will not work, because a is mapped to two different values, so SAS will not know which one to use.
proc format;
value $ badtest
'a'='aa'
'a' = 'ba'
'b' = 'aa'
'c' = 'as';
run;
This generates the error regarding overlaps in your data.
The way to fix this is to find the duplicates and determine which code they should actually map to. PROC SORT can be used to get your duplicate records.
I have a data having field Income , Age and Cond where A and B are numeric and Cond contains conditions (string) like "If Income>=10000", "If Age<=35" etc.
I want to use field Cond for filtering of the data.
I am using call symput to create runtime macro variable inside data step bt unable to use it as filtering criterion.
data T2;
set T1;
CALL SYMPUT("Condition", Cond);
&Condition.; /*this is the line which is not working*/
run;
You are mixing scopes.
A running data step can not change it's running source code, so you can't have a data step set a macro variables value and then expect the data set to use the resolution of the macro variable as source code.
You can use a variety of techniques to evaluate an expression.
CALL EXECUTE
You can use call EXECUTE to %EVAL an expression in the macro environment while a DATA step is running. The result can be retrieved with SYMGET
Example of idea
%let x = 0;
data _null_;
length expression $1000;
expression = '%let x = %eval(10 + 20)';
call execute (expression);
x = symget('x');
put x=;
run;
Using the idea
data want;
set have;
condition = tranwrd(condition, 'age', cats(age));
condition = tranwrd(condition, 'income', cats(income));
call execute (cats('%let result = %eval(', condition, ')'));
result = symget('result');
* subsetting if based on dynamic evaluation of conditional expression;
if result;
run;
Other
Other ways to execute dynamic code is through functions RESOLVE or DOSUBL
Good day,
I assume you have only a single condition in the table? (Same reason every row.) Unless, you need to select the proper reason via other reasons. This takes the reason in the last row.
Firtst lets generate some dummmy data.
data T1;
infile datalines delimiter=',';
input cond $21. Income age ;
cards;
"If Income>=10000" , 1000 , 10
"If Income>=10000" , 10000 , 100
"If Income>=10000" , 100000 , 1000
;run;
What we do here is create Global variable &condition into which we put the last value of Cond-column. You can also use proc sql to easily select desired string into the variable. See more on proc sql Into here
data _NULL_;
set T1;
CALL SYMPUT("Condition", Cond);
run;
Here we begin with set T1 and apply the String &condition, which contains the rule. I needed to remove the quotes from the command in order to get SAS to perform the function. It is a bit unconventional to apply command this way but works.
data T2;
set T1;
%qsysfunc(dequote(&Condition.));
run;
EDIT: Based on further elaboration the following has been tested to work. One feeds the conditions to macro loop, which selects the condition. If there are dupilates, I suggest you split conditions and data to different sets.
%macro Create_subset(Row);
data _NULL_;
set T1(firstobs=&row obs=&row.);
CALL SYMPUT("Condition", Cond);
run;
data Subset_&row.;
set T1;
%qsysfunc(dequote(&Condition.));
applied_cond = &Condition.;
run;
%mend Create_subset;
data _NULL_;
set T1;
call execute('%nrstr(%Create_subset('||strip(_N_)|| '))');
run;
I am new to this and I already posted this question. But I think I did not explain it well.
I have a DATA inside SAS.
Some of the cells are empty[nothing in] and in the SAS output window, they have a DOT in the cell.
WHen I run the Result, At the end of the table, It add MISSING FREQUENCY = 7 or whatever the number is...
How do I make SAS disregard the Missing Frequency and ONLY use the one that have result...
Please see my screen shot, code and my CSV:OUTPUT DATA
RESULT WITH the MISSING frequency at the bottom
/* Generated Code (IMPORT) */
/* Source File:2012_16_ChathamPed.csv */
/* Source Path: /home/cwacta0/my_courses/Week2/ACCIDENTS */
PROC IMPORT
DATAFILE='/home/cwacta0/my_courses/Week2/ACCIDENTS/2012_16_ChathamPed.csv'
OUT=imported REPLACE;
GETNAMES=YES;
GUESSINGROWS=32767;
RUN;
proc contents data=work.imported;
run;
libname mydata"/courses/d1406ae5ba27fe300" access=readonly;
run;
/* sorting data by location*/
PROC SORT ;
by LocationOfimpact;
LABEL Route="STREET NAME" Fatalities="FATALITIES" Injuries="INJURIES"
SeriousInjuries="SERIOUS INJURIES" LocationOfimpact="LOCATION OF IMPACT"
MannerOfCollision="MANNER OF COLLISION"
U1Factors="PRIMARY CAUSES OF ACCIDENT"
U1TrafficControl="TRAFFIC CONTROL SIGNS AT THE LOCATION"
U2Factors="SECONDARY CAUSES OF ACCIDENT"
U2TrafficControl="OTHER TRAFFIC CONTROL SIGNS AT THE LOCATION"
Light="TYPE OF LIGHTHING AT THE TIME OF THE ACCIDENT"
DriverAge1="AGE OF THE DRIVER" DriverAge2="AGE OF THE CYCLIST";
/* Here I was unable to extract the drivers age 25 or less and te drivers who disregarded stop sign. here is how I coded it;
IF DriverAge1 LE 25;
IF U1Factors="Failed to Yield" OR U1Factors= "Disregard Stop Sign";
Run;
Also, I want to remove the Missing DATA under the results. But in the data, those are just a blank cell. How do I tell SAS to disregard a blank cell and not add it to the result?
Here is what I did and it does not work...
if U1Factors="BLANK" Then U1Factors=".";
Please help me figre this out...Tks
IF U1Factors="." Then call missing(U1Factors)*/;
Data want;
set imported;
IF DriverAge1 LE 25 And U1Factors in ("Failed to Yield", "Wrong Side of Road",
"Inattentive");
IF Light in ("DarkLighted", "DarkNot Lighted", "Dawn");
run;
proc freq ;
tables /*Route Fatalities Injuries SeriousInjuries LocationOfimpact MannerOfCollision*/
U1Factors /*U1TrafficControl U2Factors U2TrafficControl*/
light DriverAge1 DriverAge2;
RUN;
SAS will display missing numeric variables using a period. So if there was nothing in column for DriverAge1 in the CSV file then that observation will have a missing value. If your variable is character then SAS will also normally convert values of just a period in the input stream into blanks in the SAS variable.
Missing numeric values are considered less than any real number. So if you want use conditions like less than or equal to then missing values would be included if you do not exclude them by some other condition.
You can use a WHERE statement on procs to filter the data. If you want to append to the WHERE condition in a separate statement you can use the WHERE ALSO syntax to add the extra conditions.
If you want the missing category to appear in the PROC FREQ output add the MISSPRINT option to the TABLES statement. Or add the MISSING option and it will appear and also be counted in statistics.
proc freq ;
where . < DriverAge1 <= 25
and U1Factors in ("Failed to Yield", "Wrong Side of Road","Inattentive")
;
where also Light in ("DarkLighted", "DarkNot Lighted", "Dawn");
tables U1Factors light DriverAge1 DriverAge2 / missing;
run;
The WHERE conditions will apply to the whole dataset. So if you exclude missing DriverAge1 and missing U1Factors
proc freq ;
where not missing(U1Factors) and not missing(DriverAge1);
tables U1Factors DriverAge1 ;
run;
then only the observations that are not missing for both will be included. So you might want to generate the statistics separately for each variable.
proc freq ;
where not missing(U1Factors);
tables U1Factors ;
run;
proc freq ;
where not missing(DriverAge1);
tables DriverAge1 ;
run;
I am learning drop_conditions in SAS. I am requesting a sample on three variables, however only two observations are retrieved. Please advise! Thank you!
data temp;
set mydata.ames_housing_data;
format drop condition $30.;
if (LotArea in (7000:9000)) then drop_condition = '01: Between 7000-9000 sqft';
else if (LotShape EQ 'IR3') then drop_condition = '02: Irregular Shape of Property';
else if (condition1 in 'Artery' OR 'Feedr') then drop_condition = '03: Proximity to arterial/feeder ST';
run;
proc freq data=temp;
tables drop_condition;
title 'Sample Waterfall';
run; quit;
Your conditions/comparisons aren't specified correctly, I think you're looking for the IN operator to do multiple comparisons in a single line. I'm surprised there aren't errors in your log.
Rather than the following:
if (LotArea = 7000:9000)
Try:
if (lotArea in (7000:9000))
and
if (condition1 EQ 'Atrery' OR 'Feedr')
should be
if (condition1 in ('Atrery', 'Feedr'))
EDIT:
You also need to specify a length for the drop_condition variable, instead of a format to ensure the variable is long enough to hold the text specified. It's also useful to verify your answer afterwards with a proc freq against the specified conditions, for example:
proc freq data=temp;
where drop_condition=:'01';
tables drop_condition*lot_area;
run;