I have a task in SAS;
What I need to do is to assign the October Current_value (60) in the column Fake_1 and let repeat it only for the months of October, November and December; the same logic should be applied to valorise column Fake_2, and assign the Current_Value of November both for November and December of columns Fake_2.
That's how my table should look like:
Product_Code Division Category Payment_Frequency Selling_Type expiring_month Current_Value Fake_1 Fake2
C611720 17822 AZ Monthly NSD 1 63
C611720 17822 AZ Monthly NSD 2 72
C611720 17822 AZ Monthly NSD 3 23
C611720 17822 AZ Monthly NSD 4 24
C611720 17822 AZ Monthly NSD 5 90
C611720 17822 AZ Monthly NSD 6 87
C611720 17822 AZ Monthly NSD 7 56
C611720 17822 AZ Monthly NSD 8 43
C611720 17822 AZ Monthly NSD 9 57
C611720 17822 AZ Monthly NSD 10 60 60
C611720 17822 AZ Monthly NSD 11 48 60 48
C611720 17822 AZ Monthly NSD 12 32 60 48
How can it be done? I think the solution would be using retain function, but I am struggling a bit as when I use it, sas prints me value of 48 and 32 for Nov/Dec values in the column fake1.
data sox_8;
set sox_7;
retain Fake_1;
if Expiring_Month > 9 then Fake_1 = Current_Value;
run;
Thank you for the support
The > 9 means it updates for 10, 11, and 12. You want only to update on 10, so:
data sox_8;
set sox_7;
retain Fake_1;
if Expiring_Month = 10 then Fake_1 = Current_Value;
run;
Now, you may want to extend your code like so:
data sox_8;
set sox_7;
array Fake[*] Fake_1-Fake_2;
retain Fake_1-Fake_2;
if Expiring_Month gt 9 and Expiring_Month lt 12 then
Fake[Expiring_Month-9] = Current_Value;
run;
This will update for both 10 and 11. Presumably you'll eventually have code that, for 2021 perhaps, would need to do this for more than two months - this allows you the beginning of a program that will automatically pick up what it needs (you'll want to also calculate Fake dimensions based on the month you run, probably with a macro variable).
%let curmonth=10;
data sox_8;
set sox_7;
array Fake[*] Fake_1-Fake_%sysevalf(12-&curmonth);
retain Fake_:;
if Expiring_Month ge &curmonth and Expiring_Month lt 12 then
Fake[Expiring_Month - &curmonth + 1] = Current_Value;
run;
Of course, please don't use Fake as a variable name in your real code!
Related
I have two tables, say table 1 and table 2.
Table 1:
Region2
Apr
May
North
50
1200
South
75
1500
East
100
750
West
150
220
Table 2:
Region2
Apr
May
North
5
12
South
10
15
East
10
15
West
15
11
I need a table 3 that is a division of table 1 and table 2
Table 3:
Region2
Apr
May
North
10
100
South
7.5
100
East
10
50
West
15
20
I managed to solve it by creating a measure which is a ratio.
In my case, table 1 had values in say rupees.
table 2 had values in say, litres
So, I created a measure which is a division of sum total rupees and sum total of litres and when I added the measure to values in pivot matrix, it automatically divided the two tables.
I have the following tbl_Episodes table (50K records):
ID Month
22 01/01/2019
22 02/01/2019
22 03/01/2019
22 04/01/2019
22 05/01/2019
23 03/01/2020
23 06/01/2020
I need to create a calculated column in DAX language, that will place "1" value on each row where it'll be the beginning or the end of the Quarter, otherwise - "0" value, as:
ID Month NewColumn
22 01/01/2019 1
22 02/01/2019 0
22 03/01/2019 1
22 04/01/2019 0
22 05/01/2019 0
23 03/01/2020 1
23 06/01/2020 1
There are only 4 quarters, the simpler way is to switch dates :
Add to your calendar table columns :
(Consider that your calendar table has "Year" columns)
SWITCH([MONTH],date(1,1,[Year]),1,date(31,03,[Year]),1,
date(1,4,[Year]),1,date(30,6,[Year]),1
,date(1,7,[Year],1,date(30,9,[Year]),1
,date(1,10,[Year]),1,date(31,12,[Year]),1,0)
I need to select a median value for each id, in each age range. So in the following table, for id = 1, in age_range of 6 months, I need to select value for row 2. Basically, I need to create a column per id where only median for each range is selected.
id wt age_range
1 22 6
1 23 6
1 24 6
2 25 12
2 24 12
2 44 18
If I understand correctly, you're looking to make a new column where for each id and age_range you have the median value for comparison. You could do this in base SAS by using proc means to output the medians and then merge it back to the original dataset. However proc sql will do this all in one step and to easily name your new column.
proc sql data;
create table want as
select id, wt, age_range, median(wt) as median_wt
from have
group by id, age_range;
quit;
id wt age_range median_wt
1 24 6 23
1 22 6 23
1 23 6 23
2 24 12 24.5
2 25 12 24.5
2 44 18 44
data1 is data from 1990 and it looks like
Panelkey Region income
1 9 30
2 1 20
4 2 40
data2 is data from 2000 and it looks like
Panelkey Region income
3 2 40
2 1 30
1 1 20
I want to add a column of where each person lived in 1990.
Panelkey Region income Region1990
3 2 40 .
2 1 30 1
1 1 20 9
How can I do this on Stata?
The following code will deal with panels that live in multiple regions in the same year by choosing the region with larger income. This would make sense if income was proportional to fraction of the year spent in a region. Same income ties will be broken arbitrarily using the highest region's value. Other types of aggregation might make sense (take a look at the -collapse- command).
Note that I tweaked your data by inserting second rows for the last observation in each year:
clear
input Panelkey Region income
1 9 30
2 1 20
4 2 40
4 10 80
end
rename (Region income) =1990
bysort Panelkey (income Region): keep if _n==_N
isid Panelkey
save "data1990.dta", replace
clear
input Panelkey Region income
3 2 40
2 1 30
1 1 20
1 9 20
end
bysort Panelkey (income Region): keep if _n==_N
isid Panelkey
merge 1:1 Panelkey using "data1990.dta", keep(match master) nogen
list, clean noobs
I am trying to make Stata select the minimum value of ice_cream eaten by every person (Amanda, Christian, Paola) so that I end up with just 3 rows:
person ice_cream
Amanda 16
Amanda 27
Amanda 29
Amanda 40
Amanda 96
Amanda 97
Christian 19
Christian 23
Christian 26
Christian 27
Christian 28
Christian 34
Christian 62
Christian 70
Christian 78
Paola 5
Paola 11
Paola 28
Paola 97
A one-line solution
collapse (min) ice_cream, by(person)
An answer that avoids creating a new variable:
sort person ice_cream
by person: keep if _n == 1
This should work:
* Generate a variable with the group minimums
sort person
by person: egen Min = min(ice_cream)
* Only keep observations with same value as group minimums
keep if Min == ice_cream
* Delete minimum variable
drop Min
Note: This will leave only observations with a minimum value for ice_cream. If multiple observations in a group have the minimum value for ice_cream then you will have multiple observations for that group (Note this is not in the above data but may be likely if for instance ice_cream was a factor variable). If you wanted a unique observation per group you could then add:
duplicates drop person, force
If you want to simply display the minimum value of ice_cream eaten
by Amanda, Christian and Paola, but without altering your dataset, you can
use the summarize command instead:
clear
input str20 person ice_cream
Amanda 16
Amanda 27
Amanda 29
Amanda 40
Amanda 96
Amanda 97
Christian 19
Christian 23
Christian 26
Christian 27
Christian 28
Christian 34
Christian 62
Christian 70
Christian 78
Paola 5
Paola 11
Paola 28
Paola 97
end
bysort person: summarize ice_cream
---------------------------------------------------------------------------
-> person = Amanda
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
ice_cream | 6 50.83333 36.18517 16 97
---------------------------------------------------------------------------
-> person = Christian
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
ice_cream | 9 40.77778 22.63171 19 78
---------------------------------------------------------------------------
-> person = Paola
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
ice_cream | 4 35.25 42.30347 5 97