I plan to run the following cross-sectional regression for 10 years and plot the coefficient estimate for variable x in one graph.
Thanks to this post, I wrote the following and it works:
forvalues i=1/10 {
reg y x if year==1
estimates store year`i'
local allyears `allyears' year`i' ||
local labels `labels' `i'
}
coefplot `allyears', keep(grade) vertical bycoefs bylabels(`labels')
I want to add the following to the same graph but don't know how:
A horizontal line segment x=5 for year 1 to year 5, and another horizontal line segment x=4 for year 6 to year 10.
A shaded area ranging from x=4 to x=6 for year 1 to year 5, and another shaded area ranging from x=2 to 4 for year 6 to year 10.
(Note that my horizontal axis is year, and my vertical axis is coefficient for x.)
Any help is greatly appreciated!
Here's an example based on the nlswork toy dataset:
clear
use http://www.stata-press.com/data/r12/nlswork.dta
for values i = 70 / 73 {
regress ln_w grade if year==`i'
estimates store year`i'
local allyears `allyears'year`i' ||
local labels `labels' `i'
}
coefplot `allyears', keep(grade) vertical bycoefs bylabels(`labels') ///
addplot(scatteri 0.08 1 0.08 3, recast(connected) || ///
scatteri 0.09 1 0.09 3, recast(connected) || ///
scatteri 0.065 2 0.065 3 0.075 3 0.075 2, recast(area) lwidth(none))
Related
Hello I am just new in powerBI and it is still hard to work on for me.
I have a matrix like that
DATE Sales Refund
26 Agu 45 5
p1 10 3
p2 15 2
p3 20 0
27 Agu 60 1
p1 15 1
p2 20 0
p3 25 0
In the date parts I have subtotals as it normally does. However, I want to show the average of that day there and when I get the average I will make conditional formatting according to it. If a cell is below average I will mark it with red point and in refunds I will do it for the values above the average.
Is there a way to do that. I searched for it for awhile but could not find.
The output I want is like that. (star is for red point.)
DATE Sales Refund
26 Agu 15 1.66
p1 10* 3*
p2 15 2*
p3 20 0
27 Agu 20 0.33
p1 15* 1*
p2 20 0
p3 25 0
Thanks.
You can colour the background; For example, create this measure:
AVG =
IF( SELECTEDVALUE(RefundTab[Sale] ) < CALCULATE(AVERAGE(RefundTab[Sale]), ALL(RefundTab[Code])),0,1)
From menu -> Conditional formatting -> Background color:
And here:
OR
you can create measure where we return string instead of number where we put some unicode value:
SumSaleIf =
var _sale = sum(RefundTab[Sale])
var _IfAVG = CALCULATE(AVERAGE(RefundTab[Sale]), ALL(RefundTab[Code]))
var _check = if(_sale < _IfAVG, _sale & UNICHAR(128315), _sale &"")
return _check
I have the following lottery numbers in a macro:
global lottery 6 9 4 32 98
How can I simulate a variable with 50 observations, where each observation is randomly obtained from the numbers stored in the macro?
The code below produces an error:
set obs 50
global lottery 6 9 4 32 98
g lot=$lottery
invalid '9'
r(198);
Here are two similar methods:
clear
set obs 50
set seed 2803
local lottery 6 9 4 32 98
* method 1
generate x1 = ceil(5 * runiform())
tabulate x1
generate y1 = .
forvalues j = 1/5 {
replace y1 = real(word("`lottery'", `j')) if x1 == `j'
}
* method2
set seed 2803
generate x2 = runiform()
generate y2 = cond(x2 <= 0.2, 6, cond(x2 <= 0.4, 9, cond(x2 <= 0.6, 4, cond(x2 <= 0.8, 32, 98))))
tab1 y?
I am assuming that you want equal probabilities, which is not explicit. The principle of setting the seed is crucial for reproducibility. As in #Pearly Spencer's answer, using a local macro is widely considered (much) better style.
To spell it out: in this answer the probabilities are equal, but the frequencies will fluctuate from sample to sample. In #Pearly’s answer the frequencies are guaranteed equal, given that 50 is a multiple of 5, and randomness is manifest only in the order in which they arrive. This answer is like tossing a die with five faces 50 times; #Pearly’s answer is like drawing from a deck of 50 cards with 5 distinct types.
proc sgplot data= group;
scatter x=test1 y=test2 / markerattrs=(symbol=circlefilled color=black);
xaxis label='Test 1 Scores'
min =30
max =95
;
yaxis label='Test 2 Scores'
min=35
max=95
;
run;
I've been trying to adjust the vales showing on the axis of my scatterplot so that the values are divisible by 5, however despite saying my min and max values are 35 and 95, the min and max values showing are 40 and 100. I attempted to see what would occur if I said the min value was 30, but it still showed as 40. What exactly am I doing wrong here?
I have a variable age, 13 variables x1 to x13, and 802 observations in a Stata dataset. age has values ranging 1 to 9. x1 to x13 have values ranging 1 to 13.
I want to know how to count the number of 1 .. 13 in x1 to x13 according to different values of age. For example, for age 1, in x1 to x13, count the number of 1,2,3,4,...13.
I first change x1 to x13 as a matrix by using
mkmat x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13, matrix (a)
Then, I want to count using the following loop:
gen count = 0
quietly forval i = 1/802 {
quietly forval j = 1/13 {
replace count = count + inrange(a[r'i', x'j'], 0, 1), if age==1
}
}
I failed.
I am still somewhat uncertain as to what you like to achieve. But if I am understanding you correctly, here is one way to do it.
First, a simple data that has age ranging from one to three, and four variables x1-x4, each with values of integers ranging between 5 and 7.
clear
input age x1 x2 x3 x4
1 5 6 6 6
1 7 5 6 5
2 5 7 6 6
3 5 6 7 7
3 7 6 6 6
end
Then we create three count variables (n5, n6 and n7) that counts the number of 5s, 6s, and 7s for each subject across x1-x4.
forval i=5/7 {
egen n`i'=anycount(x1 x2 x3 x4),v(`i')
}
Below is how the data looks like now. To explain, the first "1" under n5 indicates that there is only one "5" for the subject across x1-x4.
+----------------------------------------+
| age x1 x2 x3 x4 n5 n6 n7 |
|----------------------------------------|
1. | 1 5 6 6 6 1 3 0 |
2. | 1 7 5 6 5 2 1 1 |
3. | 2 5 7 6 6 1 2 1 |
4. | 3 5 6 7 7 1 1 2 |
5. | 3 7 6 6 6 0 3 1 |
+----------------------------------------+
It sounds to me like your ultimate goal is to have sums calculated separately for each value in age. Assuming this is true, let's create a 3x3 matrix to store such results.
mat A=J(3,3,.) // age (1-3) and values (5-7)
mat rown A=age1 age2 age3
mat coln A=value5 value6 value7
forval i=5/7 {
forval j=1/3 {
qui su n`i' if age==`j'
loca k=`i'-4 // the first column for value5
mat A[`j',`k']=r(sum)
}
}
The matrix looks like this. To explain, the first "3" under value5 indicates that for all children of the age of 1, the value 5 appears a total of three times across x1-x4
A[3,3]
value5 value6 value7
age1 3 4 1
age2 1 2 1
age3 1 4 3
With Aspen's example, you could do this:
gen id = _n
reshape long x, i(id)
tab age x
Note that your sample code doesn't loop over different ages and there is an incorrect comma in the count command. I won't try to fix the code, as there are many more direct methods, one of which is above. tabulate has an option to save the table as a matrix.
Here is another solution closer to the original idea. Warning: code not tested.
matrix count = J(9, 13, 0)
forval i = 1/9 {
forval j = 1/13 {
forval J = 1/13 {
qui count if age == `i' & x`J' == `j'
matrix count[`i', `j'] = count[`i', `j'] + r(N)
}
}
}
I'm trying to calculate the frequency of fractions in my data set (excluding whole numbers).
For example, my variable P takes values 24+1/2, 97+3/8, 12+1/4, 57+1/2, etc. and I'm looking to find the frequency of 1/2, 3/8, and so on. Can anyone help?!
Thanks in advance!
Clyde013
Clyde013, here is one way, assuming that p is of character type. hth. cheers, chang
> Pulled from SAS-L
/* test data -- if p is a character var */
data one;
input p $ ##;
cards;
24+1/2
97+3/8
12+1/4
57+1/2
36 3/8 ;
run;
/* frequencies of frations? */
data two;
set one;
whole = scan(p, 1, "+");
frac = scan(p, 2, "+");
run;
proc freq data=two;
tables frac;
run;
/* on lst
Cumulative Cumulative
frac Frequency Percent Frequency Percent
---------------------------------------------------------
1/2 2 50.00 2 50.00
1/4 1 25.00 3 75.00
3/8 1 25.00 4 100.00
Frequency Missing = 2 */