Hi I'm trying to create a pie chart that has a lot of slices. For some reason I get an error when running this code.
My code
graph pie ccounter if year==1900 & ccounter>100 & labforce==2, over(occ1950)
and I get this error
(note: areastyle p193pie not found in scheme, default attributes used)
(note: areastyle p194pie not found in scheme, default attributes used)
(note: areastyle p195pie not found in scheme, default attributes used)
(note: areastyle p196pie not found in scheme, default attributes used)
option min() incorrectly specified
Note that the variable occ1950 has more than 100 values. I don't know whether this is what causing the problem.
Extra Information
I use this code to create the variable ccounter
bys mcdstr year occ: gen counter=_n
bys mcdstr year occ: egen ccounter=max(counter)
I used this to calculate the number of people working in each industry by year and location.
The problem lies in that the variable occ1950 has too much unique values. Let us examine the problem using a CSV dataset of only 40 countries.
country,fdi
Afghanistan,141.391
Algeria,541.478
Angola,238.637
Antigua and Barbuda,1.653
Argentina,205.691
Bahamas,21.927
Bahrain,1.317
Bangladesh,50.298
Barbados,2.816
"Bolivia, Plurinational State of",41.572
Botswana,87.649
Brazil,455.5649999999999
British Virgin Islands,12387.568
Brunei Darussalam,21.02
Cambodia,672.6800000000001
Cameroon,30.159
Cape Verde,3.783
Cayman Islands,15323.116
Chile,53.149
Colombia,49.047
Congo,112.104
"Congo, Democratic Rep. of",302.505
Costa Rica,.826
Côte d' Ivoire,27.099
Dominican Republic,.112
Ecuador,93.673
Egypt,191.59
Equatorial Guinea,80.21700000000001
Eritrea,16.269
Ethiopia,205.824
Fiji,38.742
Gabon,76.66500000000001
Ghana,129.068
Guinea,97.413
Guyana,79.899
Honduras,2.6
"Hong Kong, China",124987.422
India,291.567
Indonesia,850.0709999999999
"Iran, Islamic Republic of",480.71
After loading this into Stata 14, we can observe that
graph pie fdi, over(country)
produces the error: option min() incorrectly specified.
If we now reduce the dataset to simply 30 countries by: drop _n > 30. We would be able to get a pie chart.
This suggests that you should collapse your data, take the n categories with the largest ccounter and then classify the other classes as "other".
The magic number is 36. So you can have at most 36 unique categories in your pie chart.
Related
I use Oracle APEX (v22.1) and on a page I created a (line) chart, but I have the following problem for the visualization of the graphic:
On the y-axis it is not possible to show the values in the format 'hh:mi' and I need a help for this.
Details for the axis:
x-axis: A date column represented as a string: to_char(time2, 'YYYY-MM')
y-axis: Two date columns and the average of the difference will be calculated: AVG(time2 - time1); the date time2 is the same as the date in the x-axis.
So I have the following SQL query for the visualization of the series:
SELECT DISTINCT to_char(time2, 'YYYY-MM') AS YEAR_MONTH --x-axis,
AVG(time2 - time1) AS AVERAGE_VALUE --y-axis
FROM users
GROUP BY to_char(time2, 'YYYY-MM')
ORDER BY to_char(time2, 'YYYY-MM')
I have another problem to solve it in another way: I am not familiar with JavaScript, if the solution is only possible in this way. Because I started new with APEX, but I have seen in different tutorials that you can use JS. So, when JS is the only solution, I would be happy to get a short description what I must do on the page.
(I don't know if this point is important for this case: The values time1 and time2 are updated daily.)
On the attributes of the chart I enabled the 'Time Axis Type' under Settings
On the y-axis I change the format to "Time - Short" and I tried with different pattern like ##:## but in this case you see for every value and also on the y-axis the value '01:00' although the line chart was represented in the right way. But when I change the format to Decimal the values are shown correct as the line chart.
I also tried it with the EXTRACT function for the value like 'EXTRACT(HOUR FROM AVG(time2 - time1))|| ':' || EXTRACT(MINUTE FROM AVG(time2 - time1))' but in this case I get an error message
So where is my mistake or is it more difficult to solve this?
ROUND(TRUNC(avg(time2 - time1)/60) + mod(avg(time2 - time1),60)/100, 2) AS Y
will get close to what you want, you can set Y Axis minimum 0 maximum 24
then 12.23 means 12 hour and 23 minutes.
After running these command all day, my head is on fire, I am now reaching out.
Please don't direct me to papers about Suest that are commonly mentioned on the web, I already checked them.
It seems there is a problem with storing the dy/dx values of the AME to merge between different models in the suest command in order to perform the test command.
What I would like to test is if the AME of the lower class/upper class/middle class in one regime/context is statistically significant from the lower class situated in another regime/context.
Dependent variable: 3 categories: renter, mortgaged homeownership, outright homeownership.
*Liberal_market
mlogit owner_housing_debt2 United_States United_Kingdom Swizerland c.age_centered
ib0.lower_class ib0.upper_class if homeownership_regimes==1 ,
baseoutcome(1)
margins , dydx(lower_class upper_class) coeflegend post
est store Liberal_market
*Family_financial_support
mlogit owner_housing_debt2 Belgium Finland France Ireland Luxembourg Norway Spain
ib0.lower_class ib0.upper_class if homeownership_regimes==2 , baseoutcome(1)
margins , dydx(lower_class upper_class) coeflegend post
est store Family_financial_support
est table Liberal_market Family_financial_support
suest Liberal_market Family_financial_support
**In the end, this is what I want to do:
test [Liberal_market]1.lower_class =[Family_financial_support]1.lower_class
*error message
Liberal_market was estimated with a nonstandard vce (delta)
r(322);
-Unfortunately, the following answer from Statalist regarding the nonstandard vce in suest- didn’t help me either
https://www.statalist.org/forums/forum/general-stata-discussion/general/1511169-can-not-use-suest-for-margins-after-probit-or-regress
Will appreciate your solution:)
Thank you. I tried your recommendation. Unfortunately, I could not find an organized document with examples for xlincom with margins.
I tried the following code, but my problem is withdrawing the margins of the independent variable categories (1.lower_class and 1.upper_class) from the 2 separate mlogit reg after suest. I mean, how to define in the command that I want 1.lower class from model A and 1.lower_class from model B in the margins and in the xlincom. Please see my example below:
mlogit owner_housing_debt2 ib0.lower_class ib0.upper_class if regime==1,
baseoutcome(1)
est store A
mlogit owner_housing_debt2 ib0.lower_class ib0.upper_class if regime==2,
baseoutcome(1)
est store B
suest A B
margins 1.lower_class 1.upper_class, coeflegend post
lincom _b[1.lower_class] - _b[1.upper_class]
I can't quite grasp how to insert a variable's value labels for titles to a graph.
For example, in sysuse auto, the variable foreign takes the value of 0 or 1 where 0 is labeled "Domestic" and 1 is labeled "Foreign".
In the following snippet, I want to plot the average price for each category of the variable foreign using a loop:
sysuse auto, clear
forvalues i=0/1{
local t = foreign[`i']
graph bar (mean) price if foreign == `i', ///
over(rep78, sort(price) descending) asyvars ///
title("`t'") name(p_`i', replace) nodraw
local graphs `graphs' p_`i'
}
gr combine `graphs'
but it does not even display the category value correctly in the title.
What am I doing wrong?
Your code
local t = foreign[`i']
sets the local macro t to the value of the variable foreign first in observation 0 and then in obseration 1: these will be missing and 0, respectively.
What you want is the value label corresponding to the values 0 and 1, which you can obtain with
local t : label (foreign) `i'
Swap this into your code and your graphs will be labelled Domestic and Foreign, respectively.
The syntax of the replacement command may be unfamiliar; macro "extended functions" are described in help extended_fcn.
Note that either of these graph commands
sysuse auto, clear
graph bar (mean) price , ///
over(rep78, sort(price) descending) asyvars over(foreign)
graph bar (mean) price , ///
over(rep78, sort(price) descending) asyvars by(foreign)
uses value labels automatically and produces a combined graph directly. This may not be the main question, but the original code is not a good solution on those grounds.
I have the following model:
Y_{it} = alpha_i + B1*weight_{it} + B2*Dummy_Foreign_{i} + B3*(weight*Dummy_Foreign)_ {it} + e_{it}
and I am interested on the effect on Y of weight for foreign cars and to graph the evolution of the relevant coefficient across quantiles, with the respective standard errors. That is, I need to see the evolution of the coefficients (B1+ B3). I know this is a non-linear effect, and would require some sort of delta method to obtain the variance-covariance matrix to obtain the standard error of (B1+B3).
Before I delve into writing a program that attempts to do this, I thought I would try and ask if there is a way of doing it with grqreg. If this is not possible with grqreg, would someone please guide me into how they would start writing a code that computes the proper standard errors, and graphs the quantile coefficient.
For a cross section example of what I am trying to do, please see code below.
I use grqred to generate the evolution of the separate coefficients (but I need the joint one)-- One graph for the evolution of (B1+B3) with it's respective standard errors.
Thanks.
(I am using Stata 14.1 on Windows 10):
clear
sysuse auto
set scheme s1color
gen gptm = 1000/mpg
label var gptm "gallons / 1000 miles"
gen weight_foreign= weight*foreign
label var weight_foreign "Interaction weight and foreign car"
qreg gptm weight foreign weight_foreign , q(.5)
grqreg weight weight_foreign , ci ols olsci reps(40)
*** Question 1: How to constuct the plot of the coefficient of interest?
Your second question is off-topic here since it is statistical. Try the CV SE site or Statalist.
Here's how you might do (1) in a cross section, using margins and marginsplot:
clear
set more off
sysuse auto
set scheme s1color
gen gptm = 1000/mpg
label var gptm "gallons / 1000 miles"
sqreg gptm c.weight##i.foreign, q(10 25 50 75 95) reps(500) coefl
margins, dydx(weight) predict(outcome(q10)) predict(outcome(q25)) predict(outcome(q50)) predict(outcome(q75)) predict(outcome(q95)) at(foreign=(0 1))
marginsplot, xdimension(_predict) xtitle("Quantile") ///
legend(label(1 "Domestic") label(2 "Foreign")) ///
xlabel(none) xlabel(1 "Q10" 2 "Q25" 3 "Q50" 4 "Q75" 5 "Q95", add) ///
title("Marginal Effect of Weight By Origin") ///
ytitle("GPTM")
This produces a graph like this:
I didn't recast the CI here since it would look cluttered, but that would make it look more like your graph. Just add recastci(rarea) to the options.
Unfortunately, none of the panel quantile regression commands play nice with factor variables and margins. But we can hack something together. First, you can calculate the sums of coefficients with nlcom (instead of more natural lincom, which the lacks the post option), store them, and use Ben Jann's coefplot to graph them. Here's a toy example to give you the main idea where we will look at the effect of tenure for union members:
set more off
estimates clear
webuse nlswork, clear
gen tXu = tenure*union
local quantiles 1 5 10 25 50 75 90 95 99 // K quantiles that you care about
local models "" // names of K quantile models for coefplot to graph
local xlabel "" // for x-axis labels
local j=1 // counter for quantiles
foreach q of numlist `quantiles' {
qregpd ln_wage tenure union tXu, id(idcode) fix(year) quantile(`q')
nlcom (me_tu:_b[tenure]+_b[tXu]), post
estimates store me_tu`q'
local models `"`models' me_tu`q' || "'
local xlabel `"`xlabel' `j++' "Q{sub:`q'}""'
}
di "`models'
di `"`xlabel'"'
coefplot `models' ///
, vertical bycoefs rescale(100) ///
xlab(none) xlabel(`xlabel', add) ///
title("Marginal Effect of Tenure for Union Members On Each Conditional Quantile Q{sub:{&tau}}", size(medsmall)) ///
ytitle("Wage Change in Percent" "") yline(0) ciopts(recast(rcap))
This makes a dromedary curve, which suggests that the effect of tenure is larger in the middle of the wage distribution than at the tails:
I'm trying to create a bar chart in which the frequency is outside the bar and the percentage inside, is it possible? Would post a picture but the system doesn't allow for it yet.
As others pointed out, this is a poor question without code.
It is possible to guess that you are using graph bar. That makes you choose at most one kind and position of bar labels. Much more is possible with twoway bar so long as you do a little work.
sysuse auto, clear
contract rep78 if rep78 < .
su _freq
gen _pc = 100 * _freq / r(sum)
gen s_pc = string(_pc, "%2.1f") + "%"
gen one = 1
twoway bar _freq rep78, barw(0.9) xla(1/5, notick) bfcolor(none) ///
|| scatter one _freq rep78, ms(none ..) mla(s_pc _freq) mlabcolor(black ..) ///
mlabpos(0 12) scheme(s1color) ysc(r(0 32)) yla(, ang(h)) legend(off)
In short:
contract collapses to a dataset of frequencies.
Calculation of percents is trivial, but you need a formatted version in a string variable if the labels are not to look silly. Precise format is at choice.
The frequency scale on the axis is arguably redundant given the bar labels, and could be omitted.
The example puts labels within the bar just above its base at the level of frequency equal to 1. That's a choice for this example and would be too close to the axis if the typical frequencies were much higher.