American Community Survey, SAS EG code for margin of error - sas
The Census Bureau gives the mathematical formula for calculating the margin of error for the American Community Survey, but doesn't include the SAS code for it. The formula is on page 24 of the documentation here: http://www2.census.gov/programs-surveys/acs/tech_docs/accuracy/ACS_Accuracy_of_Data_2014.pdf
Does anyone have the SAS code for the Margin of Error? It would have to incorporate all 80 pwgtp's.
Here is the relevant code. It uses a 90% confidence interval because that is what the Census Bureau uses for their published margins of error on American FactFinder. You can change the confidence interval at the beginning where '1.64537' is.
/* Margin of Error 90% confidence code*/
1.64537*(SQRT(.05*(SUM((SUM(t1.pwgtp1)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp2)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp3)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp4)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp5)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp6)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp7)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp8)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp9)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp10)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp11)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp12)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp13)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp14)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp15)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp16)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp17)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp18)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp19)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp20)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp21)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp22)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp23)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp24)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp25)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp26)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp27)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp28)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp29)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp30)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp31)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp32)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp33)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp34)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp35)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp36)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp37)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp38)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp39)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp40)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp41)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp42)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp43)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp44)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp45)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp46)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp47)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp48)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp49)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp50)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp51)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp52)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp53)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp54)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp55)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp56)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp57)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp58)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp59)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp60)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp61)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp62)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp63)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp64)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp65)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp66)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp67)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp68)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp69)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp70)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp71)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp72)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp73)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp74)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp75)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp76)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp77)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp78)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp79)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp80)-(SUM(t1.PWGTP)))**2))))
AS Margin_of_Error,
/* Plus_Minus */
(1.64537*(SQRT(.05*(SUM((SUM(t1.pwgtp1)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp2)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp3)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp4)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp5)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp6)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp7)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp8)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp9)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp10)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp11)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp12)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp13)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp14)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp15)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp16)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp17)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp18)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp19)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp20)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp21)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp22)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp23)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp24)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp25)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp26)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp27)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp28)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp29)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp30)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp31)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp32)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp33)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp34)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp35)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp36)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp37)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp38)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp39)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp40)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp41)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp42)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp43)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp44)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp45)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp46)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp47)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp48)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp49)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp50)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp51)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp52)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp53)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp54)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp55)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp56)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp57)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp58)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp59)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp60)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp61)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp62)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp63)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp64)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp65)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp66)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp67)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp68)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp69)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp70)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp71)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp72)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp73)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp74)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp75)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp76)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp77)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp78)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp79)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp80)-(SUM(t1.PWGTP)))**2)))))/ (SUM(t1.PWGTP))
FORMAT=PERCENT6.1 AS Plus_Minus_Percent
/* End of Margin of Error Code */
To see where this fits in to a query, here is the full code of a query with the margin of error code embedded.
/* Example program */
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_COMBINEDACS2013_SAS7BD(label="QUERY_FOR_combinedacs2013.sas7bdat") AS
SELECT /* SUM_of_PWGTP */
(SUM(t1.PWGTP)) FORMAT=Z5. AS SUM_of_PWGTP,
t1.SCHL,
1.64537*(SQRT(.05*(SUM((SUM(t1.pwgtp1)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp2)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp3)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp4)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp5)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp6)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp7)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp8)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp9)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp10)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp11)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp12)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp13)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp14)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp15)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp16)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp17)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp18)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp19)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp20)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp21)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp22)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp23)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp24)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp25)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp26)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp27)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp28)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp29)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp30)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp31)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp32)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp33)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp34)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp35)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp36)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp37)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp38)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp39)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp40)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp41)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp42)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp43)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp44)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp45)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp46)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp47)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp48)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp49)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp50)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp51)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp52)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp53)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp54)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp55)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp56)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp57)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp58)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp59)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp60)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp61)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp62)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp63)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp64)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp65)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp66)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp67)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp68)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp69)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp70)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp71)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp72)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp73)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp74)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp75)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp76)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp77)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp78)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp79)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp80)-(SUM(t1.PWGTP)))**2))))
AS Margin_of_Error,
/* Plus_Minus */
(1.64537*(SQRT(.05*(SUM((SUM(t1.pwgtp1)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp2)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp3)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp4)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp5)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp6)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp7)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp8)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp9)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp10)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp11)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp12)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp13)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp14)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp15)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp16)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp17)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp18)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp19)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp20)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp21)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp22)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp23)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp24)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp25)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp26)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp27)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp28)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp29)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp30)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp31)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp32)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp33)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp34)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp35)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp36)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp37)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp38)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp39)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp40)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp41)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp42)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp43)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp44)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp45)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp46)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp47)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp48)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp49)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp50)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp51)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp52)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp53)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp54)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp55)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp56)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp57)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp58)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp59)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp60)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp61)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp62)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp63)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp64)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp65)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp66)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp67)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp68)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp69)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp70)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp71)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp72)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp73)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp74)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp75)-(SUM(t1.PWGTP)))**2
,(SUM(t1.pwgtp76)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp77)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp78)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp79)-(SUM(t1.PWGTP)))**2,(SUM(t1.pwgtp80)-(SUM(t1.PWGTP)))**2)))))/ (SUM(t1.PWGTP))
FORMAT=PERCENT6.1 AS Plus_Minus_Percent
FROM EC100005.combinedacs2013 t1
GROUP BY t1.SCHL;
QUIT;
/* End example program */
Related
'metan': forest plot options
I want to create a forest plot with the metan ado in Stata. This works fine so far - but I can't solve two problems: How do I manage that the value range on the x-axis does not already start at the headings but afterwards? How do I label the column on the far right with "p-value"? Can someone assist with this? The code I used is: metan or_num lci_num uci_num, label (namevar=Variable) by(sort_variable) xlabel(0,1,2,3,4,5,6) xsize(10) ysize(8) null(1) effect(Odds ratio) labtitle(Subgroup) nohet nobetween nooverall nosubgroup nosecsub lcols() rcols(p_num) astext(40) boxscale(0)
Precisions and counts
I am working with a educational dataset called IPEDS from the National Center for Educational Statistics. They track students in college based upon major, degree completion, etc. The problem in Stata is that I am trying to determine the total count for degrees obtained by a specific major. They have a variable cipcode which contains values that serve as "majors". cipcode might be 14.2501 "petroleum engineering, 16.0102 "Linguistics" and so forth. When I write a particular code like tab cipcode if cipcode==14.2501 it reports no observations. What code will give me the totals? /*Convert Float Variable to String Variable and use Force Replace*/ tostring cipcode, gen(cipcode_str) format(%6.4f) force replace cipcode_str = reverse(substr(reverse(cipcode_str), indexnot(reverse(cipcode_str), "0"), .)) replace cipcode_str = reverse(substr(reverse(cipcode_str), indexnot(reverse(cipcode_str), "."), .)) /* Created a total variable called total_t1 for total count of all stem majors listed in table 1*/ gen total_t1 = cipcode_str== "14.2501" + "14.3901" + "15.0999" + "40.0601"
This minimal example confirms your problem. (See, by the way, https://stackoverflow.com/help/mcve for advice on good examples.) * code clear input code 14.2501 14.2501 14.2501 end tab code if code == 14.2501 tab code if code == float(14.2501) * results . tab code if code == 14.2501 no observations . tab code if code == float(14.2501) code | Freq. Percent Cum. ------------+----------------------------------- 14.2501 | 3 100.00 100.00 ------------+----------------------------------- Total | 3 100.00 The keyword is one you use, precision. In Stata, search precision for resources, starting with blog posts by William Gould. A decimal like 14.2501 is hard (impossible) to hold exactly in binary and the details of holding a variable as type float can bite. It's hard to see what you're doing with your last block of code, which you don't explain. The last statement looks puzzling, as you're adding strings. Consider what happens with . gen whatever = "14.2501" + "14.3901" + "15.0999" + "40.0601" . di whatever[1] 14.250114.390115.099940.0601 The result is a long string that cannot be a valid cipcode. I suspect that you are reaching towards ... if inlist(cipcode_str, "14.2501", "14.3901", "15.0999", "40.0601") which is quite different. But using float() is the minimal trick for this problem.
Graph evolution of quantile non-linear coefficient: can it be done with grqreg? Other options?
I have the following model: Y_{it} = alpha_i + B1*weight_{it} + B2*Dummy_Foreign_{i} + B3*(weight*Dummy_Foreign)_ {it} + e_{it} and I am interested on the effect on Y of weight for foreign cars and to graph the evolution of the relevant coefficient across quantiles, with the respective standard errors. That is, I need to see the evolution of the coefficients (B1+ B3). I know this is a non-linear effect, and would require some sort of delta method to obtain the variance-covariance matrix to obtain the standard error of (B1+B3). Before I delve into writing a program that attempts to do this, I thought I would try and ask if there is a way of doing it with grqreg. If this is not possible with grqreg, would someone please guide me into how they would start writing a code that computes the proper standard errors, and graphs the quantile coefficient. For a cross section example of what I am trying to do, please see code below. I use grqred to generate the evolution of the separate coefficients (but I need the joint one)-- One graph for the evolution of (B1+B3) with it's respective standard errors. Thanks. (I am using Stata 14.1 on Windows 10): clear sysuse auto set scheme s1color gen gptm = 1000/mpg label var gptm "gallons / 1000 miles" gen weight_foreign= weight*foreign label var weight_foreign "Interaction weight and foreign car" qreg gptm weight foreign weight_foreign , q(.5) grqreg weight weight_foreign , ci ols olsci reps(40) *** Question 1: How to constuct the plot of the coefficient of interest?
Your second question is off-topic here since it is statistical. Try the CV SE site or Statalist. Here's how you might do (1) in a cross section, using margins and marginsplot: clear set more off sysuse auto set scheme s1color gen gptm = 1000/mpg label var gptm "gallons / 1000 miles" sqreg gptm c.weight##i.foreign, q(10 25 50 75 95) reps(500) coefl margins, dydx(weight) predict(outcome(q10)) predict(outcome(q25)) predict(outcome(q50)) predict(outcome(q75)) predict(outcome(q95)) at(foreign=(0 1)) marginsplot, xdimension(_predict) xtitle("Quantile") /// legend(label(1 "Domestic") label(2 "Foreign")) /// xlabel(none) xlabel(1 "Q10" 2 "Q25" 3 "Q50" 4 "Q75" 5 "Q95", add) /// title("Marginal Effect of Weight By Origin") /// ytitle("GPTM") This produces a graph like this: I didn't recast the CI here since it would look cluttered, but that would make it look more like your graph. Just add recastci(rarea) to the options. Unfortunately, none of the panel quantile regression commands play nice with factor variables and margins. But we can hack something together. First, you can calculate the sums of coefficients with nlcom (instead of more natural lincom, which the lacks the post option), store them, and use Ben Jann's coefplot to graph them. Here's a toy example to give you the main idea where we will look at the effect of tenure for union members: set more off estimates clear webuse nlswork, clear gen tXu = tenure*union local quantiles 1 5 10 25 50 75 90 95 99 // K quantiles that you care about local models "" // names of K quantile models for coefplot to graph local xlabel "" // for x-axis labels local j=1 // counter for quantiles foreach q of numlist `quantiles' { qregpd ln_wage tenure union tXu, id(idcode) fix(year) quantile(`q') nlcom (me_tu:_b[tenure]+_b[tXu]), post estimates store me_tu`q' local models `"`models' me_tu`q' || "' local xlabel `"`xlabel' `j++' "Q{sub:`q'}""' } di "`models' di `"`xlabel'"' coefplot `models' /// , vertical bycoefs rescale(100) /// xlab(none) xlabel(`xlabel', add) /// title("Marginal Effect of Tenure for Union Members On Each Conditional Quantile Q{sub:{&tau}}", size(medsmall)) /// ytitle("Wage Change in Percent" "") yline(0) ciopts(recast(rcap)) This makes a dromedary curve, which suggests that the effect of tenure is larger in the middle of the wage distribution than at the tails:
Setting bounds in order to work on MMM modeling using SAS Procedures
I am trying to build a MM model using a SAS procedure proc nlin on the following columns (attaching 4 lines of the actual data): ACVDO ACVFO ACVFD ACVPR 7.84 12.82 1.21 44.34 4.96 22.54 2.77 40.69 6.51 12.23 0.63 32.42 4.97 13.3 2.46 37.06 & the code I used is as follows: proc nlin data=Bags UNCORRECTEDDF; parameters b0=0 b1=0 b2=0 b3=0.004,b4=0 ; bounds 0.004<b3<=0.00717 ;*/ bounds b4>0.00253; model mi_EQ = b0+b1*ACVDO+b2*ACVFO+b3*ACVFD+b4*ACVPR; output out=nlinout predicted=pred ; run; There are other variables also that has to be included in the model but the problem that I am facing with this code is that in spite of setting a range the coefficients that I get are taking either of the extremes & not something within the range specified. Can someone help me out as to how can I fix this issue?Is that I am missing out something while writing the code ?
Is it possible to use the 'where' statement in elasticnet (SAS)?
Here is the code I am using for variables selection: proc glmselect data=abct; where incex1=1; title 'GLMSELECT with Elastic Net'; model devmood_c = asetot age yrseduc sex employyn cohabyn caucyn asitot penntot anxdis ahealthuse ahospit ventxpwk acmn nhospit bmi comorb aqllimmn aqlsubmn aqlsympmn aqlemotmn aqlenvirmn aqltotmn smoke3gp nalcwkcurr /selection=elasticnet(steps=120 L2=0.001 choose=validate); run; The problem is that, when I run it, it tells me: ERROR: Variable incex1 is not on file WORK.ABCT. This incex1 variable is used to exclude people in our database that have score too high on a particular question. It works with LASSO, but even though the code is similar, doesn't seem to work with elasticnet. Does anyone know how I could use it or if there is another way to exclude the patients who scored under a certain threshold on a questionnaire? This is how incex1 has been coded: if devmood_c = 0 then incex1=1; if devmood_c = 1 then incex1=1; if devmood_c = . then incex1=0; if bdisev > 2 then incex1=0; label incex1 = "1=no mood at baseline or BDI > 20, 0=excluded";
This works in test data, so it is likely an issue with your source data not having the characteristics you expect. For example, ods graphics on; proc glmselect data=sashelp.Leutrain valdata=sashelp.Leutest plots=coefficients; where x1>0; model y = x2-x7129/ selection=elasticnet(steps=120 l2=0.001 choose=validate); run; That works as expected.