Applying jack-knife weights to categorical variables in SAS - sas

I am using SAS proc surveyfreq with jack-knife replicate weights to describe frequencies across variables in a survey that used address based sampling. Some of the variables are coded by individual selection - for example, a survey question asks respondents to pick three top choices, so the actual dataset made each individual choice a variable with a Yes/No 0/1 response. Which SAS procedure that incorporate jk weights should I use in this case to describe the frequency for the entire question for the three top choices?

Related

Do you still need to include "site" as a random effect when modeling matched data set?

I am working on a multicenter propensity matched cohort study. The primary outcome is binary while the secondary outcome is continuous. First I performed multiple imputation to address the missing data. I initially planned exact matching on the sites in addition to matching on other variables of interests but got very poor matches. Then I used variables that described the characteristics of the sites, which I compared with the site variable using c statistic and they had similar values. With this new variables and the other variables of interest I got a much better match. I then performed within imputation conditional logistic regression for the binary variable and pulled the results. For the secondary outcome I used negative binomial regression including the match ID in the class statement and as a repeated statement. Do I need to include 'site' as a random statement in the model? I don't know if this is possible in conditional logistic regression. What would be the best way to model this data after matching? For this study I used SAS for analysis.

Relationship between continuous outcome variable (with many zeroes) and predictors using SAS

I have a continuous dependant variable (volume of chemical) with lots of values as 0 and a bunch of continuous and categorical predictors. I want to examine the relationship between predictors and the volume of chemicals. I was thinking of using multiple linear regression but many values in the outcome variable are 0. So, I am not sure how I should proceed.
I am using SAS.

Detailed of predictions on proc logistic

I am implementing a logit model in a database of households using as dependent variable the classification of poor or not poor household (1 if it is poor, 0 if it is not):
proc logistic data=regression;
model poor(event="1") = variable1 variable2 variable3 variable4;
run;
Using the proc logistic in SAS, I obtained the table "Association of predicted probabilities and observed responses" that allows me to know the concordant percentage. However, I require detailed information of how many households are classified poor adequately, in this way:
I will appreciate your help with this issue.
Add the CTABLE option to your MODEL statement.
model poor(event="1") = variable1 variable2 variable3 variable4 / ctable;
CTABLE classifies the input binary response observations according to
whether the predicted event probabilities are above or below some
cutpoint value z in the range . An observation is predicted as an
event if the predicted event probability exceeds or equals z. You can
supply a list of cutpoints other than the default list by specifying
the PPROB= option. Also, you can compute positive and negative
predictive values as posterior probabilities by using Bayes’ theorem.
You can use the PEVENT= option to specify prior probabilities for
computing these statistics. The CTABLE option is ignored if the data
have more than two response levels. This option is not available with
the STRATA statement.
For more information, see the section Classification Table.

Proc reg using by variable (month): How do you take average of all coefficients across all months?

How do you take an average of the coefficients across all months?
Please refer to this question earlier
How do I perform regression by month on the same SAS data set?
The comments in the linked question provide the code to get the estimates in a data set. Then you would run a PROC MEANS on the saved data set to get the averages. But you could also run the model without which a variable to get the monthly estimates alone. In general, it isn't common to average parameter estimates this way, except in a bootstrapping process.

"Automatically" calculate linear combination of parameter estimates with PROC GLM

Background: I have a categorical variable, X, with four levels that I fit as separate dummy variables. Thus, there are three total dummy variables representing x=1, x=2, x=3 (x=0 is baseline).
Problem/issue: I want to be able to calculate the value of a linear combination (i.e. using SAS as a calculator) of these dummy variables. For example, 2*B1 + 2*B2 + B3.
In Stata, this can be done using the lincom command, which uses the stored beta estimates to calculate linear combinations of the parameters.
In SAS in a procedure such as PROC GLM, I think I should use the ESTIMATE statement, but I'm not sure how I would specify the "weights" for each variable in this case.
You are looking for PROC SCORE. This takes output regression or factor estimates and scores a new data set. See here for an example. http://support.sas.com/documentation/cdl/en/statug/66859/HTML/default/viewer.htm#statug_score_examples02.htm
FYI, PROC MODEL does allow this in the model statement, which may be less work than PROC SCORE. I know PROC MODEL can be used readily in place of PROC REG, but I'm not sure how advanced of modeling PROC MODEL does, so it may not be an option for more complex models. I was hoping for something with less coding, but given the nature of SAS, I think this and PROC SCORE are the best I'm going to get.
What if you add your linear combination as a variable in your input dataset?
data myDatasetWithLinCom;
set mydata;
LinComb=2*(x=1)+ 2*(x=2)+(x=3); /*equvilent to 2*B1 + 2*B2 + B3*/
run;
then you can specify LinComb as one of the explanatory variables and you can lookup the coefficient directly from the output.