R language Access Menu

Title Text Both  

Multiple regression for ordinal outcome

Ordinal regression is used if the outcome or dependent variable is of ordered or ordinal type. Function lrm of rms package can be used for this: 

Code:

> lrm(race~., data=bwdf)

Logistic Regression Model

lrm(formula = race ~ ., data = bwdf)

                         Model Likelihood     Discrimination    Rank Discrim.    
                            Ratio Test            Indexes          Indexes       
Obs              189    LR chi2      41.76    R2       0.230    C       0.747    
 1                96    d.f.             8    g        1.168    Dxy     0.494    
 2                26    Pr(> chi2) <0.0001    gr       3.214    gamma   0.495    
 3                67                          gp       0.247    tau-a   0.297    
max |deriv| 0.000003                          Brier    0.200                     

        Coef    S.E.   Wald Z Pr(>|Z|)
y>=2     2.9884 0.9618  3.11  0.0019  
y>=3     2.2860 0.9475  2.41  0.0158  
low=1    0.7727 0.3481  2.22  0.0265  
age     -0.0580 0.0301 -1.92  0.0543  
lwt     -0.0090 0.0053 -1.69  0.0910  
smoke=1 -1.7551 0.3450 -5.09  <0.0001 
ptl      0.1737 0.3307  0.53  0.5995  
ht=1     0.2745 0.6526  0.42  0.6741  
ui=1     0.0589 0.4298  0.14  0.8910  
ftv     -0.1792 0.1550 -1.16  0.2479  

The function polr of MASS package is also commonly used but it does not give P values, though thay can be calculated easily. 

Code:

res = polr(race~., data=bwdf)
> summary(res)

Re-fitting to get Hessian

Call:
polr(formula = race ~ ., data = bwdf)

Coefficients:
           Value Std. Error t value
low1    0.772657   0.348134  2.2194
age    -0.057965   0.030123 -1.9243
lwt    -0.008951   0.005303 -1.6878
smoke1 -1.755090   0.344926 -5.0883
ptl     0.173654   0.330725  0.5251
ht1     0.274575   0.652671  0.4207
ui1     0.058921   0.429790  0.1371
ftv    -0.179145   0.155038 -1.1555

Intercepts:
    Value   Std. Error t value
1|2 -2.9884  0.9620    -3.1066
2|3 -2.2860  0.9477    -2.4120

Residual Deviance: 330.4168 
AIC: 350.4168 

Following function can be used to list the pvalues: 

Code:

rnpolr_pvals = function(res){
# to get p values from polr result
# from: http://www.ats.ucla.edu/stat/r/dae/ologit.htm
    ctable <- coef(summary(res))
    ## calculate and store p values
    p <- pnorm(abs(ctable[, "t value"]), lower.tail = FALSE) * 2
    ## combined table
    ctable <- cbind(ctable, "p value" = p)
    ctable
}
> rnpolr_pvals(res)

Re-fitting to get Hessian

              Value Std. Error    t value        p value
low1    0.772656858 0.34813412  2.2194229 0.026457964671
age    -0.057965474 0.03012316 -1.9242825 0.054319179717
lwt    -0.008950877 0.00530327 -1.6878033 0.091448993515
smoke1 -1.755090336 0.34492593 -5.0883108 0.000000361267
ptl     0.173653582 0.33072526  0.5250690 0.599535222649
ht1     0.274574637 0.65267077  0.4206939 0.673978589879
ui1     0.058920557 0.42979004  0.1370915 0.890958480223
ftv    -0.179144528 0.15503774 -1.1554898 0.247889888811
1|2    -2.988419742 0.96195072 -3.1066246 0.001892365658
2|3    -2.285967938 0.94773195 -2.4120406 0.015863514669

Following function can be used to plot these coefficients: 

Code:

rnpolrcoefplot = function(res){
    dd = rnpolr_pvals(res)
    dd = data.frame(dd)
    dd$text = paste('P=',round(dd$p.value, 3))
    dd = dd[-3]
    dd$var = rownames(dd)
    dd$lower = with(dd, Value-1.96*Std..Error)
    dd$upper = with(dd, Value+1.96*Std..Error)
    dd = dd[-c(2,3)]
    names(dd)[1]='value'
    rnggforest(dd)
}
rnpolrcoefplot(res)

Output graph:

                  

 

References: 
Frank E Harrell Jr (2015). rms: Regression Modeling Strategies. R package version 4.3-1.
http://CRAN.R-project.org/package=rms
 


    Comments & Feedback