R language Access Menu

Graphical representation of contingency table

There are many ways to show a contingency table graphically.  The 2 categorical factors can be put on x and y axes and points can be jittered to show the distribution: 

code:

    > ggplot(bwdf, aes(race, smoke))+ 
             geom_jitter(position=position_jitter(width=0.25, height=0.25))

               

An association plot can be used. It can also show statistically important groups: 

code:

> library(vcd)
> assoc(tt, shade=T, gp=shading_max)

              

In the above plot, the bars in blue indicate that there are more counts in that group than expected, while red bars indicate groups where entries are less than expected. Hence, there are more smokers in race 1 and less in race 3. Race group 2 has average number of smokers and non-smokers. 

Similar demonstration can be made with mosaic plot: 

code:

> library(vcd)
> mosaic(tt, shade=T, gp=shading_max)

              

Tile plot:

This is often used to show 2 categorical (on x and y axis) and one numeric variable (shown as fill color of tiles). Hence it is well suited for plotting contingency tables. 

code:

> ggplot(bwdf, aes(race, smoke, fill=age))+ geom_tile()

                 

Dotchart: 

This is a simple but effect plot that clearly shows data in a table. 

code:

> library(datasets)
> VADeaths
      Rural Male Rural Female Urban Male Urban Female
50-54       11.7          8.7       15.4          8.4
55-59       18.1         11.7       24.3         13.6
60-64       26.9         20.3       37.0         19.3
65-69       41.0         30.9       54.6         35.1
70-74       66.0         54.3       71.1         50.0

> dotchart(VADeaths)

              

Correspondence analysis plot: 

Technique of correspondence analysis can also be used to display relationship in contingency tables: 

code:
> mytable
           performance
subject     Good Fair Poor
  Language    76   32   46
  Economics   48   23   47
  Science     45   34   78

> library(FactoMineR)
> plot(CA(mytable))

               

As can be seen in the plot, science is associated with poor performance, while language is associated with good performance. 


    Comments & Feedback