2021-01-09
Multiple Correspondence Analysis with R, or how to dig into qualitative variable associations?
When working with factor variables, it is of interest to understand the preferential association between different factor levels: was the patient blood type preferentially associated with disease-free status? A frequent strategy would rely on Chi-2 test to verify if the modalities of two factors are independently distributed. The Multiple Correspondence Analysis can be seen as a generalisation where the Chi-2 square corresponds to a distance in a resulting factorial map.
The example below relied on the package
FactoMineR
to compute the
Multiple Correspondence Analysis factorial map, result visualization was
enhanced with
ggplot2
to represent patients, factor levels as well as
an illustrative variable (factors projected in the map which did not contribute
to its definition).