Missing data in craniometrics: a simulation study
Abstract
Craniometric measurements represent a useful tool for studying the differentiation of mammal populations. However, the fragility of skulls often leads to incomplete data matrices. Damaged specimens or incomplete sets of measurements are usually discarded prior to statistical analysis. We assessed the performance of two strategies that avoid elimination of observations: (1) pairwise deletion of missing cells, and (2) estimation of missing data using available measurements. The effect of these distinct approaches on the computation of inter-individual distances and population differentiation analyses were evaluated using craniometric measurements obtained from insular populations of deer mice Peromyscus maniculatus (Wagner, 1845). In our simulations, Euclidean distances were greatly altered by pairwise deletion, whereas Gower's distance coefficient corrected for missing data provided accurate results. Among the different estimation methods compared in this paper, the regression-based approximations weighted by coefficients of determination (r(2)) outperformed the competing approaches. We further show that incomplete sets of craniometric measurements can be used to compute distance matrices, provided that an appropriate coefficient is selected. However, the application of estimation procedures provides a flexible approach that allows researchers to analyse incomplete data sets.