Book Description
Missing values are one of the problems encountered in microarray data analysis. For many of the clustering algorithms applied in microarray data analysis, a complete data matrix is required. The traditional approach to solving the missing value problem is to fill in with estimates by imputation. Once the missing value estimates are imputed, they remain fixed during the following clustering process. Poorly estimated missing data points will impair reliability of the cluster analysis. In this particular study, we tested the ability of a novel clustering method based on a Bayesian infinite mixtures model (IMM) to accommodate missing data. In a simulation study and a prostate cancer dataset, by examining the specificity and sensitivity of clusters we demonstrated that the IMM method has increased precision of the cluster analysis without requirement of a prior imputation. IMM is more robust in clustering an incomplete dataset than traditional clustering methods, which require prior imputation.