Data-driven segmentation is an important tool for analyzing patterns of associations in social survey data; however, it remains a challenge to compare the quality of segmentations obtained by different methods. We present a statistical framework for quantifying the quality of segmentations of human values, by evaluating their ability to predict held-out data. By comparing clusterings of human values survey data from the forth round of European Social Study (ESS-4), we show that demographic markers such as age or country predict better than random, yet are outperformed by data-driven segmentation methods. We show that a Bayesian version of Latent Class Analysis (LCA) outperforms the standard maximum likelihood LCA in predictive performance and is more robust for different number of clusters.
- Bayesian latent class analysis
- Human value segmentation
- Predictive evaluation