The Role of Simulated Data in Making the Best Predictions

Stephen D. Ousley, George R. Milner, Jesper L. Boldsen, Richard L. Jantz

Publikation: Bidrag til tidsskriftKonferenceabstrakt i tidsskriftForskningpeer review

Abstract

Machine Learning (ML) methods for regression and classification, along with the bootstrap, have revolutionized the analysis of data through resa-mpling. The resulting simulated data sets are used to select the best fitting models and to esti-mate prediction precision and accuracy. These two tasks are especially important in forensic analyses, which should reflect predictive data analysis because they will be applied to new cases, rather than summarized in descriptive data analysis. Naturally, we want to use the methods that are expected to be the most accurate and precise for new cases. However, as the great Zen master Berra noted, "It’s tough to make predic-tions, especially about the future." Predictive methods must therefore incorporate the "Known Unknowns" (Rumsfeld, 2002), and avoid overfit-ting by analyzing multiple independent training and test samples, each of which ideally should be large. Bootstrap and Monte Carlo methods mimic sampling variability that would be present in future cases, and both methods are incorporated into numerous routines to estimate prediction accuracy. No routine is perfect due to bias and variance issues, and to the nature of the data and the analytical method. New routines are always being explored.This presentation provides results from two forensic scenarios: predicting sex and ancestry using bone measurements, and predicting age using many osteological traits with a new method (TA3). We demonstrate that the consequences of supposed overfitting may be relatively small in classification, and predicting age using TA3 is far more accurate than using previous methods, even with their underestimated prediction error.
OriginalsprogEngelsk
TidsskriftAmerican Journal of Physical Anthropology
Vol/bind165
Udgave nummer66
Sider (fra-til)195
Antal sider1
ISSN0002-9483
StatusUdgivet - 1. apr. 2018
Begivenhed87th Annual Meeting of the American Association of Physical Anthropologists - Austin, USA
Varighed: 11. apr. 201815. apr. 2018

Konference

Konference87th Annual Meeting of the American Association of Physical Anthropologists
Land/OmrådeUSA
ByAustin
Periode11/04/201815/04/2018

Bibliografisk note

87th Annual Meeting of the American-Association-of-Physical-Anthropologists (AAPA), Austin, TX, APR 11-14, 2018

Fingeraftryk

Dyk ned i forskningsemnerne om 'The Role of Simulated Data in Making the Best Predictions'. Sammen danner de et unikt fingeraftryk.

Citationsformater