Optimized Risk Assessment in Forensic Practice: A Comparison of Machine Learning and Manual Scoring Approaches
Behavioral Sciences & the Law / BEHAVIORAL SCIENCES AND THE LAW
Published online on February 02, 2026
Abstract
["Behavioral Sciences &the Law, EarlyView. ", "\nABSTRACT\nAs correctional jurisdictions and risk instrument developers look to optimize scoring for specific population needs, an open question remains ‐ which method is optimal. Popular scoring methods range from manual simple scoring approaches (e.g., Burgess) to more complex machine learning algorithms (e.g., random forests). Prior comparisons between approaches have produced similarly acceptable levels of predictive validity. This study compares scoring methods beyond predictive validity to also assess calibration, item inclusion, and item weighting and discusses drawbacks of each approach. Scoring was developed for an actuarial release decision making risk assessment tool—the Reduction in Capacity Evaluation (ReduCE)—using manual (unweighted, Burgess, Nuffield, Nuffield 2.5, regression) and machine learning (artificial neural network, random forests) scoring methods. The machine learning methods did not outperform the manual methods in predictive validity or calibration and introduced drawbacks on item inclusion and weighting. The optimal approach for ReduCE was the Nuffield 2.5 method.\n"]