Evaluating and optimizing the performance of software commonly used for the taxonomic classification of DNA metabarcoding sequence data
The International Journal of Health Planning and Management
Published online on November 21, 2016
Abstract
The taxonomic classification of DNA sequences has become a critical component of numerous ecological research applications; however, few studies have evaluated the strengths and weaknesses of commonly used sequence classification approaches. Further, the methods and software available for sequence classification are diverse, creating an environment in which it may be difficult to determine the best course of action and the trade‐offs made using different classification approaches. Here, we provide an in silico evaluation of three DNA sequence classifiers, the rdp Naïve Bayesian Classifier, rtax and utax. Further, we discuss the results, merits and limitations of both the classifiers and our method of classifier evaluation. Our methods of comparison are simple, yet robust, and will provide researchers a methodological and conceptual foundation for making such evaluations in a variety of research situations. Generally, we found a considerable trade‐off between accuracy and sensitivity for the classifiers tested, indicating a need for further improvement of sequence classification tools.