Big Data Bioinformatics
Journal of Cellular Physiology
Published online on May 06, 2014
Abstract
Recent technological advances allow for high throughput profiling of biological systems in a cost‐efficient manner. The low cost of data generation is leading us to the “big data” era. The availability of big data provides unprecedented opportunities, but it also brings out challenges in data mining and analysis. In this review, we introduce key concepts in the analysis of big data, including both “machine learning” algorithms as well as “unsupervised” and “supervised” examples of each. We note packages for the R programming language that are available perform machine learning analyses. In addition to programming based solutions, we review webservers that allow users with limited or no programming background to perform these analyses on large data compendia. J. Cell. Physiol. © 2014 Wiley Periodicals, Inc.