Ordinal outlier detection based on recursive uniform partitioning
Transactions of the Institute of Measurement and Control
Published online on December 12, 2011
Abstract
Outlier detection plays an important role in intelligent cyber systems, especially for fault-tolerant and adaptive ones. Traditional algorithms always need to evaluate distances or densities, which are very time-consuming. Considering the increasingly urgent demand for real-time application, during the past few years, various novel algorithms have been proposed. They are much faster, but less stable and less accurate. To cope with these problems, based on the core idea of ordinal optimization and the ‘few and different’ characteristics of outliers, by introducing the concept of outlier probability, we propose this ordinal outlier detection algorithm (OOD), which extracts outliers in terms of the order of being isolated in a recursive uniform data space partitioning process. It does not need any distance or density evaluation, and the complexity is reduced to O(n). Experiments show that the CPU time of OOD increases linearly with linearly growing data sets. Furthermore, compared with the recent iForest algorithm, OOD is about 30 times faster, with a 20–30% improvement in accuracy and in particular is much more stable. OOD also has good scalability, so it works well in high-dimensional data sets, which have a huge number of instances and irrelevant attributes.