|
- 2015
Fast online detection of outliers using least-trimmed squares regression with non-dominated sorting based initial subsetsAbstract: In this paper, a new algorithm is devised for calculating the Least Trimmed of Squares (LTS) estimator. The algorithm consists of two steps. In the first step, the non-dominated sorting algorithm is applied on the design matrix of regression data for selecting a clean subset of observations. In the second step, C-steps are iterated to adjust the LTS estimators. The algorithm is fast and precise for small sample sizes, however, the sorting algorithm is computationally inefficient in large datasets. A fast update mechanism can be used in online data with a linear increase in computation time. Some properties of the sorting algorithm are also investigated under some transformations. Results of applying the algorithm on some well-known datasets and Monte Carlo simulations show that the proposed algorithm is suitable to use in many cases when the computation time is the major objective and a moderate level of precision is enough.
|