Path: Top -> Journal -> Jurnal Internasional -> King Saud University -> 2021 -> Volume 33, Issue 10, December
Improving outliers detection in data streams using LiCS and voting
Oleh : atima-Zahra Benjelloun, Ahmed Oussous, Amine Bennani, Samir Belfkih, Ayoub Ait Lahcen, King Saud University
Dibuat : 2022-02-15, dengan 0 file
Keyword : Data streams, Outlier detection, High-dimensional data, Big data mining, Intrusion detection
Url : http://www.sciencedirect.com/science/article/pii/S1319157819301454
Sumber pengambilan dokumen : web
Detecting outliers in real-time is increasingly important for many real-world applications such as detecting abnormal heart activity, intrusions to systems, spams or abnormal credit card transactions. However, detecting outliers in data streams rises many challenges such as high-dimensionality, dynamic data distribution and unpredictable relationships. Our simulations demonstrate that some advanced solutions still show drawbacks. In this paper, first, we improve the capacity to detect outliers of both micro-clusters based algorithms (MCOD) and distance-based algorithms (Abstract-C and Exact-Storm) known for their performance. This is by adding a layer called LiCS that classifies online the K-nearest-neighbors (Knn) of each node based on their evolutionary status. This layer aggregates the results and uses a count threshold to better classify nodes. Experiments on SpamBase datasets confirmed that our technique enhances the accuracy and the precision of such algorithm and helps to reduce the unclassified nodes.Second, we propose a hybrid solution based on iterative majority voting and our LiCS. Experiments on real data proves that it outperforms discussed algorithms in terms of accuracy, precision and sensitivity in detecting outliers. It also minimizes the issue of unclassified instances and consolidate the different outputs of algorithms.
Deskripsi Alternatif :Detecting outliers in real-time is increasingly important for many real-world applications such as detecting abnormal heart activity, intrusions to systems, spams or abnormal credit card transactions. However, detecting outliers in data streams rises many challenges such as high-dimensionality, dynamic data distribution and unpredictable relationships. Our simulations demonstrate that some advanced solutions still show drawbacks. In this paper, first, we improve the capacity to detect outliers of both micro-clusters based algorithms (MCOD) and distance-based algorithms (Abstract-C and Exact-Storm) known for their performance. This is by adding a layer called LiCS that classifies online the K-nearest-neighbors (Knn) of each node based on their evolutionary status. This layer aggregates the results and uses a count threshold to better classify nodes. Experiments on SpamBase datasets confirmed that our technique enhances the accuracy and the precision of such algorithm and helps to reduce the unclassified nodes.Second, we propose a hybrid solution based on iterative majority voting and our LiCS. Experiments on real data proves that it outperforms discussed algorithms in terms of accuracy, precision and sensitivity in detecting outliers. It also minimizes the issue of unclassified instances and consolidate the different outputs of algorithms.
Beri Komentar ?#(0) | Bookmark
Properti | Nilai Properti |
---|---|
ID Publisher | gdlhub |
Organisasi | King Saud University |
Nama Kontak | Herti Yani, S.Kom |
Alamat | Jln. Jenderal Sudirman |
Kota | Jambi |
Daerah | Jambi |
Negara | Indonesia |
Telepon | 0741-35095 |
Fax | 0741-35093 |
E-mail Administrator | elibrarystikom@gmail.com |
E-mail CKO | elibrarystikom@gmail.com |
Print ...
Kontributor...
- Editor: Calvin