Path: Top -> Journal -> Telkomnika -> 2019 -> Vol 17, No 6, Desember 2019

An adaptive clustering and classification algorithm for Twitter data streaming in Apache Spark

Journal from gdlhub / 2020-01-09 14:49:04
Oleh : Raed A. Hasan, Royida A. Ibrahem Alhayali,Nashwan Dheyaa Zaki,Ahmed Hussien Ali, Telkomnika
Dibuat : 2020-01-09, dengan 1 file

Keyword : : classification, clustering, data streaming, optimization, pre-processing
Url : http://journal.uad.ac.id/index.php/TELKOMNIKA/issue/view/640
Sumber pengambilan dokumen : web

On-going big data from social networks sites alike Twitter or Facebook has been an entrancing

hotspot for investigation by researchers in current decades as a result of various aspects including

up-to-date-ness, accessibility and popularity; however anyway there may be a trade off in accuracy.

Moreover, clustering of twitter data has caught the attention of researchers. As such, an algorithm which

can cluster data within a lesser computational time, especially for data streaming is needed. The presented

adaptive clustering and classification algorithm is used for data streaming in Apache spark to overcome

the existing problems is processed in two phases. In the first phase, the input pre-processed twitter data is

viably clustered utilizing an Improved Fuzzy C-means clustering and the proposed clustering is additionally

improved by an Adaptive Particle swarm optimization (PSO) algorithm. Further the clustered data

streaming is assessed utilizing spark engine. In the second phase, the input pre-processed Higgs data is

classified utilizing the modified support vector machine (MSVM) classifier with grid search optimization.

At long last the optimized information is assessed in spark engine and the assessed esteem is utilized to

discover an accomplished confusion matrix. The proposed work is utilizing Twitter dataset and Higgs

dataset for the data streaming in Apache Spark. The computational examinations exhibit the superiority of

presented approach comparing with the existing methods in terms of precision, recall, F-score,

convergence, ROC curve and accuracy.

Beri Komentar ?#(0) | Bookmark

PropertiNilai Properti
ID Publishergdlhub
OrganisasiTelkomnika
Nama KontakHerti Yani, S.Kom
AlamatJln. Jenderal Sudirman
KotaJambi
DaerahJambi
NegaraIndonesia
Telepon0741-35095
Fax0741-35093
E-mail Administratorelibrarystikom@gmail.com
E-mail CKOelibrarystikom@gmail.com

Print ...

Kontributor...

  • , Editor: Calvin

Download...