Path: Top -> Journal -> Jurnal ITB -> 2016 -> Vol.10 No.2

Social Media Text Classification by Enhancing Well-Formed Text Trained Model

Journal from gdlhub / 2017-08-14 09:11:15
Oleh : Phat Jotikabukkana, Virach Sornlertlamvanich, Okumura Manabu, Choochart Haruechaiyasak, ITB
Dibuat : 2016-08-10, dengan 1 file

Keyword : online news, semi-supervised learning, social media text, well-formed text, Term Frequency-Inverse Document Frequency (TF-IDF) weighting, Word Article Matrix (WAM)
Url : http://journals.itb.ac.id/index.php/jictra/article/view/1879
Sumber pengambilan dokumen : Web

Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF) and Word Article Matrix (WAM) are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.

Beri Komentar ?#(0) | Bookmark

PropertiNilai Properti
ID Publishergdlhub
OrganisasiITB
Nama KontakHerti Yani, S.Kom
AlamatJln. Jenderal Sudirman
KotaJambi
DaerahJambi
NegaraIndonesia
Telepon0741-35095
Fax0741-35093
E-mail Administratorelibrarystikom@gmail.com
E-mail CKOelibrarystikom@gmail.com

Print ...

Kontributor...

  • , Editor: sustriani

Download...