STIKOM DB Digital Library

Home

Login / Registerasi / Aktivasi

Kontak

Info

Versi liveCD dari koleksi perpustakaan
STIKOM DB Digital Library
Alamat: Jln. Jenderal Sudirman
Info lebih lanjut

Bahasa

Links

Path: Top -> Journal -> Telkomnika -> 2021 -> Vol 19, No 4, August

Enhancing text classification performance by preprocessing misspelled words in Indonesian language

Journal from gdlhub / 2021-09-10 13:59:02
Oleh : Reza Setiabudi, Ni Made Satvika Iswari, Andre Rusli, Telkomnika
Dibuat : 2021-09-10, dengan 0 file

Keyword : Indonesian language, levenshtein distance, text classification, typo correction, user feedback
Url : http://journal.uad.ac.id/index.php/TELKOMNIKA/article/view/20369
Sumber pengambilan dokumen : Web

Supervised learning using shallow machine learning methods is still a popular method in processing text, despite the rapidly advancing sector of unsupervised methodologies using deep learning. Supervised text classification for application user feedback sentiments in Indonesian Language is one of the applications which is quite popular in both the research community and industry. However, due to the nature of shallow machine learning approaches, various text preprocessing techniques are required to clean the input data. This research aims to implement and evaluate the role of Levenshtein distance algorithm in detecting and preprocessing misspelled words in Indonesian language, before the text data is then used to train a user feedback sentiment classification model using multinomial Naïve Bayes. This research experimented with various evaluation scenarios, and found that preprocessing misspelled words in Indonesian language using the Levenshtein distance algorithm could be useful and showed a promising 8.2% increase on the accuracy of the models ability to classify user feedback text according to their sentiments.

Deskripsi Alternatif :

Supervised learning using shallow machine learning methods is still a popular method in processing text, despite the rapidly advancing sector of unsupervised methodologies using deep learning. Supervised text classification for application user feedback sentiments in Indonesian Language is one of the applications which is quite popular in both the research community and industry. However, due to the nature of shallow machine learning approaches, various text preprocessing techniques are required to clean the input data. This research aims to implement and evaluate the role of Levenshtein distance algorithm in detecting and preprocessing misspelled words in Indonesian language, before the text data is then used to train a user feedback sentiment classification model using multinomial Naïve Bayes. This research experimented with various evaluation scenarios, and found that preprocessing misspelled words in Indonesian language using the Levenshtein distance algorithm could be useful and showed a promising 8.2% increase on the accuracy of the models ability to classify user feedback text according to their sentiments.

Beri Komentar ?#(0) | Bookmark

Properti	Nilai Properti
ID Publisher	gdlhub
Organisasi	Telkomnika
Nama Kontak	Herti Yani, S.Kom
Alamat	Jln. Jenderal Sudirman
Kota	Jambi
Daerah	Jambi
Negara	Indonesia
Telepon	0741-35095
Fax	0741-35093
E-mail Administrator	elibrarystikom@gmail.com
E-mail CKO	elibrarystikom@gmail.com

Print ...

Kontributor...

Editor: Calvin

GDL

Info

Menu

Bahasa

Links

GDL