Path: Top -> Journal -> Jurnal Nasional Teknik Elektro dan Teknologi Informasi -> 2018 -> Vol 7, No 2

Self-Training Naive Bayes Berbasis Word2Vec untuk Kategorisasi Berita Bahasa Indonesia

Journal from gdlhub / 2019-11-15 10:58:14
Oleh : Joan Santoso, Agung Dewa Bagus Soetiono, Gunawan Gunawan, Endang Setyati, Eko Mulyanto Yuniarno, Mochamad Hariadi, Mauridhi Hery Purnomo, JNTETI
Dibuat : 2018-07-25, dengan 1 file

Keyword : Kategorisasi Berita, Word2Vec, Skip-Gram, Self-Training, Naive Bayes, Semi-supervised Learning, Bahasa Indonesia
Url : http://ejnteti.jteti.ugm.ac.id/index.php/JNTETI/article/view/418
Sumber pengambilan dokumen : WEB

News as one kind of information that is needed in daily life has been available on the internet. News website often categorizes their articles to each topic to help users access the news more easily. Document classification has widely used to do this automatically. The current availability of labeled training data is insufficient for the machine to create a good model. The problem in data annotation is that it requires a considerable cost and time to get sufficient quantity of labeled training data. A semi-supervised algorithm is proposed to solve this problem by using labeled and unlabeled data to create classification model. This paper proposes semi-supervised learning news classification system using Self-Training Naive Bayes algorithm. The feature that is used in text classification is Word2Vec Skip-Gram Model. This model is widely used in computational linguistics or text mining research as one of the methods in word representation. Word2Vec is used as a feature because it can bring the semantic meaning of the word in this classification task. The data used in this paper consists of 29,587 news documents from Indonesian online news websites. The Self-Training Naive Bayes algorithm achieved the highest F1-Score of 94.17%.

Beri Komentar ?#(0) | Bookmark

PropertiNilai Properti
ID Publishergdlhub
OrganisasiJNTETI
Nama KontakHerti Yani, S.Kom
AlamatJln. Jenderal Sudirman
KotaJambi
DaerahJambi
NegaraIndonesia
Telepon0741-35095
Fax0741-35093
E-mail Administratorelibrarystikom@gmail.com
E-mail CKOelibrarystikom@gmail.com

Print ...

Kontributor...

  • , Editor: sukadi

Download...

  • Download hanya untuk member.

    418-701-1-SM
    Download Image
    File : 418-701-1-SM.pdf

    (1455318 bytes)