STIKOM DB Digital Library

Info

Versi liveCD dari koleksi perpustakaan
STIKOM DB Digital Library
Alamat: Jln. Jenderal Sudirman
Info lebih lanjut

Bahasa

Links

Path: Top -> Journal -> Jurnal Internasional -> King Saud University -> 2020 -> Volume 32, Issue 5, June

Classifying protein-protein interaction articles from biomedical literature using many relevant features and context-free grammar

Journal from gdlhub / 2021-08-24 11:54:42
Oleh : Sabenabanu Abdulkadhar, Gurusamy Murugesan, Jeyakumar Natarajan, King Saud University
Dibuat : 2021-08-04, dengan 0 file

Keyword : Article classification task, Protein-protein interaction, Named entity recognition, Boosting classifier, Latent semantic analysis, Context free grammar
Url : http://www.sciencedirect.com/science/article/pii/S1319157817301829
Sumber pengambilan dokumen : Web

Detecting the articles which consist of protein-protein interactions (PPI) is a significant step in biological information extraction. In this paper, we present a hybrid text classification (TC) method to identify proteinprotein interaction articles. Our methodology comprises of four modules i) Feature extraction, ii) Semantic similarity based feature selection iii) Ensemble learning and iv) Context free grammar (CFG) based post processing to classify PPI relevant articles. In first module, we extracted many linguistic and domain specific features such as protein names, interaction cues etc., to classify the documents. The second module used similarity based feature selection to extract the relevant efficient features. In third module, we employed AdaBoost based ensemble learning to improve the performance of weak learning classifiers. The final module incorporates CFG based pattern matching to resolve the errors in the classifiers. The performance of our hybrid TC method was trained and tested on BioCreative III corpus in which we attained the precision of 0.5813 and recall of 0.6582. The overall F-score of the system was 0.6228 and our hybrid approach combined with ensemble classifier and CFG post-processing method outperforms most of the state of-the-art systems.

Deskripsi Alternatif :

Detecting the articles which consist of protein-protein interactions (PPI) is a significant step in biological information extraction. In this paper, we present a hybrid text classification (TC) method to identify proteinprotein interaction articles. Our methodology comprises of four modules i) Feature extraction, ii) Semantic similarity based feature selection iii) Ensemble learning and iv) Context free grammar (CFG) based post processing to classify PPI relevant articles. In first module, we extracted many linguistic and domain specific features such as protein names, interaction cues etc., to classify the documents. The second module used similarity based feature selection to extract the relevant efficient features. In third module, we employed AdaBoost based ensemble learning to improve the performance of weak learning classifiers. The final module incorporates CFG based pattern matching to resolve the errors in the classifiers. The performance of our hybrid TC method was trained and tested on BioCreative III corpus in which we attained the precision of 0.5813 and recall of 0.6582. The overall F-score of the system was 0.6228 and our hybrid approach combined with ensemble classifier and CFG post-processing method outperforms most of the state of-the-art systems.

Beri Komentar ?#(0) | Bookmark

Properti	Nilai Properti
ID Publisher	gdlhub
Organisasi	King Saud University
Nama Kontak	Herti Yani, S.Kom
Alamat	Jln. Jenderal Sudirman
Kota	Jambi
Daerah	Jambi
Negara	Indonesia
Telepon	0741-35095
Fax	0741-35093
E-mail Administrator	elibrarystikom@gmail.com
E-mail CKO	elibrarystikom@gmail.com

Print ...

Kontributor...

Editor: Calvin

GDL

Info

Menu

Bahasa

Links

GDL