Path: Top -> Journal -> Jurnal Internasional -> King Saud University -> 2020 -> Volume 32, Issue 4, May

An improved sine cosine algorithm to select features for text categorization

Journal from gdlhub / 2021-08-24 11:54:33
Oleh : Mouhoub Belazzoug, Mohamed Touahria, Farid Nouioua, Mohammed Brahimi, King Saud University
Dibuat : 2021-08-03, dengan 0 file

Keyword : Text categorization, Information gain, Feature subset selection, Wrapper methods, Improved sine cosine algorithm
Url : http://www.sciencedirect.com/science/article/pii/S1319157819301958
Sumber pengambilan dokumen : Web

Bag of words model is commonly used for text categorization. The main problem of this model lies in the large number of involved features, which influences the categorization task performance. To deal with this problem, feature selection method is necessary. Feature selection is beneficial for reducing the dimensionality of the problem, it leads to minimize the computational time and improve the performance of the categorization task. In this paper, we propose a new improved algorithm of the original Sine Cosine Algorithm (SCA) for feature selection, which allows for better exploration in the search space. Unlike the SCA which focuses only on the best solution to generate a new solution, the new algorithm (ISCA) of our proposal takes into account two positions of the solution. (i), The position of the best solution found so far, and (ii), a given random position from the search space. This combination allows us to propose a simple algorithm which is able to avoid premature convergence and obtain very satisfactory performance. To validate the new ISCA algorithm, we carried out a series of experiments on nine text collection, where, we compared the experimental results with several search algorithms including the original SCA algorithm and some of its improved versions as well as the Moth-Flam Optimizer (MFO) algorithm. Moreover, from the state of the art, the Genetic Algorithm (GA) and the Ant Colony Optimization (ACO) are chosen in our comparative study. Our evaluation results demonstrate the high performance of our proposed ISCA algorithm which makes it very useful for text categorization problem.

Deskripsi Alternatif :

Bag of words model is commonly used for text categorization. The main problem of this model lies in the large number of involved features, which influences the categorization task performance. To deal with this problem, feature selection method is necessary. Feature selection is beneficial for reducing the dimensionality of the problem, it leads to minimize the computational time and improve the performance of the categorization task. In this paper, we propose a new improved algorithm of the original Sine Cosine Algorithm (SCA) for feature selection, which allows for better exploration in the search space. Unlike the SCA which focuses only on the best solution to generate a new solution, the new algorithm (ISCA) of our proposal takes into account two positions of the solution. (i), The position of the best solution found so far, and (ii), a given random position from the search space. This combination allows us to propose a simple algorithm which is able to avoid premature convergence and obtain very satisfactory performance. To validate the new ISCA algorithm, we carried out a series of experiments on nine text collection, where, we compared the experimental results with several search algorithms including the original SCA algorithm and some of its improved versions as well as the Moth-Flam Optimizer (MFO) algorithm. Moreover, from the state of the art, the Genetic Algorithm (GA) and the Ant Colony Optimization (ACO) are chosen in our comparative study. Our evaluation results demonstrate the high performance of our proposed ISCA algorithm which makes it very useful for text categorization problem.

Beri Komentar ?#(0) | Bookmark

PropertiNilai Properti
ID Publishergdlhub
OrganisasiKing Saud University
Nama KontakHerti Yani, S.Kom
AlamatJln. Jenderal Sudirman
KotaJambi
DaerahJambi
NegaraIndonesia
Telepon0741-35095
Fax0741-35093
E-mail Administratorelibrarystikom@gmail.com
E-mail CKOelibrarystikom@gmail.com

Print ...

Kontributor...

  • Editor: Calvin