Path: Top -> Journal -> Jurnal Internasional -> King Saud University -> 2019 -> Volume 31, Issue 1, January

Arabic Web page clustering: A review

Journal from gdlhub / 2020-04-08 09:46:07
By : Hanan M. Alghamdi, Ali Selamat, King Saud University
Created : 2019-01-08, with 1 files

Keyword : Feature selection, Feature reduction, K-means, Review, Text clustering, ARABIC Web page
Url : http://www.sciencedirect.com/science/article/pii/S1319157817300290
Document Source : WEB

Clustering is the method employed to group Web pages containing related information into clusters, which facilitates the allocation of relevant information. Clustering performance is mostly dependent on the text features' characteristics. The Arabic language has a complex morphology and is highly inflected. Thus, selecting appropriate features affects clustering performance positively. Many studies have addressed the clustering problem in Web pages with Arabic content. There are three main challenges in applying text clustering to Arabic Web page content. The first challenge concerns difficulty with identifying significant term features to represent original content by considering the hidden knowledge. The second challenge is related to reducing data dimensionality without losing essential information. The third challenge regards how to design a suitable model for clustering Arabic text that is capable of improving clustering performance. This paper presents an overview of existing Arabic Web page clustering methods, with the goals of clarifying existing problems and examining feature selection and reduction techniques for solving clustering difficulties. In line with the objectives and scope of this study, the present research is a joint effort to improve feature selection and vectorization frameworks in order to enhance current text analysis techniques that can be applied to Arabic Web pages.

Give Comment ?#(0) | Bookmark

PropertyValue
Publisher IDgdlhub
OrganizationKing Saud University
Contact NameHerti Yani, S.Kom
AddressJln. Jenderal Sudirman
CityJambi
RegionJambi
CountryIndonesia
Phone0741-35095
Fax0741-35093
Administrator E-mailelibrarystikom@gmail.com
CKO E-mailelibrarystikom@gmail.com

Print ...

Contributor...

  • , Editor: sustriani

Downnload...