Path: Top -> Journal -> Jurnal Internasional -> Journal -> Computer

Arabic Text Categorization: A comparative Study of Different Representation Modes

Arabic Text Categorization: A comparative Study of Different Representation Modes

2010
Journal from gdlhub / 2017-08-14 11:52:32
Oleh : Zakaria Elberrichi and Karima Abidi, STIKOM Dinamika Bangsa Jambi
Dibuat : 2012-06-23, dengan 1 file

Keyword : Categorisation, Arabic texts, Arabic wordnet, bag of words, ngrams, and concepts
Subjek : Arabic Text Categorization: A comparative Study of Different Representation Modes
Url : http://www.ccis2k.org/iajit/PDF/vol.9,no.5/2983-10.pdf
Sumber pengambilan dokumen : Internet

The quantity of accessible information on Internet is phenomenal, and its categorization remains one of the most


important problems. A lot of work is currently, focused on English rightly since; it is the dominant language of the Web.


However, a need arises for the other languages, because the Web is each day more multilingual. The need is much more


pressing for the Arabic language. Our research is on the categorization of the Arabic texts, its originality relates to the use of


a conceptual representation of the text. For that we will use Arabic WordNet (AWN) as a lexical and semantic resource. To


comprehend its effect, we incorporate it in a comparative study with the other usual modes of representation (bag of words and


N)grams), and we use the K)NN learning scheme with different similarity measures. The results show the benefits and


advantages of this representation compared to the more conventional methods, and demonstrate that the addition of the


semantic dimension is one of the most promising ways for the automatic categorization of Arabic texts.

Deskripsi Alternatif :

The quantity of accessible information on Internet is phenomenal, and its categorization remains one of the most


important problems. A lot of work is currently, focused on English rightly since; it is the dominant language of the Web.


However, a need arises for the other languages, because the Web is each day more multilingual. The need is much more


pressing for the Arabic language. Our research is on the categorization of the Arabic texts, its originality relates to the use of


a conceptual representation of the text. For that we will use Arabic WordNet (AWN) as a lexical and semantic resource. To


comprehend its effect, we incorporate it in a comparative study with the other usual modes of representation (bag of words and


N)grams), and we use the K)NN learning scheme with different similarity measures. The results show the benefits and


advantages of this representation compared to the more conventional methods, and demonstrate that the addition of the


semantic dimension is one of the most promising ways for the automatic categorization of Arabic texts.

Beri Komentar ?#(0) | Bookmark

PropertiNilai Properti
ID Publishergdlhub
OrganisasiSTIKOM Dinamika Bangsa Jambi
Nama KontakHerti Yani, S.Kom
AlamatJln. Jenderal Sudirman
KotaJambi
DaerahJambi
NegaraIndonesia
Telepon0741-35095
Fax0741-35093
E-mail Administratorelibrarystikom@gmail.com
E-mail CKOelibrarystikom@gmail.com

Print ...

Kontributor...

  • , Editor: fachruddin

Download...

  • Download hanya untuk member.

    23
    Download Image
    File : 23.52.PDF

    (500835 bytes)