By Topic

Web classification using extraction and machine learning techniques

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Yusuf, L.M. ; Fac. of Comput. Sci. & Inf. Syst., Univ. Teknol. Malaysia, Skudai, Malaysia ; Othman, M.S. ; Salim, J.

Internet services that has become easier to access has contributed to the drastic increase in the number of web pages. This phenomenon has created new difficulties to internet users about retrieving the latest, relevant and excellent web information. This is due to the enormous contents of web information that have caused problems in the restructuring of web information. Thus, in order to ensure the latest, quality and relevant web information is optimally retrievable, it is necessary to undertake the task of web document classification. This paper discusses the result of classifying web document using the extraction and machine learning techniques. Four types of kernels namely the Radial Basis Function (RBF), linear, polynomial and sigmoid are applied to test the accuracy of the classification. The results show that the accuracy percentage of web document classification will increase whenever more web document is used. The results also show that linear kernel technique is the best in web document classification compared to RBF, polynomial and sigmoid.

Published in:

Information Technology (ITSim), 2010 International Symposium in  (Volume:2 )

Date of Conference:

15-17 June 2010