Combination of Cluster Method for Segmentation of Web Visitors

Yuhefizar Yuhefizar, Budi Santosa, I Ketut Eddy P, Yoyon K. Suprapto

Abstract


Clustering is one of the important part in web usage miningfor the purpose of segmenting visitors. This action is very important for web personalization orweb modification. In this paper, we perform clustering of the web visitors using a combination of methods of hierarchical and non-hierarchical clustering toward web log data. Hierarchical clustering method used to determine the number of clusters, and non-hierarchical clustering method is used in forming clusters. The stages of cluster analysis are preceded by pre-processing the data and factor analysis. With this approach, the owner of the web is more effective at finding access patterns of web visitors and can have new knowledge about visitors’ segmentation. From the test applied on ITS’s web log data, 6 clusters of web visitors are resulted. Among the 6 cluster, cluster 3 has the biggest number of members. This information can be useful for web management to pay attention on members’ behavioral patterns of the 3rd cluster’s either to make personalization or modification on the web. The test results show the feasibility and efficiency of application of this method.


Full Text:

PDF

References


Yohanes BW, Handoko, Wardana HK. Focused Crawler Optimization Using Genetic Algorithm.TELKOMNIKA Indonesian Journal of Electrical Engineering. 2011; 9(3): 403 - 410.

Fong ACM, Baoyao Z, Hui SC, Hong GY, Do TA. Web Content Recomender System Based On Consumer Behavior Modelling. IEEE Transactional on Consumer Electropnics. 2011; 57(2): 962 – 969.

Awad MA, Khalil I. Prediction of User’s Web-Browsing Behaviour: Application of Markov Model. IEEE transaction on Systems, Man, And Cybernetics, Part B: Cybernetics. 2012; 42(4): 1131 – 1142.

Nasraoui O, Soliman M, Saka E, Badia A, Germain R. A Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites. IEEE Transaction on Knowledge and Data Engineering. 2008; 20(2): 202 – 215.

Godoy D, Amandi A. User Profiling for Web Page Filtering. IEEE Internet Computing. 2005; 9(3): 56–64.

Wang Y-T, Lee AJT. Mining Web Navigation Patterns With a Path Traversal Graph. Experts System with Application. 2011; 38(6): 7112 – 7122.

Hussain T, Asghar S, Masood N. Web Usage Mining: A Survey on Preprocessing of Web Log File. International Conference on Information and Emerging Technologies (ICIET). Karachi. 2010: 1–6.

Khasawneh N, Chan C-C. Active User-Based and Ontology-Based Web Log Data Preprocessing for Web Usage Mining. International Conference on Web Intelligence, IEEE/WIC/ACM. Washington 2006: 325–328.

Chang CC, Chen P-L, Chiu F-R, Chen Y-K. Application of Neural Networks and Kano’s Method to Content Recommendation in Web Personalization. Journal Expert Systems with Application. 2009; 36 (3); 5310 – 5316.

Kumar R. Mining Web Logs: Applications and Challenges. KDD’09 Proceedings of the 15th ACM SIGKDD, International Conference on Knowledge Discovery and Data Mining. New York. 2009.

Srivastava J, Cooley R, Deshpande M, Tan P.-N. Web Usage Mining: Discovery and Applications of Usage Patterns From Web Data. SIGKDD Explorations. 2000; 1(2): 12-23.

Lee CL, Lee S. Interpreting The Web-Mining Results by Cognitive Map and Association Rule Approach. Information Processing & Management. 2011; 47(4); 482 – 490.

Nagi M, ElSheikh A, Sleiman I, Peng P, Rifaie M, Kianmehr K,Karampelas P, Ridley M, Rokne J,Alhajj R. Association Rules Mining Based Approach for Web Usage Mining. IEEE International Confrence on Information Reuse and Integration (IRI). Las Vegas, NV. 2011: 166–171.

Lee Y-S, Yen S-J. Incremental and Interactive Mining of Web Traversal Patterns. Information Sciences. 2008; 178(2): 287-306.

[15] Wu H-Y, Zhu J-J, Zhang X-Y. The Explore of the Web-Based Learning Environment Base in Web Sequential Pattern Mining. International Conference on Computational Intelligence and Software Engineering (CISE). Wuhan. 2009: 1–6.

Chen C-M, Lee H-M, Chang Y-J.Two Novel Feature Selection Approaches For Web Page Classification. Expert Systems with Applications. 2009; 36(1): 260 – 272.

Yu JX, Yuming O, Zhang C, Zhang S. Identifying Interesting Visitors Throught Web Log Classification. IEEE Intelligent Systems. 2005; 20(3): 55 – 59.

Sudhamathy G, Venkateswaran JC. Web Log Clustering Approaches – A Survey. International Journal on Computer Science and Engineering (IJCSE). 2011; 3(7): 2896–1903.

Shi P. An Efficient Approach for Clustering Web Access Patterns from Web Logs. International Journal of Advanced Science and Technology. 2009; 5: 1–14.

Martiana E, Rosyid N, Agusetia U. Mesin Pencari Dokumen dengan Pengklasteran Secara Otomatis. TELKOMNIKA Indonesian Journal of Electrical Engineering. 2010; 8(1): 41 - 48.

Xie Y, Phoha VV. Web User Clustering From Access Log Using Belief Function. Proceedings of the ACM K-CAP'OI. First International Conference on Knowledge Capture. Victoria. 2001: 202-208.

Xu HJ, Liu H. Web User Clustering Analysis Based on KMeans Algorithm. International Conference on Information, Networking and Automation (ICINA). Kunming. 2010; 2: V2-6 – V2-9.

Chaofeng L. Research on Web Session Clustering. Journal of Software. Academy Publisher. 2009; 4(5): 460–468.

Tanasa D, Trousse B. Advanced Data Preprocessing for Intensitas Web Usage Mining. IEEE Intelligent System. 2004; 19(2); 59 – 65.

Lee C-H, Lo Y-L, Fu Y-H. A Novel Prediction Model Based on Hierarchical Characteristic of Web Site. Expert Systems with Application. 2011; 38 : 3422–3430.

Liu B. Web Data Mining : Exploring Hyperlinks, Contents, and Usage Data. Berlin: Springer. 2007.

Niu J, He Y, Li M, Zhang X, Chao C, Zhang B. A Comparative Study on Application of Data Mining Technique in Human Shape Clustering: Principal Component Analysis VS. Factor Analysis. IEEE Conference on ICIEA. 2010; 2014 – 2018.




DOI: http://doi.org/10.12928/telkomnika.v11i1.906

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604

View TELKOMNIKA Stats