Web usage mining inspects the navigation patterns in web access logs and extracts previously unknown and useful information. This may lead to strategies for various web-oriented applications like web site restructure, recommender system, web page prediction and so on. The current work demonstrates clustering of user sessions of uneven lengths to discover the access patterns by proposing a distance method to group user sessions. The proposed hybrid distance measure uses the access path information to find the distance between any two sessions without altering the order in which web pages are visited. R2 is used to make a decision regarding the number of clusters to be constructed. Jaccard Index and Davies–Bouldin validity index are employed to assess the clustering done. The results obtained by these two standard statistic measures are encouraging and illustrate the goodness of the clusters created.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Human-Computer Interaction
- Information Systems
- Media Technology