Analysis of Web Page links and Web Content Similarity using Hits Algorithm
X. Leela Mary1, G. Silambarasan2
1X. Leela Mary, M.Phil Scholar, Department of Computer Science, Annai Vailankanni of Arts & Science College, Thanjavur (Tamil Nadu), India.
2G. Silambarasan, Assistant Professor, Department of Computer Science, Annai Vailankanni of Arts & Science College, Thanjavur (Tamil Nadu), India.
Manuscript received on June 08, 2017. | Revised Manuscript received on June 09, 2017. | Manuscript published on June 25, 2017. | PP: 7-9 | Volume-4 Issue-11, June 2017. | Retrieval Number: K11900641117
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: HITS (Hyper Link-Induced Topic Search) are a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of “topic drift”-a deviation between search and topic-would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic waft to some degree.
Keywords: HITS algorithm; Web content similarity; Authority page; Hub page