webranking演示文件修改版.ppt

webranking演示文件修改版

Introduction Early search engines mainly compare content similarity of the query and the indexed pages. I.e., They use information retrieval methods, cosine, TF-IDF, ... From 1996, it became clear that content similarity alone was no longer sufficient. The number of pages grew rapidly in the mid-late 1990’s. Try “classification technique”, Google estimates: 10 million relevant pages. How to choose only 30-40 pages and rank them suitably to present to the user? Content similarity is easily spammed. A page owner can repeat some words and add many related words to boost the rankings of his

文档评论(0)

1亿VIP精品文档

相关文档