信息检索六tfidf概要1.pptVIP

  • 8
  • 0
  • 约1.14万字
  • 约 47页
  • 2017-07-09 发布于湖北
  • 举报
信息检索六tfidf概要1

湖南大学计算机与通信学院 刘钰峰 回顾 1、中文分词 2、词典压缩 3、posting list压缩 4、tfidf Scoring documents How do we construct an index? What strategies can we use with limited main memory? Scoring We wish to return in order the documents most likely to be useful to the searcher How can we rank order the docs in the corpus with respect to a query? Assign a score – say in [0,1] for each doc on each query Begin with a perfect world – no spammers Nobody stuffing keywords into a doc to make it match queries More on “adversarial IR” under web search Linear zone combinations First generation of scoring methods: use a linear com

文档评论(0)

1亿VIP精品文档

相关文档