在线索引维护详解.pptxVIP

  • 2
  • 0
  • 约4.95千字
  • 约 18页
  • 2017-06-17 发布于湖北
  • 举报
On-line Index Maintenance Introduction A list of terms, with their postings list comprises the inverted index. A term’s postings list is a sequence of noes, where each node contains the docID and a list of positions of the term in the document. Off-line indexing approach On-line indexing approach Off-line Indexing Process In the main-memory Tokenizing the input documents forming a list of term,doc pairs The list is sorted lexicographically across terms All the pairs that have the same term are merged to form a list of docs Too large to process in the main-memory Split into smaller collections

文档评论(0)

1亿VIP精品文档

相关文档