- 10
- 0
- 约1.07万字
- 约 45页
- 2017-04-04 发布于江苏
- 举报
Exact Set Matching
Exact Set Matching Charles Yan 2008 Exact Set Matching Goal: To find all occurrences in text T of any pattern in a set of patterns P={p1,p2,…,pz}. n: the total length of all the patterns in P. m: the length of T O(n+zm) vs. O(n+m+k) k: the number of occurrences in T the patterns from P. Keyword Tree Keyword tree for a set P is a rooted directed tree k satisfying three conditions: (1) each edge is labeled with one character; (2) any two edges out of the same node have distinct labels; and (3) every pattern Pi in P maps to some node v of K such that the characters on the path from the root of K to v exactly spell out Pi and every leaf of K is mapped to by some pattern in P. Keyword Tree P={potato, poetry, pottery, science, school} Keyword Tree Construction of keyword tree K1: the tree that includes only pattern 1 Ki: the tree that includes patterns p1 …pi Assuming a fixed-size alphabet Construction of Ki by adding Pi to Ki-1 costs O(|Pi|) Thus total time is O(n) Keyword Tree Naive use of keyword tree for exact set matching: Start from each position l in T and follow the unique path from r in K that matches a substring of T starting at l . O(mn) Keyword Tree The dictionary problem: To find if a input word is contained in the dictionary. The words in a dictionary (P) are encoded in a keyword tree. The problem is reduced to whether the input word (T) completely matches some pattern in P. Require that the set of patterns are initially known. Keyword Tree Speedup the exact set matching problem (1) shift the tree by more than one positions (2) skip comparisons that have been made in previous steps. Failure Link v: a node in keyword tree K L(v): the label on v, that is, the concatenation of characters on the path from the root to v. lp(v): the length of the longest proper suffix of string L(v) that is a prefix of some pattern in P. Let this substring be a. Lemma. There is a unique node in the keyword tree that is labeled by string a. Let this node be nv.
您可能关注的文档
最近下载
- 2026年湖南铁道职业技术学院单招职业适应性测试题库及答案解析(名师系列).docx VIP
- 逻辑学导论 全套课件.PPT VIP
- 2025年西藏自治区中考数学试卷真题(含答案解析).docx
- 23_第七章一二年生花卉备课教案.pdf VIP
- GSK980TDb_车床CNC使用手册.pdf VIP
- 2022高中英语译林版新教材选择性必修二课文翻译(英汉对照).pdf VIP
- 广东省茂名市2026年高三上学期第一次综合测试(一模)语文试题(含答案).docx VIP
- 牛津译林版高中英语新教材选择性必修四课文原文及汉语翻译 (英汉对照).pdf VIP
- 2025-2026学年新教材高中生物期末综合检测卷新人教版必修1 .pdf VIP
- 液化石油气站操作规程.doc VIP
原创力文档

文档评论(0)