- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
PODS 2000, 5/17/00 ATT Labs Selectivity Estimation For Boolean Queries Zhiyuan Chen (Speaker) Flip Korn Nick Koudas S.Muthukrishnan Motivation(2) Use estimates for. Query optimization. Best Filtering order. Interactive query refinement. Hard to write a query having 1- 20 answers. Ranking approach does not always work and expensive. Estimate - refine query - ... - exact answer Outline Problem Definition Related work Our approach Experiments Conclusions Problem Definition Substring predicate ?(s) is true iff string s contains ? as substring. Boolean queries: Substring predicates concatenated with AND, OR, NOT. For a string set S, a Boolean query q: Selectivity P(q) is the fraction of strings in S that satisfy q. Related Work (1) Histograms: Not suitable for strings Selectivity of adjacent substrings often differs a lot! Related Work (2) Existing work for substring queries. Conjunction-only queries: KVI96, WVI97, JNS99, JKNS99 Use Previous Approach? Set-Oriented Approach - Store Correlation Implicitly Set-Hashing Approach A Monte Carlo technique(Cohen94,Broder98) Implementation Issues Approximate permutations: Use a set of independent hash functions. Pick the minimal hash images as signature components. Sig(A) = min{h(x)| x in A}. Signature of unions: Sig(peanut ? butter) Pair-wise min of Sig(p) and Sig(b). Algorithm Outline - No Pruning With negations. | (Peanut ? butter)? ? sandwich | = | (p ^ ? s) ? ( b ? ? s) | (Convert to DNF) = |p ? ? s| + | b ? ? s| - |p ? b ? ? s| (Eliminate disjunction by set-inclusion-exclusion) =|p| - |p ? s| + |b| - | b ? s| - |p ? b| + |p ? b ? s| (Eliminate negations) Complexity Theorem: Preprocessing: building tree and signatures O(signature length * database size) time and space. Online estimate: O(2O(L)), L is the query length. Online time only related to query length. L is small in real life. Below 1 millisecond in experiments. Experiments - Setup Data set: real ATT data - service description. 130 K strin
您可能关注的文档
- SCI搜索培训教程文件.ppt
- SCI快乐写作培训教程文件.ppt
- SCItraining培训教程文件.ppt
- SCM Logistics Network Configuration 供应链管理培训教程文件.ppt
- SCI检索深度教程培训教程文件.ppt
- scm合作伙伴选择new培训教程文件.ppt
- SCI信息检索教程培训教程文件.ppt
- Screwdriver 螺丝刀类型及其生产工艺培训教程文件.ppt
- SDH的复用结构和步骤培训教程文件.ppt
- SAP SD 模块介绍培训教程文件.ppt
- SElecture 面向对象技术导论培训教程文件.ppt
- SecurityVPN智能信息安全培训教程文件.ppt
- Seminer奶牛磷利用的研究进展培训教程文件.ppt
- Senior One Unit & Revision培训教程文件.ppt
- Send SNOOPY to your friends培训教程文件.ppt
- selenium基础入门培训教程文件.ppt
- Selecting QUMP subsets for Regional Modelling Experiments培训教程文件.ppt
- Semiotics符号构成培训教程文件.ppt
- sentencetypeandvariation培训教程文件.ppt
- SCR 烟气脱硝空预器改造技术与应用培训教程文件.ppt
文档评论(0)