- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
Querying Websites Using Compact Skeletons使用紧凑的骨架查询网站
CS345 Compact Skeletons Compact Skeletons Assume tuples components are scattered over website We have a tagger that can tag all tuple components on website Assume no noise for now Reconstruct relation Skeletons Labeled trees Transformation from data graphs to relations Compact Skeletons A skeleton is compact if all overlays are consistent Perfect if each node and edge of data graph is covered by at least one overlay Given a data graph G, does G have a Perfect Compact Skeleton (PCS)? Not always But if it exists it is unique Partial Compact Skeletons For data graphs with incomplete information, we allow partial overlays Results in nulls in relation If we can use consistent partial overlays to cover every node and edge of the graph, we have a partially perfect compact skeleton (PPCS) Tuple subsumption Tuple t subsumes tuple u if t and u agree on every component of u that is not null Noisy Data Graphs Real-life websites are noisy False positives e.g., MS = degree, state or Microsoft? Non-skeleton links e.g., featured products Data graph for a retail website Coverage of a skeleton Coverage of a skeleton Coverage of a skeleton Skeletons for Noisy Data Graphs Problem: Find skeleton K with optimal coverage, called the best-fit skeleton (BFS) NP-complete Greedy Heuristic for BFS Greedy Heuristic for BFS Weighted Greedy Heuristic Simple Greedy heuristic uses parent counts “Memory-less” Weighted Greedy heuristic takes into account past selections to improve simple greedy selection Computes “benefit” of each decision at every stage c1 c2 c3 p1 p3 p4 a3 C I Skeleton K1 Coverage = 28 a1 a2 i1 i2 i3 i4 a4 P A c1 c2 c3 p1 p3 p4 a3 C I C I Skeleton K1 Coverage = 28 Skeleton K2 Coverage = 12 a1 a2 i1 i2 i3 i4 a4 P A A P c1 c2 c3 p1 p3 p4 a3 a1 a2 i1 i2 i3 i4 a4 r C C C P P P A C A A I I I I A R R 1 R C 1 R 4 C I 1 C 3 I A 3 I P Count Parent Label I P A R A B C C C D D D D 1 R B 1 R A 1 B 2 A C 4 C D Count Parent Label C R A D B Greedy skeleton R A B C C C D D D D C R A D B Gre
您可能关注的文档
- Properties of Light Department of Physics光物理系的性质.ppt
- PROPERTIES OF LOVE bjbiblelessons爱bjbiblelessons性质.ppt
- Properties of Matter resource物质资源的性质.sbo.accomack.k12.va.us.ppt
- Properties of Minerals The University of Kansas矿物的性质勘萨斯大学.ppt
- Properties of Minerals myteacherpages性能的矿物 myteacherpages.ppt
- Properties of Matter and the Analysis of Glass物质的性质和玻璃的分析.ppt
- Properties of Mobile Radio Propagation Channel移动无线电波传播信道的特性.ppt
- Properties of Water Wikispaces水的性质 wiki空间.ppt
- Property Based Coordination MINES Sainttienne基于属性的协调矿山圣与#艾蒂安201.ppt
- Property Testing of the Diameter of a Graph TAU图头直径的属性检验.ppt
- Question Bank Henry and Mudge #3 Chapter 1 The 问题银行亨利和玛吉3章1.ppt
- Question of the Day How are people and animals alike问题的一天人和动物都一样.ppt
- Quantum Phase Estimation using Multivalued Logic利用多值逻辑量子相位估计.pptx
- Questions and Comments, Popkin text pp问题和意见波普金文本PP. 7475.ppt
- Questions College of Lake County问题湖县学院.ppt
- Queues Are Databases队列是数据库.ppt
- Queueing Theory Tutorial MIT排队论教程麻省理工学院.ppt
- Quick and Dirty Intro to PHP Northwestern University快速和肮脏的PHP 西北大学介绍.ppt
- QUICK DRAW trafalgarenglish快速绘制 trafalgarenglish.wikispaces.ppt
- Quick Guide for Your IBM ThinkPad LAUSD快速指南为您的IBM ThinkPad lausd.ppt
原创力文档


文档评论(0)