14. Evaluating Relational Operators.ppt

  1. 1、本文档共59页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
14. Evaluating Relational Operators.ppt

14. Evaluating Relational Operators Table Statistics Operators and example schema Selection Projection Equijoins General Joins Set Operators Buffering Intergalactic standard reference G. Graefe, Query Evaluation Techniques for Large Databases, ACM Computing Surveys, 25(2) (1993), pp. 73-170 Learning Objectives In a typical major DBMS, what statistics are automatically collected and when? Given collected statistics, estimate a predicate’s output size/selectivity. For each relational operator, describe the major algorithms, their optimizations, their pros and cons, and their costs Selection Projection Equijoins General Joins Set Operators Motivation/Review Assume Sailors has an index on age. Does the optimal plan for this query use the index? SELECT * FROM Sailors S WHERE S.age 31 Moral: In order to choose the optimal plan we need to know the selectivity of the predicate/size of the output. Collecting Statistics What are statistics Table sizes, index sizes, ranges of values, etc. Where are statistics kept? In the system catalog. Why collect statistics? Statistics are needed to determine selectivity of predicates and sizes of outputs of operators. Data in the previous bullet is needed to calculate costs of plans. Data in the previous bullet is needed by the optimizer to find the optimal plan. How often are statistics collected? Typically done when 10% of the data has been updated. Can be overidden manually UPDATE STATISTICS, ANALYZE Typically done by sampling, if table is large Which tables/columns are monitored? Typically all tables/columns are monitored Sailors Example What statistics would you collect to find the number of rows satisfying Age 31? Rank =4 ? In Intro DB we assumed data was uniformly distributed. Histograms Histograms are more accurate than the uniformity assumption. Equiwidth Histogram: Estimate number of rows/selectivity for the predicate “age 31”: Equidepth Histogram: Estimate number of rows/selectivity for the predicate “age 31”: Actua

文档评论(0)

gshshxx + 关注
实名认证
内容提供者

该用户很懒,什么也没介绍

1亿VIP精品文档

相关文档