- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
- 4、该文档为VIP文档,如果想要下载,成为VIP会员后,下载免费。
- 5、成为VIP后,下载本文档将扣除1次下载权益。下载后,不支持退款、换文档。如有疑问请联系我们。
- 6、成为VIP后,您将拥有八大权益,权益包括:VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
- 7、VIP文档为合作方或网友上传,每下载1次, 网站将根据用户上传文档的质量评分、类型等,对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档
查看更多
lec 6 External Sorting精选
* 19 External Sorting Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing gpa order First step in bulk loading B+ tree index. Useful for eliminating duplicates (Why?) Useful for summarizing groups of tuples Sort-merge join algorithm involves sorting. Problem: sort 100Gb of data with 1Gb of RAM. why not virtual memory? 2-Way Sort: Requires 3 Buffers Pass 0: Read a page, sort it, write it. only one buffer page is used. each sorted page (or subfiles) is called a run. Pass 1, 2, 3, …, etc.: requires 3 buffer pages merge pairs of runs into runs twice as long three buffer pages used. Main memory buffers INPUT 1 INPUT 2 OUTPUT Disk Disk Merging Runs Two-Way External Merge Sort Each pass we read + write each page in file. N pages in the file = the number of passes So total cost is: Idea: Divide and conquer: sort subfiles and merge Input file 1runs 2runs 4runs 8runs PASS 0 PASS 1 PASS 2 PASS 3 9 3,4 6,2 9,4 8,7 5,6 3,1 2 3,4 5,6 2,6 4,9 7,8 1,3 2 2,3 4,6 4,7 8,9 1,3 5,6 2 2,3 4,4 6,7 8,9 1,2 3,5 6 1,2 2,3 3,4 4,5 6,6 7,8 Merging Runs General External Merge Sort To sort a file with N pages using B buffer pages: Pass 0: use B buffer pages. Produce sorted runs of B pages each. Pass 1, 2, …, etc.: merge B-1 runs. B Main memory buffers INPUT 1 INPUT B-1 OUTPUT Disk Disk INPUT 2 . . . . . . . . . More than 3 buffer pages. How can we utilize them? Cost of External Merge Sort Number of passes: Cost = 2N * (# of passes) E.g., with 5 buffer pages, to sort 108 page file: Pass 0: = 22 sorted runs of 5 pages each (last run is only 3 pages) Pass 1: = 6 sorted runs of 20 pages each (last run is only 8 pages) Pass 2: 2 sorted runs, 80 pages and 28 pages Pass 3: Sorted file of 108 pages Formula check: ┌log4 22┐= 3 … + 1 ? 4 passes √ Number of Passes of External Sort (
您可能关注的文档
- IPLAT开发培训精选.ppt
- IplImage, CvMat, Mat 的关系和相互转换精选.pdf
- ipsec(分支动态IP)精选.doc
- IPO申报项目业务合并实务分析精选.doc
- IPP075N15N3 G_Rev2.06精选.pdf
- IR204C-A, 规格书,Datasheet 资料精选.pdf
- IP地址课件精选.ppt
- IR3800MTRPBF,IR3800MTRPBF,IR3800MTRPBF,IR3800MTR1PBF,IR3800MTR1PBF, 规格书,Datasheet 资料精选.pdf
- irfb4127规格书精选.pdf
- IRFI4019HG-117P;中文规格书,Datasheet资料精选.pdf
最近下载
- 一种硅片单面清洗机.pdf VIP
- 人教版(2019年)高中物理选择性必修第二册《无线电波的发射和接收》(共22张PPT).pptx VIP
- 2012年公路土工合成材料应用技术规范.pdf VIP
- 人教新目标七年级上册Unit9 My favorite subject is science. Se.pptx VIP
- 智慧城市 电子围网测试方法.pdf VIP
- 职业规划大赛演讲稿5篇:大学生职业规划大赛演讲稿(三篇).pdf VIP
- 4.3无线电波的发射和接收(课件)高二物理(人教版2019选择性必修第二册).pptx VIP
- ..电子邮件..ppt VIP
- 挖机台班计时结算表Excel模板.xlsx VIP
- 职业素养 课件 专题八、九 坚持最美风景在远方;学习—保持续航能力 成就更好的自己 .pptx
原创力文档


文档评论(0)