- 1、本文档共41页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
lecture 2 – theoretical underpinnings of mapreduce - ubc ece
Key idea 3:Scale out, not up! For data-intensive workloads, a large number of commodity servers is preferred over a small number of high-end servers cost of super-computers is not linear Some numbers Processing data is quick, I/O is very slow: 1 HDD = 75 MB/sec; 1000 HDDs = 75 GB/sec Data volume processed: 80 PB/day at Google; 60TB/day at Facebook (~2012) Key idea 4“Shared-nothing” infrastructure(both hard- and soft-ware) Sharing vs. Shared nothing: Sharing: manage a common/global state Shared nothing: independent entities, no common state Functional programming as key enabler No side effects Recovery from failures much easier map/reduce – as subset of functional programming More examples Distributed Grep: The map function emits a line if it matches a supplied pattern. The reduce function is an identity function that just copies the supplied intermediate data to the output. Count of URL Access Frequency: The map function processes logs of web page requests and outputs URL; 1. The reduce function adds together all values for the same URL and emits a URL; total count pair. ReverseWeb-Link Graph: The map function outputs target; source pairs for each link to a target URL found in a page named source. The reduce function concatenates the list of all source URLs associated with a given target URL and emits the pair: target; list(source) Term-Vector per Host: … More info MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat, /papers/mapreduce.html The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-TakLeung, /papers/gfs.html * * * * * * * * * * * * * * * MapReduce: Acknowledgements: Some slides form Google University (licensed under the Creative Commons Attribution 2.5 License) others from Jure Leskovik MapReduce Concept from functional programming Applied to large number of problems Java: int fooA(String[] list) { return bar1(list) + bar2(list); } int fooB(String[] list) {
您可能关注的文档
- jmp pro的魔力.pdf
- jmpによる心理统计学入门 - kurume university institutional repository.pdf
- jci评审与护理质量 - psbh china.ppt
- jstor检索指南 - 天津商业大学图书馆.ppt
- just 检测与质量控制.pdf
- jy30型网(站)异常检测装置 - 铁岭市谐振电子科技有限公司.doc
- keysight technologies u2020 x 系列usb 峰值与平均功率感测器.pdf
- keysight n9344c 手持式频谱分析仪(hsa).pdf
- knn多类标算法.ppt
- kriging插值和序贯高斯条件模拟算法的对比分析 - 地理科学进展.pdf
- 难点详解鲁教版(五四制)6年级数学下册期末测试卷带答案详解(考试直接用).docx
- 难点详解鲁教版(五四制)6年级数学下册期末试题【培优】附答案详解.docx
- 难点解析鲁教版(五四制)7年级数学下册期末试题及完整答案详解(全国通用).docx
- 难点解析鲁教版(五四制)7年级数学下册期末试题含完整答案详解(名师系列).docx
- 难点解析鲁教版(五四制)7年级数学下册期末试题含完整答案详解【全国通用】.docx
- 难点解析鲁教版(五四制)7年级数学下册期末试卷(突破训练)附答案详解.docx
- 难点解析鲁教版(五四制)7年级数学下册期末试卷(能力提升)附答案详解.docx
- 难点详解京改版数学9年级上册期中试卷附参考答案详解【突破训练】.docx
- 难点解析鲁教版(五四制)7年级数学下册期末试题含完整答案详解(有一套).docx
- 难点解析鲁教版(五四制)7年级数学下册期末试卷带答案详解(夺分金卷).docx
文档评论(0)