火龙果软件-MapReduce课件.pptxVIP

  • 0
  • 0
  • 约5.93千字
  • 约 10页
  • 2025-09-02 发布于四川
  • 举报

MapReduce课件

01MapReduceoverview02DiscussionQuestions03MapReduceOutline

Motivation200+processors200+terabytedatabase1010totalclockcycles0.1secondresponsetime5¢averageadvertisingrevenueFrom:/~bryant/presentations/DISC-FCRC07.ppt

Motivation:LargeScaleDataProcessingWanttoprocesslotsofdata(1TB)Wanttoparallelizeacrosshundreds/thousandsofCPUs…WanttomakethiseasyGoogleEarthuses70.5TB:70TBfortherawimageryand500GBfortheindexdata.

21AutomaticparallelizationdistributionCleanabstractionforprogrammersFault-tolerantProvidesstatusandmonitoringtools43MapReduce

BorrowsfromfunctionalprogrammingUsersimplementinterfaceoftwofunctions:map(in_key,in_value)-(out_key,intermediate_value)listreduce(out_key,intermediate_valuelist)-out_valuelistProgrammingModel

mapRecordsfromthedatasource(linesoutoffiles,rowsofadatabase,etc)arefedintothemapfunctionaskey*valuepairs:e.g.,(filename,line).map()producesoneormoreintermediatevaluesalongwithanoutputkeyfromtheinput.

Afterthemapphaseisover,alltheintermediatevaluesforagivenoutputkeyarecombinedtogetherintoalist12(inpractice,usuallyonlyonefinalvalueperkey)3reduce()combinesthoseintermediatevaluesintooneormorefinalvaluesforthatsameoutputkeyreduce

Parallelismmap()functionsruninparallel,creatingdifferentintermediatevaluesfromdifferentinputdatasetsreduce()functionsalsoruninparallel,eachworkingonadifferentoutputkeyAllvaluesareprocessedindependentlyBottleneck:reducephasecan’tstartuntilmapphaseiscompletelyfinished.

Example:Countwordoccurrencesmap(Stringinput_key,Stringinput_value)://input_key:documentname//input_value:documentcontentsforeachwordwininput_value:EmitIntermediate(w,1);reduce(Stringoutput_key,Iteratorintermediate_values)://output_key:aword//output_values:alistofcountsintresult=0;foreachvin

文档评论(0)

1亿VIP精品文档

相关文档