- 1
- 0
- 约6.58千字
- 约 28页
- 2017-01-18 发布于湖南
- 举报
ppt课件-ppt-thestanforduniversityinfolab
CS 345AData Mining MapReduce Single-node architecture Commodity Clusters Web data sets can be very large Tens to hundreds of terabytes Cannot mine on a single server (why?) Standard architecture emerging: Cluster of commodity Linux nodes Gigabit ethernet interconnect How to organize computations on this architecture? Mask issues such as hardware failure Cluster Architecture Stable storage First order problem: if nodes can fail, how can we store data persistently? Answer: Distributed File System Provides global file namespace Google GFS; Hadoop HDFS; Kosmix KFS Typical usage pattern Huge
您可能关注的文档
- ppt课件-powerpointpresentation-crystalandcrystallisation.ppt
- ppt课件-powerpointpresentation-eatingdisorders.ppt
- ppt课件-powerpointpresentation-electricaldevices,refrigeration.ppt
- ppt课件-powerpointpresentation-humanimpactonwatercycle.ppt
- ppt课件-powerpointpresentation-movement.ppt
- ppt课件-powerpointpresentation-smallenginedisassembly.ppt
- ppt课件-powerpointpresentation-technologyinancientegypt.ppt
- ppt课件-powerpointpresentation-theeffectoflighttypeonplant.ppt
- ppt课件-powerpointpresentation-thehumandigestivesystem.ppt
- ppt课件-powerpointpresentation-theriseandfalloftheroman.ppt
原创力文档

文档评论(0)