大数据与数据挖掘-1概述分析报告.ppt

  1. 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
  2. 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  3. 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
基于关系型数据库的数据挖掘:传统正面临挑战,常用的调优手段,一体机的理念,大数据的存储与分析 * 大数据市场正处于井喷式发展阶段,国家领导人对高数据高度重视,各地大数据产业蓬勃兴起,政治局委员汪洋在广东任职期间主抓大数据,互联网领军人物雷军力推国家大数据发展规划。大数据落地的5大成功因素(基础设施、产业链、人才、技术、立法),大数据基础设施是基础,潜藏着巨大的商机。 * 用户行为分析,反欺诈,反洗钱 * * Cluster filesystem - a distributed filesystem that is not a single server with a set of clients, but instead a cluster of servers that all work together to provide high performance service to their clients. To the clients the cluster is transparent - it is just the filesystem, but the filesystem software deals with distributing requests to elements of the storage cluster. Examples include: HP (DEC) Tru64 cluster and Spinnaker is a clustered NAS (NFS) service. Panasas ActiveScale is a cluster filesystem Parallel filesystem - file systems with support for parallel applications, all nodes may be accessing the same files at the same time, concurrently reading and writing. Data for a single file is striped across multiple storage nodes to provide scalable performance to individual files. Examples of this include: Panasas ActiveScale, Lustre, GPFS and Sistina. NFSv4.1 will feature an extension to the NFS standard that supports parallel IO. These definitions overlap. * Level 4, Distributed file systems, such as Locus, Sun Network File System (NFS) and CMU Andrew , where multiple users who are physically dispersed in a network of autonomous computers share in the use of a common file system. New issues location transparency - dynamically maps file names to storage sites Availability – Fault Tolerance – Replication – Consistency Distributed Storage is software for files and directories synchronization locally and between many remote computers connected via LAN or Internet. wiki Consistency, availability and performance tend to be mutually contradictory goals in a distributed system. * object means an ordered set of bytes (within the OSD) that is associated with a unique identifier. * …老问题新需求 Cloud storage is a model of networked online storage where data is s

文档评论(0)

w5544434 + 关注
实名认证
内容提供者

该用户很懒,什么也没介绍

1亿VIP精品文档

相关文档