基于大数据的hadoop并行计算优化处理性能研究-research on hadoop parallel computing optimization processing performance based on big data.docxVIP
- 33
- 0
- 约5.57万字
- 约 70页
- 2018-05-18 发布于上海
- 举报
基于大数据的hadoop并行计算优化处理性能研究-research on hadoop parallel computing optimization processing performance based on big data
AbstractWith the development and popularization of new generation mobile communication, Internet of Things, and Cloud Computing, data traffic shows explosive growth with increasingly large pressure on data processing. By virtue of its powerful data processing capability, Hadoop MapReduce programming framework has become more mature solutions in the field of text analysis, natural language processing and business data processing.It can meet the data processing bottle-neck of communicating system. But the lack of cost-based optimization of parameters in MapReduce frameworks becomes a major limiting factor as MapReduce usage grows beyond large Web companies to new applications. About 13 of all 200 parameters have major effects on the cluster’s performance. Around the above problems, we design a new parameters configuration analysis system based on the Hadoop tunning in this thesis. Every single task will have the optimized parameters to improve the performance.In this thesis, based on the framework of MapReduce, we propose three new components: Profiler, Judge-Engine and Cost-based Optimizer. The Profiler is designed to collect detailed statistical information from unmodified MapReduce programs; The Judge-Engine works for the fine-grained cost estimation; The Cost-based Optimizer provide the best and simplified parameters based on the ouput of other two components.Through the comparisions with optimized parameters and default parameters in MapReduce’stypical applications: text analysis, natural language processing and business data processing.We have proved the the effectiveness of each component through a comprehensive evaluation from representative MapReduce application domains. The result shows that with help of theses three new components, the new optimization model makes Hadoop parameters’ optimization much easier. Keywords: Hadoop, Performance Optimization, Parameters, MapReduceV目录专用术语注释表1第一章 绪论21.1 课题研究背景21.2 国内外研究现状31.3 本文的主要贡献及组织结构4第二章 Hadoop相关技
您可能关注的文档
- 基于成像技术海产鱼类中异尖线虫检测-detection of metacercaria in marine fish base on imaging technology.docx
- 基于成效教学理论的在线学习活动设计——《现代远程教育》在线学习活动的设计与实践-design of online learning activity based on effectiveness teaching theory —— design and practice of online learning activity of modern distance education.docx
- 基于成员企业风险偏好的供应链风险分析——识别与评估-risk analysis of supply chain based on risk preference of member enterprises - identification and evaluation.docx
- 基于承接产业转移的合肥市主导产业选择及发展政策分析-analysis on the choice and development policy of leading industries in hefei based on undertaking industrial transfer.docx
- 基于成长型战略的我国中小企业资本结构实证分析-empirical analysis of capital structure of small and medium-sized enterprises in china based on growth strategy.docx
- 基于成员企业风险偏好的供应链风险研究--识别与评估-research on supply chain risk based on risk preference of member enterprises - identification and evaluation.docx
- 基于承包商视角的国际工程争议解决方式决策分析-decision analysis of international engineering dispute resolution based on contractor's perspective.docx
- 基于成形力检测的振动摆动辗压成形实验分析-experimental analysis of vibration swing rolling forming based on forming force detection.docx
- 基于城市更新的城市综合体项目开发可行性研究──以深圳fh厂为例-feasibility study of urban complex project development based on urban renewal ── taking shenzhen fh factory as an example.docx
- 基于城市环境的车载自组织网络路由协议分析-analysis of routing protocol in vehicular ad hoc networks based on urban environment.docx
- 小区绿化施工协议书.docx
- 墙面施工协议书.docx
- 1 古诗二首(课件)--2025-2026学年统编版语文二年级下册.pptx
- (2026春新版)部编版八年级道德与法治下册《3.1《公民基本权利》PPT课件.pptx
- (2026春新版)部编版八年级道德与法治下册《4.3《依法履行义务》PPT课件.pptx
- (2026春新版)部编版八年级道德与法治下册《6.2《按劳分配为主体、多种分配方式并存》PPT课件.pptx
- (2026春新版)部编版八年级道德与法治下册《6.1《公有制为主体、多种所有制经济共同发展》PPT课件.pptx
- 初三教学管理交流发言稿.docx
- 小学生课外阅读总结.docx
- 餐饮门店夜经济运营的社会责任报告(夜间贡献)撰写流程试题库及答案.doc
最近下载
- 药品生产过程中的清洁生产与环境保护策略.docx VIP
- 2023年日历表带节假日全年一页打印版.docx VIP
- 铁路项目监理人员岗位职责.docx VIP
- 【初中 物理】电磁波的应用课件 2025-2026学年沪科版九年级全一册物理.pptx VIP
- 第14篇 世界屋脊——藏族高原游牧文化旅游区.ppt VIP
- (人教版)数学五年级下册计算题“天天练”习题卡,含108份题组.pdf VIP
- 基于组织学视角的异种脱细胞真皮基质研究:动物筛选、制备工艺与体内植入初探.docx VIP
- 2025年度宜都市高新技术产业投资有限公司公开招聘8人(第二批)笔试备考题库及答案解析.docx VIP
- 初中地理新课程标准及解读精选全文.pptx VIP
- 《工业机器人技术基础》考试试卷练习题附答案.pdf VIP
原创力文档

文档评论(0)