基于Hadoop的电网数据质量校验方法与验证系统-南京大学PASA大.PDF

基于Hadoop的电网数据质量校验方法与验证系统-南京大学PASA大.PDF

计算机研究与发展(增刊) Journal of Computer Research and Develop 基于Hadoop 的电网数据质量校验方法与验证系统 1 1 2 2 2 3 3 3 张志亮 ,孙煜华 ,陈承志 ,龙庆麟 ,梁国辉 ,顾荣 ,杨滨诚 ,黄宜华 1 (广州供电局) 2 (广州科腾信息技术有限公司) 3 (江苏省软件新技术与产业化协同创新中心) (zhangzhiliang@ ,yhuang@ ) Data Quality Verification System for Power Grid Based on Hadoop Zhang Zhiliang, Sun Yuhua, Chen Chengzhi, Long Qinglin, Liang Guohui, Gu Rong, Yang Bincheng, Huang Yihua (Guangzhou Power Grid) (Guangzhou Keteng Information Technology Inc. Ltd) (Collaborative Innovation Center of Novel Software Technology and Industrialization) Abstract Among many power grid data processing applications, the quality monitoring of power grid data is one of the most important services. With constant increase of the scale of power grid data and the number of data quality checking rules, the processing power of the current data quality checking system based on the traditional RDBMs and computing platforms has become a serious bottleneck, making it hard to conduct the data quality monitoring and checking in time and hard to scale when the size of data volume and number of checking rules increase. All of these make the current system hard to meet the need of management and operational decision making. The big data technology has provided great technical means and support platforms for the solution to power grid big data processing. Thus, in this paper, we propose a big data solution to power grid big data processing. We study and design the techniques for distributed data storage and parallel computing based on Hadoop for executing data quality checking rules. After choosing a few typical scenarios of batch-style and streaming-style power grid data quality checking for verification study, we design and implement an indexing mechanism for data quality checking, building a fast search index for the attribute

文档评论(0)

1亿VIP精品文档

相关文档