- 10
- 0
- 约6.78千字
- 约 32页
- 2018-09-16 发布于湖北
- 举报
DataStage EE版技术交流
DataStage 调优 数据库、文本字段读取需要的字段 文本读取增加 Number of Readers Per Node 数据库并行度:增加并行度/*+parallel(f_Crd_Dbt_Card,8)*/ 控件选择,如Lookup与Join控件 入库方式,尽量少用Upset 单表多次入库更改为一次入库 网络情况 练习 动态取数据库字段 区间查找数据 聚合汇总 排序去重 串行循环 并行循环 文本设置 DataStage EE版技术交流 资料全部来源于随机安装帮助文档(Parallel Job Developer’s Guide) Stage and link Passive stage: Initiative stage: Link type: 注:两被动stage之间全部为主动stage. Advanced Development and Maintenance Traditional Batch Processing Write to disk and read from disk before each processing operation Sub-optimal utilization of resources a 10 GB stream leads to 70 GB of I/O processing resources can sit idle during I/O Very complex to manage (lots and lots of small jobs) Becomes impractical with big data volumes Data Pipelining Start a downstream process while an upstream process is still running This eliminates intermediate storing to disk, which is critical for big data This also keeps the processors busy Still has limits on scalability Think of a conveyor belt moving the rows from process to process! Data Partitioning Break up big data into partitions Run one partition on each processor 4X times faster on 4 processors - With data big enough: 100X faster on 100 processors Repartitioning Why EE is so Effective Parallel processing paradigm – More hardware, faster processing – Level of parallelization is determined by a configuration file read at runtime Emphasis on memory – Data read into memory and lookups performed like hash table Parallel processing = executing your application on multiple CPUs Multiple CPUs with a single operating system Programs communicate using shared memory All CPUs share system resources Partitioning Icons 1、auto partitioning 2、repartitioning 3、Being partitioned 4、same partitioning 5、no partitioning Partitioning Type Round robin Random Same Entire Hash by field Modulus Range DB2 Auto Random randomly distributed across all processing nodes. approximately equal-sized partitions. slightly higher overhead
您可能关注的文档
- 2018连云港绿地·观湖一号整合传播策略.ppt.ppt
- A Plan for IT Disaster Avoidance, Mitigation, and Recovery对其避灾,减灾计划,和恢复.ppt
- 5700MAC偶极声波成像测井-中国石油测井有限公司.ppt
- aad_1112_安全管理--危险物品管理(PPT 44页).ppt
- accp美国胸科医师协会抗栓与血栓预防指南 ppt课件.pptx
- AccNet电话铜线接入技术..ppt
- AC尼尔森饼干市场零售研究2.ppt
- AC尼尔森:顶新国际集团糕饼事业群-饼干市场零售研究报告会.ppt.ppt
- aft_制造业务的帐务处理(ppt 112).ppt
- AKI的定义、诊断及治疗(2010-9-1)PPT课件.ppt
原创力文档

文档评论(0)