- 1、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。。
- 2、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
软件调优基础 陈健 2003/3 为什么需要调优?相同的代码 不同的性能 目标 明确性能调优的主要任务 定义一些重要的性能调优术语 利用Intel工具提供帮助 Agenda Performance Cycle Overview The Performance Cycle When to Start Performance Gains When to Stop Putting it into Perspective Performance Cycle Details Summary 调优循环 When (why) to Start User Requirement? Software Vendor Requirement? Put Performance Requirement into the Requirements Document Performance should be considered at every stage of the product life cycle (Requirements Gathering, Design, and Testing) Exception: Do “code tuning” after the simple/readable non-optimized version of the application exists. 工作 vs. 效果 When to Stop Architecture is at Maximum Efficiency? Be sure you know what this is: Calculate Theoretical Maximum Performance Requirement is satisfied Incrementally do Wide Mesh Optimizations2 until done 调优原则 We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Donald Knuth Agenda Performance Cycle Overview Performance Cycle Details Gather Performance Data Analyze Data and Identify Issues Generate Alternatives to Resolve Issues Implement Enhancements Summary 收集性能数据 Timer Use to get wall clock time Accuracy, Low Overhead Use Intel? VTune? Performance Analyzer Profiler: Gather Information about Code Usage Performance Monitor: Gather Information about System Resource Usage 工作量 A good workload should have these characteristics: measurable reproducible static representative 分析数据得出结论 Baseline Current Performance Examine Hot Spots Identify Bottlenecks Calculate Potential Maximum Performance Examine Hot Spots The Pareto Principle, a.k.a. the 80/20 Rule Concentrate on the vital few vs. the trivial many Hot Spot: 应用或系统中占主要运算量的部分 Generally consists of a Loop For Applications that don’t have hot spots, examine: Memory Layout Exceptions Effective Compiler Usage 额外内容 Big O Utilization, Efficiency, Throughput, Latency Bottlenecks I/O, Memory, CPU MIPS/FLOPS/CPI Concurrency, Parallelism Scalability
文档评论(0)