编译器参数调优方法.pptVIP

  • 1
  • 0
  • 约1.99万字
  • 约 31页
  • 2019-09-06 发布于广东
  • 举报
Which Processor: [a]x? To require at least... Use Windows* Linux* Pentium Pro and Pentium II processors with CMOV and FCMOV instructions i Qaxi axi Pentium processors with MMX instructions M QaxM axM Pentium III processor with Streaming SIMD Extensions (implies i and M above) K QaxK axK Pentium 4 processor with Streaming SIMD Extensions 2 (implies i, M and K above) W QaxW axW Automatic Processor Dispatch Single executable Pentium 4 target that runs on all x86 processors. For Target Processor it uses: Processor Specific Opcodes Prefetch (Pentium III only) Vectorization Low Overhead Some increase in code size Can mix and match: -xK –axW together makes Xeon/Pentium 4 the target and Pentium III the default Agenda General Xeon? processor optimizations Loop level optimizations Multi-pass optimizations Other Vectorization Automatically converts loops to utilize MMX/SSE/SSE2 instructions and registers. Data types: char/short/int/float/double (but not mixed) Can Use Short Vector Math Library Enabled through -[Q]xW, -[Q]xK, -[Q]axW, -[Q]axK -vec_report3 tells you which loops were vectorized, and if not, why not. High Level Optimizer Windows: /O3 or Linux: -O3 Use with –xW, -xK, -QxW, -QxK, etc. additional loop optimizations more aggressive dependency analysis scalar replacement software prefetch (-xK on Pentium III) Loops must meet criteria related to those for vectorization Under the Covers: Xeon SMP parallelism OpenMP Easy multithreading using directives Use KSL tools for Development Use Intel tools to optimize for IA in tandem with OpenMP Auto-parallelization Simple loops threaded by compiler alone Loops must meet certain criteria… OpenMP* Support OpenMP 1.1 for Fortran 1.0 for C / C++ Debugger info support for OpenMP Assure for Threads supported with Intel Compiler OpenMP switches: -Qopenmp, -openmp (or -openmpP) -QopenmpS, -openmpS (serial, for debugging) -openmp_report[n] (diagnostics) wo

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档