- 1、本文档共29页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
性能库intel数学核心库(mkl)
* Intel? Math Kernel Library Contents Each of the BLAS has 4 data types: single and double precision real and complex data types. Most all the functions (with some exceptions) have identical functionality in each data type. The extended BLAS are a set of level 1 BLAS, which support sparse data. * Intel? Math Kernel Library Contents Intel MKL’s value-add to the LAPACK code includes: Just building the LAPACK code takes some effort Threading key portions of the functions Optimizing key functions through the use of recursion The new Fourier transforms meet the needs of a far wider audience than did the previous radix-2 FFTs. This list shows key features. Optimization of the functions will continue for some time yet, but the complex transforms are well optimized for IPF-2 now. VML and VSL offer improved performance over scalar implementations of the underlying functions provided the user can vectorize the code. * Roll Your Own/Dot Product Roll Your Own: This is a simple, straightforward dot product approach to matrix multiplication. Note that the innermost loop is a dot product, and thus can be replaced with a call to the dot product, which is shown in the second panel. * DGEMV/DGEMM The two innermost loops comprise a matrix-vector multiply, which can form the central operation of matrix multiplication. DGEMV parameters: incx = 1; incy = ldb; alpha = 1.0; beta = 0.0; transa = t; DGEMM parameters: alpha = 1.0; beta = 0.0; * Intel? Math Kernel Library Optimizations in LAPACK* Threading at higher levels (LAPACK factorization rather than at DGEMM, for instance) opens additional parallelization opportunities. The blocking strategy employed in traditional LAPACK can be extended to the factorization of the block columns to improve locality of reference and minimize vector operations. NETLIB LAPACK has numerous intrinsic function calls, which raises the need for run-time library support. All of these calls have been implemented within Intel MKL, so no run-time
您可能关注的文档
最近下载
- 3DMax中英文对照表2.doc
- 2024年四川省德阳市中考生物试题卷(含答案解析).docx
- 苏教版四下简便计算练习题.doc VIP
- 北京市第一零一中学2023-2024学年八年级下学期期中数学试题(原卷版).pdf VIP
- 2025年单招生活常识题目答案大全 .pdf VIP
- 湖北省武汉市2025届高三上学期元月调考数学试题(学生版+解析版).docx
- 卡萨帝 洗衣机 双子云裳洗干一体机 C8 HU12G1 使用说明.pdf
- 2023届高考英语新时政热点阅读 10 人工智能(含解析).docx
- 马尔测长仪使用说明ULM Manual_Chinese.pdf VIP
- OA系统运维项目方案.docx VIP
文档评论(0)