1. 1、本文档共37页,可阅读全部内容。
  2. 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
  3. 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载
  4. 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
NCHC00210201F

* This should be emphasized! Maybe another slide on “This is the first thing” * * * * Try to update tool slides with the tutorial? * * * * * What are the options that the students can use? * ? David Kirk/NVIDIA and Wen-mei W. Hwu Taiwan, June 30-July 2, 2008 * Compiling a CUDA Program NVCC C/C++ CUDA Application PTX to Target Compiler G80 … GPU Target code PTX Code Virtual Physical CPU Code Parallel Thread eXecution (PTX) Virtual Machine and ISA Programming model Execution resources and state float4 me = gx[gtid]; me.x += me.y * me.z; ld.global.v4.f32 {$f1,$f3,$f5,$f7}, [$r9+0]; mad.f32 $f1, $f5, $f3, $f1; ? David Kirk/NVIDIA and Wen-mei W. Hwu Taiwan, June 30-July 2, 2008 * Compilation Any source file containing CUDA language extensions must be compiled with NVCC NVCC is a compiler driver Works by invoking all the necessary tools and compilers like cudacc, g++, cl, ... NVCC outputs: C code (host CPU Code) Must then be compiled with the rest of the application using another tool PTX Object code directly Or, PTX source, interpreted at runtime ? David Kirk/NVIDIA and Wen-mei W. Hwu Taiwan, June 30-July 2, 2008 Linking Any executable with CUDA code requires two dynamic libraries: The CUDA runtime library (cudart) The CUDA core library (cuda) ? David Kirk/NVIDIA and Wen-mei W. Hwu Taiwan, June 30-July 2, 2008 Debugging Using the Device Emulation Mode An executable compiled in device emulation mode (nvcc -deviceemu) runs completely on the host using the CUDA runtime No need of any device and CUDA driver Each device thread is emulated with a host thread Running in device emulation mode, one can: Use host native debug support (breakpoints, inspection, etc.) Access any device-specific data from host code and vice-versa Call any host function from device code (e.g. printf) and vice-versa Detect deadlock situations caused by improper usage of __syncthreads ? David Kirk/NVIDIA and Wen-mei W. Hwu Taiwan, June 30-July 2, 2008 Device Emulation Mode Pitfalls

文档评论(0)

l215322 + 关注
实名认证
内容提供者

该用户很懒,什么也没介绍

1亿VIP精品文档

相关文档