Introduction to parallel computing for scientists and engineers. Shared memory parallel architectures and programming, distributed memory, message-passing data-parallel architectures, and programming.
A hands-on introduction to parallel programming and optimizations for 1000+ core GPU processors, their architecture, the CUDA programming model, and performance analysis. Students implement various ...
Students will be able to analyze the computing and memory architecture of a super computing node and use OpenMP directives to improve vectorization of their programs. This module focuses on the key ...