Parallel Programming using OpenACC C

This page is under construction and will be updated regularly.

OpenACC is a directive-based parallel programming model designed to simplify the process of writing high-performance computing (HPC) applications. OpenACC codes can be run on a wide variety of platforms (CPUs and GPUs from different vendors etc).

If you are looking for OpenACC for Fortran, please click here.

OpenACC Basics

  • Introduction to OpenACC
  • OpenACC Programming Model
  • Setting Up OpenACC Environment
  • First Parallel Region
  • Execution Model Overview
  • Compiler Feedback and Analysis
  • Error Checking and Debugging
  • Performance Measurement
  • Code Portability
  • OpenACC vs Sequential Code

Parallel Constructs

  • Parallel Directive Basics
  • Kernels Directive
  • Parallel vs Kernels
  • Num_gangs Clause
  • Num_workers Clause
  • Vector_length Clause
  • Combining Gang, Worker, Vector
  • Async and Wait Clauses

Loop Constructs

  • Loop Directive Basics
  • Independent Clause
  • Gang, Worker, Vector Loop Clauses
  • Seq Clause
  • Collapse Clause
  • Tile Clause
  • Auto Clause
  • Reduction Clause
  • Private and Firstprivate Clauses
  • Loop Optimization Techniques

Data Management

  • Data Directive Overview
  • Copyin Clause
  • Copyout Clause
  • Copy Clause
  • Create Clause
  • Present Clause
  • Present_or_copyin/copyout/create
  • Delete Clause
  • Update Directive
  • Data Regions and Scope
  • Enter Data and Exit Data
  • Default(present) Clause

Optimization Techniques

  • Understanding Gang/Worker/Vector Mapping
  • Loop Scheduling Optimization
  • Memory Access Patterns
  • Minimizing Data Transfers
  • Collapse for Better Parallelism
  • Tile for Cache Efficiency
  • Reduction Optimization
  • Private vs Firstprivate Performance
  • Async Execution Optimization
  • Compiler Optimization Flags
  • Profiling and Performance Analysis
  • Performance Tuning Workflow

Advanced Topics

  • Routine Directive
  • Atomic Operations
  • Cache Directive
  • Declare Directive
  • Host_data Directive
  • Device Type Clause
  • If Clause
  • Link Clause
  • Nested Parallelism
  • Multi-Device Programming

Interoperability & Tools

  • OpenACC with CUDA
  • OpenACC with OpenMP
  • OpenACC with MPI
  • Using OpenACC Libraries
  • Debugging Tools
  • Profiling with NVIDIA Tools
  • Compiler-Specific Features
  • Porting Legacy Code

References:


Mandar Gurav Avatar

Mandar Gurav

Parallel Programmer, Trainer and Mentor


If you are new to Parallel Programming you can start here.



Beginner CUDA Fortran Hello World Message Passing Interface MPI Nvidia Nsight Systems NVPROF OpenACC OpenACC Fortran OpenMP PGI Fortran Compiler Profiling Vector Addition


Popular Categories