OpenACC is a directive-based parallel programming model designed to simplify the process of writing high-performance computing (HPC) applications. OpenACC codes can be run on a wide variety of platforms (CPUs and GPUs different vendors etc).
Beginner Level – Fundamentals
- Introduction to OpenACC with Fortran
- First OpenACC Fortran Program – Parallel Loop on GPU
- Understanding Fortran Array Indexing in OpenACC
- Basic Parallel Loop Directive with DO Loops
- Loop Directive with Num_gangs, Num_workers, Vector_length
- Data Clauses – Copyin, Copyout, Copy with Fortran Arrays
- Data Clauses – Create, Present, Delete
- Array Sections and Partial Array Transfers
Beginner Level – Data Management
- Data Regions with Fortran Arrays
- Working with Allocatable Arrays
- Multi-dimensional Arrays in OpenACC
- Update Directive for Array Synchronization
- Managed Memory with Fortran
- Column-Major vs Row-Major Performance Considerations
Intermediate Level – Parallel Constructs
- Kernels vs Parallel Directives in Fortran
- Gang, Worker, Vector with Nested DO Loops
- Reduction Operations on Fortran Arrays
- Atomic Operations in Fortran
- Loop Collapse for Nested DO Loops
- Tile Clause with Multi-dimensional Arrays
- Cache Directive for Fortran Arrays
Intermediate Level – Advanced Directives
- Async and Wait with Fortran Arrays
- Multiple Arrays and Data Dependencies
- Routine Directive for Subroutines
- Routine Directive for Functions
- Sequential and Auto Clauses
- Host_data Directive in Fortran
Intermediate Level – Dynamic Memory
- Allocatable Arrays with OpenACC
- Dynamic Memory Management
- Resizing Arrays in Parallel Regions
- Memory Allocation on Device
Advanced Level – Optimization
- Optimizing Fortran Array Transfers
- Data Reuse with Fortran Arrays
- Minimizing Memory Movement
- Efficient Multi-dimensional Array Access
Advanced Level – Complex Patterns
- Matrix Operations with OpenACC
- Stencil Patterns in Fortran
- Parallel Reductions on 2D Arrays
- Transpose Operations
Performance Analysis
- Compiler Feedback for Fortran Programs
- Profiling Fortran OpenACC Applications
- Performance Bottleneck Identification
CPU Fallback
- Running Fortran OpenACC on Multicore CPUs
- Performance Portability in Fortran
- Target-specific Optimizations