Compiling and Running OpenACC Fortran Codes using PGI Fortran

In this tutorial we will learn how to compile and execute an OpenACC Fortran code using PGI Fortran Compiler. Let’s look at the sample vector addition code parallelized using OpenACC Fortran based parallel loop construct.

program vector_addition

  integer, dimension(10) :: A, B, C
  integer ::i
  
  do i = 1, 10
  	A(i) = i*100
  	B(i) = i
  	C(i) = 1000
  enddo
  
  !$acc parallel loop 
  do i = 1,10 
  	C(i) = A(i) + B(i)
  enddo
  !$acc end parallel loop
  
  do i = 1,10
  	print *, C(i)
  enddo
  
end program vector_addition

We can compile this code for Nvidia GPU using following command –

pgfortran -ta=tesla ./vector_addition.f90

Or

pgfortran -acc ./vector_addition.f90

Here, ‘-ta=tesla’ option informs compiler that compiler should generate OpenACC code for the Nvidia GPU.

We can use ‘-Minfo=accel’ option to find out details about the OpenACC code generated by the Compiler.

pgfortran -ta=tesla -Minfo=accel ./vector_addition.f90

Compiler generates following output after using ‘-Minfo=accel’ option

vector_addition:
     13, Generating Tesla code
         14, !$acc loop gang, vector(10) ! blockidx%x threadidx%x
     13, Generating implicit copyout(c(:)) [if not already present]
         Generating implicit copyin(b(:),a(:)) [if not already present]

This output helps us understand the OpenACC code generated by the compiler.

Now, the executable file (a.out) is generated and we can execute it using ‘./a.out’ command. It will produce following output.

          101
          202
          303
          404
          505
          606
          707
          808
          909
         1010

If you are able to see above output on your screen, then you have successfully executed your OpenACC code on Nvidia GPU!