In this tutorial we will learn how to compile and execute an OpenACC Fortran code using PGI Fortran Compiler. Let’s look at the sample vector addition code parallelized using OpenACC Fortran based parallel loop construct.
program vector_addition
integer, dimension(10) :: A, B, C
integer ::i
do i = 1, 10
A(i) = i*100
B(i) = i
C(i) = 1000
enddo
!$acc parallel loop
do i = 1,10
C(i) = A(i) + B(i)
enddo
!$acc end parallel loop
do i = 1,10
print *, C(i)
enddo
end program vector_addition
We can compile this code for Nvidia GPU using following command –
pgfortran -ta=tesla ./vector_addition.f90
Or
pgfortran -acc ./vector_addition.f90
Here, ‘-ta=tesla’ option informs compiler that compiler should generate OpenACC code for the Nvidia GPU.
We can use ‘-Minfo=accel’ option to find out details about the OpenACC code generated by the Compiler.
pgfortran -ta=tesla -Minfo=accel ./vector_addition.f90
Compiler generates following output after using ‘-Minfo=accel’ option
vector_addition:
13, Generating Tesla code
14, !$acc loop gang, vector(10) ! blockidx%x threadidx%x
13, Generating implicit copyout(c(:)) [if not already present]
Generating implicit copyin(b(:),a(:)) [if not already present]
This output helps us understand the OpenACC code generated by the compiler.
Now, the executable file (a.out) is generated and we can execute it using ‘./a.out’ command. It will produce following output.
101
202
303
404
505
606
707
808
909
1010
If you are able to see above output on your screen, then you have successfully executed your OpenACC code on Nvidia GPU!