Author: Mandar Gurav

  • OpenMP C/C++: Hello World!

    This is the first OpenMP program, one can write for understanding the parallelization process using OpenMP. First, let us find out how to compile and execute this code. The ‘-fopenmp’ option here requests compiler to generate parallel threads for the given code using OpenMP. If we do not provide this option, compiler will ignore all…

  • Profiling OpenACC Code using NVPROF

    Profiling your OpenACC code on a remote system can be tricky sometimes. Many times we try to profile the code in cluster environment where we need to use a job scheduler to submit our jobs. In such scenarios, command line based profiling comes handy. This tutorials provides some usage examples for NVIDIA’s command line profiler…

  • Compiling and Running OpenACC Fortran Codes using PGI Fortran

    In this tutorial we will learn how to compile and execute an OpenACC Fortran code using PGI Fortran Compiler. Let’s look at the sample vector addition code parallelized using OpenACC Fortran based parallel loop construct. We can compile this code for Nvidia GPU using following command – Or Here, ‘-ta=tesla’ option informs compiler that compiler…

  • Profiling Serial C codes using GNU’s profiler – gprof

    This post covers the steps to profile a C code using GNU’s profiler – gprof. Profiling your serial code is one of the most important step in writing parallel codes. We use profilers to find out the most time consuming parts of the code. Let us consider following sample C code. Fore this code, we…