Author: Mandar Gurav

  • Message Passing Interface (MPI) : MPI_Gather example

    by

    in

    This post talks about Reduction operation in MPI using MPI_Gather. MPI_Gather is a collective operation in the Message Passing Interface (MPI) used to collect data from multiple processes and combine it into a single process. Syntax for MPI_Gather int MPI_Gather(const void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm…

  • Message Passing Interface (MPI) : MPI_Scatter example

    by

    in

    This post talks about an MPI function – MPI_Scatter. MPI_Scatter is a collective operation in the Message Passing Interface (MPI) used in parallel programming. It takes data from a process and distributes chunks to others in a communicator. Syntax for MPI_Scatter int MPI_Scatter(const void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype,…

  • Message Passing Interface (MPI) : MPI_Reduce example

    by

    in

    This post talks about Reduction operation in MPI using MPI_Reduce. In reduction operation, each process contributes its local data, and these values are aggregated according to the specified operation such as summation, finding the maximum, or custom-defined functions. Syntax for MPI_Reduce int MPI_Reduce(const void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root,…

  • Message Passing Interface (MPI) : MPI_Bcast example

    by

    in

    This post talks about a MPI Broadcast function – MPI_Bcast. Broadcast operation is used to send data from one process to all other processes within a communicator. Every process receives the same data. Syntax for MPI_Bcast int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm) Input Parameters Input/Output Parameters Example code – To…

  • Nvidia Nsight Systems : Profiling for CUDA code

    In this post we will look at steps involved in profiling of the CUDA code using Nvidia Nsight Systems. Let’s take a simple code which performs some array operations. To compile this code, we can use following command. Please note that I am using “-arch=sm_86” which instructs compiler to generate code for compute capability 8.6…