Tag: Beginner

  • CUDA: Error Handling

    by

    in

    Robust CUDA programs require systematic error checking since GPU operations can fail silently. When you start a kernel on the GPU, it runs immediately without giving an error code if something goes wrong. Using cudaError_t, cudaGetLastError(), and error-checking macros helps catch problems like running out of memory, bad launch settings, or trying to access memory…

  • CUDA: Compilation and Execution

    by

    in

    CUDA programs require special compilation to generate both CPU and GPU code. The nvcc tool helps by splitting the code into two parts: host (C++) and device (PTX/SASS). Then it combines them. Using the right compiler flags is important, especially the -arch flag, which tells the program which GPUs to run on. This makes your…

  • CUDA: Thread Indexing and IDs

    by

    in

    Thread indexing is how each parallel thread determines which data element to process. Computing a unique global thread ID from threadIdx, blockIdx, and blockDim enables thousands of threads to safely access different array elements without conflicts. This way of connecting threads to data is very important for all kinds of GPU tasks. It works for…

  • CUDA: Hello World Kernel

    by

    in

    Our first CUDA kernel helps connect CPU and GPU programming. It runs a simple function using many parallel threads. This is different from normal “Hello World” programs because it shows true parallelism, where hundreds or thousands of threads work at the same time. To understand how this works, you need to know some basics: These…

  • CUDA Programming Model

    by

    in

    The CUDA programming model splits work between two parts: the CPU (host) and the GPU (device). The CPU controls what happens in the program and sends tasks called kernels to the GPU for processing. To write good CUDA programs, you need to understand how these two parts work together and how tasks are organized into…