Learn Parallel Programming

Sharing knowledge on Parallel Programming

Tag: CUDA

What is parallel computing and why does it matter?

—

by

Mandar Gurav

in CUDA

Parallel computing means executing many calculations simultaneously, by dividing a large problem into smaller independent pieces and running them at the same time on multiple processors. A modern GPU like the NVIDIA RTX 5060 Ti contains 4,608 CUDA cores — all of them working in parallel on your data. Parallel computing is not a niche…
CUDA : Vector Addition Example

—

by

Mandar Gurav

in CUDA

Vector addition (C[i] = A[i] + B[i]) is the our first parallel CUDA program, integrating memory management, data transfer, kernel execution, and error handling. This complete example demonstrates the full CUDA workflow: allocate device memory with cudaMalloc(), copy data with cudaMemcpy(), launch parallel kernel, retrieve results, verify correctness, and free allocated memories. Refer to following…
CUDA: Device Query

—

by

Mandar Gurav

in CUDA

Using cudaGetDeviceProperties() lets your program learn about the GPU’s features. It tells you things like how powerful the GPU is, how much memory it has, and how many multiprocessors it has. This information helps you write better CUDA code that works well on different types of GPUs. For example, it can help you decide the…
CUDA: Error Handling

—

by

Mandar Gurav

in CUDA

Robust CUDA programs require systematic error checking since GPU operations can fail silently. When you start a kernel on the GPU, it runs immediately without giving an error code if something goes wrong. Using cudaError_t, cudaGetLastError(), and error-checking macros helps catch problems like running out of memory, bad launch settings, or trying to access memory…
CUDA: Compilation and Execution

—

by

Mandar Gurav

in CUDA

CUDA programs require special compilation to generate both CPU and GPU code. The nvcc tool helps by splitting the code into two parts: host (C++) and device (PTX/SASS). Then it combines them. Using the right compiler flags is important, especially the -arch flag, which tells the program which GPUs to run on. This makes your…

► Necessary Cookies Always Active

Necessary cookies enable essential site features like secure log-ins and consent preference adjustments. They do not store personal data.

► Functional Cookies Remark

Functional cookies support features like content sharing on social media, collecting feedback, and enabling third-party tools.

► Analytical Cookies Remark

Analytical cookies track visitor interactions, providing insights on metrics like visitor count, bounce rate, and traffic sources.

► Advertisement Cookies Remark

Advertisement cookies deliver personalized ads based on your previous visits and analyze the effectiveness of ad campaigns.

Tag: CUDA

What is parallel computing and why does it matter?

CUDA : Vector Addition Example

CUDA: Device Query

CUDA: Error Handling

CUDA: Compilation and Execution