Controlling the number of threads is important for best performance when using many processors at the same time. OpenMP gives several ways to set how many threads to use: the num_threads clause, the OMP_NUM_THREADS environment variable, and runtime functions. Knowing these methods helps you control parallelism better. This allows programs to adapt to different hardware configurations and workload requirements for maximum efficiency.
Refer to following diagram for thread count control hierarchy.

Core Concept
OpenMP decides how many threads to use in a certain order. The num_threads clause in a parallel directive is most important. It changes the number of threads even if other settings say something different. You can set the thread count for your whole system using the OMP_NUM_THREADS environment variable. Programs can also change the default thread count using functions like omp_set_num_threads(). If there are no specific settings, OpenMP usually uses the number of processor cores available.
Key Points
- num_threads Clause: Highest priority, directive-specific control
- OMP_NUM_THREADS: Environment variable for global default
- omp_set_num_threads(): Programmatic default setting
- omp_get_num_threads(): Query actual thread count (call from parallel region)
- omp_get_max_threads(): Query default thread count for next parallel region
- Dynamic Adjustment: Runtime can reduce thread count if system resources are limited
Code Example
Demonstrating various methods to control and query thread count
OpenMP Implementation:
#include <stdio.h>
#include <omp.h>
int main() {
// Query default settings
printf("Default max threads: %d\n", omp_get_max_threads());
printf("Available processors: %d\n\n", omp_get_num_procs());
// Method 1: Default behavior (uses OMP_NUM_THREADS or system default)
#pragma omp parallel
{
#pragma omp master
printf("Region 1 - Default: %d threads\n", omp_get_num_threads());
}
// Method 2: num_threads clause (highest priority)
#pragma omp parallel num_threads(2)
{
#pragma omp master
printf("Region 2 - num_threads(2): %d threads\n",
omp_get_num_threads());
}
// Method 3: Runtime function
omp_set_num_threads(6);
#pragma omp parallel
{
#pragma omp master
printf("Region 3 - omp_set_num_threads(6): %d threads\n",
omp_get_num_threads());
}
// Method 4: num_threads overrides omp_set_num_threads
#pragma omp parallel num_threads(3)
{
#pragma omp master
printf("Region 4 - num_threads(3) override: %d threads\n",
omp_get_num_threads());
}
return 0;
}
Expected Output (with OMP_NUM_THREADS=4):
Default max threads: 4
Available processors: 8
Region 1 - Default: 4 threads
Region 2 - num_threads(2): 2 threads
Region 3 - omp_set_num_threads(6): 6 threads
Region 4 - num_threads(3) override: 3 threads
Usage & Best Practices
When to Use
- Adapting to different hardware configurations
- Testing scalability with varying thread counts
- Limiting threads for memory-constrained applications
- Oversubscription prevention in nested parallelism
Best Practices
- Use
num_threadsfor fine-grained control per region - Set
OMP_NUM_THREADSfor application-wide defaults - Query
omp_get_num_procs()to avoid oversubscription - Call
omp_get_num_threads()only from within parallel regions - Test performance with different thread counts to find optimal configuration
Common Mistakes
- Calling
omp_get_num_threads()outside parallel regions returns 1
Key Takeaways
Summary:
- Thread count control hierarchy:
num_threads>omp_set_num_threads()>OMP_NUM_THREADS> system default - Use
omp_get_num_threads()within parallel regions to query actual count - Use
omp_get_max_threads()outside parallel regions to query default - Proper thread management ensures optimal resource utilization
Quick Reference
Thread Count Precedence:
1. num_threads(N) clause [Highest]
2. omp_set_num_threads(N)
3. OMP_NUM_THREADS=N
4. System default [Lowest]
Query Functions:
omp_get_num_threads() // Inside parallel region
omp_get_max_threads() // Default for next region
omp_get_num_procs() // Available processors