[IPOL discuss] Nested openMP loops
Miguel Colom
colom at cmla.ens-cachan.fr
Tue Jun 11 10:07:51 CEST 2013
Dear all,
We've detected some problems with demos that use nested loops with OpenMP.
Using a nested loop with openMP is always a bad idea since the outer
loop will create as many threads as the inner loop iterations.
For example:
#pragma openmp parallel for
for = 1..10
(...)
#pragma openmp parallel for
for j = 1..10
()...
This would create 10*10=100 threads if we configure OpenMP to allow
nested loops. Of course, it doesn't make any sense since our server
has just 32 cores.
Moreover, if the inner iterations execute fast, it might happen that
the execution time of the iteration is less than the cost of
assigning/releasing a thread.
In general, a good solution is to rewrite to code to use a single
parallelized loop and precomputed indices. This way openMP can control
that the number of threads it uses is related to the number of cores.
And the performance is maximum.
Best,
Miguel
More information about the discuss
mailing list