[IPOL discuss] possible bug in NL-means (ipol)
Miguel Colom
colom at cmla.ens-cachan.fr
Sun Oct 14 17:29:38 CEST 2012
Dear all,
Mauricio and I have found a way to modify the code to avoid the
critical sections (SO mutex, slow) and use atomic operations (hardware
lock, very fast).
The source code as published currently in IPOL is absolutely buggy.
So, to correct the code you have to put a "#pragma omp atomic" at the
beginning of lines 229 (fpCount[il]++) and another for line 232
(fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;) in
libdenoising.cpp
Best,
Miguel
Quoting Miguel Colom <Miguel.Colom at cmla.ens-cachan.fr>:
> Hi,
> I think the problem is at line 232:
> fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>
> The fpODenoised is private for every thread, so it's not problematic.
>
> Perhaps instead of protecting the loop with a critical section, it'd
> be faster to just declare the fpO update at line 232 as an atomic
> operation:
>
> #pragma omp atomic
> fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>
> Best,
> Miguel
>
> Quoting Mauricio Delbracio <mdelbra at gmail.com>:
>
>> Dear all,
>>
>> I may have found a bug in the NL-means IPOL implementation. The
>> problem is due to parallelism, particularly when updating shared
>> values by more than one threading (race condition, race bug).
>>
>> In NL-means this happens in the filtered image (fpO) which is shared
>> and updated by all the threads. The image is decomposed in overlapped
>> patches that are denoised each of them separately, and then they are
>> aggregated in the final image (fpO). The problem is that when the
>> update of fpO takes place, may happen that two overlapped patches
>> running in different threads try to update the same memory address.
>>
>> This not happen very frequently, since it is very unlikely that two
>> overlapped patches take the same amount of time to arrive at that part
>> of the code...The probability of this to happen increases if the
>> search block (the block where similar patches are searched) is very
>> small.
>>
>> To eliminate the bug, the way I found (a.k.a the easy way) is to
>> enclose the writing of fpO in a "omp critical" block. This means that
>> only one thread at at a time can write in fpO (there's a loss in
>> execution time, probably there other ways of avoiding this problem,
>> e.g. by using local memory blocks and then update at the end? i don't
>> know I'm not really familiar with OpenMP).
>>
>> I attach a possible patch to libdenoising.cpp.
>>
>> The block of code I refer is in libdenoising.cpp between lines 219-238
>> http://www.ipol.im/pub/art/2011/bcm_nlm/srcdoc/libdenoising_8cpp_source.html
>>
>> I also attach an image showing the effect of the bug (parameters
>> bloc=1, win=1) and some pixels that I manually marked.
>>
>> I appreciate any feedback (there are surely many OpenMP experts in
>> this mailing-list), so let me know what you think.
>>
>> best
>> m
>>
>
>
> _______________________________________________
> discuss mailing list
> discuss at list.ipol.im
> https://tools.ipol.im/mailman/listinfo/discuss
>
More information about the discuss
mailing list