[IPOL discuss] possible bug in NL-means (ipol)

Sun Oct 14 17:44:53 CEST 2012

Can you test it on the same image?
Best,
JM

On Sun, Oct 14, 2012 at 5:29 PM, Miguel Colom <
Miguel.Colom at cmla.ens-cachan.fr> wrote:

> Dear all,
> Mauricio and I have found a way to modify the code to avoid the critical
> sections (SO mutex, slow) and use atomic operations (hardware lock, very
> fast).
>
> The source code as published currently in IPOL is absolutely buggy.
>
> So, to correct the code you have to put a "#pragma omp atomic" at the
> beginning of lines 229 (fpCount[il]++) and another for line 232
> (fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;) in libdenoising.cpp
>
> Best,
> Miguel
>
>
> Quoting Miguel Colom <Miguel.Colom at cmla.ens-cachan.**fr<Miguel.Colom at cmla.ens-cachan.fr>
> >:
>
>  Hi,
>> I think the problem is at line 232:
>>  fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>>
>> The fpODenoised is private for every thread, so it's not problematic.
>>
>> Perhaps instead of protecting the loop with a critical section, it'd be
>> faster to just declare the fpO update at line 232 as an atomic operation:
>>
>> #pragma omp atomic
>> fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>>
>> Best,
>> Miguel
>>
>> Quoting Mauricio Delbracio <mdelbra at gmail.com>:
>>
>>  Dear all,
>>>
>>> I may have found a bug in the NL-means IPOL implementation. The
>>> problem is due to parallelism, particularly when updating shared
>>> values by more than one threading (race condition, race bug).
>>>
>>> In NL-means this happens in the filtered image (fpO) which is shared
>>> and updated by all the threads. The image is decomposed in overlapped
>>> patches that are denoised each of them separately, and then they are
>>> aggregated in the final image (fpO). The problem is that when the
>>> update of fpO  takes place, may happen that two overlapped patches
>>> running in different threads try to update the same memory address.
>>>
>>> This not happen very frequently, since it is very unlikely that two
>>> overlapped patches take the same amount of time to arrive at that part
>>> of the code...The probability of this to happen increases if the
>>> search block (the block where similar patches are searched) is very
>>> small.
>>>
>>> To eliminate the bug, the way I found (a.k.a the easy way) is to
>>> enclose the writing of fpO in a "omp critical" block. This means that
>>> only one thread at at a time can write in fpO (there's a loss in
>>> execution time, probably there other ways of avoiding this problem,
>>> e.g. by using local memory blocks and then update at the end? i don't
>>> know I'm not really familiar with OpenMP).
>>>
>>> I attach a possible patch to libdenoising.cpp.
>>>
>>> The block of code I refer is in libdenoising.cpp between lines 219-238
>>> http://www.ipol.im/pub/art/**2011/bcm_nlm/srcdoc/**
>>> libdenoising_8cpp_source.html<http://www.ipol.im/pub/art/2011/bcm_nlm/srcdoc/libdenoising_8cpp_source.html>
>>>
>>> I also attach an image showing the effect of the bug (parameters
>>> bloc=1, win=1) and some pixels that I manually marked.
>>>
>>> I appreciate any feedback (there are surely many OpenMP experts  in
>>> this mailing-list), so let me know what you think.
>>>
>>> best
>>> m
>>>
>>>
>>
>> ______________________________**_________________
>> discuss mailing list
>> discuss at list.ipol.im
>> https://tools.ipol.im/mailman/**listinfo/discuss<https://tools.ipol.im/mailman/listinfo/discuss>
>>
>>
>
> ______________________________**_________________
> discuss mailing list
> discuss at list.ipol.im
> https://tools.ipol.im/mailman/**listinfo/discuss<https://tools.ipol.im/mailman/listinfo/discuss>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tools.ipol.im/mailman/archive/discuss/attachments/20121014/d6abc8b7/attachment.html>