[IPOL discuss] possible bug in NL-means (ipol)

Sun Oct 14 13:38:28 CEST 2012

Yep, I tried to do an atomic operation that but I couldn't get it
work. The problem is that this sentence seems to be too complex to be
protected as an atomic operation? Or maybe I just did it wrong.

I also I tried to make only that line (232) critical, but the overhead
was too much (It's better to do only one whole critical block for the
whole loop since it seems that the overhead is in starting/ending a
critical block).

m

On Sun, Oct 14, 2012 at 1:32 PM, Miguel Colom <colom at cmla.ens-cachan.fr> wrote:
> Hi,
> I think the problem is at line 232:
>  fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>
> The fpODenoised is private for every thread, so it's not problematic.
>
> Perhaps instead of protecting the loop with a critical section, it'd be
> faster to just declare the fpO update at line 232 as an atomic operation:
>
> #pragma omp atomic
> fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>
> Best,
> Miguel
>
>
> Quoting Mauricio Delbracio <mdelbra at gmail.com>:
>
>> Dear all,
>>
>> I may have found a bug in the NL-means IPOL implementation. The
>> problem is due to parallelism, particularly when updating shared
>> values by more than one threading (race condition, race bug).
>>
>> In NL-means this happens in the filtered image (fpO) which is shared
>> and updated by all the threads. The image is decomposed in overlapped
>> patches that are denoised each of them separately, and then they are
>> aggregated in the final image (fpO). The problem is that when the
>> update of fpO  takes place, may happen that two overlapped patches
>> running in different threads try to update the same memory address.
>>
>> This not happen very frequently, since it is very unlikely that two
>> overlapped patches take the same amount of time to arrive at that part
>> of the code...The probability of this to happen increases if the
>> search block (the block where similar patches are searched) is very
>> small.
>>
>> To eliminate the bug, the way I found (a.k.a the easy way) is to
>> enclose the writing of fpO in a "omp critical" block. This means that
>> only one thread at at a time can write in fpO (there's a loss in
>> execution time, probably there other ways of avoiding this problem,
>> e.g. by using local memory blocks and then update at the end? i don't
>> know I'm not really familiar with OpenMP).
>>
>> I attach a possible patch to libdenoising.cpp.
>>
>> The block of code I refer is in libdenoising.cpp between lines 219-238
>>
>> http://www.ipol.im/pub/art/2011/bcm_nlm/srcdoc/libdenoising_8cpp_source.html
>>
>> I also attach an image showing the effect of the bug (parameters
>> bloc=1, win=1) and some pixels that I manually marked.
>>
>> I appreciate any feedback (there are surely many OpenMP experts  in
>> this mailing-list), so let me know what you think.
>>
>> best
>> m
>>
>
>
> _______________________________________________
> discuss mailing list
> discuss at list.ipol.im
> https://tools.ipol.im/mailman/listinfo/discuss