[IPOL discuss] possible bug in NL-means (ipol)

Sun Oct 14 15:51:43 CEST 2012

paI've tried to compile the code from IPOL and the first I get is a  
compilation error, because the Makefile seems to be wrong. It should  
be corrected as soon as possible!

So I created a new Makefile and then I've put the #pragma omp atomic  
before line 232. With the new Makefile it compiles and executes in a  
highly parallelized  without any significant loss of speed because of  
the atomic operation.

I copy below the Makefile I wrote to compile the program (only NL-means):

COPT      = -O99
CFLAGS    = $(COPT)
CSTRICT   = -Wall -Wextra -ansi
CXXOPT    = -O99
CXXFLAGS  = $(CXXOPT)
CXXSTRICT = -Wall -Wextra -ansi
LDFLAGS   = -lpng -lgomp -fopenmp
EXEC      = nl-means-miguel

default: $(EXEC)
all: $(EXEC)

# ------- C files -------
./mt19937ar.o: ./mt19937ar.c ./mt19937ar.h
	$(CC) $(CFLAGS) -c ./mt19937ar.c -o ./mt19937ar.o

./io_png.o: ./io_png.c ./io_png.h
	$(CC) $(CFLAGS) -c ./io_png.c -o ./io_png.o

# ------- C++ files -------
./libauxiliar.o: ./libauxiliar.cpp ./libauxiliar.h
	$(CXX) $(CXXFLAGS) -c ./libauxiliar.cpp -o ./libauxiliar.o

./nlmeans_ipol.o: ./nlmeans_ipol.cpp
	$(CXX) $(CXXFLAGS) $(LDFLAGS) -c ./nlmeans_ipol.cpp -o ./nlmeans_ipol.o

./libdenoising.o: ./libdenoising.cpp ./libdenoising.h
	$(CXX) $(CXXFLAGS) $(LDFLAGS) -c ./libdenoising.cpp -o ./libdenoising.o

# ------- Main -------
$(EXEC):  ./mt19937ar.o ./io_png.o ./libauxiliar.o ./nlmeans_ipol.o  
./libdenoising.o
	$(CXX)  ./mt19937ar.o ./io_png.o ./libauxiliar.o ./nlmeans_ipol.o  
./libdenoising.o $(LDFLAGS) -o $(EXEC)

lint:
	$(MAKE) CFLAGS="$(CFLAGS) $(CSTRICT)" CXXFLAGS="$(CXXFLAGS) $(CXXSTRICT)"

clean:
	rm -f *.o

distclean: clean
	rm -f $(EXEC)

Quoting Mauricio Delbracio <mdelbra at gmail.com>:

> Yep, I tried to do an atomic operation that but I couldn't get it
> work. The problem is that this sentence seems to be too complex to be
> protected as an atomic operation? Or maybe I just did it wrong.
>
> I also I tried to make only that line (232) critical, but the overhead
> was too much (It's better to do only one whole critical block for the
> whole loop since it seems that the overhead is in starting/ending a
> critical block).
>
> m
>
>
>
> On Sun, Oct 14, 2012 at 1:32 PM, Miguel Colom  
> <colom at cmla.ens-cachan.fr> wrote:
>> Hi,
>> I think the problem is at line 232:
>>  fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>>
>> The fpODenoised is private for every thread, so it's not problematic.
>>
>> Perhaps instead of protecting the loop with a critical section, it'd be
>> faster to just declare the fpO update at line 232 as an atomic operation:
>>
>> #pragma omp atomic
>> fpO[ii][il] += fpODenoised[ii][iindex] / fTotalWeight;
>>
>> Best,
>> Miguel
>>
>>
>> Quoting Mauricio Delbracio <mdelbra at gmail.com>:
>>
>>> Dear all,
>>>
>>> I may have found a bug in the NL-means IPOL implementation. The
>>> problem is due to parallelism, particularly when updating shared
>>> values by more than one threading (race condition, race bug).
>>>
>>> In NL-means this happens in the filtered image (fpO) which is shared
>>> and updated by all the threads. The image is decomposed in overlapped
>>> patches that are denoised each of them separately, and then they are
>>> aggregated in the final image (fpO). The problem is that when the
>>> update of fpO  takes place, may happen that two overlapped patches
>>> running in different threads try to update the same memory address.
>>>
>>> This not happen very frequently, since it is very unlikely that two
>>> overlapped patches take the same amount of time to arrive at that part
>>> of the code...The probability of this to happen increases if the
>>> search block (the block where similar patches are searched) is very
>>> small.
>>>
>>> To eliminate the bug, the way I found (a.k.a the easy way) is to
>>> enclose the writing of fpO in a "omp critical" block. This means that
>>> only one thread at at a time can write in fpO (there's a loss in
>>> execution time, probably there other ways of avoiding this problem,
>>> e.g. by using local memory blocks and then update at the end? i don't
>>> know I'm not really familiar with OpenMP).
>>>
>>> I attach a possible patch to libdenoising.cpp.
>>>
>>> The block of code I refer is in libdenoising.cpp between lines 219-238
>>>
>>> http://www.ipol.im/pub/art/2011/bcm_nlm/srcdoc/libdenoising_8cpp_source.html
>>>
>>> I also attach an image showing the effect of the bug (parameters
>>> bloc=1, win=1) and some pixels that I manually marked.
>>>
>>> I appreciate any feedback (there are surely many OpenMP experts  in
>>> this mailing-list), so let me know what you think.
>>>
>>> best
>>> m
>>>
>>
>>
>> _______________________________________________
>> discuss mailing list
>> discuss at list.ipol.im
>> https://tools.ipol.im/mailman/listinfo/discuss
> _______________________________________________
> discuss mailing list
> discuss at list.ipol.im
> https://tools.ipol.im/mailman/listinfo/discuss
>