[IPOL discuss] Best Practices for Scientific Computing

Miguel Colom colom at cmla.ens-cachan.fr
Mon Oct 8 14:16:55 CEST 2012


I agree, you can hide the use of pointers, but trying to avoid the  
low-level characteristic of the language doesn't turns it into  
high-level.

Let's look at some real IPOL C/C++ source code. For example, the  
Quasi-Euclidean Epipolar Rectification article.

If we look, for example, at the MissStereo/src/stereoAC/main.cpp at  
function principalComponents, the code is the following:

/// Matrix mapping patch to principal components
static LWImage<float> principalComponents() {
#include "dataStereo/pca_basis.dat"
     int psize = sizeof(pca)/sizeof(*pca);
     int npcskept = sizeof(pca[0])/sizeof(*pca[0]);
     if(bIgnoreContrast)
         --npcskept;
     LWImage<float> PCs = alloc_image<float>(npcskept, psize);
     float* pc = PCs.data;
     for(int i=0; i < psize; i++)
         for(int j=0; j<npcskept; j++)
             *pc++ = pca[i][j+(bIgnoreContrast? 1: 0)];
     return PCs;
}

The line "int psize = sizeof(pca)/sizeof(*pca);" includes:
- Getting the size of a pointer.
- Dereferencing the pointer and computing its size.
- Make a division to know the size of the pointer.

I think this is quite low-level!
The same for "int npcskept = sizeof(pca[0])/sizeof(*pca[0]);"

Moreover, if we look carefully at the function, we see that it creates  
an object  PCs that is returned at the end. But inside a function,  
this local object should be destroyed after the return, and therefore  
it shouldn't be valid anymore outside the function.

It works because the function is declared to be "static", so the  
variables at the heap memory will be still allocated when it finished.

So, finally this function needs:
- Non-evident pointer arithmetic.
- Knowledge about how the heap memory works.
- Knowledge about the static functions and how they're stored in the memory.
- Dynamic memory allocation.
- Keep in mind that the allocated memory should be destroyed  
afterward, because of the leak of a garbage collector.

This function is absolutely correct and of course it work perfectly.  
But it's clear that you need some low-level details to understand it  
or try to modify it.

And I think that is very easy to find examples like that in most of  
the C/C++ code.

Best,
Miguel

Quoting Pascal Monasse <monasse at imagine.enpc.fr>:

> No, in many situations you do not have to manipulate pointers in C++. One
> example:
>
> #include <vector>
> #include <iostream>
> int main() {
> 	std::vector<int> vec;
> 	for(int i=0; i<10000; i++)
> 		vec.push_back(i);
> 	for(int i=0; i<vec.size(); i++)
> 		std::cout << vec[i] << std::endl;
> }
>
> It builds an array of 10000 integers and I do not have to care about
> allocation, delete, bound-checking...
>
> Pascal
>
> On Monday, October 08, 2012 12:07:19 PM Miguel Colom wrote:
>> Well, nowadays the terms "high-level" and "low-level" are relative and
>> they're redefined with the characteristics of the new languages.
>>
>> Of course, when C was developed, it was considered a very high-level
>> language, because the alternative was to use assembler code. Even
>> assembler was considered once high-level compared to coding directly
>> with ones and zeros using punched cards!
>>
>> It's true that it's possible to use compiler flags to check the bounds
>> of the arrays, but you have still to deal with memory pointers and
>> memory allocation.
>>
>> A bad pointer allocation in a compiled C/C++ can happen easily and
>> errors in the memory allocation are also common (a double free, a bad
>> pointer in the free, ...)
>> Since most operations involve pointers, there's plenty of situations
>> in which they can lead to bad behavior, sometimes in a silent and
>> difficult to trace way.
>>
>> Of course, there're compiler flags to check array bounds, like Pascal
>> mentioned,  and also to check the memory assignment/freeing/region
>> overlapping/etc, such as Valdrind and others.
>>
>> But, in my opinion, the very existence of these tools means that the
>> language is low-level. And even the most simple programs need to use
>> pointer arithmetic and memory allocation.
>>
>> However, I agree that for us in IPOL C/C++ is the best alternative,
>> for the reason that have been already discussed.
>>
>> Best,
>> Miguel
>>
>> Quoting Pascal Monasse <monasse at imagine.enpc.fr>:
>> > Hi all,
>> >
>> > I agree totally with Pascal, it is a good summary of the constraints we
>> > have in the choice of language.
>> >
>> > I also think that C++ permits high-level programming, depending on
>> > your choice
>> > of structures. For the most important structure, an array, you can use it
>> > as in C, with explicit memory allocation and deletion, but you can also
>> > use the vector class of STL, which has an optional bound checking (though
>> > the way to enable it is not in the standard):
>> > - automatic in debug mode under Windows with Visual C++
>> > - with compiler flag -D_GLIBCXX_DEBUG with gcc
>> > These options are not on by default, because you always have to choose:
>> > 1. bound-checking (slow, safe)
>> > 2. no bound-checking (fast, dangerous)
>> > Whatever the language, you cannot have fast+safe. It is true that most of
>> > the time option 2 is chosen, but the programmer *can* choose option 1. In
>> > this sense, C++ can be used as a high level language.
>> >
>> > Best,
>> > Pascal
>> >
>> > On Friday, October 05, 2012 08:12:53 PM Pascal Getreuer wrote:
>> >> Dear all,
>> >>
>> >> > Write code in the highest-level language possible.: Are we doing that?
>> >>
>> >> Reading between the lines, Jean-Michel's question highlights a
>> >> limitation on IPOL.  By requiring C/C++, we prevent authors from
>> >> choosing the highest-level language for the application (MATLAB,
>> >> Mathematica, Java, or Python, for example).  We are pretty much making
>> >> the choice for them, C++ notwithstanding.
>> >>
>> >> It would be really awesome if some day IPOL could accept code in more
>> >> languages.  There are challenges though.
>> >>
>> >> Foremost, languages should preferably be well-known, portable, and
>> >> free.  If a language is not free (MATLAB...) or not portable
>> >> (Microsoft VBA...), it locks out potential IPOL readers from using it.
>> >>
>> >>  And if the language is less well-known, it would be harder to find
>> >>
>> >> reviewers.
>> >>
>> >> There are also technical considerations.  Languages need to be stable
>> >> and efficient on the demo servers.  Specific maintenance to individual
>> >> programs is not possible.  This reminds me of the discussions we have
>> >> had about libraries :-\.
>> >>
>> >> So the answer is "no", we do not live up to the ideal of allowing "the
>> >> highest-level language possible."  It is still a pretty good balance
>> >> of the philosophies.  The Software Guidelines advises C/C++ by default
>> >> and to consider other languages only if they are ISO-standardized---I
>> >> agree this is the way to go.
>> >>
>> >> Best,
>> >> Pascal
>> >>
>> >> On Fri, Oct 5, 2012 at 4:23 PM, Miguel Colom <colom at cmla.ens-cachan.fr>
>> >
>> > wrote:
>> >> > Quoting Pierre Moulon <pmoulon at gmail.com>:
>> >> >> And for me C++ is already a High-Level language.
>> >> >> It offer high level programming option (concepts, traits, C++11:
>> >> >> functors,
>> >> >> lambda), but if you want you can go to the deep/dark side (intrinsic,
>> >> >> pointer arithmetic...).
>> >> >
>> >> > Well, for me it's not a high-level language, because:
>> >> > - The programmer has to deal with the calls of the memory
>> >> > allocation/freeing and pointers to actual memory zones are returned. -
>> >> > The references to the objects/data are just pointers that contain
>> >> > actual
>> >> > memory addresses.
>> >> >
>> >> > In my opinion, this is enough to say C/C++ is low-level.
>> >> >
>> >> > Think, for example, in the case of declaring an array of 100 positions.
>> >> > If you try to access position 123, a Python script would say that
>> >> > you're
>> >> > trying to access a position beyond the bounds of the array. In C/C++ it
>> >> > would cause a SegFault exception, in case it is detected by the OS. If
>> >> > the buffer overwrite does not produce a segment violation, it wouldn't
>> >> > be
>> >> > either detected.
>> >> >
>> >> > The majority of the errors in a C/C++ program are caused by trying to
>> >> > access memory not allocated. This gives an idea of the low-level or how
>> >> > close the language is to the CPU.
>> >> >
>> >> > Indeed, it's not casual that the Linux kernel and device drivers are
>> >> > written in C, since it's very low-level.
>> >> >
>> >> > Best,
>> >> > Miguel
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > discuss mailing list
>> >> > discuss at list.ipol.im
>> >> > https://tools.ipol.im/mailman/listinfo/discuss
>> >>
>> >> _______________________________________________
>> >> discuss mailing list
>> >> discuss at list.ipol.im
>> >> https://tools.ipol.im/mailman/listinfo/discuss
>> >
>> > --
>> > Pascal Monasse
>> > IMAGINE, Ecole des Ponts ParisTech
>> > Tel: 01 64 15 21 76
>> > _______________________________________________
>> > discuss mailing list
>> > discuss at list.ipol.im
>> > https://tools.ipol.im/mailman/listinfo/discuss
>>
>> _______________________________________________
>> discuss mailing list
>> discuss at list.ipol.im
>> https://tools.ipol.im/mailman/listinfo/discuss
> --
> Pascal Monasse
> IMAGINE, Ecole des Ponts ParisTech
> Tel: 01 64 15 21 76
> _______________________________________________
> discuss mailing list
> discuss at list.ipol.im
> https://tools.ipol.im/mailman/listinfo/discuss
>




More information about the discuss mailing list