[IPOL discuss] Python as a platform for reproducible research (... not!)

Pascal Monasse monasse at imagine.enpc.fr
Wed Nov 20 11:46:21 CET 2013


Hi Miguel,

I do not quite agree with your statements :
- C/C++ are languages that have passed international standards. That is why if 
your code respects  such standard, the compilers being compliant, you should 
never have to change a single line of code.
- Concerning the libraries, it is precisely the reason why IPOL has a very 
conservative set of allowed external libraries. These are libraries whose API 
is very stable.
- The changes we had to make with IPOL codes and newer versions of compilers 
where all related to the build procedure, not to the source code itself 
(correct me if I am wrong). For example, there was an important restriction in 
gcc 4.7 with respect to the order of link libraries (former versions were more 
forgiving). Such a kind of change does not require understanding the code 
itself.
- By contrast, python is still a young language and a moving target : consider 
the changes between python 2 and python 3 in the core language itself.
- Yes, python comes with "batteries included" and may be easier for 
prototyping an algorithm. But many of its libraries are still changing with 
often no care about API stability.

Of course, python is a very valuable tool, but I would not recommend it for 
reproducible code.

Best,
Pascal

On Wednesday, November 20, 2013 11:16:33 AM Miguel Colom wrote:
> Hi all,
> this is the classic problem of software libraries changing their
> interfaces along the time and breaking the code that depends upon it.
> It's not related only to Python, but to any software using libraries.
> 
> Indeed, we have the same problem in IPOL with the C libraries. We
> decide that libraries are "stable" because the haven't changed the
> interfaces for a long time, but we can never be sure that this will
> remain true for the long term.
> 
> After all, this troubles have to do with software lifetime. All
> software need to be maintained and this isn't optional, but mandatory.
> If the software is not longer maintained, it's finished it's lifetime.
> After that point, it might be still executed and be useful, but
> without any warranty that it won't break, be incompatible with the
> newer interfaces of the components it relies on, or contain bugs that
> will never be fixed.
> 
> In IPOL we have already experimented this. With the new version of the
> compiler, some source codes won't compile. Therefore, the only
> solution if to maintain them. Even if the modifications are not
> performed by the authors, but by IPOL as updates and bug fixes.
> 
> To summarize, my opinion is that Python is an excellent tool for
> reproducible research and it's got no more problems with libraries
> than any other system. And of course, as a language is way far more
> powerful than C/C++ and allows fast prototyting. Of course, the
> problem is that, as any VM-based language, it's really slow executing
> operations out of the C/C++ compiled code of the libraries.
> 
> Best,
> Miguel
> 
> Quoting Nicolas Limare <nicolas.limare at cmla.ens-cachan.fr>:
> > Hi everyone,
> > 
> > Here is an excerpt from a blog post by Kohnrad hinsen abour how Python
> > is not a perfect solution for long-term software stability, even when
> > only using the numerical computation Python package. The full text is
> > at
> > http://khinsen.wordpress.com/2013/11/19/python-as-a-platform-for-reproduci
> > ble-research/
> > 
> > 8<----------8<----------8<----------8<----------8<----------8<----------
> > 
> > Python as a platform for reproducible research
> > 
> > The other day I was looking at the release notes for the recently
> > published release 1.8 of NumPy, the library that is the basis for most
> > of the Scientific Python ecosystem. As usual, it contains a list of
> > new features and improvements, but also sections such as ?dropped
> > support? (for Python 2.4 and 2.5) and ?future changes?, to be
> > understood as ?incompatible changes that you should start to prepare
> > for?. [...]
> > 
> > From the point of view of reproducible research, all these changes are
> > bad news. They mean that libraries and scripts that work today will
> > fail to work with future NumPy releases, in ways that their users, who
> > are usually not the authors, cannot easily understand or fix. Actively
> > maintained libraries will of course be adapted to changes in NumPy,
> > but much, perhaps most, scientific software is not actively
> > maintained. A PhD student doing computational reasearch might well
> > publish his/her software along with the thesis, but then switch
> > subjects, or leave research altogether, and never look at the old code
> > again. There are also specialized libraries developed by small teams
> > who don?t have the resources to do as much maintenance as they would
> > like. [...]
> > 
> > One popular attitude is to say: Just run old Python packages with old
> > versions of Python, NumPy, etc. This is an option as long as the
> > versions you need are recent enough that they can still be built and
> > installed on a modern computer system. And even then, the practical
> > difficulties of working with parallel installation of multiple
> > versions of several packages are considerable [...]
> > 
> > --
> > Nicolas LIMARE
> > http://nicolas.limare.net/                         pgp:0xFA423F4F
> 
> --
> IPOL - Image Processing On Line   - http://ipol.im/
> 
> contact     edit at ipol.im          - http://www.ipol.im/meta/contact/
> news+feeds  twitter @IPOL_journal - http://www.ipol.im/meta/feeds/
> announces   announce at list.ipol.im - http://tools.ipol.im/mm/announce/
> discussions discuss at list.ipol.im  - http://tools.ipol.im/mm/discuss/
-- 
Pascal Monasse
IMAGINE, Ecole des Ponts ParisTech
Tel: 01 64 15 21 76


More information about the discuss mailing list