[IPOL discuss] Python as a platform for reproducible research (... not!)

Miguel Colom colom at cmla.ens-cachan.fr
Wed Nov 20 11:16:33 CET 2013


Hi all,
this is the classic problem of software libraries changing their  
interfaces along the time and breaking the code that depends upon it.  
It's not related only to Python, but to any software using libraries.

Indeed, we have the same problem in IPOL with the C libraries. We  
decide that libraries are "stable" because the haven't changed the  
interfaces for a long time, but we can never be sure that this will  
remain true for the long term.

After all, this troubles have to do with software lifetime. All  
software need to be maintained and this isn't optional, but mandatory.  
If the software is not longer maintained, it's finished it's lifetime.  
After that point, it might be still executed and be useful, but  
without any warranty that it won't break, be incompatible with the  
newer interfaces of the components it relies on, or contain bugs that  
will never be fixed.

In IPOL we have already experimented this. With the new version of the  
compiler, some source codes won't compile. Therefore, the only  
solution if to maintain them. Even if the modifications are not  
performed by the authors, but by IPOL as updates and bug fixes.

To summarize, my opinion is that Python is an excellent tool for  
reproducible research and it's got no more problems with libraries  
than any other system. And of course, as a language is way far more  
powerful than C/C++ and allows fast prototyting. Of course, the  
problem is that, as any VM-based language, it's really slow executing  
operations out of the C/C++ compiled code of the libraries.

Best,
Miguel

Quoting Nicolas Limare <nicolas.limare at cmla.ens-cachan.fr>:
> Hi everyone,
>
> Here is an excerpt from a blog post by Kohnrad hinsen abour how Python
> is not a perfect solution for long-term software stability, even when
> only using the numerical computation Python package. The full text is
> at
> http://khinsen.wordpress.com/2013/11/19/python-as-a-platform-for-reproducible-research/
>
> 8<----------8<----------8<----------8<----------8<----------8<----------
>
> Python as a platform for reproducible research
>
> The other day I was looking at the release notes for the recently
> published release 1.8 of NumPy, the library that is the basis for most
> of the Scientific Python ecosystem. As usual, it contains a list of
> new features and improvements, but also sections such as ?dropped
> support? (for Python 2.4 and 2.5) and ?future changes?, to be
> understood as ?incompatible changes that you should start to prepare
> for?. [...]
>
> From the point of view of reproducible research, all these changes are
> bad news. They mean that libraries and scripts that work today will
> fail to work with future NumPy releases, in ways that their users, who
> are usually not the authors, cannot easily understand or fix. Actively
> maintained libraries will of course be adapted to changes in NumPy,
> but much, perhaps most, scientific software is not actively
> maintained. A PhD student doing computational reasearch might well
> publish his/her software along with the thesis, but then switch
> subjects, or leave research altogether, and never look at the old code
> again. There are also specialized libraries developed by small teams
> who don?t have the resources to do as much maintenance as they would
> like. [...]
>
> One popular attitude is to say: Just run old Python packages with old
> versions of Python, NumPy, etc. This is an option as long as the
> versions you need are recent enough that they can still be built and
> installed on a modern computer system. And even then, the practical
> difficulties of working with parallel installation of multiple
> versions of several packages are considerable [...]
>
> --
> Nicolas LIMARE
> http://nicolas.limare.net/                         pgp:0xFA423F4F
>




More information about the discuss mailing list