Max Planck Digital Library
Internal Blog

Reproducible Research

One of the most interesting sessions for me at berlin 6 was titled: “Open Data and Reproducible Research: Blurring the Boundaries between Research and Publication” and chaired by Mark Libermann.

In particular the first two talks by Sergey Fomel and Patrick Vandewalle resonated with me.

Both talked about how to ensure reproducibility in computational science – a field characterized in the third talk as a hybrid between the epistemological model of mathematics and the model of experimental
sciences.

When using computational methods there’s software involved, and to make the results reproducible the software has to be made available. We probably know where this is going: open source.

Now there’s repositories for research papers, and in the fourth talk about Earth System Science Data we learned about repositories for data like Pangaea. What about repositories for software? Obviously they do exist: sourcefourge and google code are just two eminent examples. But are these two what arxiv is to physics papers – the community service as opposed to the institutional service? Patrick Vandewalle said, he’d feel that sourceforge and google code would be adequate as repositories, but acknowledged that others might object to making the reproducibility of research dependent on the hosting policies of third parties.

So is there demand for institutional software repositories? Is this a mission for our underutilized hosting platform?

And should we come up with a – potentially OAI-ORE based – compound object description format of paper-data-code-packages to disseminate via – say – pubman – as was suggested in a question from Wolfram Horstmann?

One solution presented by the first speaker was to publish a frozen version of the code together with a paper, as is done with the Reproducible Research Repository.

But I would see the additional information of versioning history and changesets that a true software repository could provide as analogous to the method of research blogging as it was presented in yet another talk by Lilia Efimova: It provides more context about the genesis of ideas and hooks for the community to contribute.

In any case, I get more and more settled in my belief that URLs are the words in the new web language, they identify the things and concepts – the universe of discourse; and the more things are addressable on the web, the richer this language will be.

So here’s a nice mission statement for me: “Provide the scientific debate 2.0 with words!”

The president of the Linguistic Society of Americ, Stephen Anderson characterized in a different session the attitude of young researchers as “what’s not on my screen is not knowledge”. While this may have been somewhat sarcastic, I think it’s accurate.

So let’s put knowledge on screens.

As with any good conference, many questions, fewer answers, and some ideas to follow up on.

4 Responses to “Reproducible Research”

  1. Berlin 6 Open Access Conference » Wrapping up Berlin 6 Says:

    [...] MPDL Blog [...]

  2. Berlin 6 Open Access Conference at Pixeltje Blog Says:

    [...] Max Planck Digital Library Blog, by Robert Forkel [...]

  3. University blog hubs | CorpBlawg Says:

    [...] and mentioned the one maintained by the University of Cape Town. Thanks to Patrick Vandewalle and Robert Forkel, I’ve found a few more and compiled a list. Here it [...]

  4. MPDL-Internal Blog » Blog Archive » 25. DV Treffen der MPG Says:

    [...] source-code-hosting Service [...]

Leave a Reply