Open Source, Open Science

August 30th, 2006

At Sci Foo Brian Behlendorf and I hosted a session about how the lessons learned from the open source software experience might be applicable to scientific endeavors. The hope is that we can support the “open” movement in science as well. By “open” I mean a system where effort and resources are pooled and the result shared. This is in contrast to an increasing focus on what’s “my intellectual property, how can I best protect that intellectual property, use it to create a closed system and then extract the most value for me from that closed system.”

We ended up with a list of things that are different about the life sciences that make it difficult to transport open source software methods wholesale. These are listed below in no particular order. I use “science” here to cover the range of topics, although it feels a bit basic. The real value is in trying to figure out how to alleviate some of these problems. I haven’t tried to do that here; rather I’m trying to start a list of the various issues.

  1. A lot of scientific effort is expensive. It’s hard to work in many areas without being tied to an institution that provides the equipment, the labs and other necessary support. This greatly reduces an individual’s ability to break out of the standard way of doing things.
  2. A lot of scientific efforts require long periods of outlays before getting meaningful results — it’s harder to find incremental projects that can demonstrate value (whether economic or social) quickly.
  3. It’s much more difficult to “scratch one’s own itch.” Someone choosing to work in many scientific fields is unlikely to be solving his or her own immediate problem. The result may be years away, unknown, and not directly applicable to his or her own life. This is quite different from software development, where many people get involved to fix something that is bugging their daily experiences.
  4. There’s no accepted set of free and unencumbered tools and building block for the life sciences. This problem was raised by Richard Jefferson of, who notes that the technologies used to pursue the scientific process are encumbered by patents in such a way that the end result is hard (or impossible) to use and share freely. It’s as if a patent on a compiler (or all compilers) applied to any code that had been compiled. Richard’s pithy summation of this problem is: “there’s no LAMP stack.” (Thanks to Richard for permission to attribute this to him, which is required under the Chatham House Rule under which SciFoo operated.)
  5. There’s already a recognition system in place through the peer-reviewed journals. This mechanism has a variety of problems itself and may be due for change. But even so, there is an accepted review, recognition and advancement system for the sciences outside of collaboration.
  6. Collaboration often needs to occur between institutions rather than individuals. This makes it harder to get started than simply having a few people decide to try something.

One comment for "Open Source, Open Science"

  

    Jean-Claude Bradley said on September 4th, 2006 at 2:53 am:

    Addressing your point #2, using Web2.0 technologies enables the sharing of raw scientific data, including failed experiments, at little or no cost to the author and reader. This means that the smallest unit of meaningful scientific contribution can be much smaller than a full research article and can be available the same day that the experiment was performed. There is a huge potential here for speeding up the progress of science this way, even if relatively few scientists engage in this mode of communication.

