I wrote an article with Josh Denny (Vanderbilt), David Glazer (Verily Life Sciences), Benedict Paten (University of California at Santa Cruz), and Anthony Philippakis (Broad Institute) about we are calling a data biosphere for biomedical research.

In this article, we introduce four governing principles for data biospheres: A data biosphere should be:

  1. modular, composed of functional components with well-specified interfaces;
  2. community-driven, created by many groups to foster a diversity of ideas;
  3. open, developed under open-source licenses that enable extensibility and reuse, with users able to add custom, proprietary modules as needed;
  4. and, standards-based, consistent with standards developed by coalitions such as the Global Alliance for Genomics and Health (GA4GH).
As in biological ecosystems, the health of the Data Biosphere will be measured by both its activity and its diversity.