A Case for Data Commons

Sep 10, 2016

A paper making the case for the the importance of data commons for supporting research.

An overview of the architecture of a data commons.

A new paper "A Case for Data Commons" recently appeared in the IEEE Journal Computing in Science and Engineering (CISE).

The arxiv version can be found here and the DOI is 10.1109/MCSE.2016.92.

In this paper, we define a "data commons" as cyberinfrastructure that co-locates data, storage and computing infrastructure with commonly used services and tools for analyzing and sharing data to create an interoperable resource for the research community. We describe six requirements for a data commons and the necessity that the services for a data commons scale out to the scale of a data center, which we sometimes view as a seventh requirement.

We also describe our experiences developing and operating data commons, beginning with the OCC-NASA Project Matsu in 2008, the Open Science Data Cloud in 2010, the Bionimbus Protected Data Cloud in 2014, the OCC-NOAA Data Commons in 2015, and the NCI Genomic Data Commons in 2016.