Medical advances facilitated by NSF-funded foundational research provide alternative to large, costly and cumbersome storage infrastructure.
May 15, 2013
The University of Chicago launched the first secure cloud-based computing system that enables researchers to access and analyze human genomic cancer information without the costly and cumbersome infrastructure normally needed to download and store massive amounts of data.
The Bionimbus Protected Data Cloud, as it is called, enables researchers who are authorized by the National Institutes of Health (NIH) to access and analyze data in The Cancer Genome Atlas (TCGA) without having to set up secure, compliant computing environments capable of managing and analyzing terabytes of data, download the data–which can take weeks–and then install the appropriate tools needed to perform the desired analyses.
Using technology that was developed in part by the Open Science Data Cloud, a National Science Foundation-supported project that is developing cloud infrastructure for large scientific datasets, the Bionimbus Protected Data Cloud provides researchers with a more cost- and time-effective mechanism to extract knowledge from massive amounts of data. Drawing insights from big data is imperative for addressing some of today’s most vexing environmental, health and safety challenges.
“The open source technology underlying the Open Science Data Cloud enables researchers to manage and analyze the large data sets that are essential to tackling some of today’s greatest challenges: from environmental monitoring to cancer genomics,” said Robert L. Grossman, the director of the Open Science Data Cloud Project and a professor at the University of Chicago.
Today, as the only NIH-approved cloud-based system for TCGA data, the Bionimbus Protected Data Cloud allows researchers to focus on the analysis of large-scale cancer genome sequencing, which experts believe can unlock paths to early detection, appropriate treatment and prevention of cancer.
“We are excited that the Bionimbus Protected Data Cloud is now used for cancer genomics data so that researchers can more easily work with large datasets to understand genomic variations that seem to be one of the keys to the precise diagnosis and treatment of cancer,” continued Grossman.
“With funding provided by NSF’s Partnerships for International Research and Education [PIRE] program, NSF has sought to narrow the gap between the capability of modern scientific instruments to produce data and the ability of researchers to access, manage, analyze and share those data in a reliable and timely manner,” said NSF Program Director Harold Stolberg.
“By embracing cloud computing as a global issue, this PIRE project brings together the expertise of many researchers, not only in the United States, but worldwide. Its success in helping researchers to access and analyze important human genomic cancer information is an exciting indicator of future developments with these technologies,” he said.
Megan McNerney, an instructor of pathology at the University of Chicago, used Bionimbus to analyze data that led to her discovery that gene CUX1, which acts as a tumor suppressor, is frequently inactivated in acute myeloid leukemia.
“Bionimbus was critical for my work, as it was used for all aspects of the project, including secure storage of protected data, quality control of next-generation sequencing results, alignments, expression analysis, and algorithm development,” she said. “The strength of Bionimbus, however, is the support that is provided for end users, which enabled both expert and non-expert team members to use the cloud.”