Support

Support Options

Submit a Support Ticket

Building a Data Support Infrastructure for the Cancer Care Engineering OMIC Study

By Ann Christine Catlin

Purdue University

Category Teaching Materials
Abstract

OMIC profiles hold enormous promise for transforming our understanding of biological processes and revealing the underlying biological mechanisms, especially the functional and interactive networks contributing to complex biological systems. Rapid advances in large scale OMIC analyses have produced complex, multi-scale, high dimensional data with disparate granularity of resolution. Predictive patterns therein elucidate the key signatures of important changes in biological systems.

We will develop statistical models to integrate multiple types of OMIC data, permitting the transition from molecular biology to ‘modular biology’, where the functions of genes and their products can be examined via a systems approach. To manage the massive data, we will create an environment that enables scientists to work at the knowledge level by enabling the discovery process to harness their knowledge and analytical reasoning with seamless, guided semi-automated computerized analysis and exploration. This requires an appropriate interactive visualization and analysis environment that provides information for guided discovery based on novel, underlying data integration, analysis, and knowledge management solutions. We will bring the OMIC analysis laboratories, statistical modelers, and visual analysts together by developing a cyberinfrastructure that unifies the data acquisition, data synthesis and data analysis efforts, ensuring that the continuously evolving data and analysis resources that drive the discovery process are united in a shared research environment.

We will leverage existing HUBzeroTM technology, while building and integrating a powerful data framework with full support for the OMIC data lifecycle. Unification of data and analysis tools in a single, dynamic and evolutionary environment will result in transformational workflows that allow researchers to work at the knowledge level by taking advantage of continuously expanding knowledge profiles created for both data and tools. We believe our integrated approach for analyzing across OMIC platforms within a unified, collaborative environment will provide new insights that transform not only biological research, but many fields of scientific discovery that depend upon the advanced analysis and integration of massive data.

The success of the knowledge discovery process is therefore critically dependent on a cyberinfrastructure that unifies the data acquisition, data synthesis and data analysis research efforts. An effective infrastructure must support community-driven contribution and sharing of data and analysis resources, and, in addition, it must support the annotation, interaction, integration, tracking, feedback and validation of those resources. The unification of data and analysis tools in a single, dynamic and evolutionary environment will result in transformational workflows that allow researchers to work at the knowledge level instead of the process level, by taking advantage of continuously expanding knowledge profiles created for both data and tools. We have identified the following goals to achieve our vision for a transformational OMIC research and discovery environment.

  • Acquire and archive OMIC datasets from well developed protocols, with metadata for systematic annotation and tracking;
  • Create statistical modeling and visual analytic tools for synthesis and analysis of OMIC data;
  • Build and integrate a data support framework into the HUBzeroTM cyberinfrastructure, offering full support for the OMIC data lifecycle;
  • Support community-driven, community-shared deployment and integration of OMIC data, tools and workflows;
  • Utilize the developed cyberinfrastructure to extract predictive knowledge from OMIC data.

Our approach is to develop an infrastructure supporting analytic capabilities that enables the entire process from the acquisition of massive raw datasets, to the extraction and integration of relevant data necessary for analysis, to an interactive, integrated visual analytic environment. We believe our integrated approach for analyzing across OMIC platforms will provide new insights that transform not only biological research, but many fields of scientific discovery that depend upon the advanced analysis and integration of massive data.

Contributor Ann Christine Catlin
  • super-administrator
Cite this work

Researchers should cite this work as follows:

  • Ann Christine Catlin (2009), "Building a Data Support Infrastructure for the Cancer Care Engineering OMIC Study," http://ccehub.org/resources/246.

    BibTex | EndNote

Tags
  1. cceHUB
  2. data explorer
  3. data infrastructure
  4. data upload
  5. metadata
  6. metadata database