Limits...
openBIS: a flexible framework for managing and analyzing complex data in biology research.

Bauch A, Adamczyk I, Buczek P, Elmer FJ, Enimanev K, Glyzewski P, Kohler M, Pylak T, Quandt A, Ramakrishnan C, Beisel C, Malmström L, Aebersold R, Rinn B - BMC Bioinformatics (2011)

Bottom Line: Ease of integration with data analysis pipelines and other computational tools is a key requirement for it.This framework can be extended and has been customized for different data types acquired by a range of technologies. openBIS is currently being used by several SystemsX.ch and EU projects applying mass spectrometric measurements of metabolites and proteins, High Content Screening, or Next Generation Sequencing technologies.The attributes that make it interesting to a large research community involved in systems biology projects include versatility, simplicity in deployment, scalability to very large data, flexibility to handle any biological data type and extensibility to the needs of any research domain.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biosystems Science and Engineering, Center for Information Sciences and Databases, Swiss Federal Institute of Technology (ETH) Zurich, Switzerland.

ABSTRACT

Background: Modern data generation techniques used in distributed systems biology research projects often create datasets of enormous size and diversity. We argue that in order to overcome the challenge of managing those large quantitative datasets and maximise the biological information extracted from them, a sound information system is required. Ease of integration with data analysis pipelines and other computational tools is a key requirement for it.

Results: We have developed openBIS, an open source software framework for constructing user-friendly, scalable and powerful information systems for data and metadata acquired in biological experiments. openBIS enables users to collect, integrate, share, publish data and to connect to data processing pipelines. This framework can be extended and has been customized for different data types acquired by a range of technologies.

Conclusions: openBIS is currently being used by several SystemsX.ch and EU projects applying mass spectrometric measurements of metabolites and proteins, High Content Screening, or Next Generation Sequencing technologies. The attributes that make it interesting to a large research community involved in systems biology projects include versatility, simplicity in deployment, scalability to very large data, flexibility to handle any biological data type and extensibility to the needs of any research domain.

Show MeSH
Data organization and metadata. Data are organized using entities and relations that are familiar to scientists. To organize metadata, the concepts of experiment, sample and dataset types have been introduced. Structured metadata will be assigned at the respective levels by using Property Types. Property Types are unique and can be reused and allow researchers to provide custom properties (or annotations) to experiments, samples and datasets.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3275639&req=5

Figure 2: Data organization and metadata. Data are organized using entities and relations that are familiar to scientists. To organize metadata, the concepts of experiment, sample and dataset types have been introduced. Structured metadata will be assigned at the respective levels by using Property Types. Property Types are unique and can be reused and allow researchers to provide custom properties (or annotations) to experiments, samples and datasets.

Mentions: To organize the stored data in a logical and transparent manner and to manage access privileges we created a hierarchical structure of the data. This was accomplished by using the entities i) Data Space, ii) Project, iii) Experiment, iv) Sample and v) Dataset (Figure 2). Permission rules are applied at the highest level, the data space. These rules determine what a user is allowed to see and which operations he is able to perform. A data space contains projects that group one or more related experiments. An experiment is an empirical approach to acquiring data. The experiment in turn typically contains at least one sample, the object being measured or observed in an experiment. A sample can have one or more datasets associated with it, where a dataset is a set of files containing the values of the actually measured or derived data. For example, the same microtiter plate (sample) being read twice by a microscope will result in two different datasets which both can be associated with a single sample. Such a hierarchical structure is a prerequisite for organizing larger collections of experimental datasets efficiently. For example, both raw data and processed data can be stored as individual datasets which in turn are linked to each other and to a sample or an experiment. The hierarchical data structure further is capable of establishing parent-child relationships between samples and between datasets.


openBIS: a flexible framework for managing and analyzing complex data in biology research.

Bauch A, Adamczyk I, Buczek P, Elmer FJ, Enimanev K, Glyzewski P, Kohler M, Pylak T, Quandt A, Ramakrishnan C, Beisel C, Malmström L, Aebersold R, Rinn B - BMC Bioinformatics (2011)

Data organization and metadata. Data are organized using entities and relations that are familiar to scientists. To organize metadata, the concepts of experiment, sample and dataset types have been introduced. Structured metadata will be assigned at the respective levels by using Property Types. Property Types are unique and can be reused and allow researchers to provide custom properties (or annotations) to experiments, samples and datasets.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3275639&req=5

Figure 2: Data organization and metadata. Data are organized using entities and relations that are familiar to scientists. To organize metadata, the concepts of experiment, sample and dataset types have been introduced. Structured metadata will be assigned at the respective levels by using Property Types. Property Types are unique and can be reused and allow researchers to provide custom properties (or annotations) to experiments, samples and datasets.
Mentions: To organize the stored data in a logical and transparent manner and to manage access privileges we created a hierarchical structure of the data. This was accomplished by using the entities i) Data Space, ii) Project, iii) Experiment, iv) Sample and v) Dataset (Figure 2). Permission rules are applied at the highest level, the data space. These rules determine what a user is allowed to see and which operations he is able to perform. A data space contains projects that group one or more related experiments. An experiment is an empirical approach to acquiring data. The experiment in turn typically contains at least one sample, the object being measured or observed in an experiment. A sample can have one or more datasets associated with it, where a dataset is a set of files containing the values of the actually measured or derived data. For example, the same microtiter plate (sample) being read twice by a microscope will result in two different datasets which both can be associated with a single sample. Such a hierarchical structure is a prerequisite for organizing larger collections of experimental datasets efficiently. For example, both raw data and processed data can be stored as individual datasets which in turn are linked to each other and to a sample or an experiment. The hierarchical data structure further is capable of establishing parent-child relationships between samples and between datasets.

Bottom Line: Ease of integration with data analysis pipelines and other computational tools is a key requirement for it.This framework can be extended and has been customized for different data types acquired by a range of technologies. openBIS is currently being used by several SystemsX.ch and EU projects applying mass spectrometric measurements of metabolites and proteins, High Content Screening, or Next Generation Sequencing technologies.The attributes that make it interesting to a large research community involved in systems biology projects include versatility, simplicity in deployment, scalability to very large data, flexibility to handle any biological data type and extensibility to the needs of any research domain.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biosystems Science and Engineering, Center for Information Sciences and Databases, Swiss Federal Institute of Technology (ETH) Zurich, Switzerland.

ABSTRACT

Background: Modern data generation techniques used in distributed systems biology research projects often create datasets of enormous size and diversity. We argue that in order to overcome the challenge of managing those large quantitative datasets and maximise the biological information extracted from them, a sound information system is required. Ease of integration with data analysis pipelines and other computational tools is a key requirement for it.

Results: We have developed openBIS, an open source software framework for constructing user-friendly, scalable and powerful information systems for data and metadata acquired in biological experiments. openBIS enables users to collect, integrate, share, publish data and to connect to data processing pipelines. This framework can be extended and has been customized for different data types acquired by a range of technologies.

Conclusions: openBIS is currently being used by several SystemsX.ch and EU projects applying mass spectrometric measurements of metabolites and proteins, High Content Screening, or Next Generation Sequencing technologies. The attributes that make it interesting to a large research community involved in systems biology projects include versatility, simplicity in deployment, scalability to very large data, flexibility to handle any biological data type and extensibility to the needs of any research domain.

Show MeSH