Limits...
MiMiR--an integrated platform for microarray data sharing, mining and analysis.

Tomlinson C, Thimma M, Alexandrakis S, Castillo T, Dennis JL, Brooks A, Bradley T, Turnbull C, Blaveri E, Barton G, Chiba N, Maratou K, Soutter P, Aitman T, Game L - BMC Bioinformatics (2008)

Bottom Line: The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication.MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia.The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Microarray Centre, MRC Clinical Sciences Centre and Imperial College, Hammersmith Hospital, London, UK. chris.tomlinson@imperial.ac.uk

ABSTRACT

Background: Despite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Microarray data Mining Resource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data.

Results: A user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package.

Conclusion: The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.

Show MeSH

Related in: MedlinePlus

Screen shots of the Curation and Annotation Tools. a: Visualisation using the Curation Tool of experimental information submitted online. An object model of the experiment is programmatically built and represented graphically, where nodes represent the experiment and biomaterials (organisms with a prefix BS_ and samples with a prefix BSM_), and treatments are represented by arcs. Nodes are colour-coded to represent different stages of sample preparation; for example, beige and pink nodes correspond to the extracted total RNA and hybridisation cocktail, respectively. The biomaterial information supplied by users via the Online Annotation Tool is automatically displayed in the Curation Tool and assigned with MGED Ontology terms as indicated by the prefix "MO:" (bottom table views). Where appropriate NCI Metathesaurus terms and accession numbers are also automatically assigned and indicated by the "NCI:" prefix. b: Table views of biosample details in the Annotation Tool. The table view enables rapid validation of detailed information including sample types, descriptions and names assigned to samples by users. Double clicking on an item in the table opens a pop-up window (insert) where more detailed ontology information can be viewed and edited.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2572073&req=5

Figure 2: Screen shots of the Curation and Annotation Tools. a: Visualisation using the Curation Tool of experimental information submitted online. An object model of the experiment is programmatically built and represented graphically, where nodes represent the experiment and biomaterials (organisms with a prefix BS_ and samples with a prefix BSM_), and treatments are represented by arcs. Nodes are colour-coded to represent different stages of sample preparation; for example, beige and pink nodes correspond to the extracted total RNA and hybridisation cocktail, respectively. The biomaterial information supplied by users via the Online Annotation Tool is automatically displayed in the Curation Tool and assigned with MGED Ontology terms as indicated by the prefix "MO:" (bottom table views). Where appropriate NCI Metathesaurus terms and accession numbers are also automatically assigned and indicated by the "NCI:" prefix. b: Table views of biosample details in the Annotation Tool. The table view enables rapid validation of detailed information including sample types, descriptions and names assigned to samples by users. Double clicking on an item in the table opens a pop-up window (insert) where more detailed ontology information can be viewed and edited.

Mentions: The experimental descriptions submitted by researchers via the Online Annotation Tool are programmatically extracted and assembled into an experiment by a specially designed Curation Tool. The Curation Tool is a Java application that uses an internal UML (Unified Modelling Language) object model to capture all the submitted information including details on experimental design, biosources and biosamples descriptions, compounds, protocols, treatment steps, user details and relevant publications, as well as the relationships between these entities. Automatically built experiment information is presented to annotators in a graphical form (Figure 2a) where nodes represent entities such as biosources, biosamples, treated biosamples, labelled extracts, hybridisation cocktails and scans, while arcs represent the actions required to move from one entity to another (i.e. treatment steps). MGED Ontology and NCI Metathesaurus terms are added to systematically describe certain experimental entities and can be viewed in the Curation Tool. Following creation of the experiment object model, the Annotation Tool is used to further annotate biomaterials and the relationships between them. The Annotation Tool is a Java application that displays information pertaining to biomaterials and hybridisations in a table view, enabling annotators to inspect subsets of data for consistency and accuracy, and to edit fields as appropriate (Figure 2b). MGED Ontology terms can be appended to experimental components using the existing MGED Ontology Viewer available through the Annotation Tool. A comprehensive user guide for the Annotation and Curation tools is available in Additional File 3.


MiMiR--an integrated platform for microarray data sharing, mining and analysis.

Tomlinson C, Thimma M, Alexandrakis S, Castillo T, Dennis JL, Brooks A, Bradley T, Turnbull C, Blaveri E, Barton G, Chiba N, Maratou K, Soutter P, Aitman T, Game L - BMC Bioinformatics (2008)

Screen shots of the Curation and Annotation Tools. a: Visualisation using the Curation Tool of experimental information submitted online. An object model of the experiment is programmatically built and represented graphically, where nodes represent the experiment and biomaterials (organisms with a prefix BS_ and samples with a prefix BSM_), and treatments are represented by arcs. Nodes are colour-coded to represent different stages of sample preparation; for example, beige and pink nodes correspond to the extracted total RNA and hybridisation cocktail, respectively. The biomaterial information supplied by users via the Online Annotation Tool is automatically displayed in the Curation Tool and assigned with MGED Ontology terms as indicated by the prefix "MO:" (bottom table views). Where appropriate NCI Metathesaurus terms and accession numbers are also automatically assigned and indicated by the "NCI:" prefix. b: Table views of biosample details in the Annotation Tool. The table view enables rapid validation of detailed information including sample types, descriptions and names assigned to samples by users. Double clicking on an item in the table opens a pop-up window (insert) where more detailed ontology information can be viewed and edited.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2572073&req=5

Figure 2: Screen shots of the Curation and Annotation Tools. a: Visualisation using the Curation Tool of experimental information submitted online. An object model of the experiment is programmatically built and represented graphically, where nodes represent the experiment and biomaterials (organisms with a prefix BS_ and samples with a prefix BSM_), and treatments are represented by arcs. Nodes are colour-coded to represent different stages of sample preparation; for example, beige and pink nodes correspond to the extracted total RNA and hybridisation cocktail, respectively. The biomaterial information supplied by users via the Online Annotation Tool is automatically displayed in the Curation Tool and assigned with MGED Ontology terms as indicated by the prefix "MO:" (bottom table views). Where appropriate NCI Metathesaurus terms and accession numbers are also automatically assigned and indicated by the "NCI:" prefix. b: Table views of biosample details in the Annotation Tool. The table view enables rapid validation of detailed information including sample types, descriptions and names assigned to samples by users. Double clicking on an item in the table opens a pop-up window (insert) where more detailed ontology information can be viewed and edited.
Mentions: The experimental descriptions submitted by researchers via the Online Annotation Tool are programmatically extracted and assembled into an experiment by a specially designed Curation Tool. The Curation Tool is a Java application that uses an internal UML (Unified Modelling Language) object model to capture all the submitted information including details on experimental design, biosources and biosamples descriptions, compounds, protocols, treatment steps, user details and relevant publications, as well as the relationships between these entities. Automatically built experiment information is presented to annotators in a graphical form (Figure 2a) where nodes represent entities such as biosources, biosamples, treated biosamples, labelled extracts, hybridisation cocktails and scans, while arcs represent the actions required to move from one entity to another (i.e. treatment steps). MGED Ontology and NCI Metathesaurus terms are added to systematically describe certain experimental entities and can be viewed in the Curation Tool. Following creation of the experiment object model, the Annotation Tool is used to further annotate biomaterials and the relationships between them. The Annotation Tool is a Java application that displays information pertaining to biomaterials and hybridisations in a table view, enabling annotators to inspect subsets of data for consistency and accuracy, and to edit fields as appropriate (Figure 2b). MGED Ontology terms can be appended to experimental components using the existing MGED Ontology Viewer available through the Annotation Tool. A comprehensive user guide for the Annotation and Curation tools is available in Additional File 3.

Bottom Line: The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication.MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia.The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.

View Article: PubMed Central - HTML - PubMed

Affiliation: Microarray Centre, MRC Clinical Sciences Centre and Imperial College, Hammersmith Hospital, London, UK. chris.tomlinson@imperial.ac.uk

ABSTRACT

Background: Despite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Microarray data Mining Resource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data.

Results: A user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package.

Conclusion: The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.

Show MeSH
Related in: MedlinePlus