CSIRO Publishing blank image blank image blank image blank imageBooksblank image blank image blank image blank imageJournalsblank image blank image blank image blank imageAbout Usblank image blank image blank image blank imageShopping Cartblank image blank image blank image You are here: Journals > Functional Plant Biology   
Functional Plant Biology
Journal Banner
  Plant Function & Evolutionary Biology
 
blank image Search
 
blank image blank image
blank image
 
  Advanced Search
   

Journal Home
About the Journal
Editorial Board
Contacts
Content
Online Early
Current Issue
Just Accepted
All Issues
Special Issues
Research Fronts
Reviews
Evolutionary Reviews
Sample Issue
For Authors
General Information
Notice to Authors
Submit Article
Open Access
For Referees
Referee Guidelines
Review Article
For Subscribers
Subscription Prices
Customer Service
Print Publication Dates

blue arrow e-Alerts
blank image
Subscribe to our Email Alert or RSS feeds for the latest journal papers.

red arrow Connect with us
blank image
facebook   youtube

red arrow PrometheusWiki
blank image
PrometheusWiki
Protocols in ecological and environmental plant physiology

 

Open Access Article << Previous     |         Contents Vol 39(11)

Data management pipeline for plant phenotyping in a multisite project

Kenny Billiau A , Heike Sprenger A , Christian Schudoma A , Dirk Walther A and Karin I. Köhl A B

A Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam OT Golm, Germany.
B Corresponding author. Email: koehl@mpimp-golm.mpg.de

Functional Plant Biology 39(11) 948-957 http://dx.doi.org/10.1071/FP12009
Submitted: 13 January 2012  Accepted: 22 June 2012   Published: 15 August 2012


 
 Full Text
 PDF (684 KB)
 Supplementary Material
 Export Citation
 Print
  
Abstract

In plant breeding, plants have to be characterised precisely, consistently and rapidly by different people at several field sites within defined time spans. For a meaningful data evaluation and statistical analysis, standardised data storage is required. Data access must be provided on a long-term basis and be independent of organisational barriers without endangering data integrity or intellectual property rights. We discuss the associated technical challenges and demonstrate adequate solutions exemplified in a data management pipeline for a project to identify markers for drought tolerance in potato. This project involves 11 groups from academia and breeding companies, 11 sites and four analytical platforms. Our data warehouse concept combines central data storage in databases and a file server and integrates existing and specialised database solutions for particular data types with new, project-specific databases. The strict use of controlled vocabularies and the application of web-access technologies proved vital to the successful data exchange between diverse institutes and data management concepts and infrastructures. By presenting our data management system and making the software available, we aim to support related phenotyping projects.

Additional keywords: controlled vocabulary, data integration, field trials, marker assisted selection, mixed schema design, ontologies.


References

Alshawi S, Saez-Pujol I, Irani Z (2003) Data warehousing in decision support for pharmaceutical R&D supply chain. International Journal of Information Management 23, 259–268.
CrossRef |

Bérard C, Cloutier LM, Cassivi L (2012) Evaluating clinical trial management systems: a simulation approach. Industrial Management & Data Systems 112, 146–164.

Cote R, Jones P, Apweiler R, Hermjakob H (2006) The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries. BMC Bioinformatics 7, 97

Dinu V, Nadkarni P (2007) Guidelines for the effective use of entity-attribute-value modeling for biomedical databases. International Journal of Medical Informatics 76, 769–779.
CrossRef |

Fabre J, Dauzat M, Negre V, Wuyts N, Tireau A, Gennari E, Neveu P, Tisne S, Massonnet C, Hummel I, Granier C (2011) PHENOPSIS DB: an information system for Arabidopsis thaliana phenotypic data in an environmental context. BMC Plant Biology 11, 77
CrossRef |

Finkel E (2009) With ‘Phenomics,’ plant scientists hope to shift breeding into overdrive. Science 325, 380–381.
CrossRef | CAS |

Gibson D, Harvey AJ, Everett V, Parmar MKB (1994) Is double data-entry necessary – the CHART trials. Controlled Clinical Trials 15, 482–488.
CrossRef | CAS |

Gollub J, Ball CA, Binkley G, Demeter J, Finkelstein DB, Hebert JM, Hernandez-Boussard T, Jin H, Kaloper M, Matese JC, Schroeder M, Brown PO, Botstein D, Sherlock G (2003) The Stanford microarray database: data access and quality assessment tools. Nucleic Acids Research 31, 94–96.
CrossRef | CAS |

Harnsomburana J, Green JM, Barb AS, Schaeffer M, Vincent L, Shyu CR (2011) Computable visually observed phenotype ontological framework for plants. BMC Bioinformatics 12, 260
CrossRef |

Hummel J, Selbig J, Walther D, Kopka J (2007) The Golm Metabolome Database: a database for GC-MS based metabolite profiling. Topics in Current Genetics 18, 75–95.
CrossRef | CAS |

Jaiswal P, Avraham S, Ilic K, Kellogg EA, McCouch S, Pujar A, Reiser L, Rhee SY, Sachs MM, Schaeffer M, Stein L, Stevens P, Vincent L, Ware D, Zapata F (2005) Plant ontology (PO): a controlled vocabulary of plant structures and growth stages. Comparative and Functional Genomics 6, 388–397.
CrossRef | CAS |

Kattge J, Ogle K, Bönisch G, Díaz S, Lavorel S, Madin J, Nadrowski K, Nöllert S, Sartor K, Wirth C (2011) A generic structure for plant trait databases. Methods in Ecology and Evolution 2, 202–213.
CrossRef |

Khatri P, Dr?aghici S (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595.
CrossRef | CAS |

Köhl KI, Basler G, Luedemann A, Selbig J, Walther D (2008) A plant resource and experiment management system based on the Golm Plant Database as a basic tool for omics research. Plant Methods 4, 11
CrossRef |

Lancashire PD, Bleiholder H, Van Den Boom T, Landgeluddeke P, Strauss R, Weber E, Witzenberger A (1991) A uniform decimal code for growth stages of crops and weeds. Annals of Applied Biology 119, 561–601.
CrossRef |

Li Y-F, Kennedy G, Ngoran F, Wu P, Hunter J (2011) An ontology-centric architecture for extensible scientific data management systems. Future Generation Computer Systems in press.

Marenco L, Tosches N, Crasto C, Shepherd G, Miller P, Nadkarni P (2003) Achieving evolvable web-database bioscience applications using the EAV/CR framework: recent advances. Journal of the American Medical Informatics Association 10, 444–453.
CrossRef |

Mungall CJ (2004) Obol: integrating language and meaning in bio-ontologies. Comparative and Functional Genomics 5, 509–520.
CrossRef | CAS |

Mungall C, Gkoutos G, Smith C, Haendel M, Lewis S, Ashburner M (2010) Integrating phenotype ontologies across multiple species. Genome Biology 11, R2
CrossRef |

Nadkarni PM, Marenco L, Chen R, Skoufos E, Shepherd G, Miller P (1999) Organization of heterogeneous scientific data using the EAV/CR representation. Journal of the American Medical Informatics Association 6, 478–493.
CrossRef | CAS |

Reynolds-Haertle RA, McBride R (1992) Single vs double data entry in CAST. Controlled Clinical Trials 13, 487–494.
CrossRef | CAS |

Riano-Pachon DM, Nagel A, Neigenfind J, Wagner R, Basekow R, Weber E, Mueller-Roeber B, Diehl S, Kersten B (2009) GabiPD: the GABI primary database – a plant integrative ‘omics’ database. Nucleic Acids Research 37, D954–D959.
CrossRef | CAS |

Richards RA, Rebetzke GJ, Watt M, Condon AG, Spielmeyer W, Dolferus R (2010) Breeding for improved water productivity in temperate cereals: phenotyping, quantitative trait loci, markers and the selection environment. Functional Plant Biology 37, 85–97.
CrossRef |

Sayers EW, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Madden TL, Maglott DR, Miller V, Mizrachi I, Ostell J, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Yaschenko E, Ye J (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Research 37, D5–D15.
CrossRef | CAS |

Sherry ST, Ward MH, Sirotkin K (1999) dbSNP – Database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Research 9, 677–679.

Smith CL, Goldsmith CA, Eppig JT (2004) The mammalian phenotype ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biology 6, R7
CrossRef |

Smith B, Ceusters W, Kohler J, Kumar A, Lomax J, Mungall CJ, Neuhaus F, Rector A, Rosse C (2005) Relations in biomedical ontologies. Genome Biology 6, R46
CrossRef |

Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, Leontis N, Rocca-Serra P, Ruttenberg A, Sansone S-A, Scheuermann RH, Shah N, Whetzel PL, Lewis S (2007) The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology 25, 1251–1255.
CrossRef | CAS |

Washington NL, Haendel MA, Mungall CJ, Ashburner M, Westerfield M, Lewis SE (2009) Linking human diseases to animal models using ontology-based phenotype annotation. PLoS Biology 7, e1000247
CrossRef |

Yamazaki Y, Jaiswal P (2005) Biological ontologies in rice databases. An introduction to the activities in gramene and oryzabase. Plant & Cell Physiology 46, 63–68.
CrossRef | CAS |

Zimmermann P, Hirsch-Hoffmann M, Hennig L, Gruissem W (2004) GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiology 136, 2621–2632.
CrossRef | CAS |


   
 
    
Legal & Privacy | Contact Us | Help

CSIRO

© CSIRO 1996-2013