Register      Login
Australian Systematic Botany Australian Systematic Botany Society
Taxonomy, biogeography and evolution of plants
RESEARCH ARTICLE (Open Access)

Towards a new online species-information system for legumes

Anne Bruneau https://orcid.org/0000-0001-5547-0796 A M , Leonardo M. Borges https://orcid.org/0000-0001-9269-7316 B , Robert Allkin C , Ashley N. Egan https://orcid.org/0000-0001-7803-4444 D L , Manuel de la Estrella https://orcid.org/0000-0002-4484-3566 E , Firouzeh Javadi F , Bente Klitgaard https://orcid.org/0000-0002-8509-0556 G , Joseph T. Miller https://orcid.org/0000-0002-5788-9010 H , Daniel J. Murphy https://orcid.org/0000-0002-8358-363X I , Carole Sinou https://orcid.org/0000-0002-6718-6669 A , Mohammad Vatanparast https://orcid.org/0000-0002-9644-0566 J and Rong Zhang K
+ Author Affiliations
- Author Affiliations

A Institut de Recherche en Biologie Végétale and Département de Sciences Biologiques, Université de Montréal, 4101 Sherbrooke Est, Montréal, QC, H1X 2B2, Canada.

B Universidade Federal de São Carlos, Departamento de Botânica, Rodovia Washington Luís, quilômetro 235, São Carlos, SP, 13565-905, Brazil.

C Biodiversity Informatics and Spatial Analysis Department, Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK.

D Department of Biosciences, Aarhus University, Ny Munkegade 116, DK-8000 Aarhus, Denmark.

E Departamento de Botánica, Ecología y Fisiología Vegetal, Campus de Rabanales, Universidad de Córdoba, E-14071, Córdoba, Spain.

F Institute of Decision Science for a Sustainable Society, Kyushu University, 744 Motooka, Nishiku, Fukuoka, 819-0395, Japan.

G Identification and Naming Department, Biodiversity Informatics and Spatial Analysis Department, Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK.

H Global Biodiversity Information Facility, 15 Universitetparken, DK-2100 Copenhagen, Denmark.

I Royal Botanic Gardens Victoria, Birdwood Avenue, Melbourne, Vic. 3004, Australia.

J Department of Geosciences and Natural Resource Management, Rolighedsvej 23, DK-1958 Frederiksberg C, University of Copenhagen, Denmark.

K Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, No.132 Lanhei Road, Kunming, 650201, PR China.

L Present address: Department of Biology, Utah Valley University, 800 W University Parkway, Orem, UT 84058, USA.

M Corresponding author. Email: anne.bruneau@umontreal.ca

Australian Systematic Botany 32(6) 495-518 https://doi.org/10.1071/SB19025
Submitted: 2 April 2019  Accepted: 25 July 2019   Published: 1 October 2019

Journal Compilation © CSIRO 2019 Open Access CC BY-NC-ND

Abstract

The need for scientists to exchange, share and organise data has resulted in a proliferation of biodiversity research-data portals over recent decades. These cyber-infrastructures have had a major impact on taxonomy and helped the discipline by allowing faster access to bibliographic information, biological and nomenclatural data, and specimen information. Several specialised portals aggregate particular data types for a large number of species, including legumes. Here, we argue that, despite access to such data-aggregation portals, a taxon-focused portal, curated by a community of researchers specialising on a particular taxonomic group and who have the interest, commitment, existing collaborative links, and knowledge necessary to ensure data quality, would be a useful resource in itself and make important contributions to more general data providers. Such an online species-information system focused on Leguminosae (Fabaceae) would serve useful functions in parallel to and different from international data-aggregation portals. We explore best practices for developing a legume-focused portal that would support data sharing, provide a better understanding of what data are available, missing, or erroneous, and, ultimately, facilitate cross-analyses and direct development of novel research. We present a history of legume-focused portals, survey existing data portals to evaluate what is available and which features are of most interest, and discuss how a legume-focused portal might be developed to respond to the needs of the legume-systematics research community and beyond. We propose taking full advantage of existing data sources, informatics tools and protocols to develop a scalable and interactive portal that will be used, contributed to, and fully supported by the legume-systematics community in the easiest manner possible.

Additional keywords: data exchange, data standards, genetic data, nomenclature, occurrence data, phylogenetic data, specialist data curation, taxonomic backbone, trait data.


References

Adey ME, Allkin R, Bisby FA, White RJ, Macfarlane TD (1984) The Vicieae database: an experimental taxonomic monograph. In ‘Databases in Systematics’. (Eds R Allkin, FA Bisby) Systematics Association Special Volume 26, pp. 175–188. (Academic Press: London, UK)

Afendi FM, Okada T, Yamazaki M, Hirai-Morita A, Nakamura Y, Nakamura K, Ikeda S, Takahashi H, Altaf-Ul-Amin M, Darusman LK, Saito K (2012) KNApSAcK family databases: integrated metabolite–plant species databases for multifaceted plant research. Plant & Cell Physiology 53, e1
KNApSAcK family databases: integrated metabolite–plant species databases for multifaceted plant research.Crossref | GoogleScholarGoogle Scholar |

Allkin R (1984) Handling taxonomic descriptions by computer. In ‘Databases in Systematics’. (Eds R Allkin, FA Bisby) Systematics Association Special Volume 26, pp. 263–278. (Academic Press: London, UK)

Allkin R, White RJ (1988) Data management models for biological classification. In ‘Classification and related methods of data analysis’. (Ed. HH Bock) pp. 653–402. (Elsevier: Amsterdam, Netherlands)

Allkin R, White RJ (1993) XDF Data exchange format. In ‘Advances in Computer Methods for Systematic Biology: Artificial Intelligence, Databases and Computer Vision’. (Ed. R Fortuner) pp. 474–475. (The John Hopkins University Press: Baltimore, MD, USA)

Allkin R, Winfield PJ (1993) Software development strategies for global plant information systems. In ‘Designs for a Global Plant Information System’. (Eds FA Bisby, GF Russell, RJ Pankhurst) pp. 304–318. (Academic Press: London, UK)

Allkin R, White RJ, Winfield PJ (1992) Handling the taxonomic structure of biological data. Mathematical and Computer Modelling 16, 1–9.
Handling the taxonomic structure of biological data.Crossref | GoogleScholarGoogle Scholar |

Angiosperm Phylogeny Group (2009) An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Botanical Journal of the Linnean Society 161, 105–121.
An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III.Crossref | GoogleScholarGoogle Scholar |

Antonelli A, Hettling H, Condamine FL, Vos K, Nilsson RH, Sanderson MJ, Sauquet H, Scharn R, Silvestro D, Töpel M, Bacon CD, Oxelman B, Vos RA (2017) Toward a self-updating platform for estimating rates of speciation and migration, ages, and relationships of taxa. Systematic Biology 66, 152–166.
Toward a self-updating platform for estimating rates of speciation and migration, ages, and relationships of taxa.Crossref | GoogleScholarGoogle Scholar | 27616324PubMed |

Banfield R, Lombardo CT, Wax T (2015) ‘Design Sprint: a Practical Guidebook for Building Great Digital Products.’ (O’Reilly Media, Inc.: Sebastopol, CA, USA)

Bennett D, Hettling H, Silvestro D, Zizka A, Bacon C, Faurby S, Vos R, Antonelli A (2018) phylotaR: an automated pipeline for retrieving orthologous DNA sequences from GenBank in R. Life 8, 20
phylotaR: an automated pipeline for retrieving orthologous DNA sequences from GenBank in R.Crossref | GoogleScholarGoogle Scholar |

Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2012) GenBank. Nucleic Acids Research 41, D36–D42.
GenBank.Crossref | GoogleScholarGoogle Scholar | 23193287PubMed |

Berendsohn WG, Güntsch A, Hoffmann N, Kohlbecker A, Luther K, Müller A (2011) Biodiversity information platforms: from standards to interoperability. ZooKeys 150, 71–87.
Biodiversity information platforms: from standards to interoperability.Crossref | GoogleScholarGoogle Scholar |

Binggeli P (1996) A taxonomic, biogeographical and ecological overview of invasive woody plants. Journal of Vegetation Science 7, 121–124.
A taxonomic, biogeographical and ecological overview of invasive woody plants.Crossref | GoogleScholarGoogle Scholar |

Bisby FA (1993) Botanical strategies for compiling a global plant checklist. In ‘Designs for a Global Plant Information System’. (Eds FA Bisby, GF Russell, G RJ Pankhurst) pp. 145–157. (Academic Press: London, UK)

Bisby FA (2000) The quiet revolution: biodiversity informatics and the internet. Science 289, 2309–2312.
The quiet revolution: biodiversity informatics and the internet.Crossref | GoogleScholarGoogle Scholar | 11009408PubMed |

Bisby FA, Buckingham J, Harborne JB (1994) ‘Phytochemical Dictionary of the Leguminosae.’ (Chapman & Hall: London, UK)

Bisby FA, Ruggiero MA, Roskov YR, Cachuela-Palacio M, Kimani SW, Kirk PM, Soulier-Perkins A, van Hertum J (2006) ‘Species 2000 & ITIS Catalogue of Life: 2006 Annual Checklist. CD-ROM, Species 2000.’ (University of Reading: Reading, UK)

Bowser A, Wiggins A, Shanley L, Preece J, Henderson S (2014) Sharing data while protecting privacy in citizen science. Interaction 21, 70–73.
Sharing data while protecting privacy in citizen science.Crossref | GoogleScholarGoogle Scholar |

Bridge PD, Roberts PJ, Spooner BM, Panchal G (2003) On the unreliability of published DNA sequences. New Phytologist 160, 43–48.
On the unreliability of published DNA sequences.Crossref | GoogleScholarGoogle Scholar |

Butler D (2006) Mashups mix data into global service. Nature 439, 6–7.
Mashups mix data into global service.Crossref | GoogleScholarGoogle Scholar | 16397468PubMed |

Buttigieg PL, Pafilis E, Lewis SE, Schildhauer MP, Walls RL, Mungall CJ (2016) The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation. Journal of Biomedical Semantics 7, 57
The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation.Crossref | GoogleScholarGoogle Scholar | 27664130PubMed |

Cantrill DJ (2018) The Australasian virtual herbarium: tracking data usage and benefits for biological collections. Applications in Plant Sciences 6, e1026
The Australasian virtual herbarium: tracking data usage and benefits for biological collections.Crossref | GoogleScholarGoogle Scholar | 29732257PubMed |

Cayuela L, Granzow-de la Cerda I, Albuquerque FS, Golicher JD (2012) Taxonstand: an R package for species names standardization in vegetation databases. Methods in Ecology and Evolution 3, 1078–1083.
Taxonstand: an R package for species names standardization in vegetation databases.Crossref | GoogleScholarGoogle Scholar |

Chard K, Dart E, Foster I, Shifflett D, Tuecke S, Williams J (2018) The modern research data portal: a design pattern for networked, data-intensive science. PeerJ – Computer Science 4, e144
The modern research data portal: a design pattern for networked, data-intensive science.Crossref | GoogleScholarGoogle Scholar |

Cicero C, Spencer CL, Bloom DA, Guralnick RP, Koo MS, Otegui J, Russell LA, Wieczorek JR (2017) Biodiversity informatics and data quality on a global scale. In ‘The Extended Specimen: Emerging Frontiers in Collections-based Ornithological Research. Studies in Avian Biology, number 50’. (Ed. MS Webster) pp. 201–218. (CRC Press: Boca Raton, FL, USA)

Conte MG, Gaillard S, Lanau N, Rouard M, Périn C (2008) GreenPhylDB: a database for plant comparative genomics. Nucleic Acids Research 36, D991–D998.
GreenPhylDB: a database for plant comparative genomics.Crossref | GoogleScholarGoogle Scholar | 17986457PubMed |

Costello M, Michener W, Gahegan M, Zhang Z-Q, Bourne P (2013) Biodiversity data should be published, cited, and peer reviewed. Trends in Ecology & Evolution 28, 454–461.
Biodiversity data should be published, cited, and peer reviewed.Crossref | GoogleScholarGoogle Scholar |

Dallwitz MJ (1993) DELTA and INTKEY. In ‘Advances in Computer Methods for Systematic Biology: Artificial Intelligence, Databases, Computer Vision’. (Ed. R Fortuner) pp. 287–296. (The Johns Hopkins University Press: Baltimore, MD, USA)

Dash S, Campbell J, Cannon E, Cleary A, Huang W, Kalberer S, Karingula V, Rice A, Singh J, Umale P, Weeks N, Wilkey A, Farmer A, Cannon S (2016) Legume information system (LegumeInfo.org): a key component of a set of federated data resources for the legume family. Nucleic Acids Research 44, D1181–D1188.
Legume information system (LegumeInfo.org): a key component of a set of federated data resources for the legume family.Crossref | GoogleScholarGoogle Scholar | 26546515PubMed |

Deans AR, Lewis SE, Huala E, Anzaldo SS, Ashburner M, Balhoff JP, Blackburn DC, Blake JA, Burleigh JG, Chanet B, Cooper LD, Courtot M, Csösz S, Cui H, Dahdul W, Das S, Dececchi TA, Dettai A, Diogo R, Druzinsky RE, Dumontier M, Franz NM, Friedrich F, Gkoutos GV, Haendel M, Harmon LJ, Hayamizu TF, He Y, Hines HM, Ibrahim N, Jackson LM, Jaiswal P, James-Zorn C, Köhler S, Lecointre G, Lapp H, Lawrence CJ, Le Novère N, Lundberg JG, Macklin J, Mast AR, Midford PE, Mikó I, Mungall CJ, Oellrich A, Osumi-Sutherland D, Parkinson H, Ramírez MJ, Richter S, Robinson PN, Ruttenberg A, Schulz KS, Segerdell E, Seltmann KC, Sharkey MJ, Smith AD, Smith B, Specht CD, Squires RB, Thacker RW, Thessen A, Fernandez-Triana J, Vihinen M, Vize PD, Vogt L, Wall CE, Walls RL, Westerfeld M, Wharton RA, Wirkner CS, Woolley JB, Yoder MJ, Zorn AM, Mabee P (2015) Finding our way through phenotypes. PLoS Biology 13, e1002033
Finding our way through phenotypes.Crossref | GoogleScholarGoogle Scholar | 25562316PubMed |

Delisle F, Lavoie C, Jean M, Lachance D (2003) Reconstructing the spread of invasive plants: taking into account biases associated with herbarium specimens. Journal of Biogeography 30, 1033–1042.
Reconstructing the spread of invasive plants: taking into account biases associated with herbarium specimens.Crossref | GoogleScholarGoogle Scholar |

Dressler S, Schmidt M, Zizka G (2014) Introducing African plants: a photo guide – an interactive photo data-base and rapid identification tool for continental Africa. Taxon 63, 1159–1161.
Introducing African plants: a photo guide – an interactive photo data-base and rapid identification tool for continental Africa.Crossref | GoogleScholarGoogle Scholar |

Faria SM, Lewis GP, Sprent JI, Sutherland JM (1989) Occurrence of nodulation in the Leguminosae. New Phytologist 111, 607–619.
Occurrence of nodulation in the Leguminosae.Crossref | GoogleScholarGoogle Scholar |

Fecher B, Friesike S, Hebing M (2015) What drives academic data sharing? PLoS One 10, e0118053
What drives academic data sharing?Crossref | GoogleScholarGoogle Scholar | 25714752PubMed |

Gardiner LM, Bachman SP (2016) The role of citizen science in a global assessment of extinction risk in palms (Arecaceae). Botanical Journal of the Linnean Society 182, 543–550.
The role of citizen science in a global assessment of extinction risk in palms (Arecaceae).Crossref | GoogleScholarGoogle Scholar |

Gewin V (2002) All living things, online. Nature 418, 362–363.
All living things, online.Crossref | GoogleScholarGoogle Scholar | 12140529PubMed |

Godfray HCJ (2002) Challenges for taxonomy. Nature 417, 17–19.
Challenges for taxonomy.Crossref | GoogleScholarGoogle Scholar |

Gonzales M, Archuleta E, Farmer A, Gajendran K, Grant D, Shoemaker R, Beavis W, Waugh M (2005) The legume information system (LIS): an integrated information resource for comparative legume biology. Nucleic Acids Research 33, D660–D665.
The legume information system (LIS): an integrated information resource for comparative legume biology.Crossref | GoogleScholarGoogle Scholar | 15608283PubMed |

Gunn CR (1984) Fruits and seeds of genera in the subfamily Mimosoideae (Fabaceae). Technical bulletin number 1681. USDA Agricultural Research Service, Washington, DC, USA.

Gunn CR (1991) Fruits and seeds of genera in the subfamily Caesalpinioideae (Fabaceae). Technical bulletin number 1755, USDA Agricultural Research Service, Washington, DC, USA.

Heaton L (2018) Introduction. In ‘La reconfiguration du travail scientifique en biodiversité, Pratiques amateurs et technologies numériques’. (Eds L Heaton, F Miller, PD da Silva, S Proulx) pp. 9–29. (Les Presses de l’Université de Montréal: Montréal, QC, Canada)

Hinchliff CE, Smith SA, Allman JF, Burleigh JG, Chaudhary R, Coghill LM, Crandall KA, Deng J, Drew BT, Gazis R, Gude K, Hibbett DS, Katz LA, Laughinghouse HD Hinchliff CE, Smith SA, Allman JF, Burleigh JG, Chaudhary R, Coghill LM, Crandall KA, Deng J, Drew BT, Gazis R, Gude K, Hibbett DS, Katz LA, Laughinghouse HD (2015) Synthesis of phylogeny and taxonomy into a comprehensive tree of life. Proceedings of the National Academy of Sciences of the United States of America 112, 12764–12769.
Synthesis of phylogeny and taxonomy into a comprehensive tree of life.Crossref | GoogleScholarGoogle Scholar | 26385966PubMed |

Hobern D, Apostolico A, Arnaud E, Bello JC, Canhos D, Dubois G, Field D, Alonso Garcia E, Hardisty A, Harrison J, Heidorn B, Krishtalka L, Mata E, Page RDM, Parr C, Price J, Willoughby S (2012) ‘Global Biodiversity Informatics Outlook: Delivering Biodiversity Knowledge in the Information Age.’ (Global Biodiversity Information Facility: Copenhagen, Denmark). 10.15468/6jxa-yb44

Hobern D, Baptiste B, Copas K, Guralnick R, Hahn A, van Huis E, Kim ES, McGeoch M, Naicker I, Navarro L, Noesgaard D, Price M, Rodrigues A, Schigel D, Sheffield CA, Wieczorek J (2019) Connecting data and expertise: a new alliance for biodiversity knowledge. Biodiversity Data Journal 7, e33679
Connecting data and expertise: a new alliance for biodiversity knowledge.Crossref | GoogleScholarGoogle Scholar | 30886531PubMed |

Hollis S, Brummitt R (1992) ‘World Geographical Scheme for Recording Plant Distributions. Plant Taxonomic Database Standards Number 2. International Working Group on Taxonomic Databases for Plant Sciences (TDWG).’ (Hunt Institute for Botanical Documentation: Pittsburgh, PA, USA)

Horai H, Arita M, Kanaya S, Nihei Y, Ikeda T, Suwa K, Ojima Y, Tanaka K, Tanaka S, Aoshima K, Oda Y, Kakazu Y, Kusano M, Tohge T, Matsuda F, Sawada Y, Hirai MY, Nakanishi H, Ikeda K, Akimoto N, Maoka T, Takahashi H, Ara T, Sakurai N, Suzuki H, Shibata D, Neumann S, Iida T, Tanaka K, Funatsu K, Matsuura F, Soga T, Taguchi R, Saito K, Nishioka T (2010) MassBank: a public repository for sharing mass spectral data for life sciences. Journal of Mass Spectrometry 45, 703–714.
MassBank: a public repository for sharing mass spectral data for life sciences.Crossref | GoogleScholarGoogle Scholar | 20623627PubMed |

Jolley-Rogers G, Varghese T, Harvey P, dos Remedios N, Miller JT (2014) Phylojive: integrating biodiversity data with the tree of life. Bioinformatics 30, 1308–1309.
Phylojive: integrating biodiversity data with the tree of life.Crossref | GoogleScholarGoogle Scholar | 24443378PubMed |

Joshi HJ, Hirsch-Hoffmann M, Baerenfaller K, Gruissem W, Baginsky S, Schmidt R, Schulze WX, Sun Q, van Wijk KJ, Egelhofer V, Wienkoop S, Weckwerth W, Bruley C, Rolland N, Toyoda T, Nakagami H, Jones AM, Briggs SP, Castleden I, Tanz SK, Millar AH, Heazlewood JL (2011) MASCP Gator: an aggregation portal for the visualization of Arabidopsis proteomics data. Plant Physiology 155, 259–270.
MASCP Gator: an aggregation portal for the visualization of Arabidopsis proteomics data.Crossref | GoogleScholarGoogle Scholar | 21075962PubMed |

Kattge J, Díaz S, Lavorel S, Prentice IC, Leadley P, Bönisch G, Garnier E, Westoby M, Reich PB, Wright IJ, Cornelissen JHC, Violle C, Harrison SP, Van Bodegom PM, Reichstein M, Enquist BJ, Soudzilovskaia NA, Ackerly DD, Anand M, Atkin O, Bahn M, Baker TR, Baldocchi D, Bekker R, Blanco C, Blonder B, Bond WJ, Bradstock R, Bunker DE, Casanoves F, Cavender-Bares J, Chambers JQ, Chapin FS, Chave J, Coomes D, Cornwell WK, Craine JM, Dobrin BH, Duarte L, Durka W, Elser J, Esser G, Estiarte M, Fagan WF, Fang J, Fernández-Méndez F, Fidelis A, Finegan B, Flores O, Ford H, Frank D, Freschet GT, Fyllas NM, Gallagher RV, Green WA, Gutierrez AG, Hickler T, Higgins S, Hodgson JG, Jalili A, Jansen S, Joly C, Kerkhoff AJ, Kirkup D, Kitajima K, Kleyer M, Klotz S, Knops JMH, Kramer K, Kühn I, Kurokawa H, Laughlin D, Lee TD, Leishman M, Lens F, Lenz T, Lewis SL, Lloyd J, Llusià J, Louault F, Ma S, Mahecha MD, Manning P, Massad T, Medlyn B, Messier J, Moles AT, Müller SC, Nadrowski K, Naeem S, Niinemets Ü, Nöllert S, Nüske A, Ogaya R, Oleksyn J, Onipchenko VG, Onoda Y, Ordoñez J, Overbeck G, Ozinga WA, Patiño S, Paula S, Pausas JG, Peñuelas J, Phillips OL, Pillar V, Poorter H, Poorter L, Poschlod P, Prinzing A, Proulx R, Rammig A, Reinsch S, Reu B, Sack L, Salgado-Negret B, Sardans J, Shiodera S, Shipley B, Siefert A, Sosinski E, Soussana J-F, Swaine E, Swenson N, Thompson K, Thornton P, Waldram M, Weiher E, White M, White S, Wright SJ, Yguel B, Zaehle S, Zanne AE, Wirth C (2011) TRY: a global database of plant traits. Global Change Biology 17, 2905–2935.
TRY: a global database of plant traits.Crossref | GoogleScholarGoogle Scholar |

Kirkbride JH Jr, Gunn CR, Weitzman AL (2003a) Fruits and seeds of genera in the subfamily Faboideae (Fabaceae), Vol. I. Technical bulletin number 1890, USDA Agricultural Research Service, Washington, DC, USA.

Kirkbride JH Jr, Gunn CR, Weitzman AL (2003b) Fruits and seeds of genera in the subfamily Faboideae (Fabaceae), Vol. II. Technical bulletin number 1890, USDA Agricultural Research Service, Washington, DC, USA.

Knapp J, Zeratsky J, Kowitz B (2016) ‘Sprint: How to Solve Big Problems and Test New Ideas in Just Five days.’ (Simon and Schuster: New York, NY, USA)

Kress WJ, Garcia-Robledo C, Soares JVB, Jacobs D, Wilson K, Lopez IC, Belhumeur PN (2018) Citizen science and climate change: mapping the range expansions of native and exotic plants with the mobile app Leafsnap. Bioscience 68, 348–358.
Citizen science and climate change: mapping the range expansions of native and exotic plants with the mobile app Leafsnap.Crossref | GoogleScholarGoogle Scholar |

Kumar N, Belhumeur PN, Biswas A, Jacobs DW, Kress WJ, Lopez IC, Soares JV (2012) Leafsnap: a computer vision system for automatic plant species identification. In ‘Computer Vision: ECCV 2012’. (Eds A Fitzgibbon, S Lazebnik, P Perona, Y Sato, C Schmid) pp. 502–516. (Springer: Berlin, Germany)

Lang PL, Willems FM, Scheepens JF, Burbano HA, Bossdorf O (2019) Using herbaria to study global environmental change. New Phytologist 221, 110–122.
Using herbaria to study global environmental change.Crossref | GoogleScholarGoogle Scholar | 30160314PubMed |

Lawler A (2001) Up for the count? Science 294, 769–770.
Up for the count?Crossref | GoogleScholarGoogle Scholar | 11679647PubMed |

Legume Phylogeny Working Group (2017) A new subfamily classification of the Leguminosae based on a taxonomically comprehensive phylogeny. Taxon 66, 44–77.
A new subfamily classification of the Leguminosae based on a taxonomically comprehensive phylogeny.Crossref | GoogleScholarGoogle Scholar |

Lewis GP, Schrire B, Mackinder B, Lock M (Eds) (2005) ‘Legumes of the World.’ (Royal Botanic Gardens, Kew: London, UK)

Li J, Dai X, Liu T, Zhao PX (2012) LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes. Nucleic Acids Research 40, D1221–D1229.
LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes.Crossref | GoogleScholarGoogle Scholar | 22110036PubMed |

Lock JM (1989) ‘Legumes of Africa: a Checklist.’ (Royal Botanic Gardens, Kew: London, UK)

Meineke EK, Davies TJ, Daru BH, Davis CC (2018) Biological collections for understanding biodiversity in the Anthropocene. Philosophical Transactions of the Royal Society of London – B. Biological Sciences 374, 20170386
Biological collections for understanding biodiversity in the Anthropocene.Crossref | GoogleScholarGoogle Scholar | 30455211PubMed |

Michener WK (2015) Ecological data sharing. Ecological Informatics 29, 33–44.
Ecological data sharing.Crossref | GoogleScholarGoogle Scholar |

Michonneau F, Brown JW, Winter DJ (2016) rotl: an R package to interact with the Open Tree of Life data. Methods in Ecology and Evolution 7, 1476–1481.
rotl: an R package to interact with the Open Tree of Life data.Crossref | GoogleScholarGoogle Scholar |

Miller MA, Schwartz T, Pickett BE, He S, Klem EB, Scheuermann RH, Passarotti M, Kaufman S, O’Leary MA (2015) A RESTful API for access to phylogenetic tools via the CIPRES Science Gateway. Evolutionary Bioinformatics Online 11, 43–48.
A RESTful API for access to phylogenetic tools via the CIPRES Science Gateway.Crossref | GoogleScholarGoogle Scholar | 25861210PubMed |

Miller JT, Pirzl R, Rosauer D, Jolley-Rogers G, Varghese T (2019) Phylolink: phylogenetically based profiling, visualisations and metrics for biodiversity. Bioinformatics 35, 1229–1230.
Phylolink: phylogenetically based profiling, visualisations and metrics for biodiversity.Crossref | GoogleScholarGoogle Scholar | 30202854PubMed |

Nelson G, Ellis S (2018) The history and impact of digitization and digital data mobilization on biodiversity research. Philosophical Transactions of the Royal Society of London – B. Biological Sciences 374, 20170391
The history and impact of digitization and digital data mobilization on biodiversity research.Crossref | GoogleScholarGoogle Scholar | 30455209PubMed |

Nelson G, Sweeney P, Gilbert E (2018) Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens. Applications in Plant Sciences 6, e1027
Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens.Crossref | GoogleScholarGoogle Scholar | 29732258PubMed |

O’Leary MA, Kaufman S (2011) MorphoBank: phylophenomics in the ‘cloud’. Cladistics 27, 529–537.
MorphoBank: phylophenomics in the ‘cloud’.Crossref | GoogleScholarGoogle Scholar |

Parr CL, Dunn RR, Sanders NJ, Weiser MD, Photakis M, Bishop TR, Fitzpatrick MC, Arnan X, Baccaro F, Brandão CR, Chick L, Donoso DA, Fayle TM, Gómez C, Grossman B, Munyai TC, Pacheco R, Retana J, Robinson A, Sagata K, Silva RR, Tista M, Vasconcelos H, Yates M, Gibb H (2017) GlobalAnts: a new database on the geography of ant traits (Hymenoptera: Formicidae). Insect Conservation and Diversity 10, 5–20.
GlobalAnts: a new database on the geography of ant traits (Hymenoptera: Formicidae).Crossref | GoogleScholarGoogle Scholar |

Penev L, Mietchen D, Chavan V, Hagedorn G, Smith V, Shotton D, Ó Tuama É, Senderov V, Georgiev T, Stoev P, Groom Q, Remsen D, Edmunds S (2017) Strategies and guidelines for scholarly publishing of biodiversity data. Research Ideas and Outcomes 3, e12431
Strategies and guidelines for scholarly publishing of biodiversity data.Crossref | GoogleScholarGoogle Scholar |

Poisot T, Bruneau A, Gonzalez A, Gravel D, Peres-Neto P (2019) Ecological data should not be so hard to find and reuse. Trends in Ecology & Evolution 34, 494–496.
Ecological data should not be so hard to find and reuse.Crossref | GoogleScholarGoogle Scholar |

Ratnasingham S, Hebert P (2007) BoLD: the barcode of life data system (http://www.barcodinglife.org). Molecular Ecology Notes 7, 355–364.
BoLD: the barcode of life data system (http://www.barcodinglife.org).Crossref | http://www.barcodinglife.org).&journal=Molecular Ecology Notes&volume=7&pages=355-364&publication_year=2007&author=S%20Ratnasingham&hl=en&doi=10.1111/j.1471-8286.2007.01678.x" target="_blank" rel="nofollow noopener noreferrer" class="reftools">GoogleScholarGoogle Scholar | 18784790PubMed |

Rees J, Cranston K (2017) Automated assembly of a reference taxonomy for phylogenetic data synthesis. Biodiversity Data Journal 5, e12581
Automated assembly of a reference taxonomy for phylogenetic data synthesis.Crossref | GoogleScholarGoogle Scholar |

Rosindell J, Harmon LJ (2012) OneZoom: a fractal explorer for the tree of life. PLoS Biology 10, e1001406
OneZoom: a fractal explorer for the tree of life.Crossref | GoogleScholarGoogle Scholar | 23091419PubMed |

Sanderson MJ, Boss D, Chen D, Cranston KA, Wehe A (2008) The PhyLoTA browser: processing GenBank for molecular phylogenetics research. Systematic Biology 57, 335–346.
The PhyLoTA browser: processing GenBank for molecular phylogenetics research.Crossref | GoogleScholarGoogle Scholar | 18570030PubMed |

Schuettpelz E, Frandsen PB, Dikow RB, Brown A, Orli S, Peters M, Metallo A, Funk VA, Dorr LJ (2017) Applications of deep convolutional neural networks to digitized natural history collections. Biodiversity Data Journal 5, e21139
Applications of deep convolutional neural networks to digitized natural history collections.Crossref | GoogleScholarGoogle Scholar |

Smith S, Walker J (2019) PyPHLAWD: a python tool for phylogenetic dataset construction. Methods in Ecology and Evolution 10, 104–108.
PyPHLAWD: a python tool for phylogenetic dataset construction.Crossref | GoogleScholarGoogle Scholar |

Soltis PS (2017) Digitization of herbaria enables novel research. American Journal of Botany 104, 1281–1284.
Digitization of herbaria enables novel research.Crossref | GoogleScholarGoogle Scholar | 29885238PubMed |

Soltis PS, Nelson G, James SA (2018) Green digitization: online botanical collections data answering real-world questions. Applications in Plant Sciences 6, e1028
Green digitization: online botanical collections data answering real-world questions.Crossref | GoogleScholarGoogle Scholar | 29732255PubMed |

Sprent JI (2001) ‘Nodulation in Legumes.’ (Royal Botanic Gardens, Kew: London, UK)

Stein LD (2003) Integrating biological databases. Nature Reviews – Genetics 4, 337–345.
Integrating biological databases.Crossref | GoogleScholarGoogle Scholar | 12728276PubMed |

Tedersoo L, Laanisto L, Rahimlou S, Toussaint A, Hallikma T, Pärtel M (2018) Global database of plants with root‐symbiotic nitrogen fixation: Nod DB. Journal of Vegetation Science 29, 560–568.
Global database of plants with root‐symbiotic nitrogen fixation: Nod DB.Crossref | GoogleScholarGoogle Scholar |

Tenopir C, Allard S, Douglass K, Aydinoglu AU, Wu L, Read E, Manoff M, Frame M (2011) Data sharing by scientists: practices and perceptions. PLoS One 6, e21101
Data sharing by scientists: practices and perceptions.Crossref | GoogleScholarGoogle Scholar | 21738610PubMed |

Unger J, Merhof D, Renner S (2016) Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification. BMC Evolutionary Biology 16, 248
Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification.Crossref | GoogleScholarGoogle Scholar | 27852219PubMed |

van Horn G, Mac Aodha O, Song Y, Shepard A, Adam H, Perona P, Belongie S (2017) The iNaturalist challenge 2017 dataset. Available at http://arxiv.org/abs/1707.06642 [Verified 31 May 2019]

van Kleunen M, Dawson W, Essl F, Pergl J, Winter M, Weber E, Kreft H, Weigelt P, Kartesz J, Nishino M, Antonova LA, Barcelona JF, Cabezas FJ, Cárdenas D, Cárdenas-Toro J, Castano N, Chacón E, Chatelain C, Ebel AL, Figueiredo E, Fuentes N, Groom QJ, Henderson L, Inderjit , Kupriyanov A, Masciadri S, Meerman J, Morozova O, Moser D, Nickrent DL, Patzelt A, Pelser PB, Baptiste MP, Poopath M, Schulze M, Seebens H, Shu WS, Thomas J, Velayos M, Wieringa JJ, Pysek P (2015) Global exchange and accumulation of non-native plants. Nature 525, 100
Global exchange and accumulation of non-native plants.Crossref | GoogleScholarGoogle Scholar | 26287466PubMed |

Vilgalys R (2003) Taxonomic misidentification in public DNA databases. New Phytologist 160, 4–5.
Taxonomic misidentification in public DNA databases.Crossref | GoogleScholarGoogle Scholar |

Wäldchen J, Mäder P (2018) Machine learning for image based species identification Methods in Ecology and Evolution 9, 2216–2225.
Machine learning for image based species identificationCrossref | GoogleScholarGoogle Scholar |

Weber A, Skog LE (2007) The genera of Gesneriaceae. Basic information with illustration of selected species. 2nd edn. Available at http://www.genera-gesneriaceae.at [Verified 31 May 2019]

Wheeler QD, Raven PH, Wilson EO (2004) Taxonomy: impediment or expedient? Science 303, 285
Taxonomy: impediment or expedient?Crossref | GoogleScholarGoogle Scholar | 14726557PubMed |

White RJ (1984) Implementing small database systems with specialised features. In ‘Databases in Systematics’. (Eds R Allkin, FA Bisby) Systematics Association Special Vol. 26, pp. 291–308. (Academic Press: London, UK)

White RJ, Allkin R (1992) Language for the definition and exchange of biological data sets. Mathematical and Computer Modelling 16, 199–223.
Language for the definition and exchange of biological data sets.Crossref | GoogleScholarGoogle Scholar |

White RJ, Allkin R, Winfield PJ (1993) Systematic databases: the Baobab design and the Alice system. In ‘Advances in Computer Methods for Systematic Biology: Artificial Intelligence, Databases, Computer Vision’. (Ed. R Fortuner) pp. 297–311. (Johns Hopkins University Press: Baltimore, MD, USA)

Wieczorek J, Döring M, De Giovanni R, Robertson T, Vieglais D (2009) Darwin Core, biodiversity information standards (TDWG). Available at http://rs.tdwg.org/dwc/ [Verified 31 May 2019]

Wilson EO (2000) A global biodiversity map. Science 289, 2279
A global biodiversity map.Crossref | GoogleScholarGoogle Scholar | 11041790PubMed |

Wilson EO (2003) The encyclopedia of life. Trends in Ecology & Evolution 18, 77–80.
The encyclopedia of life.Crossref | GoogleScholarGoogle Scholar |

Younis S, Weiland C, Hoehndorf R, Dressler S, Hickler T, Seeger B, Schmidt M (2018) Taxon and trait recognition from digitized herbarium specimens using deep convolutional neural networks. Botany Letters 165, 377–383.
Taxon and trait recognition from digitized herbarium specimens using deep convolutional neural networks.Crossref | GoogleScholarGoogle Scholar |

Zarucchi JL, Winfield PJ, Polhill RM, Hollis S, Bisby FA, Allkin R (1993) The ILDIS project on the world’s legume species diversity. In ‘Designs for a Global Plant Species Information system’. (Eds FA Bisby, RJ Pankhurst, GR Russell) pp. 131–144. (Oxford University Press: Oxford, UK)