Biodiversity Knowledge Organization System
The GBIF Knowledge Organization System (KOS) task group (Catapano et al 2011) provided recommendations for the uptake of KOS technology by GBIF. The KOS report, as well as some of the previous task group reports on metadata (Jones et al 2010), persistent identifiers (Cryer et al 2010; Richards et al 2011), recommended to build on existing (or to establish new) persistent identifiers for each vocabulary term and concept. These reports further recommended to reuse existing terms and concepts wherever possible.
The Biodiversity Information Standards (TDWG) maintains standards for biodiversity data. Many of these standards have in the past been expressed using the XML schema language (XSD). With the advance of the semantic web there is a growing interest in TDWG for expressing vocabularies as RDF resources. We have proposed a Vocabulary Management Task Group (VoMaG) (Endresen et al 2012a) to develop best practices and guidelines for maintaining RDF vocabularies of terms and concepts from biodiversity informatics. Membership of the task group would mean to contribute to the evaluation of these best practices. Previous task groups of this sort has produced a technical report to summarize the results. One part of the tasks for the vocabulary management task group would be to evaluate software tools including the ISOcat and the Semantic MediaWiki for collaborative development and maintenance of vocabularies of basic concepts (declared here to be re-used by other resources). These resources are likely be maintained by a small group of people, and that they will be used by a much larger group of people. One of the uses for these basic concepts is as a repository of terms for the data sharing profiles in use by the GBIF network (Endresen et al 2012b). These data sharing profiles include the Darwin Core (Wieczorek et al 2012) "extensions" and the "vocabularies" of controlled values that are declared for some of the terms included in "extensions". The overall outline is that terms to be included in the "extensions" and in the "vocabularies" would be drawn from the basic concepts declared by the RDF vocabularies. The GBIF Vocabulary Server (Harman et al 2009) provides a collaborative software tool for the development of Darwin Core "extensions" and "vocabularies" of controlled values. The GBIF Vocabulary Server is based on the Scratchpad platform maintained by the ViBRANT project (Smith et al 2009).
A best practice guideline could further be to recommend the "reuse" of terms from a flat RDF vocabulary when building richer ontologies (OWL resources). From our current thinking the OWL ontologies would as a best practice guideline not provide the normative definition of the basic term, but rather an instance of the term based on the normative definition as provided by the flat RDF vocabulary resource. The rationale is that the more elaborate ontology (OWL) resources might be too complex for most users to consume and provide a bottleneck for the "reuse" of the terms. The rationale is further that it might be difficult to agree on ONE semantic ontology description of such concepts and that more than one such ontology resource might be appropriately developed to describe opinions declaring the richer semantics of the term. OWL ontologies could be published from an instance of the NCBO BioPortal platform.
We have proposed for the task group to convene at the GBIF community site http://community.gbif.org/pg/groups/21382/vocabulary-management/. Members of the task group would start by making a user profile at the GBIF Community site and join the Vocabulary Management group. The ISOcat demo http://kos.gbif.org/isocat/interface/ and the Semantic Wiki (this site) should also be open for users to start signing up.
The annual TDWG conference could provide an opportunity for some of the task group members to meet (http://www.tdwg.org/). However most of the work would be as contributions to the discussions and the evaluation of the software tools.
- Catapano T, Hobern D, Lapp H, Morris RA, Morrison N, Noy N, Schildhauer M, and Thau D (2011). Recommendations for the use of knowledge organization systems by GBIF. Released on 4 February 2011. Global Biodiversity Information Facility (GBIF), Copenhagen. Available at http://www.gbif.org/orc/?doc_id=2942&l=en, verified 26 March 2012.
- Cryer P, Hyam R, Miller C, Nicolson N, O Tuama E, Page R, Rees J, Riccardi G, Richards K, and White R (2010). Adoption of persistent identifiers for biodiversity informatics: Recommendations of the GBIF LSID GUID task group, 6. November 2009. Global Biodiversity Information Facility (GBIF), Copenhagen. Available at http://www.gbif.org/orc/?doc_id=2956&l=en, verified 26 March 2012.
- Endresen DTF, Ó Tuama É, and Remsen D. (2012a). Vocabulary Management Task Group Charter: A Task Group of the TAG Interest Group. [Technical Report] Available at http://community.gbif.org/pg/blog/read/21387/
- Endresen DTF, Ó Tuama É , and Remsen D (2012b). Biodiversity Knowledge Organization System: Proposed Architecture. [Technical Report] Available at http://community.gbif.org/pg/file/read/21582/
- Harman KT, Hyam R, Remsen DP (2009). Vocabularies - Managing Them. Proceedings of TDWG 2009 Available at http://www.tdwg.org/proceedings/article/view/605, verified 26 March 2012.
- Jones MB, Bertrand N, Holetschek J, Hutchison V, Ko BC-J, Suarez-Mayorga A, Meaux M, Ulate W, Watts D, Robertson T, O Tuama E (2009). Report of the GBIF metadata implementation framework task group (MIFTG). September 15, 2009. Global Biodiversity Information Facility (GBIF), Copenhagen. Available at: http://imsgbif.gbif.org/CMS_NEW/get_file.php?FILE=2d85d0e8c76408129024c09aa072d6, verified 26 March 2012.
- Lapp H, Morris RA, Catapano T, Hobern D, and Morrison N (2011). Organizing our knowledge of biodiversity. Bulletin of the American Society for Information Science and Technology 37(4): 38-42. DOI: 10.1002/bult.2011.1720370411
- Richards, K, White R, Nicolson N, Pyle R (2011). A beginner’s guide to persistent identifiers, version 1.0. Released on 9 February 2011. Global Biodiversity Information Facility (GBIF) Copenhagen. Available at http://www.gbif.org/orc/?doc_id=2428, verified 26 March 2012.
- Smith VS, Rycroft SD, Harman KT, Scott B, and Roberts D (2009). Scratchpads: a data-publishing framework to build, share and manage information on the diversity of life. BMC Bioinformatics 10 (Suppl 14) p. S6. DOI:10.1186/1471-2105-10-S14-S6. Available at http://www.biomedcentral.com/1471-2105/10/S14/S6, verified 26 March 2012.
- Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, Giovanni R, Robertson T, Vieglais D (2012). Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 7(1): e29715. DOI: 10.1371/journal.pone.0029715