|
PhytoKeys 2012
From text to structured data: Converting a word-processed floristic checklist into Darwin Core Archive formatKeywords: Data mining , taxonomic checklists , Darwin Core Archive Abstract: The paper describes a pilot project to convert a conventional floristic checklist, written in a standard word processing program, into structured data in the Darwin Core Archive format. After peer-review and editorial acceptance, the final revised version of the Checklist was converted into Darwin Core Archive by means of regular expressions and published thereafter in both human-readable form as traditional botanical publication and Darwin Core Archive data files. The data were published and indexed through the Global Biodiversity Information Facility (GBIF) Integrated Publishing Toolkit (IPT) and significant portions of the text of the paper were used to describe the metadata on IPT. After publication, the data will become available through the GBIF infrastructure and can be re-used on their own or collated with other data.
|