The International Barcode of Life Project (iBOL) will assemble a reference library composed of short, standardized gene sequence profiles (DNA barcodes) to enable the molecular identification of known species and facilitate the discovery of new ones. iBOL aims to make this reference library comprehensive and accessible and relevant to the needs of both the research community and end users in the public and private sectors.
The Ft. Lauderdale Conference on sharing data from large-scale biological research projects defined a community resource project as “a research project specifically devised and implemented to create a set of data, reagents or other material whose primary utility will be as a resource for the broad scientific community.” Members of the International Barcode of Life Project (iBOL) are constructing such a community resource and in accordance with the guidelines established at Ft. Lauderdale, iBOL is committed to rapid data release and sharing. This position is typical of most large-scale collaborations in genetics (e.g. the International HapMap Project) and mirrors the data release policies of organizations such as the National Human Genome Research Institute, Genome Canada, the Gordon and Betty Moore Foundation in the USA, and the Wellcome Trust in the UK.
The iBOL data release and resource sharing policy (as outlined below) seeks to accelerate the timely development of products that will benefit humankind by providing rapid access to the primary outputs from iBOL: DNA sequences associated with high quality meta-data including taxonomic assignments.
The working philosophy of the iBOL project is full release of data within 18 months of a sequence being produced. There is the expectation that this 18-month time period will be reduced as the project progresses, and from the outset more rapid data release is encouraged whenever practical.
The finer details of this data release policy and guidelines were developed for the Biodiversity Institute of Ontario, at the University of Guelph, and are outlined below. We recognise that different core facilities and major barcoding projects may wish to adapt these to reflect local funding requirements and constraints. Nevertheless participation in the iBOL project as a contributor towards building the community resource of the barcode reference library requires adherence to the general principles of the data release policy (full data release to a public sequence repository within 18 months, with more rapid data release encouraged whenever possible).
Sequences and samples processed at the Biodiversity Institute of Ontario, will have their data housed in the Barcode of Life Datasystems (BOLD). Data submitted to iBOL affiliated projects in BOLD will be made public in BOLD and transferred to GenBank for public release prior to user initiated publication. Data release will follow a two phase process:
Phase I will involve the quarterly release of all generated sequence data and high level taxonomic information. This early release is intended to liberate enough information to be useful to other researchers and to monitor progress in the growth of barcode records for each iBOL Working Group. It will be performed automatically and involve data that can be released following computerized quality checks and generation of Barcode Index Numbers (BINs; for animals). In detail, the following data will be released and made publically available in BOLD in Phase I (within three months of the sequence generation):
Phase II will involve the release of all additional data elements that require manual curatorial efforts and detailed taxonomic enquiry. Phase II will occur when manuscripts are submitted for publication or 18 months after generation of the barcode record, whichever is sooner. Some researchers have indicated their intent to support rapid release of all data elements even if the early versions of the release involve substantial errors in taxonomic assignment. However, these will be corrected through an ongoing update process. Phase II data includes the full release of the best available taxonomic information: preferably species, individual credited for the identification, and an indication of identification certainty.
The members of iBOL recognize that it is very important for all researchers, whether they are academic, government, or industry, graduate students, post-doctoral fellows, or professors, to be acknowledged for the data that have been generated and made available to the wider scientific community. This is important for several reasons, including the fact that publications and citations are globally recognized as performance measures for projects and for researchers. It is also important because researchers or others who wish to make use of those publicly released data benefit from knowing who generated those data, so that they may explore further partnership or collaboration opportunities, or seek further information, or they may have other data that would benefit the original researcher. Thus, it is important that the data that are released as described above, contain information about the submitter, and that the appropriate texts are included with those releases to encourage citations and acknowledgments. Users of the data should acknowledge the source.
There are other mechanisms whereby researchers may receive appropriate accreditation for early data release. Two such avenues that are strongly encouraged by iBOL members are: Project Description and Data Release Publications. These serve a multitude of purposes by (a) providing information for accreditation of data submitters, so that those researchers may be cited, (b) providing the iBOL team and the greater research and public community with opportunities to provide input and fresh data that can be used to refine and improve upon preliminary data, (c) providing opportunities for information exchange that can lead to new partnerships and new funding being leveraged, and (d) providing a forum for a "statement of intent", in which researchers can outline their intended use of the data under the terms of the Fort Lauderdale agreement, to avoid being "scooped" by other researchers using the data for the same purpose.
Project Description Publications are just that: predictive descriptions of large projects undertaken by teams such as iBOL. They may contain little or no data, or preliminary data, but contain a description of the project, its objectives, the team, methodologies that will be used, timelines and deliverables, and mechanisms and timing of data release. Examples exist for other community resource projects, for example see:
Project descriptions may also be purely web-based, for example see the OneKP (One thousand transcriptomes of plants) project
Data Release Publications are peer-reviewed publications that describe preliminary datasets within a project. It is recognized and stated within the publication that these are preliminary data that will be refined and further analyzed at later stages in the project, e.g. Hubert et al., 2008. Although the data and taxonomic identifications in that paper are relatively complete; additional validation derived from ongoing research in this group will provide much value to the scientific community and to the public data resource.
iBOL members consider all barcode data within BOLD a community resource to be shared publicly according to the terms and conditions outlined in this policy. There is no Intellectual Property associated with these data.
Generally, there are few privacy concerns associated with DNA barcode data. However, some restrictions may apply, for example, in the case of data associated with samples within the Human Pathogen Working Group. Any iBOL researcher who engages in collection of samples from human subjects is expected to comply with national and institutional ethical requirements, and all proper documentation must be submitted before start of specific sub-projects. Consistent with applicable privacy legislation or other policies, any data associated with such projects must be sufficiently anonymized such that no personal identification can be made.
As samples are collected to develop the iBOL DNA Barcode Library, some of these samples will be taken from ecologically fragile and sensitive areas. Some countries may have concerns about releasing information that may publicly identify those ecologically sensitive sites. For example, access to GPS data of orchid species within tropical forests could increase the risk to those species. If such a concern is raised, iBOL researchers will commit to holding such sensitive data confidential, or providing other means of anonymizing the data; GPS data are not mandatory.
Several members of the iBOL team are researchers within government departments and are mandated to monitor for invasive or harmful species within their country or region. Such departments and organizations have protocols that proscribe public release of information about invasive or otherwise harmful species. There may be concerns that early release of DNA barcode data, as described above, of an invasive species, may contravene government protocols. It is incumbent upon an iBOL researcher, whose first concern is with such protocols, to include an assessment of identification of invasive species in the QA/QC protocol, as described above. If preliminary data identify an invasive or harmful species, then those data must not be released until the restrictions or requirements of the specific government or department are satisfied. The Board of Directors of iBOL will be informed when such situations arise that limit the release of data.
All members of iBOL will adhere to this Policy. Any requests for extensions of timelines, advice on interpretation, other questions, will be directed to the iBOL Board of Directors.