Here’s an Editorial from tomorrow’s Nature — link here — on the need for scientists to routinely record spatial data with samples, viral sequences, field observations, and other entities. It proposes a major change in the policies of journals and databases to mandate recording of such data as a prerequisite for having a scientific paper accepted. Feel free to use this blog’s comment facility to express your opinion on this, or email me at firstname.lastname@example.org.
During the research for this Editorial, Nature picked up considerable frustration from spatial scientists in many fields about the fact lack of spatial data in otherwise valuable datasets made them all but useless for more quantitative spatial analysis. For the sake of brevity and readability in this short article, we reduced the concept of spatial data to latitude and longitude, but clearly any working system would require more detailed spatial standards, depending on fields.
Nature 453, 2 (1 May 2008) | doi:10.1038/453002a; Published online 30 April 2008
A place for everything
More researchers must record the latitude and longitude of their data.
Who, what, where and when? Among the basic elements of scientific record-keeping, too often the ‘where?’ gets neglected. Now advances in satellite-positioning technology, online databases and geographical information systems offer opportunities to make good that neglect, and to add a much-needed spatial dimension to many types of biological research. Location data are essential for those modelling species’ responses to climate change, or the spread of viruses, for example. Failure to include spatial information from the get-go may close off potentially highly productive routes to analysis — including those not yet foreseen. But those data are frequently inadequate or absent.
Many museums and herbaria are trying to make good this problem as best they can, geo-referencing their collections and putting them online. This frequently requires nightmarish work translating place names from various historical eras, languages and conventions into latitudes and longitudes. Although this is a necessary evil in matters retrospective, going forward there is a much simpler and easier answer in the form of coordinates and a time-stamp taken from the Global Positioning System (GPS) at the point of capture, or any other specified point of relevance.
This technology means that there is now much less excuse for allowing spatial data to fall by the wayside simply because they are not relevant to the data collectors’ project in hand. Not only are the data easily collected, they are easily stored too. GenBank, for example, introduced fields for latitude and longitude in the metadata attached to its nucleotide sequence records in 2005. But few yet contain such information.
Gene sequence and structure databases have flourished in part because journals require authors to submit published data to them. It is worth considering a similar requirement that all samples in a published study be registered, along with GPS coordinates, in online databases such as the Global Biodiversity Information Facility. At the same time, it would behove spatial scientists to articulate to the broader research community the potential of recording and making accessible spatial data in the appropriate formats — and the painlessness of the process.