Paleonet: An Open Letter in Support of Digital Data Archiving

Jere Lipps jlipps at berkeley.edu
Sun Mar 27 14:04:59 GMT 2011


Ross:

First, I agree with your efforts to archive data and to make it 
available to all.  I've been involved with this in several ways for a 
long time, and have come to identify the economics of this as a 
significant obstacle.

Thank you for the reference to Brooks Hanson's editorial on this in 
Science--I had not seen it.  My reference referred to comments he made 
last Fall at a special seminar at Berkeley.   In his editorial he more 
mildly states that: "However, online supplements have too often become 
unwieldy, and journals are not equipped to curate huge data sets. For 
very large databases without a plausible home, we have therefore 
required authors to enter into an archiving agreement, in which the 
author commits to archive the data on an institutional Web site, with a 
copy of the data held at Science. But such agreements are only a stopgap 
solution; more support for permanent, community-maintained archives is 
badly needed."  Much of this is about Science's new effort, but the last 
statement refers to the problem that I noted  but which Brooks now 
writes as "badly needed".   In his seminar, he added that funding to fix 
this need was very uncertain and that some existing data repositories 
were in financial distress already.  I interpreted this as a "bleak 
future" [Brooks may not have used that phrase, I don't remember, but 
that was my view after listening to him.]  This is a major problem.  I 
cc him on this in case he might clarify any of this for PaleoNet readers.

I cannot accept your analysis of the charges for this because I have no 
knowledge of the costs not only of depositing the data but maintaining 
the servers and databases.  It's quite possible that doing it right will 
require hiring people, buying large capacity servers, etc.  A $25 charge 
to authors might pay to get it deposited but how much is required for 
the future of the data.  I don't know.  Perhaps some readers of PaleoNet 
might know or know someone who does.

  Science magazine's suggestion to temporarily use local institutional 
resources is not sustainable either, especially in the current financial 
crises our universities and colleges find themselves in.   Even in the 
best of times, universities never supported this kind of activity, the  
faculty instead generating the support through grants and gifts.  That's 
the way the we put our UCMP data on-line--Sun Microsystems donated the 
servers and our grad students and staff, paid by UCMP funds, populated 
the database..

Along with your admirable personal efforts to get data archives 
functioning, we need to figure out how to pay for it realistically and 
then how to get the money.  It will come in several ways, I'm sure.  
Perhaps your efforts should happen first to demonstrate the desires, 
needs and aims of our community, then we'd have a foundation for funding 
requests.   Clearly if we don't make an issue of this we'll never get 
it.   So I too encourage everyone to sign your petition.

Jere


On 3/27/2011 8:38 AM, rcpm20 at bath.ac.uk wrote:
> Dear Jere,
>
> Many thanks for taking the time to read and comment on our Open Letter.
> We encourage every palaeontologist and everyone with an interest in 
> palaeontology to do the same so that all our views are represented in 
> one place for all to see.
>
> We believe data archiving is 'pushed' by the NSF and increasingly all 
> other funding bodies, all over the world, because there are a 
> multitude of compelling reasons to do so, as we have briefly outlined 
> in our Open Letter (http://supportpalaeodataarchiving.co.uk/).
>
> Yes, we acknowledge that there is a monetary cost associated with data 
> archiving, and it is an ongoing cost. But in my opinion, relative to 
> the potential benefits reaped (more science, more synthesis, more 
> transparency, more 'reproduceability'...) it will certainly be 
> well-worth it as evaluated by any considered cost/benefit analysis.
>
> To quantify this, let us compare the 'page costs' of publishing a 
> paper with the cost of archiving the associated data with Dryad. I'm 
> led to believe that Dryad currently charges ~25USD per article to 
> archive data, charged to the journal (pers. comm. Todd Vision) - a 
> reasonable and wise request to ensure the long-term future of the 
> availability of the data. Page costs vary from journal to journal:  
> $55 per page (Evolution), $100 per page (Paleobiology), $195 per page 
> (PNAS), $1300 (total, PLoS ONE), with the notable exception of Zookeys 
> which can be *free* provided certain conditions are met [1]. Clearly, 
> in my opinion relative to page charges, data archiving charges are 
> slight but insignificant.
>
> As for the smaller journals published by smaller less wealthy 
> societies, I would advocate a 'Freemium' approach and archive data in 
> free repositories e.g. FigShare, MorphoBank, MorphBank, PaleoDB, 
> ZooBank, BioTorrents, TreeBASE II, and others... These may perhaps be 
> less certain in terms of long-term sustainability, but have good 
> short-term prospects, and they're free, so why not! - in terms of 
> additional time cost: it only took me 60 seconds to upload a test file 
> and associated metadata to FigShare just recently, so again, this is 
> not a problem.
>
> And to those who say that $25 per article may only subsidise the 'true 
> cost' of archiving (I don't know the economics of this myself), this 
> is where funding body support comes in: would it not be hypocritical 
> for funding bodies to urge data archiving and not help to fund it, at 
> least in part?
>
> The key thing *we* can do to ensure that archiving stays funded and 
> viable, is to embrace and support good and useful data archives by 
> depositing our data there AND re-using data from these archives (with 
> full and proper citation of both the original dataset(s) used AND the 
> repository from which one obtained the dataset(s)).
>
> Finally, as for the article you mention:
> Hanson, B., Sugden, A., and Alberts, B. 2011. Making data maximally 
> available. Science 331:649. http://dx.doi.org/10.1126/science.1203354
>
> I encourage everyone to read this if you have access. It's a good 
> article and I do not think it finds the "future bleak" as you say - 
> quite the opposite in my opinion. There is a wealth of data out there 
> and we must learn new ways and build new tools to be able to fully 
> make use of it all.
>
> I would think from this that the future is very bright for data 
> archiving and palaeontological science :)
>
>
> Kind regards to one and all,
>
> Ross
>
>
> 1: 
> http://pensoftonline.net/zookeys/index.php/journal/about/submissions#authorFees 
>
> thanks to V. Blagoderov (NHM) for pointing this out to me
>
> -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-
> Ross Mounce
> PhD Student
> Fossils, Phylogeny and Macroevolution Research Group
> University of Bath
> 4 South Building, Lab 1.07
> http://bath.academia.edu/RossMounce
> -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-
>




More information about the Paleonet mailing list