How do you evaluate a database?

Printer-friendly version

I was speaking with a colleague recently who, like many of us, had experienced the frustration of trying to support his on-line resources.  He has assembled a comprehensive on-line resource, it is used by the community and was used by others to publish their studies.  It is not Genbank or EBI;  it is one of the thousands of on-line databases created by individuals or small groups that the Neuroscience Information Framework and others have catalogued.  My colleague has spent years on this resource, pored over hundreds of references and entered close to a million statements in the database.  By many means, it is a successful resource.  But in the grant review, he was criticized for not having enough publications.  I experienced the same thing in a failed grant for the resource that I had created, the Cell Centered Database.  In fairness, that was not the most damning criticism, but it just seemed so very misplaced. I had succeeded in standing up and populating a resource, well before there was any thought of actually sharing data.  People used the database and published papers on it, but apparently I should have been spending more time writing about it and less time working on it.

The problems of creating and maintaining these types of resources are well known and were discussed at Beyond the PDF2:  to be funded, you have to be innovative.  But you don't have to be innovative to be useful.  To quote or paraphrase Carole Gobles at the recent conference,  "Merely being useful is not enough." 

But presumably there is a threshold of perceived value where "merely being useful" is enough.  I am thinking of the Protein Databank or Pub Med.  These resources are well funded and also well used but hardly innovative.  I am guessing that many of the resources like my colleague and I created were started with the hope that they would be as well supported and integral to people's work as the PDB or Pub Med.  But the truth is, they are not in the same class.  But they are still valuable and represent works of scholarship.  We are now allowed to list them on our biosketch for NSF.  So my question to you is:  how do we evaluate these thousands of smaller databases?

Ironically, our peers have no trouble evaluating an article about our databases, but they have much more trouble evaluating the resource itself.  How does one weigh 30,000 curated statements against 1 article?  What level of page views, visits, downloads and citations make a database worthwhile?  If my colleague had published 10 papers, the reviewers wouldn't have likely checked how often they were cited, particularly if they were recent.  What is the equivalent of a citation classic for databases?  If you don't have the budget of NCBI, then what level of service can you reasonably expect from these databases?  I thought that the gold standard was a published study that utilized your database to do something else, by a group unconnected to you.  Grant reviewers found that unconvincing.  Perhaps I didn't have enough? But how many of these do you need, relative to the size of your community,  and on what time frame should you expect them to appear?  Sometimes studies take years to publish.  Do they need to be from the community that you thought you were targeting (and whose institute may have funded your resource) or does evidence from other communities count? 

So perhaps if we want to accept databases and other artefacts in lieu of the article, we should help define a reasonable set of criteria by which they can be evaluated.  Anyone care to help here?

About Maryann Martone

A short biography:

I received my BA from Wellesley College in biological psychology and my Ph. D. in neuroscience in 1990 from the University of California, San Diego, where I am currently a Professor in the Department of Neuroscience. My background is in neuroanatomy, particularly light and electron microscopy, but I spend most of my... More

View Profile



It is fantastic that our community has been able to push funding agencies to use entities such as databases as scholarly products of our scientific, work as evidenced by the new NSF biosketch guidelines (yay NSF). However, not only do we require unique IDs for such entities so that we can reference them in other places within the scholarly communication cycle and build tooling around them, but we need to provide guidance as to how such things should be evaluated. For all too long, informatics resources have been viewed by review panels as not innovative enough. Yet scientists are as dependent on them as they are brown paper towels. Similar to brown paper towels, scientists use such resources in innovative ways on a daily basis. So how do we change program officers and review panels consideration of such resources to have different criteria than those used for traditional hypothesis-driven research? In a world where "big data" is a hot topic, our funding agencies need a new mechanism to review resouces and the persons that build them that enable biologists to take advantage of all of this data. How can such resources or people not be innovative?

How many of you now list such things as ontologies, databases, code, on your CV? Do it now! Perhaps one thing we can do is to change how we are evaluated one CV at a time. But this will only help show our innovation to our evaluators, it doesn't help with the evaluation of the databases themselves, either their use, functioning, or even proposal to build new resources. How is a review panel to know that a new resource is needed based on the evaluation and landscape of existing resources? For new resources, shouldn't one have an adoption plan? an integration plan? 

The question of how to evaluate something that isn't a journal article or book has been a preoccupation of digital humanities practitioners for some time.  Some of the question that these scholars have attempted to give guidance on relates to the narrow question of how to evaluate a database.  Here is a hastily assembled list of links on the topic: (see the "Evaluating Digital Scholarship" section of this issue, which is available through open access)

Thank Kevin.  These are very useful and I think we need to establish the same for the sciences.  I liked this guideline from the idhmc document:

"Approximating Equivalencies: Is a digital research project “equivalent” to a book published by a university press, an edited volume, a research article, or something else?  These sorts of questions are often misguided since they are predicated on comparing fundamentally different knowledge artifacts and, perhaps more problematically, consider print publications as the norm and benchmark from which to measure all other work.  Reviewers should be able to assess the significance of the digital work based on a number of factors: the quality and quantity of the research that contributed to the project; the length of time spent and the kind of intellectual investment of the creators and contributors; the range, depth, and forms of the content types and the ways in which this content is presented; and the nature of the authorship and publication process."


Thanks a lot for the interesting question, Maryanne

I hope not to be mistaken for a waterfall type, for mentioning conformance to requirements :-)

which is one of the classic metrics for quality evaluation, in systems engineering


Also, a quality model is set up either upfront, or retrospectively, conformance to which evaluates quality

A QM has a multiplicity of dimesions and perspectives, ideally specified  by the widest possible stakeholder basis!


havent seent this much in the links already provided in the other comments-