Pre-Conference Meetings Sunday, January 11, 2015

REGISTRATION: All registered attendees of the FORCE2015 conference are welcome to attend any pre-conference meetings, no sign up in advance is necessary. PRE-CONFERENCE MEETINGS LOCATION: The Sunday meetings will be held at the University of Oxford, E-Research Center.

Do we know how to cite data? If so, let's go....

08:30 - 13:00 - Open to invited attendees only

Contact: Maryann Martone

SESSION FULL (updated 1-8-2015)

Organizers: Maryann Martone, UCSD and Susanna Sansone, University of Oxford and Nature Publishing Group

As part of the US National Institutes of Health BD2K initiative, the bioCADDIE (Biomedical and healthCAre Data Discovery and Indexing Ecosystem) consortium center was established to examine and test issues surrounding the creation of a Data Discovery Index (DDI). The goal of the DDI is to do for data what PubMed (and PubMed Central) did for the literature.bioCADDIE serves as the organizational framework for investigating and testing key issues in making data discoverable, accessible and citable through task forces and pilot projects. A key component of bioCADDIE is broad engagement with the international community working across these key areas, particularly to ensure that our efforts are aligned with and informed by existing international efforts.

This workshop aims to introduce the goals and current projects under way in bioCADDIE around the area of data citation and to engage the broader community in determining whether sufficient standards, tools, infrastructure and best practices exist for us to implement data citation across NIH-funded activities. We know that standards, tools and best practices for citing data sets, whether in articles or in other artifacts, have been addressed by multiple groups spanning the academic research community, publishers and library sciences community. Data citation serves as mechanism for assigning credit and attribution to those who produce data, a means of linking claims about data to the data themselves, a means of discovering new data through the context in which it was used, and the means to data mine through aggregation of findings around individual data and the means to measure impact and track compliance with data sharing mandates. This workshop will invite a variety of stakeholders to measure the requirements of use cases derived from the BD2K centers, bioCADDIE pilot projects/supplements and the community against proposed implementation to see how bioCADDIE should move forward in the area of data citation. For example, if the current specifications seem to meet the requirements of bioCADDIE use cases, how would we facilitate adoption and testing. If they do not, what else is needed? We anticipate that the outcome will be a revised charter that will serve as a planning document for future task forces and pilot projects.

Contribution and Attribution in the Context of the Scholar

13:00 - 18:00 - Open to all

Contact: Melissa Haendel

Scholarship is changing: no longer are scientists considered simply from the perspective of the number of papers written, citations garnered, and grant dollars awarded. There has been a fundamental shift that recognizes both the interdisciplinary, team-based approach to science as well as the more fine-grained characterization and contextualization of the hundreds and thousands of contributions of varying types that are necessary to move science forward. Unfortunately, little infrastructure exists to aggregate, present, and (ultimately) assess the impact of these contributions. This workshop will move this forward with a discussion of:

  • Contribution roles in publications as authors or in acknowledgements, but much more broadly the many roles played in the creation of other scholarly products such as: code, datasets, works of art, material resources like organisms and reagents, visualizations, equipment, research methods, etc.
  • Attribution for contributions at all career stages: from faculty to undergrads to citizen scientists
  • Temporal nature of scholarly activity and contribution
  • Requirements for research expertise and attribution systems that are needed to support aggregation, display, and discovery of contributors and their contributions; and metrics and science of team science analysis functions.

This workshop will be highly interactive and energetic with the goal of fostering future discussions and active collaborations at FORCE15 and beyond.

Kristi Holmes, Northwestern University
Dean Krafft, Cornell University

Data Citation Support in Reference Managers

13:00 - 18:00 - Open to all

Contact: Martin Fenner and Ian Mulvany

It is essential that reference managers properly support data citation in order for data citation to gain widespread adoption by researchers. This workshop/hackathon with look at what popular reference managers such as Papers, Mendeley, Endnote, and Zotero support already, what is missing, and what we need to work on. By the end of the day we want to have some working solutions, and a clear understanding what work is left to do.

Workshop: Wikimedia as a Platform for Science Communication

13:00 - 18:00 - Open to all

Contact: Daniel Mietchen

Wikipedia and its sister projects increasingly interact with research communication: they are amongst (a) the most consulted online resources about many scientific topics, (b) the top referrers of CrossRef DOIs, (c) the top reusers of openly licensed scholarly materials, (d) the most collaborative environments for education about research-related topics, (e) the most interesting model systems for opening up scholarly workflows.

The workshop would familiarize FORCE2015ers with the ecosystem of Wikimedia projects (which spans over 1000 wikis), some of the inner workings, existing activities in scholarly communication, and opportunities for further interaction. Participants would be encouraged to bring their own materials to be added to Wikimedia platforms, and to share past experiences (good and bad) they may have had in this regard.

Common APIs for FAIR Data Repositories

13:00 - 18:00 - Open to all

Contact: Tim Clark

Session Chairs: Luiz Olavo Bonino (DTL), Tim Clark (Harvard), Merce Crosas (Harvard), Mark Wilkinson (U Polytecnica de Madrid)

FAIR Data has been defined as data which is "Findable, Accessible, Interoperable, and Reusable". This is a useful synopsis of several requirements for citable put forward in the Joint Declaration of Data Citation Principles (

This workshop is an activity of the Force11 Data Citation Implementation Group. Its purpose is to work on piloting the FAIRPORT model of common API models for data repositories, to support "FAIR" data*. Such models are of special significance in terms of data citation; publishing workflow; metadata and identifier aggregation; content indexing; and search.

While this is a complex topic, we feel there is significant interest in and activity around this area, and the time is right to try to build some consensus (if possible!) and open the path to actual pilot implementations. Our vision is the ALL repositories commonly used for biomedical data ought to be able to meet the "FAIR" standard by implementing such an interface.

Our intent for this workshop is to:

(1) Bring together informaticians and repository managers to discuss requirements for common APIs, at least for representing and operating upon metadata, and potentially payloads as well;
(2) Review the outputs of last summer's ELIXIR FAIRPORT skunkworks project, led by Mark Wilkinson and Michel Dumontier;
(3) Understand implications of the proposed peer-to-peer, "everyone can be an aggregator", FAIRPORT ecosystem for development of data discovery indexes.
(4) Understand practical implications and path forward for a pilot implementation of this model (or an alternative if needed) at several repositories;
(5) Recruit participants for FAIRPORT pilots against multiple existing repositories and set work for the pilot in motion.

Future Commons: Beyond Beyond the PDF

13:00 - 16:00 - Open to all

Contact: Robin Champieux

Session chairs: Robin Champieux and Dan O'Connell

A striking feature of research communication is the isolation in which different communities work. While Openness, reuse, and reproducibility are goals that transcend any one domain, the ways in which different communities achieve these goals have often been addressed in isolation. Problems considered solved in one community (e.g. author identification, discoverability, XML standards, "going beyond the PDF") remain open questions in others.

This planning meeting will open discussions on how we can identify and leverage the knowledge of our disparate local constituencies, focussing on how we can build upon interdisciplinary opportunities presented at the Force2015 meeting. We will develop strategy for identifying potential participants, establish goals, and plan our subsequent work at this meeting (Birds of a Feather lunches and potential $1k challenge). Our ultimate goal is to develop a larger scale funding proposal for work on research communication that crosses disciplinary, national, and regional silos.

ContentMine Workshop

15:00 - 18:00 - Open to all

Contact: Jenny Molloy

This workshop is designed to introduce scientists and scholars to ContentMine: a global, collaborative community of researchers, and a suite of open source text and datamining tools ( ContentMine seeks to make scientific knowledge available to everyone (including machines), through extracting millions of facts from the scientific literature. The hands-on workshop will cover journal scraping and extracting facts like species, sequences and identifiers before preparing participants to tackle real, research related, mining challenges through an introduction to responsible content mining, including the current legal restrictions. Interested participants can also join the team at the FORCE2015 hackathon to build their own scrapers and tools and further explore related technologies.


