FORCE2015 Demos and Posters
#34 "Marco Civil Watch: collaboratively monitoring court decisions that affect the Internet"
Abstract: "Marco Civil" is the Brazilian Civil Rights Framework for the Internet. It was recently approved through an experimental collaborative process engaging multiple stakeholders. Following the approval of the law, judges take a prominent position in defining standards of interpretation for its provision. The intention of this research project is to create a public database with decisions applying this law. Feeding this database will be an ongoing process that will depend on a-) a technological aspect (possibly crawlers searching for new decisions published on the courts' repositories) and b-) a human aspect (possibly a group of researchers double checking this process frequently). This database will also be available in a platform incorporating a wiki so people can comment and analyse the decisions, making this monitoring a collaborative process.
Presentation Link: http://www.internetlab.org.br/en/
#35 RRIDs, We are ready, Use them!
Abstract: A central tenet of reproducibility is an unambiguous description of the data, methods, and material resources in publications. We recently attempted to identify research resources (model organisms, antibodies, knockdown reagents, constructs, and cell lines) in the biomedical literature and the results showed that only 54% of resources were uniquely identifiable, regardless of domain, journal impact factor, or reporting requirements (https://peerj.com/articles/148/), largely due to a lack of unique identifiers such as a catalog numbers. The Resource Identification Initiative was formed by a dedicated group of academics, government and non-government institute officials, publishers and commercial antibody companies. In the pilot study, authors are asked to include RRIDs prior to publication for three resource types: antibodies, model organisms and tools (including software and databases). RRIDs meet key criteria: they are unique, machine readable, free to generate and access, and are consistent across publishers and journals. To facilitate access to the RRIDs, we aggregated data for into the Resource Identification Portal (scicrun.ch/resources). Centralized instructions to authors were modified at many publishers, data shows authors are using the portal, and there are now over 200 papers that contain over 1000 unique identifiers. The first 100 papers have been deeply analyzed and the analysis addresses many of the initial questions raised by journal editors and publishers. The overall accuracy of authors is over 96% and the data shows that when asked authors are happy to comply with this standard. Furthermore, the methods in participating journals are better than the same journal prior to the pilot.
#36 Enhanced Publication Management System
Abstract: Enhanced publications (EPs) are a novel way to represent scientific research results by enriching the traditional digital publications with (links to) other research outcomes such as data, processing workflows, software. Literature shows that there are several examples of information systems devised for the management of EPs in specific application domain and use-cases: Enhanced Publication Information Systems. The majority of those systems are realised “from scratch” so that functionalities that are shared across disciplines and user communities are re-implemented every time. In fact, EP-oriented software is realised by integrating technologies that are general-purpose (e.g. databases, file stores) and Digital Library-oriented (e.g. repository software, cataloguing systems). The resulting products are often not flexible enough to be adapted to the evolving requirements of the community they target and hardly re-usable and configurable to be re-used in different application domains with similar requirements. Such a “from scratch” approach entails non-negligible realisation and maintenance costs that could be decreased by adopting a more systemic approach, as it had been done in the past with Database Management Systems (DBMSs). The approach we propose is the adoption of an Enhanced Publication Management System (EPMS), which is a software framework that plays the role of DBMSs in the world of EPs. The framework supports developers of EPISs with tools that (i) hide the complexity of the implementation of domain-independent requirements, (ii) allow the definition of personalised EP data models, (iii) support the realisation and configuration of functionalities based on the defined EP data model.
#37 Capture, Embed, Interoperate: the Metadata Round Trip
Abstract: This poster describes the result of recent work undertaken by ORCID in partnership with CrossRef and DataCite to enhance the interoperability of scholarly information - capturing embedded unique identifiers for researchers and their research outputs (journal articles and datasets) and supporting the improved flow of information between systems. Researchers can include ORCID iDs when they submit datasets to data centres or article manuscripts to journals. The publisher or data centre then mints a DOI for the submission, and includes the ORCID iD of the author in the metadata. DataCite and CrossRef then send the new metadata back to the ORCID record for the author. This means that information about new research outputs can flow freely and automatically between these identifier systems and into other, integrated systems. The benefits, and potential, of these developments are really exciting. Researchers benefit from automatic updates to their personal record, systems linked to the ORCID registry can re-use the information, saving more time and effort, and other IDs and information sources can link to the new metadata. Building connections like this, to improve the flow of information through the scholarly ecosystem, is a great example of why we refer to ORCID as plumbing. IDs help make e-infrastructure for research work better.
Presentation Link: http://www.slideshare.net/ORCIDSlides/force2015-orcid-poster
#38 Walk softly and carry a large carrot: how to give credit for academic work
Abstract: Researchers want to know how their work impacts their communities, and the wider world; including research outputs other than peer-reviewed journal publications. The journal paper provides a way of claiming and defining an area of intellectual work, and citation of articles allows the acknowledgement of that work by others. Yet the paper can only give an overview of the work - it is not possible to publish everything into a paper that is needed to make it fully reproducible. For providing credit (and for making recruitment and promotion decisions) we abstract the paper further. Instead of reading every citing paper, we instead count the citations, reckoning this an appropriate proxy for the quality of the paper, and hence the described work. Citation counts for datasets are one of the “carrots” promised to researchers for their efforts in citing and publishing data, also producing a metric by which the quality of a dataset can be evaluated. Quality is a slippery concept when it comes to data, which can be good quality for one purpose, and bad for another. Measuring the impact of research directly is difficult, so we resort to measuring what we can (number of citations). Care must be taken with indirect measurements to ensure that they map appropriately to what we really want to measure. This presentation will address the issue of providing credit, especially for non-article research outputs, using recent surveys on researchers’ desires for evaluating their work’s impact.
#1 A Bridge from Publishing Words to Publishing Data
Abstract: As data publishing technology and data management policies have evolved over the past decade, more academic journals are working with data repositories to disseminate the data associated with published articles. The Dataverse Project at Harvard University’s Institute for Quantitative Social Science (IQSS) recently received a two year grant from the Sloan Foundation, in partnership with the Odum Institute at the University of North Carolina at Chapel Hill, to collaborate with a variety of publishers, repositories and the scientific community in order to integrate and automate data publication with more traditional scholarly communication, thus helping make data sharing and preservation an intrinsic part of the scholarly publication process. This project is an expansion of the work the Dataverse Project has already been working on with Stanford University's Public Knowledge Project for the past two years, which implemented automated data deposit integration between Dataverse and the Open Journal Systems' publishing platform. This new project will be focused on engaging more with the wider research community, including FORCE11, RDA, and FAIRPORT, so that the data repository API is aligned with a common standard across disciplines, publishing systems and repositories. This will include extending support for more metadata (beyond Dublin Core) and evaluating moving from SWORD to a more generic and suitable data repository API that will allow for sharing data across a wider variety of publishing systems (beyond just journals), supporting more data review workflows, along with automating data citation and bi-directional linking between article and data.
#18 Geungle: An Example of Interdisciplinary Research Using Research Cases
Abstract: Geungle is a platform for scholarly genealogical research. By sharing a lively genealogical research project, we will use Geungle as a means of showing the purpose, scope, and potential of Research Cases. Research Cases are a way to do and share research using the Web architecture. Semantic knowledge drives the composition of the research as it moves along the process, from question to answer, and everything in between. That research can then be preserved, shared, built upon, and assessed for quality, impact, and technical accuracy. By using a research field that is inherently interdisciplinary, such as genealogy, research cases will be put to the test.
Presentation Link: https://geungle.com
#39 Opening Knowledge
Abstract: Open access to research journals is a great step, but does not solve the fundamental problem of open access to knowledge. We live in a time of widespread intellectual affluence. Knowledge is available for free at your nearest Internet connection. Yet true open access to knowledge must include access to the knowledge-creation process, so that the foundations of knowledge can be visible for all to see and build upon.
Presentation Link: http://pentandra.com/files/FORCE2015/opening_knowledge.pdf
#2 Riffyn - Experimental Process Design & Analysis Software for Reproducible Research
Abstract: Riffyn is making manufacturing-grade quality available to academic and industrial researchers by creating process design and improvement software which represents a step-change relative to existing systems in terms of ease-of-use, flexibility, analytical power and collaborative capabilities. Riffyn offers: * Collaborative process design - a visual “blueprint” for all experimental work, which can be built, shared and collaboratively improved in a flexible and iterative fashion * Data capture and integration - Riffyn takes an open standard and open source approach to data acquisition, developing drivers, import/export filters and Internet of Things hardware to allow data from both legacy and modern instruments to be brought together and silos to be eliminated. * Analytics - by mapping experimental data to process designs, Riffyn identifies noise-reduction and quality improvement opportunities, just as Google Maps helps find the quickest route by mapping traffic data onto the road network * Reproducibility - Riffyn makes it easy for researchers to share, register and publish detailed process descriptions, in both human- and computer-readable form * Scalability - Riffyn uses a SaaS model, with public- or private-cloud hosting, to address the challenges researchers face when their needs outgrow the capacity of traditional lab software architectures
Presentation Link: https://www.youtube.com/watch?v=kH56tapZwsg
#19 Collaborating to Support Authors around the Globe
Co-Presenter: John Hammersley, Overleaf
Abstract: Over 2 million papers are published each year by authors around the world. Scientific production is now an increasingly international endeavor, with scholarly publishers reporting the rapid rise of manuscript submissions from across the globe, most notably Asia. New authoring tools are also facilitating collaborations between researchers in every time zone, making global research production faster and easier than ever. However, English is still the primary language of international journals, and language can be a barrier to the broad dissemination of research findings for many authors who speak English as a foreign language. To support these authors, Overleaf and Research Square (AJE) are collaborating on a new project to provide English-language editing services within the Overleaf environment. By serving international authors through a collaborative authoring tool, we hope to enable the creation of an efficient publication process driven by a diverse community, which can increase the pace of scientific discovery itself.
#40 OpenAIRE2020 a framework for open scholarship
Abstract: Open access to scientific peer reviewed publications has been anchored as an underlying principle in the Horizon 2020 being present in the Regulation and the Rules of Participation as well as through the relevant provisions in the grant agreement for all the projects. 50 partners, from all EU countries, and beyond collaborate to work on a large-scale initiative - OpenAIRE2020 - that aims to promote open scholarship and substantially improve the discoverability and reusability of research publications and data. A network of people, represented by the National Open Access Desks (NOADs), will organise activities to collect H2020 project outputs, and support research data management. OpenAIRE2020 will assist in monitoring H2020 research outputs and will be a key infrastructure for reporting H2020's scientific publications as it will be loosely coupled to the EC's IT backend systems. The EC's Research Data Pilot will be supported through European-wide outreach for best research data management practices and Zenodo, which will provide long-tail data storage. OpenAIRE currently operates an interoperable and validated network of more than 520 repositories and OA journals, integrating more than 9 million OA publications and 1,000 datasets, with 50,000 organizations and 30,000 projects from two funders. It has identified over 100,000 FP7 publications from about half the 26,000 FP7 projects, and offers literature-data integration services.
#20 The CERN OpenData Portal
Co-Presenter: Tim Smith
Abstract: The CERN Open Data portal is the access point to a growing range of data outputs produced through the research performed at CERN, the world's leading particle physics laboratory. The portal provides Open Access to the output from various research activities such as the LHC experiments, and included accompanying software and documentation which is needed to understand and analyze the complex datasets. The data releases on the portal adhere to established global standards in data preservation and Open Science: the products are shared under open licenses. They are issued with a digital object identifier (DOI) to make them citable objects in the scientific discourse. Finally, the data shared through this portal is captured in established community aggregators (e.g. the institutional repository CERN Document Server, and INSPIRE serving the whole High-Energy physics community). The latter enables the tracking of data citation. The demonstration will show the portal’s information architecture and how the content is prepared and disseminated. Emphasis will be given to the facilitation of data and software citation following the Force 11 declaration.
Presentation Link: http://opendata.cern.ch
#3 Beyond Data Publishing: Increasing Reproducibility and Accessibility with Galaxy
Abstract: GigaScience is a journal with a focus on the publication of reproducible research. This is facilitated by its GigaDB database where the data and the tools used for analysis within a publication may be deposited by authors and made publicly available with citable DOIs. To further increase reproducibility, several tools have been implemented in our Galaxy server (http://galaxy.cbiit.cuhk.edu.hk) where any user can access the published workflow and test data to reproduce results. This level of accessibility has allowed our collaborative partners at CUHK and, independently, groups at Brown University to use the platform for teaching purposes. Here we demonstrate our GigaGalaxy server and the tools and pipelines that have been implemented to date e.g. Galaxy tools to study genome diversity, doi:10.5524/100069; SoapDenovo2, doi:10.5524/100044; SmileFinder, doi:10.5524/100100. The link between GigaScience article, GigaDB repository and GigaGalaxy analysis platform can be seen. As a specific example, we have investigated the extent by which the results from articles published in GigaScience can be reproduced by recreating in GigaGalaxy, a comparison of three gene assembly workflows (SOAPdenovo1, SOAPdenovo2, ALL-PATHS-LG) as reported in Luo et al., (2012). Re-creating these pipelines used by the authors met with mixed success. Whilst the results generated by SOAPdenovo2 could be reproduced by our Galaxy implementation, we were less successful with SOAPdenovo1 and ALL-PATHS-LG. This demonstrates the need for researchers and publishers to collaborate on sharing of methods during the review process.
Presentation Link: http://galaxy.cbiit.cuhk.edu.hk/
#41 The research scholarship face-off: Ghana and Nigeria, will you take up this challenge, and let e-scholarship take off?
Abstract: This is a simple idea that is intended to harness the (often) playful competitive spirit between these two countries, to promote research scholarship. Even though new technologies are improving access to knowledge and tools sharing in ways that are developing research in many parts of the world, the reality for most countries, especially those in the developing world is different. This is certainly the case with Ghana and Nigeria, contexts which I’m fairly familiar with. Generally, open access to scholarly resources is still a challenge, and the use of open licence tools (for example Creative Commons) is not actively promoted across the board. Policy is lagging behind practice, and practice is not at par with worldwide trends. This idea therefore offers a challenge to technologists, academics, and archivists, in these two countries to do more within the coming year to ensure that these tools, policies and practices are put in place. Drawing from discussions, demos, hackathons, and other such activities that I have been part of over the past couple of years, both as a researcher and a practitioner in the open community, I will challenge one institution from each of the two countries to replicate a number of these actions in their respective research communities. Context considerations are important here, so we will work together to ensure that these are tailored to each country-context. Specifically, we will look at exploring the options for funding, policy-making, business models and tools development that fits these contexts. I’m really hoping that they take up this experiment, and look forward to what we can collectively produce if we challenge ourselves in this area. The aim is that by the next FORCE conference (or any closely related event in the coming year), the narrative about e-scholarship from these contexts will be different.
#21 A Data Center's Journey from Good to (not quite) Great
Co-Presenter: Mingyu Hu
Abstract: Data facilities begin operations with the tools and techniques related to the time of its inception, or may rely on even earlier versions if the group inherits legacy operations. The center must carefully identify and reassess its starting point and ultimate goals, which requires continual evolution of thinking, programming, networking, and archiving strategies; constantly evaluating and (when appropriate) utilizing the latest technologies and methodologies. This is the story of one such geophysical data center, a mostly unfinished tale of service to its research community, told from both a technical and human perspective.
Presentation Link: http://cchdo.ucsd.edu
#22 Making Data Count
Co-Presenter: Jennifer Lin
Abstract: The research community has been calling for solutions to data discovery and to more broadly capture the value of the work that is at the core of the researcher's scholarly pursuit. PLOS, CDL, and DataONE, are collaborating in a new project aimed at designing metrics needed to capture the activity surrounding research data in a valid and credible way. "Data-level metrics" (DLM) will measure data use across a broad range of activity surrounding the reach and use of data as a research output. The project will augment the existing scholarly cyberinfrastructure, which currently is focused on journal articles, and introduce data as a valued scholarly output into the framework. DLM data will provide a clear and growing picture of the activity around the dissemination and reach of research data, especially as this activity is linked to other research entities and objects in the research information ecosystem. Data metrics can create incentives that support data sharing and usage to increase the velocity of information dissemination across a wide range of disciplines, once the impact of the research is exposed and used to support reporting needs for funders and institutions. In the presentation, we will report preliminary results collected from the first set of data deposited in our pilot repositories.
Presentation Link: http://dlm.plos.org
#42 Research Object Profiles - There’s More Than One Way To Do It
Co-Presenter: Carole Goble, The University of Manchester, UK
Abstract: What do research objects look like in practice? The value proposition of research objects is now well understood - supporting the publication of more than just PDFs, making data, code, and other resources first class citizens of scholarship. What is less understood is the formats, protocols, and standards that researchers need to use to make their research objects a reality? Rather than a one size fits all approach to the realisation of research objects - we propose instead a growing series of Research Object Profiles. Requirements for research objects in areas such as identity, versioning, and provenance will vary depending on the purpose of that research object (archiving, exchange, publication, replication, learning etc.), as well as the prevailing norms of the specific domain. Each profile will therefore describe how to combine existing formats, standards, and software such as Docker, IPython, Git, PROV, etc to build research objects for a specific purpose, and from the tools that people already use. This series of profiles must be designed with two goals in mind: 1) A common core set of features (identity, minimal provenance etc) to ensure interoperability. 2) Flexibility to support domain specific requirements. We discuss our vision for Research Object Profiles, showing examples of how profiles work in practice for research objects in systems biology, clinical health care, for packaging software, and for packaging studies with workflows and data.
#43 Measuring knowledge in neuroscience.
Abstract: What do we know and what are we researching? Traditionally we answer these questions by reading or summarizing the published literature. However, developing a cursory overview of our knowledge of a single biological system requires extensive reading across a huge diversity of literatures with few articles providing links between them. Through the data collected by the Neuroscience Information Framework we can now address this question by measuring the amount of data produced by many different techniques which all claim to be about the same thing, such as a cell type or a brain region. We quantify the amount of data produced about various brain areas, species, diseases, drugs of abuse, cell types, and neurotransmitters. We present these as heatmaps of instances of terms from the NIF standard ontology found in data sources from the NIF data federation. We have also developed measures to determine how well a certain term is understood by quantifying the diversity of techniques referring to it.
#4 SciCrunch: A cooperative and collaborative data and resource discovery platform for scientific communities
Abstract: SciCrunch was designed to allow communities of researchers to create focused portals that provide access to resources, databases and tools of relevance to their research areas. SciCrunch is one of the largest aggregations of scientific data and tools available on the Web. One can think of SciCrunch as a “PubMed” for tools and data. Just as you can search across all the biomedical literature through PubMed, regardless of journal, SciCrunch lets you search across hundreds of databases and millions of data records from a single interface. SciCrunch was designed to break down the traditional types of portal silos created by different communities, so that communities can take advantage of work done by others and share their expertise as well. When a community brings in a data source, it becomes available to other communities, thus ensuring that valuable resources are shared by other communities who might need them. SciCrunch currently supports a diverse collection of communities, each with their own data needs: CINERGI – focuses on constructing a community inventory and knowledge base on geoscience information resources; Neuroscience Information Framework (NIF; neuinfo.org) – is a biological search engine that allows students, educators, and researchers to navigate data resources relevant to neuroscience; NIDDK Information Network (dkNETl; dknet.org) – serves the needs of basic and clinical investigators by providing seamless access to large pools of data relevant to the mission of The National Institute of Diabetes, Digestive and Kidney Disease; Research Identification Initiative (RII) – aims to promote research resource identification, discovery, and reuse.
Presentation Link: http://scicrunch.org/
#73 Digital Scientific Knowledge as a Living Discourse - Supporting flux in scientific knowledge
Abstract: Research Object (RO) is an aggregation of various digital assets – data, methods, software and workflows – and thus relies on the underlying digital ecosystem of science, which is fragmented among multitude of disconnected tools and systems – data repositories, software tools, workflow systems, digital journals, wikis, social networks, etc. This fragmentation makes it difficult to handle the perpetual changing nature of scientific knowledge. How often do we see a scholarly output being changed or revised as the underlying method has a new version? I would like to see scholarly communication as a living discourse that would adhere to the continuous influx of scientific knowledge, along with the deeper appreciation of processes and connections among artifacts and activities. Complementary to ROs that focus on individual instances of science, I propose we should also focus on representing science as an ongoing process (in the digital world). This could be achieved by a digital ecosystem where we have a network of communicating tools that could harmonize scientific artifacts by exchanging information as they change and evolve. Such a network of digital tools will facilitate a virtual collaboration and thus communication among communities that often don’t interact (for example, database administrators, ontology engineers and researchers). I am proposing the aggregation of digital science ecosystem along with the aggregation of a single experiment. My research tries to demonstrate these ideas through a prototype (AdvoCate) that supports category evolution and connects and synchronizes to databases and ontologies that consume or use those categories.
Presentation Link: http://dx.doi.org/10.6084/m9.figshare.1284650
#5 The Development of figshare for Institutions
Co-Presenter: Dan Valen
Abstract: Management, dissemination and collaboration around research data is key to both the development of scholarly communication and quickly becoming a requirement as more and more funders mandate open data. With new funder mandates, researchers as well as institutions are faced with a research data management puzzle. figshare for institutions provides a solution for both of these stakeholders by catering to their individual needs. figshare allows academic institutions to distribute, share and get credit for their research data by hosting videos, datasets, posters, code, figures and theses in a secure and manageable way. For researchers large amounts of private storage space is available to better manage their research outputs. The intuitive interface allows quick upload of data and easy retrieval of files by filtering on keywords and file type. When it comes to compliance, these research outputs are then just one click away from being made openly and persistently available, as well as citable and trackable, via a Digital Object Identifier (DOI). These affiliated, publicly available files are aggregated at the departmental and/or institutional level, automatically providing a self-populating institutional data repository with reporting capabilities. Technology is helping researchers to be able to communicate both positive and negative research results more openly, share data that has previously been locked away on closed hard drives, and develop research objects as a publishing output in its own right. This demo and poster will show how figshare is helping reshape the open data landscape at both the individual researcher and institutional level.
Presentation Link: http://www.figshare.com
#44 The Chrysalis Project White Paper: Opportunities for New Types of Research Alliances
Co-Presenter: Dr. Katherine Skinner (Educopia Executive Director) and Christina Drummond (Chrysalis Project Manager)
Abstract: How can we collectively design systems and organizations that will most effectively sustain innovative new varieties of digital scholarship over time? The Chrysalis project undertook a one-year planning effort funded by the Andrew W. Mellon Foundation to study this question by researching models for implementing vertically integrated research alliances (VIRAs) designed to sustain digital scholarship. Such research alliances would be comprised of combinations of scholars, libraries, scholarly associations, and university presses. The white paper examines potential revenue streams associated with these models, including possible redistribution of current university library investments in information resources and the creation of new sources of revenue generation to sustain research publishing. The planning project interviewed a broad range of research stakeholders through a series of meetings and iterative discussions, including faculty, academic libraries, research presses, and scholarly societies. The name of the project, “Chrysalis”, was selected because of the fact that organisms with a pupal developmental stage undergo a chrysalis metamorphosis which allows for internal reorganization in preparation for longer-term survival. If successful, VIRAs could constitute a chrysalis of sorts for innovative digital research publications that would otherwise be at risk. The project was undertaken by researchers at the Educopia Institute and the University of North Texas Libraries. Presenters will include Dr. Martin Halbert (UNT Dean of Libraries and Educopia President), Dr. Katherine Skinner (Educopia Executive Director), and Christina Drummond (Chrysalis Project Manager).
#23 Sumatra: Automated tracking of scientific computations
Abstract: Sumatra is a free and open source software tool for managing and tracking projects based on numerical simulation or data analysis, with automated capture of provenance. It can be thought of as an "automated electronic lab notebook" for simulation/analysis projects, with the aim of supporting reproducible research. The latest version (0.7) aims to strongly improve the usability of the tool by offering a new, data-centric view alongside the existing, process-centric methods of viewing and interacting with project history. This greatly extends the potential utility of the software as a platform supporting reproducible open data projects.
Presentation Link: http://neuralensemble.org/sumatra/
#45 Improving Reuse and Exploration of Long-tail Data Through Enhanced Metadata
Abstract: In order for experimental data from scientific research to be reused and explored by future scholars, researchers must be encouraged to annotate their original data with informative and useful metadata. While long-term repositories such as Dryad are critical to the preservation of data created and utilized in scientific research, this data is of limited utility if it is preserved without sufficient context to orient future scholars. The ISA Commons provides a framework for this contextualization, but the challenge will be in reducing barriers to annotation for the researchers who are depositing their data in repositories. This collaboration between Dryad and ISA Commons will investigate methods for automating some of this metadata creation, making it simpler for researchers in many fields to see suggestions for contextual metadata that is common to multiple experiments and allow the creation of unique and trusted identifiers that are connected to multiple levels of data creation and use, encouraging the crosslinking of researchers, funding sources, institutions, and data banks across multiple studies and publications.
#46 OpenBookBag Concept Platform
Abstract: OpenBookBag is a concept platform for an alternative peer to peer sharing and review mechanism that socializes the solitary act of literature review, while also providing an open channel for “boosting” popular reading. OpenBookBag combines hardware and software using mesh networking through WiFi, cellular, and Bluetooth for direct file sharing and interaction. The first aspect of this platform is the OpenBookBag smartphone application, which allows people within a short radius to browse and download open-source articles or documents from individually curated libraries. Imagine attending conferences, where everyone's research is available for immediate download while you interact with the author. Then imagine, after the conference, being able to go online to a companion web app to see the networks that your articles access, to different departments, to different disciplines, to different countries. The second aspect of this platform is the embedding of the software into everyday things, such as a light pole or bench. Imagine walking down a campus path beside the mechanical engineering department, and as you pass by you can browse a repository of mechanical engineering faculty-related articles stored in and distributed from a light pole on your OpenBookBag smartphone app. You might be an artist, but something in an article's abstract catches your eye that you think would be useful for your own network. "Boost" that article via your smartphone application and let the author knows that there are people who would love to read his article in popular vernacular that can be applied in everyday life.
Presentation Link: https://www.smore.com/8zdft
#47 Legal Liabilities of the Digital Documents of Academic Libraries in India
Abstract: Intellectual property of digital resources of the library is the crucial issue because more and more of print resource of the libraries are replaced by the digital resource. These resources are either born digital or it may be digital equivalent of existing print resource. Moreover the digital resource has one more dimension, i.e. accessed online or through internet. Copyright and license for access to number of users etc are the issues arise. This paper has taken an overview of the problem of the copyright laws of the digital contents of the modern day’s libraries.
#6 International Image Interoperability Framework
Abstract: Access to image-based resources is fundamental to research, scholarship and the transmission of cultural knowledge. Digital images are a container for much of the information content in the Web-based delivery of images, books, newspapers, manuscripts, maps, scrolls, single sheet collections, and archival materials. Yet much of the Internet’s image-based resources are locked up in silos, with access restricted to bespoke, locally built applications. A growing community of the world’s leading research libraries and image repositories have embarked on an effort to collaboratively produce an interoperable technology and community framework for image delivery. IIIF (International Image Interoperability Framework) has the following goals: 1. To give scholars an unprecedented level of uniform and rich access to image-based resources hosted around the world. 2. To define a set of common application programming interfaces that support interoperability between image repositories. 3. To develop, cultivate and document shared technologies, such as image servers and web clients, that provide a world-class user experience in viewing, comparing, manipulating and annotating images.
Presentation Link: http://iiif.io
#48 OMIM Concept Annotation: Steps Towards Automated Tagging the Disease Literature using PhenoMiner Phenotypes
Abstract: Genetic dispositions play an important role in numerous human diseases. In order to understand the causes linking genes to associated disease, researchers have employed high throughput experiments to discover evidence. The Online Mendelian Inheritance of Man is a manually curated database that provides bibliographic details of over 13,000 human genes and 7,800 diseases, hosted by the Johns Hopkins University. In these experiments we show our first steps towards a phenotype-based system that can automatically enrich full text scientific articles in Europe PubMed Central (http://europepmc.org/) with OMIM concept identifiers. We compare an experimental system based on automatically mined phenotypes in the PhenoMiner database (DOI 10.5281/zenodo.12493) against two standard data sets: the OMIM-curated open access literature and author-mentioned OMIM accession numbers in the open access literature. In this pilot study we found a high degree of overlap with author-mentions of OMIM identifiers but less overlap with curator assigned concept identifiers. Future work will look at more sophisticated methods for inferring OMIM concept labels using phenotypes and take the lessons learnt to refine our approach to phenotype discovery in text.
#49 Analysis of PhenoMiner Phenotypes in the Open Access Full Text Literature
Abstract: The free text scientific literature has enormous potential to support knowledge discovery in biomedical pipelines. Although substantial work has taken place to identify and link biomedical terms such as genes and diseases, until now phenotypic descriptions have not received much attention. Phenotype terms denote important clinical concepts through variation from normal morphology, physiology or behaviour and can be used as evidence to help identify the disease under consideration. In recent years ontologies have been constructed by hand using expert intuition but we still do know little about the actual terms used by authors or their distribution in the literature. In this paper we begin to make progress in this area by using a database of phenotypes (DOI 10.5281/zenodo.12493) mined from the BMC open access corpus and looking at the distributional properties across the whole of the open access literature. We show how this distribution varies across time and across journals. The data appears to show increased mentioning of phenotypes over time and an extremely small set of very high frequency phenotypes which could be focused on a narrow set of disorders such as breast cancer, skin cancer and pain.
#50 Implementing Transitive Credit with JSON-LD
Abstract: Science and engineering research increasingly relies on activities that facilitate research but are not currently rewarded or recognized, such as: data sharing; developing common data resources, software and methodologies; and annotating data and publications. To promote and advance these activities, we must develop mechanisms for assigning credit, facilitate the appropriate attribution of research outcomes, devise incentives for activities that facilitate research, and allocate funds to maximize return on investment. In this article, we focus on addressing the issue of assigning credit for both direct and indirect contributions, specifically by using JSON-LD to implement a prototype transitive credit system.
Presentation Link: http://arxiv.org/abs/1407.5117
#51 Working towards Sustainable Software for Science: Practice and Experiences (WSSSPE)
Abstract: The Working towards Sustainable Software for Science: Practice and Experiences (WSSSPE) community has had 3 workshops, reaching a few hundred people who are all working towards making science software more sustainable. In these workshops, we've discussed about 75 accepted papers, and started discussions wactions that we can move forward on, both together and in smaller groups. This poster will explain some of the previous workshops and encourage people to join the WSSSPE community to make future meetings and discussions more productive and more impactful.
Presentation Link: http://wssspe.researchcomputing.org.uk
#24 Make me a Research Object (or a Digital Music Object)
Abstract: Research Objects have been proposed as a means to capture and publish the context of research along with its findings. But there are few tools for creating Research Objects, or connecting findings to their research context. Projects are diverse, and a tool addressing the needs of one research project is not necessarily applicable for another. General purpose tools such as spreadsheets are commonly used, but are not so easy to combine with contextual information, or to re-use for different purposes without extrinsic description. Annalist is a work-in-progress to create a "linked data notebook" to fill the gap between spreadsheets and existing sharing and publishing platforms. Building on web linked data ideas, it is a flexible tool for user entry of attributes and relations linking research and other digital artifacts. Annalist collections can be accessed directly as web resources, or packaged as composite objects for sharing and publication through existing systems. A key notion is to allow evolution of structured descriptions of resources, rather than working within predefined structures, with the goal of allowing researchers to add relevant information as it becomes available or needs are identified. The Annalist demonstration prototype can be used "out of the box" to collect information about artifacts without any prior design of data structures or schema. For the FAST-IMPACt project (http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/L019981/1), we plan to trial it for associating musicological annotations with digital music artifacts.
Presentation Link: http://annalist.net/
#52 101 Innovations in Scholarly Communication: the Changing Research Workflow
Co-Presenter: Jeroen Bosman
Abstract: In the fast developing world of scholarly communication it is good to take a step back and look at the patterns and processes of innovation in this field. To this end, we have selected 101 innovations (in the form of tools & sites) and graphically displayed them by year and also according to 6 phases of the research workflow: collection of data & literature, analysis, writing, publishing & archiving, outreach and assessment. This overview facilitates discussion on processes of innovation, disruption, diffusion, consolidation, competition and success, but also of failure and stagnation, over the last 3 decades. We describe some of the trends, expectations, uncertainties, opportunities and challenges within each of the workflow phases. Also, based on the graphical overview we present a juxtaposition of typical traditional, innovative and experimental workflows.
Presentation Link: http://innoscholcomm.silk.co
#53 Researcher Perspectives on Publication and Peer Review of Data.
Abstract: Data “publication” attempts to appropriate for data the prestige of publication in the scholarly literature. While the scholarly communication community substantially endorses the idea, it hasn’t fully resolved what a data publication should look like or how data peer review should work. To contribute an important and neglected perspective on these issues, we surveyed ~250 researchers across the sciences and social sciences, asking what expectations “data publication” raises and what features would be useful to evaluate the trustworthiness and impact of a data publication and the contribution of its creator(s).
#54 Automatic Keyword Assignment using Multilingual Topic Models
Abstract: Keywords specify the principle subject matter of the article. Many scientific publications require authors to provide keywords to their scholarly work from a controlled list of vocabulary. Author based keyword annotation is a costly and time consuming process prone to errors and author bias. Aside from providing a concise summary of the article, keywords serve as index terms for performing search and comparing articles in collections of scholarly work. In their latter role, author based keywords often fail to provide additional benefit compared to indexing words. To alleviate this problem, work has been done to explore approaches for automatic keyword assignment. We are exploring the feasibility of using multilingual topic models for automatic keyword assignment by assuming that the word and keyword representation of the article are two documents written in a different language that cover the same set of topics. In this poster we’ll present our initial exploration results using articles from the Astrophysics Journal (ApJ).
#7 Publishing without Publishers: A Decentralized Server Network for Scientific Data
Abstract: There exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. To solve this problem, we propose to design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities such as publishing companies. We present a protocol and a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data with formal semantics. We show how this approach allows researchers to produce, publish, retrieve, address, verify, and recombine datasets and their individual nanopublications. Due to the use of trusty URIs, which include cryptographic hash values of the content they represent, all content in the network is verifiable and immutable. Our evaluation of the current small network shows that this system is efficient and reliable, and we discuss how it could grow to handle the large amounts of structured data that modern science is producing and consuming. We believe that this network can serve as a solid basis for semantic publishing and could contribute to improve the availability and reproducibility of scientific results. See also this preprint article: http://arxiv.org/abs/1411.2749
#25 The New ADS
Abstract: The Smithsonian/NASA Astrophysics Data System is now more than 20 years old. During the last five years we have been building a new, modern system to replace our old and venerable one. I will demo the new version, and will be asking for comments and suggestions. This will be the first public demonstration of this system.
Presentation Link: http://adsabs-bumblebee.elasticbeanstalk.com/search/
#74 Libraccess: Making open access publications widely available
Abstract: « Libraccess » is a not-for-profit project of an online platform indexing international repositories of open access documents, datasets and code bases, for all scientific disciplines. Libraccess will also be a provider of services based on these indexes. We invite all proponents of open access to bring their support to this project. The new standard of scientific publication is the open access format. Today, documents and code and datasets are scattered in different repositories across disciplines and countries, which makes it hard to search / explore and slows down the creation of innovative services based on an aggregated supply of documents, code bases and datasets. Libraccess aims to become the solution of reference for servicing open access material. In two ways: 1. Aggregation and indexing of documents, code and data. Access to this platform is offered for free via APIs and a user (web) interface. 2. Services based on these indexes: search on authors, search on highered organizations, benchmarking of publication performance. These services are offered for a fee (typically, to universities and research agencies), which serve to develop and maintain the platform.
Presentation Link: http://www.libraccess.org
#66 Open Access Button
Abstract: You know the frustration of finding the research paper you need, but being unable afford the $30 to access it. That’s why the Open Access Button exists. The Open Access Button helps you get the research you want right now, and adds papers you still need to your wishlist to be found later. You can download the free mobile and web app at openaccessbutton.org. The Button, a project run by students and early career researchers from across the world aims to help build tools which help people access research and deliver data, stories and anecdotes to the movement for Open Access to Research. After a beta launched in November 2013 the Button collected thousands of stories of people denied access to research, now after a worldwide launch in Open Access Week 2014 the Button aims to scale up this work with the launch of new web and mobile platforms.
Presentation Link: https://www.openaccessbutton.org
#17 Thomson Reuters
Abstract: Thomson Reuters supports the full research lifecycle for all involved in the process. Our expertise allows us to integrate and centralize data across multiple sources for reliable research and development analysis. The systematic and objective assembly of data lets administrators, researchers and faculty explore and build upon research at the institutional, regional, national and global levels.
#8 Europe PMC: A Full-text Literature Database for the Life Sciences and Platform for Innovation
Abstract: The goal of Europe PMC (http://europepmc.org) is to contribute to an integrated information infrastructure that supports life science research. Europe PMC is a comprehensive database of abstracts (PubMed) and full-text life science research articles, produced in collaboration with PMC USA. Europe PMC is enriched in a number of ways to deliver: (1) integration of the literature with data and grant information (2) scientific credit systems, and (3) applications developed on open access article content. With respect to data integration, Europe PMC provides database crosslinks and text-mines data citations from full text articles, updated daily. Europe PMC has developed tools and search features around ORCID (over one million unique articles in Europe PMC have been claimed by at least one ORCID), and provides a platform for application developers of text-based applications such as text and data miners. Through supporting innovative applications developed in-house and by others, we hope to extend the ways in which the literature is searched and used in the course of research and related processes. As far as licensing allows, Europe PMC content, including the enrichments described above, can be downloaded via the website (via an Export feature for small datasets), via RESTful web services, and by FTP (open access article XML).
Presentation Link: http://europepmc.org
#26 HUBzero: Your Workday on Steroids
Abstract: You may have heard of nanoHUB.org, a science gateway for the nanotechnology community with more than 300,000 users worldwide. But you may not know that the software powering nanoHUB.org also powers 60+ other gateways for many different areas, including cancer research, pharmaceutical manufacturing, environmental modeling, earthquake damage, volcanic activity, cybersecurity, and molecular diagnostics, to name a few. All of these sites combined reach more than 1,800,000 visitors worldwide every year. The underlying infrastructure is available as free open source software known as the HUBzero® Platform for Scientific Collaboration. You may not realize all of the capabilities that HUBzero brings to all of these communities, helping users work together online. HUBzero acts as an “app store” for each community, but with scientific apps that can leverage high-performance and high-throughput computing resources. HUBzero provides data management capabilities that allow users to collaborate in private project spaces, and ultimately publish datasets with digital object identifiers and web analytics to track impact. HUBzero includes a learning management system for online instruction, so that communities can conduct courses to train their users. HUBzero includes dozens of other features, such as a wikis, blogs, calendars, and a Pinterest-style mechanism for sharing scientific information. This demo will walk through a series of vignettes that show how researchers and educators use gateways powered by HUBzero as part of their day-to-day work.
#55 Future of Scholarly communication through Social Media
Abstract: The Ubiquity of social media in the 21st century has put the scientific community to rethink the traditional way of publication model and it has lot of disadvantages for creating, publishing and accessing research data and it has also problems to measure the impact of the research output. This paper discusses disadvantages of tradition Scholarly communication channel and proposes alternatives way of scholarly communication using Scientific Social Media (SSM) i.e. application of Social Media tools and techniques for scholarly communication in order to overcome the issues of existing traditional publication model to increase the visibility of research data and with greater transparency in peer review system with low cost.
#9 Rexplore: Exploring Research Dynamics
Abstract: Making sense of scholarly data is very important for a variety of users, including researchers, research managers, academic publishers, funding bodies, companies operating in the innovation space and others. Rexplore is an innovative solution, which leverages and integrates techniques in large-scale data mining, semantic technologies and visual analytics, to provide a powerful environment for exploring and making sense of scholarly data. In particular, Rexplore allows users i) to detect and make sense of important trends in research –e.g., significant migrations of researchers from one area to another or the emergence of new topics; ii) to identify a variety of interesting relations between researchers –e.g., recognizing authors who share similar research trajectories; iii) to perform fine-grained expert search with respect to detailed multi-dimensional parameters; and iv) to analyse research performance at different levels of abstraction, including individual researchers, organizations, countries, and research communities identified on the basis of dynamic criteria.
Presentation Link: http://technologies.kmi.open.ac.uk/rexplore/
#10 SEEK for Science: A Research Asset Commons Which for Open Science
Abstract: ‘Open Science’ is vital for ensuring accessibility, and reproducibility of research. Publishing findings in open access journals is not enough; readers need access to the research assets (data, models, SOPs etc.) made during, or used within, the research for validation and reuse. SEEK for Science (http://www.seek4science.org/about) is a commons, providing a platform to support access-controlled research asset sharing across large consortia of researchers working in systems biology. A scientist may add their research assets, either through a direct upload, or through a web-link to a resource public databases. It supports the research lifecycle from experiment design, through data collection and annotation, to publishing of the results. It encourages open data and the citation of research assets to backup claims in publications, through the use persistent identifiers (e.g. DOI’s). SEEK handles the versioning of all assets in order to retain their history. Clear credit to other scientists and projects can be provided where due. The Investigation, Studies, Assay (ISA) framework is used to provide scientific context to the registered assets, and describe relationships between them through interlinking. The goal is to make this available as an encapsulated sharable bundle, known as a Research Object. SEEK is open-source software, and is highly configurable and adaptable. Although its origins are for use in Systems Biology research, it is being used and adapted in several other projects including BioVeL (http://www.biovel.eu), and the Virtual Liver Network (http://www.virtual-liver.de/wordpress/en). It forms a pillar of the FAIRDOM project, a four nation-funded collaborative initiative to support Systems Biology data and model management in Europe. Open Access Licence: BSD Code URL: https://github.com/seek4science/seek
Presentation Link: http://seek4science.org/
#56 Making the whole research map available: interactivity and multimedia in policy reseach
Co-Presenter: Caio Wemeck and Javier Guillot
Abstract: Data collection could be seen as a process of drawing/designing a map. This map often has (i) data made by researchers, (ii) data produced by others and collected by researchers, (iii) intermediary analysis, among others. When the research is done, the common procedure is to make public just one (or more) linear path through that map (papers, research reports, talks in conferences). To create that linear narrative, researchers select what's essential – usually a small part of their data and analysis – and the rest gets lost forever in the hard disk. Using as a prototype a policy research ("Fostering sustainability in land reform policy: lessons from ecovillages in Brazil and Germany"), we want to explore new ways to make that map available, instead of just one path. Bringing together a multimedia approach and qualitative research methods, we intend to open all the data produced, collected and processed for potential users to be able to navigate through the research map and paths, in an interactive and multimedia way. Tagging and coding our material and arranging different visualization schemes are essential. We also intend to allow a "zoom in/zoom out" tool in which readers can select topics to get more details about and offer the possibility of other people upload their own materials and research and help construct a collaborative knowledge. In the session, we would like to present our research strategy and discuss ways for researchers to share and communicate their whole research map with peers and non-academics interested in the topic.
#27 ResQuotes.com: Turn your Notes and Highlights into Research Ideas
Abstract: ResearchQuotes is a personal information management tool that supports the extraction, indexing and re-organization of digital annotations (notes, highlights, text snippets) researchers create while working with electronic texts or other scholarly literature available in digital format. The goal of ResearchQuotes is to make it easier for people to review, compare, combine and share novel ideas as they are found in the texts they read. Different levels of privacy controls and support for collaborative work are being developed, so to incrementally make ResearchQuotes also a platform for the exploration and discovery of trending ideas within research communities.
Presentation Link: http://www.resquotes.com
#11 The Online Research Database Service (ORDS)
Abstract: The Online Research Database Service (ORDS), based at the University of Oxford, allows researchers to upload or create relational databases, and to edit, query, and share these using a simple online interface. The service forms part of Oxford’s growing research data management infrastructure, and has been designed with two aims in mind: to help researchers work more efficiently, and to encourage and facilitate data sharing. Because ORDS databases are cloud-hosted, they can be accessed from anywhere in the world. Working copies of data can be easily and securely shared with collaborators in Oxford and beyond. It’s also quick and straightforward to publish a dataset on the Web, allowing researchers to expose the fruit of their research to a wider audience, or to make available data which underpins the conclusions of another research output such as a journal article. Currently in development is additional functionality that will allow one-click archiving of databases at the end of a project, in either Oxford’s institutional data archive or a specialist repository elsewhere. ORDS is based on open source software. The code will be released in 2015, and the ORDS team is happy to engage with other institutions that would like to explore using the software to offer a similar service to their own researchers. The service was launched in August 2014. ORDS was developed by IT Services at the University of Oxford, with the help of funding from Jisc and HEFCE. A copy of the ORDS poster is available from the ORDS blog: http://blogs.it.ox.ac.uk/ords/
Presentation Link: http://ords.ox.ac.uk/
#12 The Semantic Lancet Project: Triplestore and Tools
Abstract: Scholarly publishing has seen an ever increasing interest in Linked Open Data (LOD). However, the existing datasets are mainly designed as translation of existing data sources into RDF. Although that is a crucial step to address, a lot of useful information is not expressed in RDF, and humans are still required to infer important additional knowledge by reading and making sense of the text. Examples are the reasons why authors cite other papers, the rhetorical structure of scientific discourse, bibliometric measures, provenance, and so on. In this presentation we introduce the Semantic Lancet Project, whose goal is to make available an RDF triplestore which includes the formalisation of some useful knowledge hidden within the textual content of the papers. We have developed and implemented a set of rules for reengineering and enhancing data extracted from some publisher's legacy paper repositories. In addition, we also show four tools we have developed by exploiting the data in our triplestore in order to: * present them in an intuitive way (to allow users the browsing of data about scholarly authors and their scientific publications); * assess the impact of a researcher by using also citation functions (that allow users to give a more precise weight to each citation); * search related works by using "semantic" abstracts (that allow users to find papers also taking into account entities, events, roles referred by the abstract); * simplify the maintenance of such data by means of provenance information (that allows users to spot and fix inconsistencies).
Presentation Link: http://www.semanticlancet.eu
#57 How technology could support collaboration between academics and librarians.
Abstract: The importance of collaboration between librarians and academics in enhancing the academic success of students and research capacity has been increasingly recognised in universities. Their main areas of collaboration are taking place in library resources development, teaching and integrating information and research skills into the curricular, collaborative research projects and data management. The development of technology in higher education has impacted significantly on their collaboration practices. The domain of activities has shifted radically from physical locations to online space. However, their online collaborative activities are undertaken in many separate technology platforms. For teaching and learning support activities, academics works with student via the learning management system which is separate from the library management system where librarians are working on. There were no clear pathways to have those systems linked or integrated at certain stages, so as to facilitate collaboration in related teaching and learning support activities that both communities acted on. Library staff needs access to the learning management system to get to know the course content and assessment tasks and the online discussions in order to support the scholarly resources and information skills that students need to complete their course successfully. The divide of the systems that academics and library staff worked on formed an invisible boundary between these two communities. Similarly for research related activities, there is also a lack of a common online platform where academics and librarians can work collaboratively in data management, research projects, publication management and institutional repositories. Future development of collaborative technology in university environment should focus on the integration of the various separate systems that academics and librarians are working on to facilitate their collaboration.
#28 SoftwareX - Get Your Software Published
Abstract: We are delighted to introduce our new journal, SoftwareX! SoftwareX publishes software and aims to acknowledge the impact of software on today's research practice, and on new scientific discoveries in almost all research domains. Benefits of publishing your software in SoftwareX: - Your software gets indexed, is discoverable, accessible and citable - Metadata gets captured to expose your software to interested readers and users - SoftwareX publishes software from all fields, including yours! Above all, SoftwareX aims to inform researchers about software applications, tools and libraries with a potential to impact the process of scientific discovery in various domains. The journal is multidisciplinary and accepts submissions from within and across subject domains.
Presentation Link: http://www.journals.elsevier.com/softwarex
#13 PLOT-IT Public Led Online Trials-Infrastructure and Tools
Co-Presenter: Amanda Burls
Abstract: TITLE PLOT-IT Public Led Online Trials-Infrastructure and Tools INTRODUCTION PLOT is a pragmatic "Does it Work" innovation for putting public led online clinical trials into practice. It is the clinical trials arm of the ThinkWell1 the not-for-profit organization where people can become engaged partners in healthcare through responsible, informed shared decision-making. The plan is to further engage, train and empower the public to plan, prioritize and take part in community-based randomized controlled trials. METHODS PLOT-IT crowd sources research ideas and health data (with academic health researchers providing a support service to ensure that the research is ethical, methodologically sound, clinically safe and that personal data is protected). Participants are involved in multiple aspects of an online clinical trial. AIMS To provide an infrastructure where the public can become informed and responsible shared decision-makers who help prioritize, initiate, design, organize and participate in health research through online randomized controlled trials about health and well-being. OBJECTIVE To develop a scalable online platform for running RCTs of interventions people can do for themselves. RESULTS Embedded methodological research will enable reporting of methods used for informed consent, demographics, outcome measures, recruitment barriers/facilitators, reduction of inequalities and attrition along with participant’s site use patterns and recommendations for improvement. LIMITATIONS Early PLOT-IT data shall focus on exploratory and pragmatic questions. Explanatory research is outside the remit of the project at this time CONCLUSIONS The public has untapped potential for improving decision-making, education and methodology in clinical trials. PLOT-IT aims to engage them directly in health research.
Presentation Link: http://www.ithinkwell.org/finding-answers/crisp/
#58 The proof is in the eating
Abstract: The proof is in the eATING The field of eating behaviour is abundant with a myriad of relatively small studies that can each only paint a small part of the total picture, whereas the big picture is needed to inform the food industry, governmental policy makers and scientists on topics such as overeating. Therefore, there is a rising need to organise the ever growing amount of scientific knowledge. The eATING project aims to improve the understanding of results in a difficult and scattered science by developing an easy to use and interactive network data base tool. Eating behaviour is a particularly difficult field as it is influenced by food properties, our surroundings, our physiology and our psychology and therefore is addressed by many disciplines spanning the molecular and the social sciences. With the computerised eATING tool we will build a network with causal relationships between the many factors that influence eating behaviour. This network will enable interactive exploration to facilitate integration of knowledge between disciplines, will provide a broader mechanistic understanding of eating behaviour, spawn discussion among scientists, will expose knowledge gaps and so generate new research hypotheses or unexpected insights. The causal relationships will be based on results from scientific publications. Besides the network, the eATING tool will provide an overview of all the results from the literature that underlie the causal relations. Finally, eATING can be a communication tool for researchers, industry and citizens.
Presentation Link: http://innoscholcomm.silk.co
#29 Collaborative Scholarship with PeerLibrary
Abstract: No scholar is an island. We all must stand on the shoulders of giants. Not only must we look to our predecessors for guidance, but we must also look to our peers. Modern research survives on a team basis and should be analyzed and discussed as such. PeerLibrary offers an interface that allows group annotation and highlighting on uploaded .pdf documents. It can be utilized by researchers, scholars, information scientists, or students from any background to discuss academic literature. The end goal is to make academia as readily available to anyone, including the general public, as possible. In regards to research, we understand that it, just like learning, is often most effective when done collaboratively. This interactive growth in the academic community will not only make scholarship more distributed and discussed, but it will also guide the progress of future research to build upon the network of scholars’ deliberations. PeerLibrary facilitates the convenient sharing of resources within a research group and provides an outlet for published research. A potential PeerLibrary workflow could allow a researcher to create a group with their fellow collaborators, add relevant articles or data to their group, and solicit feedback using the built-in commenting and annotation feature. Collaborative reading and discussion of publications that are foundational to the team’s research ensures that the entire group understands the necessary information, thereby reducing the risk of confusion or mis-communication.
Presentation Link: http://www.peerlibrary.org
#59 A Solution to the Silo Problem of Academic Literature
Abstract: Academic literature suffers from a silo problem: papers are published in journals that cannot be accessed without an expensive subscription and do not provide for easy analysis and discussion. As a result, the velocity of information transferred across a research field and to the public suffers: the research community has to largely rely on infrequent conferences and the public has to rely on often-inaccurate popular scientific press. Content aggregators and search engines can help alleviate the discovery problem, but cannot solve the problems of access (if one lacks the required journal subscription) and analysis. And while open access journals can help solve this problem, most prominent journals are still closed and show no signs of opening in the near future. One solution to this problem is to create an annotation and discussions layer over the whole of academic literature, allowing those with the relevant permissions to view the entire full text of an article and allowing everyone to provide and view commentary and analysis provided by other users. Such an enveloping layer of knowledge helps everyone better understand the latest literature. To prevent another lock-in, this layer has to be open and shared across the ecosystem. But a thriving ecosystem requires an active community of users, which is easier to facilitate through a centralized digital library with a unified interface for all annotation and discussion content.
#14 ISA Data Publication in Action
Co-Presenter: Alejandra Gonzalez-Beltran
Abstract: Advances in biosensing and digital signal acquisition have transformed experimental biology into a data intensive science. This is affecting the scholarly process by modifying the modalities and possibilities now offered to evaluate the solidity of scientific findings. The availability of the data is crucial for the transparency of the scientific process, for the reproducibility of results and, ultimately, to the re-use of data in further studies, optimizing the return on investiment consented by Society to the scientific entreprise. In this demo & poster, we will document how the Investigation/Study/Assay (ISA) model facilitates life science data publication owing to its flexibility, extensibility and ease of use. The ISA infrastructure being successfully applied to biological data management in an vairous domains [Sansone-NatGenet2012] and adopted by public repositories (e.g. Metabolights: http://www.ebi.ac.uk/metabolights/), data journals (e.g. BMC GigaScience: http://www.gigasciencejournal.com/, and NPG Scientific Data: http://www.nature.com/sdata/ ), research consortiums (e.g ToxBank: http://www.toxbank.net/ ), we will review its use by public repositories and journals in their efforts to preserve research artefacts while implementing community vetted annotation guidelines. We will show how ISA model is easily enacted as a tabular format (ISA-Tab) --- given biologists’ familiarity with spreadsheet software --- and supported a collection of open-source software tools to manipulate the format. We will also reveal how a fuller, semantically richer object can be obtained by producing a named graph using the LinkedISA component for release to the Linked Data Cloud.
Presentation Link: http://isa-tools.org
#15 BioSharing Catalogue of Standards: Supporting Scientists, Publishers and Funders to make data shareable and discoverable
Co-Presenter: Alejandra Gonzalez-Beltran
Abstract: Kaze Multhyomicz is a scientist in the process of writing a grant application whose guidelines require him to produce a detailed management plan, including the provision of syntax, terminologies and public repositories he should be considering in his application. Kaze is a wet lab expert and is really annoyed at such requirements. However, Kaze comes across Biosharing (http://www.biosharing.org) resources. After a few queries using keywords corresponding to the lab techniques his team regularly uses, he has identified 6 databases, 5 standard and 4 vocabularies of relevance. He also discovers that there is no standard for reporting western blots. What can he do about it? Palmito Khondry is an editor and reviewer for a prestigious molecular biology journal. He needs to ensure the submitters comply with reporting guidelines but the Board has provided little help a part from a link to Biosharing. There, Dr Khondry can navigate editorial policies for several major journals and immediately identifies the relevant resources and databases for his field. Ben Schemark is a Program Officer at a leading Funding Agency. He has been getting feedback that the records on standardization efforts in some areas of science and technology are ‘patchy’. He also knows of several competing efforts in another area. Under severe budget cuts following a tough fiscal year, Ben needs to find metrics of usage to justify resources allocation to standardization group. Ben visits Biosharing web site and checks a number of resources based on their popularity, usage and prevalence.
Presentation Link: http://www.biosharing.org
#30 Preparations at Oxford for Open Access Compliance for the Next REF
Abstract: The demonstration will include presentation of the pilot Oxford AAM (Article Accepted Manuscript) repository quick deposit service. There will be discussion around the approach being considered during the Oxford 'OA service design project.' Attendees are encouraged to participate in the discussion.
#31 Machine Methods for Retrieving Data for Display or Mining Utilising Persistent (data-DOI) identifiers
Presentation Link: http://doi.org/10042/a3v2b
#60 Taking Legal Research Beyond Traditional Communication Media
Abstract: Most of the discussions on dissemination of/ access to scholarly publications and research data have focused on natural science research. Even the limited discussions in this regard within humanities and social sciences are often restricted to some specific fields like Economics. But aren’t many of the issues with regard to dissemination/ access equally relevant for legal research materials? Law touches the lives of people in many different ways every day. Unfortunately, most of the legal research materials like scholarly publications and empirical data are locked within the pay-walls of traditional journals and databases. Even many of the statutes and judgements of various courts are beyond the reach of the general public and researchers. This issue of access to legal research materials is particularly challenging in the global south. Unfortunately, the challenges in access to legal research materials have not received due attention. In this context, this presentation will explore different tools that may enable new forms of dissemination of legal research. The presentation will analyse this issue by taking India as a case study. The presentation will begin by examining the current modes of dissemination in the Indian legal research scenario. Based on a comparison of those findings with those from other fields/ other countries, suggestions for new communication strategies in legal research will be derived. Though the study is done in the context of Indian legal research, the findings from the work will be of high importance for many other countries, particularly those in the global south.
#61 Trolls in the peanut gallery?
Abstract: The poster takes as its point of departure a comment thread titled "Fist-clenchingly poor science" left on a PLoS One article. We provide a preliminary analysis of this hostile comment thread. We contextualize this by considering the norms for public scientific communication. We ask: how does computer-mediated discourse impact professional scientists' opportunities to react to each other's work? How does moderation of this public-professional discourse impact the conversation? This is work in progress to be presented at the 14th International Pragmatics Conference, Antwerp, Belgium on the "Researching and understanding the language of aggression and conflict" panel.
#62 The Impact of Information and Communication Technologies on Informal Scientific Communication: a Naturalistic Inquiry Approach
Abstract: This poster is produced as a result of ongoing PhD project titled " The impact of information and communication technologies on informal scientific communication: a naturalistic inquiry approach" The project mainly aims to accomplish three objectives. First, to investigate the role and impact of information and communication technologies on the practice of science in the UK. Second, to examine and characterise changes in scholarly communication activities such as information seeking, publishing, and collaboration. And, third, to investigate the validity of the current scholarly communication models and to determine if there is a need for a new model. The study deployed a naturalistic inquiry approach using semi-structured interviews as a qualitative research tool. A theoretical sample of 40 researchers in four universities were interviewed to gather data regarding informal scholarly communication practices, factors that affect the researchers’ decisions, and changes in the scholarly communication system. The results suggest that there are three types of scholars who engage in scholarly communication activities. First, the ‘Orthodox Scholar’ who only uses formal and traditional scholarly communication approaches. Second, the ‘Moderate Scholar’ who prioritises formal communication approaches, but at the same time is trying to get benefits from informal channels. Lastly, the ‘Heterodox Scholar’ who uses all channels available in scholarly communication. The study also proposes a model of scholarly communication that reflects the current changes in scholarly research.
#63 Building the Digital Manuscripts Toolkit
Abstract: Digital Manuscripts Toolkit is a two-year project based at the Bodleian Libraries, funded by The Andrew W. Mellon Foundation. The aim of the project is to build a toolkit, based on the International Image Interoperability Framework (IIIF), www.iiif.io, which will allow scholars to use, manipulate and repurpose digitized medieval manuscripts in innovative ways. The proposed poster will illustrate some of the principles of IIIF, and will demonstrate some use cases (e.g. dispersed manuscripts, damage & conservation, unusual layouts or formats) drawing on digitized manuscripts held by the Bodleian. It will also outline the project’s upcoming Call for Proposals for small projects to test the toolkit’s functionality, which will give three Oxford-affiliated scholars £5000 each to develop their work involving digitized manuscripts.
Presentation Link: http://dmt.bodleian.ox.ac.uk
#64 Research shared: ZENODO
Co-Presenter: Sünje Dallmeier-Tiessen
Abstract: Zenodo provides an open dependable home for the long tail of science. A service that enables researchers, scientists, projects and institutions to store, share and showcase multidisciplinary research results (data, software and publications) that have not found a home in existing institutional or subject-based repositories. It encourages data sharing and preservation by removing the pain and the barriers, accepting data in all its forms, of all formats and with all licences. Zenodo further promotes reuse by preserving data analysis software, offering easy GitHub integration. It ensures the citability of the research artifacts through assignment of DOIs. It encourages use and reuse through open APIs and harvesting, and supports Open Science and those who drive it by ensuring funding information is captured and made available.
Presentation Link: https://zenodo.org
#65 Data Management in the Laboratory
Abstract: Transparency in documenting laboratory data and processes is a prerequisite for reproducibility. Experimental food research is gradually evolving from an individual activity to a collective activity between the scientific community and the food industry, further increasing the importance of transparency. All research objects - methods, samples, datasets etc. - that are used or generated in an experiment should be sufficiently structured and documented. However, data in practice are often poorly specified and are thus hard to understand and re-use. Tiffany is a platform used at TI Food and Nutrition to help food researchers to not only store their data but also to describe and structure it around their workflow. The Tiffany model is a refinement of the W3C PROV-O model for provenance. It combines the ability to trace back the workflow with extra information such as the type of activity and the research question being investigated. The Tiffany model provides the necessary structure for traceability and transparency. However, our experience with Tiffany has shown that researchers are reluctant to spend time on structuring their data and providing contextual information. This is regarded as an unnecessary overhead rather than a core part of good research. It is essential to provide support to minimize the effort required for annotation and documentation of data in laboratories. We have developed a model, RDF RecordTable, which links annotations to the data and also models the table structure. To validate our model, we have implemented it in an Excel add-in, called Rosanne. Moreover, to address the problem of descriptions, we will devise tools and methods that help researchers describe and document their experimental method while conducting an experiment in laboratory. In the first phase of this research, we take an ontology-based approach to understand, describe and formally model the information structure of method description protocols.
#67 Academic Discipline and Research Data Practices
Abstract: Effective support of research data management requires a deeper understanding of research practices, cultures and the needs of disciplines. Existing literature suggests that data practices vary not only across disciplines but also within sub-fields. The research presented in this poster focuses on geography, a discipline which is connected to the physical sciences, social sciences and humanities, seeking to provide an in-depth insight into researchers’ diverse data practices and their attitudes towards data sharing and explore how these shape or are shaped by the nature of disciplinarity. The research uses mixed-methods. In the first phase, desk-based research was used to analyse UK geography department websites and research groups and the web profiles of 68 researchers. A bibliometric study of collaboration patterns and researchers’ publications was also undertaken. The second phase is based on interviews with geographers from 12 UK institutions. Phase 1 showed that there is a strong division between human and physical geography, not only in research methods but also in collaborations, and sources of funding. Furthermore, the organisation of research groups and departments is very different in research intensive and teaching universities. Preliminary findings from the interviews show that most of human geographers understand the rationale behind data sharing, however, in practice sharing is limited to research councils funded research. Most are sceptical about data anonymisation and the re-usability of data. In contrast, physical geographers are increasingly sharing data beyond research teams in all forms of research.
#32 CERMINE: Automatic Extraction of Metadata and References from Scientific Literature
Abstract: CERMINE is a comprehensive open source system for automatic extraction of metadata and bibliography from scientific articles in born-digital form. The system is based on a modular workflow, whose loosely coupled architecture allows for individual components evaluation and adjustment, enables effortless improvements and replacements of independent parts of the algorithm and simplifies future architecture expanding. The implementations of most steps are based on supervised and unsupervised machine-learning techniques, which simplifies the procedure of adapting the system to new document layouts and styles. The extensive evaluation of the extraction workflow carried out with the use of a large dataset showed good performance for most metadata types, with the average F-score of 78%. CERMINE proved to be useful in a number of projects: for providing missing metadata and full text for large document collections (OpenAIRE+, Paperity), and for intelligent submission interfaces (Public Knowledge Project, Infona). CERMINE system is available under an open-source licence and can be accessed at http://cermine.ceon.pl. During my presentation I will outline the overall workflow architecture and mention the most interesting details about individual implementations. I will also describe the evaluation methodology and report its results. The demonstration will cover the usage of the extraction tools, as well as the procedure of adapting the algorithms for a specific document layouts.
Presentation Link: http://cermine.ceon.pl
#68 The Route from Publication to Database to New Conclusions: Examples from the IEDB
Co-Presenter: James A. Overton, Julia Ponomarenko, and Bjoern Peters
Abstract: The Immune Epitope Database and Analysis Resource is a free resource which presents data from scientific publications. The curators read and manually annotate journal articles to provide immunologists with enriched content regarding the recognition of epitopes by the adaptive immune system. Because the IEDB aggregates immunological data across thousands of manuscripts, it can integrate these data and mine them for new conclusions. One example of new value is the Immunome Browser. Immune responses, particularly in humans, can vary significantly between individuals, and only by studying many individuals can conclusions be drawn. The Immunome Browser takes all published data on the immune response to a given protein and plots them along the length of the protein, displaying how many subjects responded to each region, as well as how many did not respond. Through this aggregation of data combined with enhanced visual display, immunologists can see the larger picture of the immune response to a specific protein. The ability to aggregate information from different sources would be greatly facilitated if the publications were already annotated in a semantically rich format. Currently, for curators to locate the epitope specific data, much detective work is required. An epitope is the portion of the protein which is recognized by the adaptive immune receptor. To properly identify an epitope, its exact amino acid structure, as well as its position in the protein must be known. The lack of enforced protein nomenclature results in many publications that utilize a short hand name for a protein, rather than a strict UniProt identifier. We believe that the current paradigm of traditional publication, curation into semantically aware databases and subsequent re-analysis of data, could be moved towards semantically richer original publications, which would greatly reduce the effort necessary to enable meta-analysis made possible by secondary databases such as the IEDB.
#69 Applying a Rubric to Data Management Plans to Investigate Data Sharing
Abstract: Research data is increasingly considered a component of research that should be managed well and shared openly. In the United States some funding agencies, such as the National Science Foundation (NSF), require that data management plans (DMPs) be submitted to accompany research funding proposals, with a strong emphasis on data sharing. In parallel with these requirements, academic libraries and their partners have begun to develop research data management services in support of data management and curation. Data management plans can offer insight into how researchers might manage, share, and archive their research data. With that in mind, the IMLS-funded “Data management plans as A Research Tool (DART) Project” has developed an analytic rubric to standardize the review of NSF data management plans. The information gleaned from these evaluations can be leveraged for improving internal and external research data management services and infrastructure, from training to tools to repositories. This poster will introduce the rubric developed through a collaboration among five institutions. The focus will be on examining the intentions of researchers toward data publication, as expressed through a preliminary review of a subset of DMPs. We will examine the following questions: - To what degree do DMPs adequately respond to funding agency DMP requirements? - What data publishing options are proposed, and how do they reflect the current avenues for data sharing? - What considerations should be made when using DMPs for this type of analysis?
Presentation Link: http://blogs.oregonstate.edu/dmpresearch/
Abstract: The Hypothes.is Project is a non-profit company dedicated to bringing an open source, open standards annotation layer to the web, with a particular emphasis in the scholarly world. See how annotation can be added to any web page, or carried with you as a browser extension.
Presentation Link: http://hypothes.is
#70 Lines In The Ice: An Interactive Writer-In-Residence At The British Library
Abstract: This poster explores how digital interactive narratives can be used as a method to enhance visitor experiences of cultural heritage institution exhibitions; using Rob Sherman; the British Library’s Interactive Writer-In-Residence for the 'Lines In The Ice' exhibition as a case study. This exhibition runs from November 2014 to March 2015 and includes documents, maps and paraphernalia relating to arctic exploration expeditions, including John Franklin’s ill-fated voyage to find the Northwest Passage in 1845. In a hybrid physical and digital creative writing installation project called "On My Wife's Back", Rob Sherman has created a fictional character Isaac Scinbank, who goes looking for John Franklin's missing expedition. In addition to the creation of a new digital interactive narrative created using the open source Twine platform, and via content being distributed from a piratebox being used as a digital "cairn" in the exhibition gallery, the residency includes live events; encouraging in-person and online visitors to be actively involved in the story writing process; putting flesh on the human stories behind the displayed artefacts. This residency is a pilot study exploring the role of interactive narratives and collaborative creativity in enhancing and enriching visitor experiences of cultural heritage institution's collections and their interpretations of exhibitions; it is hoped the findings will be insightful to other libraries, archives, museums and writers who wish to share their content in immersive new ways.
Presentation Link: http://onmywifesback.tumblr.com
#71 The British Library Ventures Off the Map
Abstract: The British Library is interested in how videogame technologies can be used to enable new innovative interpretations of library collections. The Off The Map competition is a collaboration between the British Library and GameCity, a videogame culture festival run in partnership with Nottingham Trent University. In 2013 and 2014 Off the Map challenged higher education students based in the UK to create videogames inspired by the British Library’s collections, using Crytek’s cutting edge real-time technology CRYENGINE. The 2014 Off The Map competition accompanied the British Library’s current exhibition Terror and Wonder: The Gothic Imagination. Curators picked a number of items, ranging from maps, sounds, text, images, illustrations and architectural plans, to provide three Gothic themes for entrants to base their videogames on. These were author William Beckford’s home Fonthill Abbey, Edgar Allen Poe’s short story The Masque of the Red Death and the seaside town of Whitby, which features in Bram Stoker’s Dracula.The 2014 winning entry Nix, created by three computer gaming students from the University of South Wales, challenges gamers to reconstruct Fonthill Abbey via a series of puzzles in a spooky underwater world. It uses Oculus Rift, a revolutionary virtual reality headset for 3D gaming, to enable the user to virtually explore the Abbey. You can see a flythrough of their game at http://youtu.be/8ESieZO4VHw Plans are currently underway for the third competition: "Alice's Adventures Off the Map", which accompanies the Library's Alice in Wonderland exhibition in 2015.
Presentation Link: http://gamecity.org/alices-adventures-off-the-map/
#72 CLARIN: Common Language Resources and Technology Infrastucture
Abstract: CLARIN is an emerging European research infrastructure which is building shared services to support and promote the use of language data and tools in research across the humanities and social sciences. CLARIN is built on national infrastructures and expert centres, specialising in the areas such as data archiving, online linguistic analysis tools, multimedia resources, resource discovery, knowledge-sharing, legal and ethical issues, federated identity management, persistent identifiers, and web services. As the construction and operation of the CLARIN infrastructure gathers pace, the key challenge becomes the support of existing and changing scholarly processes around digital data: creation, use, re-use, publication, citation, and long-term archiving and preservation.