PKP Presents: Finally, A Completely Automated XML Publication Pipeline

Author / Presenter

Alex Garnett

After many false starts and supposedly-automated-solutions-that-in-fact-involve-a-lot-of-effort-on-the-part-of-grad-students-or-outsourcing over the years, the Public Knowledge Project has put together a new service-oriented toolchain consisting of several cutting-edge scholarly parsing tools (including pdfx, Pandoc, ParsCit, citeproc, and a boatload of XSL) to provide a fully automated solution for converting Word-authored article drafts to NLM XML. This allows very low-budget open access publishers to finally move beyond PDF in their publication workflows, as well as enabling us to develop additional functionality within Open Journal Systems and Open Monograph Press for editing and mining richly formatted articles.

Meeting / Conference
Name: 
Beyond the PDF2