BioLexicon is a large-scale English terminological resource which has been developed to address the needs emerging in text mining efforts in the biomedical domain. This database available for purchase contains over 2.2.M lexical entries (3.3M semantic relations) terminological variants and rich linguistic information (subcategorization frames) which supports text mining systems. The comprehensive coverage of biological terms makes BioLexicon a unique linguistic resource within the domain. It is primarily intended to support text mining and information retrieval in the biomedical domain however its standards-based structure and rich content make it a valuable resource for many other kinds of application. Additionally it has been evaluated it in IR/IE tasks. Of particular interest is the use of the BioLexicon in UKPubMedCentral evidencefinder service we are building ( The BioLexicon provides specific information to help determine the relevant facts to be extracted.BioLexicon is available in a relational database format (MySQL dump format) and it adheres to the EAGLES/ISO standards for lexical resources. It is available from the European Language Resources Association (ELRA) Catalog It is a collective achievement by EBML-EBI CNR-ILC and the University of Manchester in the EC BOOTStrep Project.Reference: Sasaki Y. Montemagni S. Pezik P. Rebholz-Schuhmann D. McNaught J. and Ananiadou S. (2008). BioLexicon: A Lexical Resource for the Biology Domain. In: Salakoski T. and Pyysalo S.(Eds.) Proceedings of the Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008) Turku Finland pp. 109--116 Turku Centre for Computer Science (TUCS)

Resource Type: 
Parent organization: 
National Centre for Text Mining
Supporting agency: