Europe PMC Text-annotator is now open source

Introducing Text-annotator

Europe PMC has open-sourced Text-annotator, a JavaScript library to locate and annotate plain text in HTML. The annotation process includes:

  1. Search: Search for a piece of plain text in the HTML; on finding it, store its location identified by an index and then return the index for later annotation.
  2. Annotate: Annotate the found text given its index.

    In order to annotate a piece of text, two steps, search and annotate, are taken. The idea of splitting the annotation process into the two steps is to allow more flexibility, e.g., the user can search for all pieces of text first and then annotate them later as required. Text-annotator can be used in the browser or the Node.js server.

How is Text-annotator used in Europe PMC?

The Text-annotator has been used in Europe PMC in the following features:

Article title highlighting

When searching in Europe PMC, the search term will be highlighted in the title of articles displayed in the search result page.

Search example https://europepmc.org/search?query=p53

Snippets

Text-annotator also supports the snippets, which are highlights from the article matching the searched query.

Snippets example https://europepmc.org/article/MED/33121131

Every search result displays two snippets, separated by an ellipsis. The following shows the snippets in an article returned for a search term ‘cancer’

Search result snippets example https://europepmc.org/search?query=cancer

SciLite

The Text-annotator was also used for the development of SciLite, a tool for highlighting biological entities, such as genes/proteins, accession numbers, protein interactions, diseases, gene-disease relationship, in the full text of life sciences articles in Europe PMC. This function combines the two steps of Text-annotator, search and annotation. SciLite currently provides access to over 1.3 billion annotations in articles.

Scilite example https://europepmc.org/article/PMC/PMC6423025

Linkback

When using SciLite, the highlighted term will provide a popup window with the name of the annotation provider and a link to the data resource. The feature that allows users to get the linkback to this annotation via the ‘Share’ option also is supported by Text-annotator.

Linkback example http://europepmc.org/article/PMC/PMC6423025#europepmc-b1865e80dfcfdd5d919150aaa56e7b0b

The source code for Text-annotator is available on Gitlab. We welcome feedback and hope that you find it useful! Check out some of the other projects we’ve open-sourced at GitLab and Github.

For more information, contact helpdesk@europepmc.org.