Europe PMC Text-annotator is now open source
Introducing Text-annotator
Europe PMC has open-sourced Text-annotator, a JavaScript library to locate and annotate plain text in HTML. The annotation process includes:
- Search: Search for a piece of plain text in the HTML; on finding it, store its location identified by an index and then return the index for later annotation.
- Annotate: Annotate the found text given its index.
In order to annotate a piece of text, two steps, search and annotate, are taken. The idea of splitting the annotation process into the two steps is to allow more flexibility, e.g., the user can search for all pieces of text first and then annotate them later as required. Text-annotator can be used in the browser or the Node.js server.
How is Text-annotator used in Europe PMC?
The Text-annotator has been used in Europe PMC in the following features:
Article title highlighting
When searching in Europe PMC, the search term will be highlighted in the title of articles displayed in the search result page.
https://europepmc.org/search?query=p53
Snippets
Text-annotator also supports the snippets, which are highlights from the article matching the searched query.
https://europepmc.org/article/MED/33121131
Every search result displays two snippets, separated by an ellipsis. The following shows the snippets in an article returned for a search term ‘cancer’
https://europepmc.org/search?query=cancer
SciLite
The Text-annotator was also used for the development of SciLite, a tool for highlighting biological entities, such as genes/proteins, accession numbers, protein interactions, diseases, gene-disease relationship, in the full text of life sciences articles in Europe PMC. This function combines the two steps of Text-annotator, search and annotation. SciLite currently provides access to over 1.3 billion annotations in articles.
https://europepmc.org/article/PMC/PMC6423025
Linkback
When using SciLite, the highlighted term will provide a popup window with the name of the annotation provider and a link to the data resource. The feature that allows users to get the linkback to this annotation via the ‘Share’ option also is supported by Text-annotator.
http://europepmc.org/article/PMC/PMC6423025#europepmc-b1865e80dfcfdd5d919150aaa56e7b0b
The source code for Text-annotator is available on Gitlab. We welcome feedback and hope that you find it useful! Check out some of the other projects we’ve open-sourced at GitLab and Github.
For more information, contact helpdesk@europepmc.org.