Extraction of protein-protein interactions (PPIs) from the literature by deep convolutional neural networks with various feature embeddings
Journal of Information Science
Published online on November 14, 2016
Abstract
The automatic extraction of protein–protein interactions (PPIs) reported in scientific publications are of great significance for biomedical researchers in that they could efficiently grasp the recent research results about biochemical events and molecular processes for conducting their original studies. This article introduces a deep convolutional neural network (DCNN) equipped with various feature embeddings to battle the limitations of the existing machine learning-based PPI extraction methods. The proposed model learns and optimises word embeddings based on the publicly available word vectors and also exploits position embeddings to identify the locations of the target protein names in sentences. Furthermore, it can employ various linguistic feature embeddings to improve the PPI extraction. The intensive experiments using AIMed data set known as the most difficult collection not only show the superiority of the suggested model but also indicate important implications in optimising the network parameters and hyperparameters.