Cross-Domain Limitations of Neural Models on Biomedical Relation Classification
Relation extraction (RE) aims to extract relational facts from plain text, which is essential to the biomedical research field with the rapid growth of biomedical literature and generally large volumes of biomedicine-related text coming from various sources. Numerous annotated corpora and state-of-the-art models have been introduced in the past five years. However, there are no general guidelines about evaluating models on these corpora in single- and cross-domain settings with diverse entities and relation types. We aim to fill this gap for the task of detecting whether a relation holds between two biomedical entities given a text span. In this work, we present a fine-grained evaluation intended to perform a comparative evaluation of four biomedical benchmarks and understand the efficiency of state-of-the-art neural architectures based on Long Short-Term Memory (LSTM) with cross-attention and Bidirectional Encoder Representations from Transformers (BERT) for relation extraction across two main domains, namely scientific abstracts and electronic health records. We present a comparative evaluation of biomedical RE datasets, including the PHAEDRA, i2b2/VA, BC5CDR, and MADE corpora. Our evaluation of BioBERT and LSTM for binary classification shows significant divergence in in-domain and out-of-domain performance, finding an average drop in F1-measure of 34.2% for BioBERT. The cross-attention LSTM model developed in this work exhibits better cross-domain performance, with a drop of only 27.6% in F-measure. © 2013 IEEE.