Sculpting enhanced dependencies for Belarusian
Enhanced Universal Dependencies (EUD) are enhanced graphs expressed on top of basic dependency trees. EUD support repre- sentation of deeper syntactic relations in constructions such as coordi- nation, gapping, relative clauses, and argument sharing through control and raising. The paper presents experiments on the EUD parsing of the low-resource Belarusian language, for which no corpora with enhanced annotations were available.
Models trained on the Universal Dependencies treebanks of two closely related Slavic languages, Russian and Ukrainian, were used to parse sen- tences translated from Belarusian. After that, EUD were projected to the original sentences, which gave us ELAS (Enhanced Labeled Attach- ment Score) 78.1% for both Russian and Ukrainian in evaluation. We also trained a model of one of the IWPT 2020 Shared Task participants on obtained the annotations in Belarusian and achieved ELAS 83.4%. The analysis shows that the most common mistakes of cross-lingual parsing are rooted in different theoretical perspectives and practice approaches to the annotation of particular types of clauses in the three Slavic treebanks. Russian and Ukrainian EUD transfer models tend to make mistakes when dealing with the predicate argument relations, which are hard to iden- tify without understanding the semantics of the sentence. The alignment method decreases the quality of the annotation by confusing tokens that occur in a sentence more than once.