?
Text Classification for Monolingual Political Manifestos with Words Out of Vocabulary
In this position paper, we implement an automatic coding algorithm for electoral programs from the Manifesto Project Database. We propose a new approach that works with new words that are out of the training vocabulary, replacing them with the words from training vocabulary that are the closest neighbors in the space of word embeddings. A set of simulations demonstrates that the proposed algorithm shows classification accuracy comparable to the state-of-the-art benchmarks for monolingual multi-label classification. The agreement levels for the algorithm is comparable with manual labeling. The results for a broad set of model hyperparam-eters are compared to each other.