Rotations and Interpretability of Word Embeddings: The Case of the Russian Language

A. Zobnin

doi:10.1007/978-3-319-73013-4_11

Publications

?

Rotations and Interpretability of Word Embeddings: The Case of the Russian Language

Ch. 11. P. 116–128.

Zobnin A.

Consider a continuous word embedding model. Usually, the cosines between word vectors are used as a measure of similarity of words. These cosines do not change under orthogonal transformations of the embedding space. We demonstrate that, using some canonical orthogonal transformations from SVD, it is possible both to increase the meaning of some components and to make the components more stable under re-learning. We study the interpretability of components for publicly available models for the Russian language (RusVectores, fastText, RDT).

Keywords: SVD word embeddings interpretability

In book

Analysis of Images, Social Networks and Texts. 6th International Conference, 2017, Revised Selected Papers

Vol. 10716. , Cham: Springer, 2018.

От неизвестности к прозрачности: обзор технологий объяснимого ИИ (XAI)

Avdoshin S. M., Pesotskaya E. Y., Информационные технологии 2026 Т. 32 № 4 С. 185–194

With the rapid advancement of artificial intelligence, and deep learning in particular, models have emerged that are capable of delivering highly accurate predictions. However, the internal logic of such models remains difficult to interpret—an issue of critical importance, especially in domains where the correctness of an algorithm directly affects high-stakes decision-making. One promising avenue for ...

Added: May 8, 2026

Mechanistic Permutability: Match Features Across Layers

Balagansky N., Maximov I., Gavrilov D., , in: Proceedings of the 13th International Conference on Learning Representations (ICLR 2025).: ICLR, 2025. P. 57940–57957.

Understanding how features evolve across layers in deep neural networks is a fundamental challenge in mechanistic interpretability, particularly due to polysemanticity and feature superposition. While Sparse Autoencoders (SAEs) have been used to extract interpretable features from individual layers, aligning these features across layers has remained an open problem. In this paper, we introduce SAE Match, ...

Added: February 25, 2026

Application of MIMO technology in wideband millimeter range wireless communications systems

Tiraspolsky S.A., Ermolayev V. T., Flaksman A. G. et al., Radioelectronics and Communications Systems 2011 Vol. 54 P. 219–226

A concept of using MIMO technology in millimeter range wireless communications systems with orthogonal frequency division multiplexing is considered. The concept is based on dividing transmitting and receiving multi-element antenna arrays into separate sub-arrays with analogue radiation pattern shaping and on using two most powerful space sub-channels for information transmission. Sequence and structure of transmitted ...

Added: February 10, 2026

mmWave SVD-based beamformed MIMO communication systems

Sergey Tiraspolsky, Jeon B., Kim J. et al., Proceedings of the 7th IEEE conference on Consumer communications and networking (CCNC’2010) 2010 P. 834–838

This paper provides concept of data transmission protocol for millimeter wave (mmWave) wireless systems operating in Non-Line-of-Sight environment. This concept is designed to provide an effective and practical functioning of Multiple-Input Multiple-Output (MIMO) transmission mode that exploits combination of Singular Value Decomposition (SVD) of channel matrix and non-adaptive beamforming. The proposed protocol reduces complexity of ...

Added: February 10, 2026

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Anton R., Mikhalchuk M., Rahmatullaev T. et al., , in: Findings of the Association for Computational Linguistics: NAACL 2025.: Association for Computational Linguistics, 2025. P. 7757–7764.

We introduce methods to quantify how Large Language Models (LLMs) encode and store contextual information, revealing that tokens often seen as minor (e.g., determiners, punctuation) carry surprisingly high context. Notably, removing these tokens — especially stopwords, articles, and commas — consistently degrades performance on MMLU and BABILong-4k, even if removing only irrelevant tokens. Our analysis ...

Added: November 6, 2025

An unstructured algorithm for the singular value decomposition of biquaternion matrices

Wang G., Applied Mathematics Letters 2025 Vol. 163 Article 109436

With the modeling of the biquaternion algebra in multidimensional signal processing, it has become possible to address issues such as data separation, denoising, and anomaly detection. This paper investigates the singular value decomposition of biquaternion matrices (SVDBQ), establishing an SVDBQ theorem that ensures unitary matrices formed by the left and right singular vectors, while also introducing a new form for singular ...

Added: October 2, 2025

Новые интерфейсы и новые медиаторы

Maksimenkova O. V., Сегал А. П., Вопросы философии 2025 № 10 С. 67–76

The study is devoted to the humans and artificial intelligence (AI) interaction. The authors view this interaction as mediated by interfaces that both simplify it and hide the real mechanisms of encoding and decoding messages (according to Shannon). In such a situation, the characteristics of the actor of communication are blurred, and it is not ...

Added: October 2, 2025

Of Models and Men: Probing Neural Networks for Agreement Attraction with Psycholinguistic Data

Bazhukov M., Voloshina E., Sergey Pletnev et al., , in: Proceedings of the 28th Conference on Computational Natural Language Learning.: Association for Computational Linguistics, 2024. P. 280–290.

Added: March 11, 2025

On Rank of Multivectors in Geometric Algebras

Dmitry Shirokov, Mathematical Methods in the Applied Sciences 2025 Vol. 48 No. 11 P. 11095–11102

We introduce the notion of rank of multivector in Clifford geometric algebras of arbitrary dimension without using the corresponding matrix representations and using only geometric algebra operations. We use the concepts of characteristic polynomial in geometric algebras and the method of SVD. The results can be used in various applications of geometric algebras in computer ...

Added: December 4, 2024

On SVD and Polar Decomposition in Real and Complexified Clifford Algebras

Shirokov D., Advances in Applied Clifford Algebras 2024 Vol. 34 Article 23

In this paper, we present a natural implementation of singular value decomposition (SVD) and polar decomposition of an arbitrary multivector in nondegenerate real and complexified Clifford geometric algebras of arbitrary dimension and signature. The new theorems involve only operations in geometric algebras and do not involve matrix operations. We naturally define these and other related ...

Added: August 23, 2024

A review of Explainable Artificial Intelligence in healthcare

Sadeghi Z., Alizadehsani R., Cifci M. A. et al., Computers and Electrical Engineering 2024 Vol. 118 No. A Article 109370

Explainable Artificial Intelligence (XAI) encompasses the strategies and methodologies used in constructing AI systems that enable end-users to comprehend and interpret the outputs and predictions made by AI models. The increasing deployment of opaque AI applications in high-stakes fields, particularly healthcare, has amplified the need for clarity and explainability. This stems from the potential high-impact ...

Added: June 8, 2024

On Singular Value Decomposition and Polar Decomposition in Geometric Algebras

Shirokov D., , in: Advances in Computer Graphics: 40th Computer Graphics International Conference, CGI 2023, Shanghai, China, August 28 – September 1, 2023, Proceedings, Part IV* 4. Vol. 14498.: Springer, 2024. P. 391–401.

This paper is a brief note on the natural implementation of singular value decomposition (SVD) and polar decomposition of an arbitrary multivector in nondegenerate real (Clifford) geometric algebras of arbitrary dimension and signature. We naturally define these and other related structures (operation of Hermitian conjugation, Euclidean space, and Lie groups) in geometric algebras. The results ...

Added: December 25, 2023

Study on precoding optimization algorithms in massive MIMO system with multi-antenna users

Bobrov E., Kropotov D., Troshin S. et al., Optimization Methods and Software 2022 P. 1–16

The paper studies the multi-user precoding problem as a non-convex optimization problem for wireless multiple inputs and multiple outputs (MIMO) systems. In our work, we approximate the target Spectral Efficiency function with a novel computationally simpler function. Then, we reduce the precoding problem to an unconstrained optimization task using a special differential projection method and ...

Added: October 26, 2022

Tradeoff search methods between interpretability and accuracy of the identification fuzzy systems based on rules

Yankovskaya A. E., Gorbunov I. V., Hodashinsky I. A., Pattern Recognition and Image Analysis 2021 Vol. 2 No. 27 P. 243–265

This paper starts a brief historical overview of occurrence and development of fuzzy systems and their applications. Integration methods are proposed to construct a fuzzy system using other AI methods, achieving synergy effect. Accuracy and interpretability are selected as main properties of rule-based fuzzy systems. The tradeoff between interpretability and accuracy is considered to be ...

Added: September 27, 2021

A note on the hyperbolic singular value decomposition without hyperexchange matrices

Shirokov D., Journal of Computational and Applied Mathematics 2021 Vol. 391 Article 113450

We present a new formulation of the hyperbolic singular value decomposition (HSVD) for an arbitrary complex (or real) matrix without hyperexchange matrices and redundant invariant parameters. In our formulation, we use only the concept of pseudo-unitary (or pseudo-orthogonal) matrices. We show that computing the HSVD in the general case is reduced to calculation of eigenvalues, ...

Added: February 12, 2021

Unsupervised Discovery of Interpretable Directions in the GAN Latent Space

Voynov A., Babenko A., , in: International Conference on Machine Learning (ICML 2020)Vol. 119.: PMLR, 2020. P. 9728–9738.

Added: January 14, 2021

A resource-light method for cross-lingual semantic textual similarity

Glavas G., Franco-Salvador M., Ponzetto S. et al., Knowledge-Based Systems 2018 Vol. 143 P. 1–9

Recognizing semantically similar sentences or paragraphs across languages is beneficial for many tasks, ranging from cross-lingual information retrieval and plagiarism detection to machine translation. Recently proposed methods for predicting cross-lingual semantic similarity of short texts, however, make use of tools and resources (e.g., machine translation systems, syntactic parsers or named entity recognition) that for many ...

Added: October 29, 2020

Scalable and language-independent embedding-based approach for plagiarism detection considering obfuscation type: no training phase

Gharavi E., Veisi H., Россо П., Neural Computing and Applications 2020 Vol. 32 No. 14 P. 10593–10607

The efficiency and scalability of plagiarism detection systems have become a major challenge due to the vast amount of available textual data in several languages over the Internet. Plagiarism occurs in different levels of obfuscation, ranging from the exact copy of original materials to text summarization. Consequently, designed algorithms to detect plagiarism should be robust ...

Added: October 29, 2020

Evaluation of Vector Transformations for Russian Word2Vec and FastText Embeddings

Korogodina O., Karpik O., Klyshinsky E., , in: GraphiCon 2020 - Proceedings of the 30th International Conference on Computer Graphics and Machine Vision.: St. Petersburg: CEUR-WS, 2020.

Authors of Word2Vec claimed that their technology could solve the word analogy problem using the vector transformation in the introduced vector space. However, the practice demonstrates that it is not always true. In this paper, we investigate several Word2Vec and FastText model trained for the Russian language and find out reasons of such inconsistency. We ...

Added: October 21, 2020

Word2vec not dead: predicting hypernyms of co-hyponyms is better than reading definitions

Arefyev N V., Fedoseev M., Kabanov A. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: по материалам ежегодной международной конференции «Диалог» (Москва, 17–20 июня 2020 г.)Issue 19(26): дополнительный том.: -, 2020. P. 13–32.

Expert-built lexical resources are known to provide information of good quality for the cost of low coverage. This property limits their applicability in modern NLP applications. Building descriptions of lexical-semantic relations manually in sufficient volume requires a huge amount of qualified human labour. However, given some initial version of a taxonomy is already built, automatic ...

Added: October 9, 2020

How much does a word weight? Weighting word embeddings for word sense induction

Arefyev, N., Ermolaev P., Panchenko A., , in: Computational Linguistics and Intellectual Technologies. International Conference "Dialogue 2018" Proceedings.: M.: Conference Proceedings Editorial board, 2018. P. 68–84.

The paper describes our participation in the first shared task on word sense induction and disambiguation for the Russian language RUSSE'2018 [Panchenko et al., 2018]. For each of several dozens of ambiguous words, the participants were asked to group text fragments containing it according to the senses of this word, which were not provided beforehand, ...

Added: October 9, 2020

Word Embedding for Semantically Related Words: An Experimental Study

Karyaeva M., Braslavski P., Sokolov V., Automatic Control and Computer Sciences 2019 Vol. 53 P. 638–643

The ability to identify semantic relations between words has made a word2vec model widely used in NLP tasks. The idea of word2vec is based on a simple rule that a higher similarity can be reached if two words have a similar context. Each word can be represented as a vector, so the closest coordinates of vectors can be interpreted ...

Added: April 10, 2020