• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
July 2, 2026
Researchers Discover How Spelling Errors Slow Down Reading in Russian
Psycholinguists from the Centre for Language and Brain at HSE University–St Petersburg have shown that words that are frequently misspelled are processed more slowly by readers, even when presented with the correct spelling. The researchers confirmed this effect for the first time using Russian-language materials and found that response speed is most strongly linked to how confidently individuals can distinguish the correct spelling of a word from an incorrect one. The study has been published in The Mental Lexicon.
July 2, 2026
HSE Develops App for Assessing Phonological Processing in Children
Researchers at the HSE Centre for Language and Brain have developed a new digital tool for assessing children's phonological processing skills—the ZARYA (Sound Analysis of the Russian Language) test battery. It is the first standardised application in Russia designed to provide a fast and reliable assessment of children's ability to distinguish speech sounds, retain them in working memory, and perform phonemic analysis. The app runs on Android tablets and smartphones and is available for download from RuStore. Details of the test validation have been published in the Journal of Speech, Language, and Hearing Research.
July 1, 2026
Scientists Discover Why Europium 'Misbehaves'
Europium is a rare-earth metal responsible for the pure red glow in displays and other luminescent materials. For a long time, however, it refused to emit light when surrounded by certain organic molecules known as acylpyrazolone ligands. Chemists have now uncovered the reason: in europium complexes with these ligands, a 'black window' appears—a charge-transfer state in which the energy absorbed by the ligand is dissipated as heat rather than emitted as light. Understanding this mechanism opens the way to designing more efficient red-emitting materials for displays, fluorescent thermometers, and chemical sensors. The results have been published in Dalton Transactions.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models

P. 868–874.
Razzhigaev A., Mikhalchuk M., Goncharova E., Oseledets I., Dimitrov D. V., Kuznetsov A.
Language: English
Text on another site
Keywords: LLMTransformersIntrinsic dimensionAnisotropy

In book

Findings of the Association for Computational Linguistics: EACL 2024
Association for Computational Linguistics, 2024.
Similar publications
Proceedings of the 4th Workshop on NLP for Music and Audio (NLP4MusA 2026)
Buzaev F., Mullakhmetov R., Bogachev R. et al., Association for Computational Linguistics, 2026.
Playlist generation based on textual queries using large language models (LLMs) is becoming an important interaction paradigm for music streaming platforms. User queries span a wide spectrum from highly personalized intent to essentially catalog-style requests. Existing systems typically rely on non-personalized retrieval/ranking or apply a fixed level of preference conditioning to every query, which can ...
Added: June 22, 2026
B3Emo: Quantifying Affect as a Double-Edged Sword in Strategic LLM Interactions
Stepin A., Mozikov M., Kabanov A. et al., IEEE Access 2026 Vol. 14 P. 48127–48144
The deployment of large language models (LLMs) in interactive roles such as automated negotiators, customer service agents, and strategic partners requires them to handle not only logical tasks but also the socio-emotional dimensions of interaction. In these situations, success often relies on understanding social cues, building trust, and using persuasion effectively. These skills are closely ...
Added: June 16, 2026
Анализ культурных референций в творчестве А. Вознесенского: цифровое исследование имен персоналий
Tyuryakova-Matveeva D., Цифровые гуманитарные исследования 2026 № 1 С. 4–26
The article explores cultural references in the works of Andrei Voznesensky by analyzing the personalities he mentions. A total of 1,678 works were processed, including poetry, prose, and early unpublished poems. NER methods based on Natasha, spaCy, and LLM Grok tools made it possible to study the frequency of mentions of famous people and their ...
Added: May 31, 2026
Optimizing Computational Infrastructure for Large Language Models in Bioinformatics: A Case Study
Beknazarov N., , in: Parallel Computational Technologies, 19th International Conference, PCT 2025, Moscow, Russia, April 8–10, 2025, Revised Selected Papers. (CCIS, volume 2891)Vol. 2891.: Springer, 2026. P. 3–16.
This paper addresses the challenge of efficiently training Large Language Models (LLMs) on large-scale, sparse omics datasets in high-performance computing (HPC) environments. Using over 1000 BED tracks as a representative data source, we propose a method combining interval-based chunked storage, sparse matrix transformation, and parallel data loading, integrated within a PyTorch Lightning training framework. Our ...
Added: May 19, 2026
Efficient Incorporation of New Interactions in Graph Recommenders via Folding-In
Yusupov V., Sukhorukov N., Frolov E., User Modelling and User-Adapted Interaction 2026 Vol. 36 Article 2
Graph-based recommender systems have emerged as a powerful paradigm for personalized recommendations. However, their reliance on full model retraining to incorporate new users or new interactions creates scalability barriers. The task becomes infeasible in real-life recommender systems due to excessive time and resource costs involved. To address this limitation, we propose a fast and efficient ...
Added: March 15, 2026
Efficient Incorporation of New Interactions in Graph Recommenders via Folding-In
Yusupov V., Sukhorukov N., Frolov E., User Modeling and User-Adapted Interaction 2025 P. 1–24
Graph-based recommender systems have emerged as a powerful paradigm for personalized recommendations. However, their reliance on full model retraining to incorporate new users or new interactions creates scalability barriers. The task becomes infeasible in real-life recommender systems due to excessive time and resource costs involved. To address this limitation, we propose a fast and efficient ...
Added: March 14, 2026
When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs
Seleznyov M., Chaichuk M., Ershov G. et al., , in: Findings of the Association for Computational Linguistics: EMNLP 2025.: Association for Computational Linguistics, 2025. P. 20370–20385.
Large Language Models (LLMs) are highly sensitive to subtle, non-semantic variations in prompt phrasing and formatting. In this work, we present the first systematic evaluation of 4 methods for improving prompt robustness within a unified experimental framework. We benchmark these techniques on 8 models from Llama, Qwen and Gemma families across 52 tasks from Natural ...
Added: February 3, 2026
Measuring Chemical LLM robustness to molecular representations: a SMILES variation-based framework
Ganeeva V., Khrabrov K., Kadurin A. et al., Journal of Cheminformatics 2025 No. 17 Article 164
The recent integration of natural language processing into chemistry has advanced drug discovery. Molecule representations in language models (LMs) are crucial to enhance chemical understanding. We explored the ability of models to match the same chemical structures despite their different representations. Recognizing the same substance in different representations is an important component of emulating the ...
Added: February 3, 2026
Efficient Incorporation of New Interactions in Graph Recommenders via Folding-In
Yusupov V., Sukhorukov N., Frolov E., , in: User Modeling and User-Adapted Interaction.: Springer, 2026. Ch. 36.2 P. 1–24.
Graph-based recommender systems have emerged as a powerful paradigm for personalized recommendations. However, their reliance on full model retraining to incorporate new users or new interactions creates scalability barriers. The task becomes infeasible in real-life recommender systems due to excessive time and resource costs involved. To address this limitation, we propose a fast and efficient ...
Added: January 29, 2026
Autoregressive generation strategies for Top-K sequential recommendations
Anna Volodkevich, Danil Gusak, Klenitskiy A. et al., User Modelling and User-Adapted Interaction 2025 No. 35 Article 13
The goal of modern sequential recommender systems is often formulated in terms of next-item prediction. In this paper, we explore the applicability of transformer-based generative models for the Top-K sequential recommendation task, where the goal is to predict items that a user is likely to interact with in the “near future.” This goal aligns with ...
Added: January 26, 2026
Diagnosis of the Severity of Depression Using Speech Recording Analysis
Sherman K., Ignatov D. I., Tatiana I. Shishkovskaya et al., , in: Analysis of Images, Social Networks and Texts, 12th International Conference, AIST 2024, Bishkek, Kyrgyzstan, October 17–19, 2024, Revised Selected PapersVol. 15419.: Springer, 2024. P. 94–108.
More than 3% of people worldwide experience depression. This diagnosis is established through interviews and clinical observations, which is a time- and money-demanding process. Additionally, there are a variety of symptoms associated with depression that are difficult to capture due to the limited capabilities of a human being. Many studies propose methods of automatic mental ...
Added: January 23, 2026
Aspect-Based Sentiment Analysis Using Large Language Models on Museum Visitor Reviews
Anastasia V. Kolmogorova, Elizaveta R. Kulikova, Vladislav V. Lobanov, Supercomputing Frontiers and Innovations 2025 Vol. 12 No. 3 P. 121–140
Museum reviews provide rich insight into visitor preferences and can drive useful change within institutions, yet they have attracted little attention in sentiment research owing to limited commercial interest and the multi-thematic nature of reviews. In this study we analysed over 12 000 reviews in Russian for 15 museum sites collected from nine different platforms. ...
Added: November 30, 2025
AutoJudge: Judge Decoding Without Manual Annotation
Roman Garipov, Fedor Velikonivtsev, Ivan Ermakov et al., , in: 39th Conference on Neural Information Processing Systems (NeurIPS 2025).: NeurIPS, 2025. P. 94605–94642.
We introduce AutoJudge, a method that accelerates large language model (LLM) inference with task-specific lossy speculative decoding. Instead of matching the original model output distribution token-by-token, we identify the generated tokens that affect the downstream quality of the response, relaxing the distribution match guarantee so that the "unimportant" tokens can be generated faster.Our approach relies ...
Added: November 6, 2025
Strategizing with AI: Insights from a Beauty Contest Experiment
Iuliia Alekseenko, Dagaev D., Sofiia Paklina et al., Journal of Economic Behavior and Organization 2025 Vol. 240 Article 107330
Added: November 6, 2025
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
Anton R., Mikhalchuk M., Rahmatullaev T. et al., , in: Findings of the Association for Computational Linguistics: NAACL 2025.: Association for Computational Linguistics, 2025. P. 7757–7764.
We introduce methods to quantify how Large Language Models (LLMs) encode and store contextual information, revealing that tokens often seen as minor (e.g., determiners, punctuation) carry surprisingly high context. Notably, removing these tokens — especially stopwords, articles, and commas — consistently degrades performance on MMLU and BABILong-4k, even if removing only irrelevant tokens. Our analysis ...
Added: November 6, 2025
Исследования благополучия с помощью передовых методов обработки естественного языка (NLP): перспективы и ограничения
Voevodina E., Современная зарубежная психология 2025 Т. 14 № 3 С. 172–181
Context and relevance. Well-being research faces methodological limitations of conventional psychometric measures, criticized for poor ecological validity, limited information yield, and inadequate capture of multidimensional construct of well-being. Advanced natural language processing (NLP) technologies offer solutions to these constraints. Objective. To evaluate opportunities and challenges of transformer-based NLP for well-being research. Methods and materials. We conducted an analytical review of ...
Added: October 9, 2025
Оценка моделей LLM по степени готовности решать задачи управления в области ESG
Storchevoy M., Mylnikov L., Чернышев В. В. et al., / SSRN. Серия "Working Papers". 2025.
Внимание к охране природы принимает все большую значимость для бизнеса с одной стороны в связи с ужесточением в природоохранном законодательстве, а с другой в связи с использованием ESG рейтингов при принятии решений о коммерческой деятельности компаний. Составление рейтинга LLM систем, способных оказывать консультационные услуги в области природоохраны и ESG, позволяет осуществить выбор такой системы для ...
Added: September 18, 2025
Цифровой театр абсурда: могут ли нейросети поставить новую научную проблему перед психологией? Кейс-сравнение ChatGPT и DeepSeek
Хашутогова У. П., Berezner T., Poddiakov A., Новые психологические исследования 2025 № 3 С. 100–125
The rapid advancement of artificial intelligence technologies has drawn increasing attention from psychological researchers. While neural networks are being integrated into nearly all domains of human activity, the boundaries of their applicability remain unclear — particularly regarding the originality and practical value of the content they generate. Proponents advocate for their widespread adoption, whereas skeptics ...
Added: September 4, 2025
Interpreting Metaphorical Language: A Challenge to Artificial Intelligence
Skrynnikova I.V., Вестник Волгоградского государственного университета. Серия 2: Языкознание 2025 Vol. 23 No. 5 P. 99–107
In recent years, numerous studies have pointed to the ability of artificial intelligence (AI) to generate and analyze expressions of natural language. However, the question of whether AI is capable of actually interpreting human language, rather than imitating its understanding, remains open. Metaphors, being an integral part of human language, as both a common figure ...
Added: August 1, 2025
Comparative Study of LoRA and Full Fine-Tuning in Large Language Models
E.V. Surikova, E.A. Sabidaeva, , in: Параллельные вычислительные технологии – XIX всероссийская конференция с международным участием, ПаВТ'2025, г. Москва, 8–10 апреля 2025 г. Короткие статьи и описания плакатов.: Челябинск: Издательский центр ЮУрГУ, 2025. P. 90–98.
Added: July 3, 2025
HR-Tech Automation: A Case Study of Resume Design using GenAI Technologies
Suleykin, A., Babenko, R., Panfilov, P., , in: Proceedings of the 35th International DAAAM Virtual Symposium ''Intelligent Manufacturing & Automation''Vol. 1.: NY: DAAAM International Vienna, 2024. Ch. 20 P. 0157–0164.
Added: April 5, 2025
OmniDialog: A Multimodal Benchmark for Generalization Across Text, Visual, and Audio Modalities
Razzhigaev A., Kurkin M., Goncharova E. et al., , in: Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP.: Association for Computational Linguistics, 2024. P. 183–195.
We introduce OmniDialog — the first trimodal comprehensive benchmark grounded in a knowledge graph (Wikidata) to evaluate the generalization of Large Multimodal Models (LMMs) across three modalities. Our benchmark consists of more than 4,000 dialogues, each averaging 10 turns, all annotated and cross-validated by human experts. The dialogues in our dataset are designed to prevent ...
Added: February 21, 2025
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit