LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Anton R.; Mikhalchuk M.; Rahmatullaev T.; E. Goncharova; Druzhinina P.; Oseledets I.; Kuznetsov A.

doi:10.18653/v1/2025.findings-naacl.432

Publications

?

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

P. 7757–7764.

Anton R., Mikhalchuk M., Rahmatullaev T., Goncharova E., Druzhinina P., Oseledets I., Kuznetsov A.

We introduce methods to quantify how Large Language Models (LLMs) encode and store contextual information, revealing that tokens often seen as minor (e.g., determiners, punctuation) carry surprisingly high context. Notably, removing these tokens — especially stopwords, articles, and commas — consistently degrades performance on MMLU and BABILong-4k, even if removing only irrelevant tokens. Our analysis also shows a strong correlation between contextualization and linearity, where linearity measures how closely the transformation from one layer’s embeddings to the next can be approximated by a single linear mapping. These findings underscore the hidden importance of “filler” tokens in maintaining context. For further exploration, we present LLM-Microscope, an open-source toolkit that assesses token-level nonlinearity, evaluates contextual memory, visualizes intermediate layer contributions (via an adapted Logit Lens), and measures the intrinsic dimensionality of representations. This toolkit illuminates how seemingly trivial tokens can be critical for long-range understanding.

Language: English

DOI

Text on another site

Keywords: NLP интерпретируемость interpretability LLM большие языковые модели Обработка естественного языка (NLP)

In book

Findings of the Association for Computational Linguistics: NAACL 2025

Association for Computational Linguistics, 2025.

Optimizing Computational Infrastructure for Large Language Models in Bioinformatics: A Case Study

Beknazarov N., , in: Parallel Computational Technologies, 19th International Conference, PCT 2025, Moscow, Russia, April 8–10, 2025, Revised Selected Papers. (CCIS, volume 2891)Vol. 2891.: Springer, 2026. P. 3–16.

This paper addresses the challenge of efficiently training Large Language Models (LLMs) on large-scale, sparse omics datasets in high-performance computing (HPC) environments. Using over 1000 BED tracks as a representative data source, we propose a method combining interval-based chunked storage, sparse matrix transformation, and parallel data loading, integrated within a PyTorch Lightning training framework. Our ...

Added: May 19, 2026

От неизвестности к прозрачности: обзор технологий объяснимого ИИ (XAI)

Avdoshin S. M., Pesotskaya E. Y., Информационные технологии 2026 Т. 32 № 4 С. 185–194

With the rapid advancement of artificial intelligence, and deep learning in particular, models have emerged that are capable of delivering highly accurate predictions. However, the internal logic of such models remains difficult to interpret—an issue of critical importance, especially in domains where the correctness of an algorithm directly affects high-stakes decision-making. One promising avenue for ...

Added: May 8, 2026

Персонализированная обратная связь на основе искусственного интеллекта: модель для магистратуры гуманитарного профиля

Подболотова М. И., Адамский А. И., Kolachev N. et al., Высшее образование в России 2026 Т. 35 № 4 С. 21–35

The purpose of the article is to present and justify a pedagogical model of personal ized feedback based on large language models (LLM) for the educational process in a human ities-oriented master’s program. The relevance of the study is determined by the objectives of digital transformation of higher education in the Russian Federation, outlined in Presidential Decree No. 474 ...

Added: May 4, 2026

Об идеологических предвзятостях генеративного ИИ: Российско-украинский конфликт в репрезентации ChatGPT

Baysha O., Trofimov V., Российская школа связей с общественностью 2026 № 40 С. 171–191

A growing number of scholars are warning about the dangers of the reproduction by generative AI of socio-political and ideological biases absorbed by models from the texts on which they were trained. If a given model was trained on Western media texts, it may generate narratives that reproduce West centric views of world events. This ...

Added: April 21, 2026

Large Language Models as Political Actors: Cultural Bias and Epistemic Power

Seredkina E., Seletkova G., Mikhailovsky A., Technology and Language 2026 Vol. 7 No. 1 P. 63–79

The rapid diffusion of Large Language Models (LLMs) into socially and politically sensitive domains raises critical questions about the nature and origins of political bias in artificial intelligence. While existing research often treats bias as a technical flaw to be minimized, this article advances a broader philosophical and cultural interpretation of LLM bias as an ...

Added: April 1, 2026

Granular computing-based deep learning for text classification

Behzadidoost R., Mahan F., Izadkhah H., Information Sciences 2024 Vol. 652 Article 119746

Granular computing involves a comprehensive process that encompasses theories, methodologies, and techniques to solve complex problems, rather than being just an algorithm. As the volume of generated data continues to grow rapidly, data-driven problems have become increasingly complex. Although deep learning models have outperformed traditional machine learning models in solving complex problems, there is still room for enhancing their performance. ...

Added: March 12, 2026

Mechanistic Permutability: Match Features Across Layers

Balagansky N., Maximov I., Gavrilov D., , in: Proceedings of the 13th International Conference on Learning Representations (ICLR 2025).: ICLR, 2025. P. 57940–57957.

Understanding how features evolve across layers in deep neural networks is a fundamental challenge in mechanistic interpretability, particularly due to polysemanticity and feature superposition. While Sparse Autoencoders (SAEs) have been used to extract interpretable features from individual layers, aligning these features across layers has remained an open problem. In this paper, we introduce SAE Match, ...

Added: February 25, 2026

When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs

Seleznyov M., Chaichuk M., Ershov G. et al., , in: Findings of the Association for Computational Linguistics: EMNLP 2025.: Association for Computational Linguistics, 2025. P. 20370–20385.

Large Language Models (LLMs) are highly sensitive to subtle, non-semantic variations in prompt phrasing and formatting. In this work, we present the first systematic evaluation of 4 methods for improving prompt robustness within a unified experimental framework. We benchmark these techniques on 8 models from Llama, Qwen and Gemma families across 52 tasks from Natural ...

Added: February 3, 2026

30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Kanazawa, Japan, July 4–6, 2025, Proceedings, Part I. Natural Language Processing and Information Systems. (LNCS, volume 15836)

Springer, 2025.

The two-volume set LNCS 15836 and 15837 constitutes the proceedings of the 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, held in Kanazawa, Japan, during July 4–6, 2025. The 33 full papers, 19 short papers and 2 demo papers presented in this volume were carefully reviewed and selected from 120 submissions. ...

Added: February 3, 2026

Measuring Chemical LLM robustness to molecular representations: a SMILES variation-based framework

Ganeeva V., Khrabrov K., Kadurin A. et al., Journal of Cheminformatics 2025 No. 17 Article 164

The recent integration of natural language processing into chemistry has advanced drug discovery. Molecule representations in language models (LMs) are crucial to enhance chemical understanding. We explored the ability of models to match the same chemical structures despite their different representations. Recognizing the same substance in different representations is an important component of emulating the ...

Added: February 3, 2026

Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

INCOMA Ltd, 2021.

Added: January 28, 2026

Многоаспектная оценка методов адаптации токенизатора для больших языковых моделей на русском языке

Андрющенко Г. Д., Godunova M., Иванов В. В. et al., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2025 Т. 527 С. 320–331

Large language models (LLMs) pretrained on English-centered corpora have biases and perform sub-optimally on other natural languages. Adaptation of LLMs vocabulary provides a resource-efficient way to improve the quality of a pretrained model. Previously proposed adaptation techniques focus on performance (accuracy) and size metrics (fertility), ignoring other aspects in comparison, such as inference latency, compute ...

Added: January 15, 2026

Aspect-Based Sentiment Analysis Using Large Language Models on Museum Visitor Reviews

Anastasia V. Kolmogorova, Elizaveta R. Kulikova, Vladislav V. Lobanov, Supercomputing Frontiers and Innovations 2025 Vol. 12 No. 3 P. 121–140

Museum reviews provide rich insight into visitor preferences and can drive useful change within institutions, yet they have attracted little attention in sentiment research owing to limited commercial interest and the multi-thematic nature of reviews. In this study we analysed over 12 000 reviews in Russian for 15 museum sites collected from nine different platforms. ...

Added: November 30, 2025

Применение больших языковых моделей для анализа ценностно-патриотического дискурса русскоязычных пользователей

Balakina Y. V., Григорьева М. В., Соколова Е. Н., Вестник Российского фонда фундаментальных исследований. Гуманитарные и общественные науки 2025 Т. 123 № 4 С. 56–69

The article examines the potential of large language models (LLMs) for automated analysis of value-laden and patriotic discourse in Russian-language social media. Using a corpus of posts from VK, Odnoklassniki and Telegram (2023–2025), it investigates the extent to which automatic coding results align with expert annotation based on a specially developed categorical scheme. The codebook ...

Added: November 26, 2025

Empaths at SemEval-2025 Task 11: Retrieval-Augmented Approach to Perceived Emotions Prediction

Morozov L., Mogilevskii A., Shirnin A., , in: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025).: Association for Computational Linguistics, 2025. P. 2000–2007.

This paper describes LIBU (LoRA enhanced influence-based unlearning), an algorithm to solve the task of unlearning - removing specific knowledge from a large language model without retraining from scratch and compromising its overall utility (SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models). The algorithm combines classical influence functions to remove the influence of ...

Added: November 17, 2025

Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Association for Computational Linguistics, 2025.

Added: November 17, 2025

AutoJudge: Judge Decoding Without Manual Annotation

Roman Garipov, Fedor Velikonivtsev, Ivan Ermakov et al., , in: 39th Conference on Neural Information Processing Systems (NeurIPS 2025).: NeurIPS, 2025. P. 94605–94642.

We introduce AutoJudge, a method that accelerates large language model (LLM) inference with task-specific lossy speculative decoding. Instead of matching the original model output distribution token-by-token, we identify the generated tokens that affect the downstream quality of the response, relaxing the distribution match guarantee so that the "unimportant" tokens can be generated faster.Our approach relies ...

Added: November 6, 2025

Strategizing with AI: Insights from a Beauty Contest Experiment

Iuliia Alekseenko, Dagaev D., Sofiia Paklina et al., Journal of Economic Behavior and Organization 2025 Vol. 240 Article 107330

Added: November 6, 2025

Findings of the Association for Computational Linguistics: NAACL 2025

Association for Computational Linguistics, 2025.

Added: November 6, 2025