• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models

.
Alina Shutova, Vladimir Malinovskii, Vage Egiazarian, Denis Kuznedelev, Ivan Ermakov
Language: English
Text on another site
Keywords: large language models

In book

Volume 267: International Conference on Machine Learning, 13-19 July 2025, Vancouver Convention Center, Vancouver, Canada
Vol. 267. , [б.и.], 2025.
Similar publications
Персонализированная обратная связь на основе искусственного интеллекта: модель для магистратуры гуманитарного профиля
Подболотова М. И., Адамский А. И., Kolachev N. et al., Высшее образование в России 2026 Т. 35 № 4 С. 21–35
The purpose of the article is to present and justify a pedagogical model of personal ized feedback based on large language models (LLM) for the educational process in a human ities-oriented master’s program. The relevance of the study is determined by the objectives of digital transformation of higher education in the Russian Federation, outlined in Presidential Decree No. 474 ...
Added: May 4, 2026
Применение больших языковых моделей для анализа ценностно-патриотического дискурса русскоязычных пользователей
Balakina Y. V., Григорьева М. В., Соколова Е. Н., Вестник Российского фонда фундаментальных исследований. Гуманитарные и общественные науки 2025 Т. 123 № 4 С. 56–69
The article examines the potential of large language models (LLMs) for automated analysis of value-laden and patriotic discourse in Russian-language social media. Using a corpus of posts from VK, Odnoklassniki and Telegram (2023–2025), it investigates the extent to which automatic coding results align with expert annotation based on a specially developed categorical scheme. The codebook ...
Added: November 26, 2025
Новые интерфейсы и новые медиаторы
Maksimenkova O. V., Сегал А. П., Вопросы философии 2025 № 10 С. 67–76
The study is devoted to the humans and artificial intelligence (AI) interaction. The authors view this interaction as mediated by interfaces that both simplify it and hide the real mechanisms of encoding and decoding messages (according to Shannon). In such a situation, the characteristics of the actor of communication are blurred, and it is not ...
Added: October 2, 2025
Artificial Neural Networks and Machine Learning. ICANN 2025 International Workshops and Special Sessions: 34th International Conference on Artificial Neural Networks, Kaunas, Lithuania, September 9–12, 2025, Proceedings, Part V
Cham: Springer, 2025.
This book constitutes the refereed proceedings of 34th International Workshops which were held in conjunction with the 34th International Conference on Artificial Neural Networks and Machine Learning, ICANN 2025, held in Kaunas, Lithuania, September 9–12, 2025.   The 20 full papers and 8 abstracts included in this workshop volume were carefully reviewed and selected from 42 submissions. ...
Added: September 29, 2025
Rewriting the Rules: LLMs Vs. Traditional ML in University Admissions
Chepikov I., Karpov I., , in: 26th International Conference, AIED 2025, Palermo, Italy, July 22–26, 2025, Proceedings, Part I. Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium, Blue Sky, and WideAIED.: Springer, 2025. P. 352 – 358.
Modern LLM models such as BERT, ChatGPT, DeepSeek have shown great potential in solving various tasks, including text classification, text generation, analysis and summary of documents. In this paper, we show that these models close to classical ML approaches based on decision trees not only in text processing, but also in processing classical tabular data ...
Added: September 4, 2025
Распознавание рукописного текста и интеллектуальный анализ: возможности нейронных технологий (на примере работы с «Дневником» Ф.П. Литке)
Boltunova E., Laptev A., Имагология и компаративистика 2025 № 23 С. 358–379
Added: June 16, 2025
Оценивание студенческих работ в рамках обучения академическому письму на английском языке в контексте развития инструментов искусственного интеллекта
Bakulev A., В кн.: Профессионализм учителя иностранных языков и его реализация. Сборник статей по материалам научно-методического симпозиума с международным участием «Лемпертовские чтения – XXVII» 15-17 мая 2025 года.: Пятигорск: Издательство Пятигорского государственного университета, 2025. С. 270–279.
The paper focuses on assessing students’ written papers in the discipline “Academic Writing in English in the context of AI tools’ capabilities. AI tools, specifically large language models (LLMs) appear to be able to tackle and solve a wide range of educational and research tasks. Foreign language teaching is no exception: AI tools are utilized ...
Added: June 5, 2025
Generative AI-based Approach to Concept Drift Generation in Streaming Text Data
Belov B., Peter Panfilov, WSEAS Transactions on Information Science and Applications 2025 Vol. 22 P. 11–20
Real-time analysis of text streams is crucial for industrial and business processes and scenarios. It is expected to be  one of the important future research topics in the text processing and understanding domain. Analysis of text data is based on the use of pre-trained machine learning/data mining (ML/DM) models that may demonstrate performance degradation over ...
Added: April 5, 2025
Роль больших языковых моделей в интегрированных средах разработки нового поколения
Ишанхонов А. Ю., Pshichenko D., Можаровский Е. А. et al., Программные системы и вычислительные методы 2024 № 4 С. 140–150
The role of Large Language Models (LLM) in new generation integrated development environments (IDEs). Tools such as GitHub Copilot, IntelliCode and Alice Code Assistant are explored in the context of their use in programming. The authors examine how LLMs enable the automation of key development tasks, including code autocompletion, error detection, refactoring, and code generation, ...
Added: March 10, 2025
Ensuring trustworthy code: leveraging a static analyzer to identify and mitigate defects in generated code
D. Shaikhelislamov, Drobyshevskiy M., A. Belevantsev, Journal of Mathematical Sciences 2024 Vol. 540 P. 233–251
The rise of large language models (LLMs) has greatly advanced code generation capabilities. A recent StackOverflow survey found that 70% of developers are using or planning to use AI coding tools this year. However, most current methods focus on supervised fine-tuning objectives derived from text generation, often overlooking the distinct sequence-level properties of code, such as compilability, and ...
Added: February 3, 2025
Управление знаниями организации и большие языковые модели
Zelenkov Y., Российский журнал менеджмента 2024 Т. 22 № 3 С. 573–601
Purpose: to summarize, classify and analyze current scientific publications on the use of large language models (LLM) in knowledge management in organization. Methodology: systematic literature review was conducted. It was based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) framework. 75 publications were selected for analysis, including academic articles and reports of consulting companies published ...
Added: January 4, 2025
ChatGPT, текст, информация: критический анализ
Komashko M. N., Труды по интеллектуальной собственности 2024 Т. 50 № 3 С. 118–128
The paper deals with theory and practice issues related to such type of artificial intelligence as large language models, in particular, ChatGPT. The main attention is paid to spheres of human activity, in which the exchange of information stated in the form of text is of the greatest importance: science, education and journalism (media sphere). The ...
Added: December 29, 2024
DAREL: Data Reduction with Losses for Training Acceleration of Real and Hypercomplex Neural Networks
Demidovskij A., Трутнев А. И., Тугарев А. М. et al., / Series ZmuLcqwzkl "NeurIPS 2023 Workshop". 2023.
Neural network training requires a lot of resources, and there are situations where training time and memory usage are limited. It makes specialized algorithms for training neural networks within the constraints of resource limitations an important and significant challenge. Data Reduction with Losses is a novel training data reduction method that operates with training samples ...
Added: January 17, 2024
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit