Toolken+: Improving LLM Tool Usage with Reranking and a Reject Option

K. Yakovlev; Nikolenko S.; Bout A.

?

Toolken+: Improving LLM Tool Usage with Reranking and a Reject Option

P. 5967–5974.

Yakovlev K., Nikolenko S., Bout A.

The recently proposed ToolkenGPT tool learning paradigm demonstrates promising performance but suffers from two major issues: first, it cannot benefit from tool documentation, and second, it often makes mistakes in whether to use a tool at all. We introduce Toolken+ that mitigates the first problem by reranking top-k tools selected by ToolkenGPT and the second problem with a special REJECT option such that the model will generate a vocabulary token if REJECT is ranked first. We demonstrate the effectiveness of Toolken+ on multistep numerical reasoning and tool selection tasks.

Language: English

Full text

Keywords: Natural Language Processing (NLP)large language model (LLM)

In book

Findings of the Association for Computational Linguistics: EMNLP 2024

Association for Computational Linguistics, 2024.

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Выпуск 24.

M.: Max press, 2026.

The volume includes 64 papers from the international conference on computational linguistics and intelligent technologies 'Dialogue 2026,' representing a broad spectrum of theoretical and applied research in the field of natural language description, language process modeling, and the development of practically applicable computational linguistic technologies. For specialists in theoretical and applied linguistics and intelligent technologies. ...

Added: June 27, 2026

Benchmarking DNA large language models on quadruplexes

Cherednichenko O., Herbert A., Poptsova M., Computational and Structural Biotechnology Journal 2025 Vol. 27 P. 992–1000

Large language models (LLMs) in genomics have successfully predicted various functional genomic elements. While their performance is typically evaluated using genomic benchmark datasets, it remains unclear which LLM is best suited for specific downstream tasks, particularly for generating whole-genome annotations. Current LLMs in genomics fall into three main categories: transformer-based models, long convolution-based models, and state-space models ...

Added: June 19, 2026

Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

Severin N., Kartushov D., Urzhumov V. et al., , in: Advances in Information Retrieval: 48th European Conference on Information Retrieval, ECIR 2026, Delft, The Netherlands, March 29 – April 2, 2026, Proceedings, Part II. (LNCS, volume 16484).: Cham: Springer Publishing Company, 2026. P. 508–517.

Sequential recommender systems have achieved significant success in modeling temporal user behavior but remain limited in cap-turing rich user semantics beyond interaction patterns. Large Language Models (LLMs) present opportunities to enhance user understanding with their reasoning capabilities, yet existing integration approaches cre-ate prohibitive inference costs in real time. To address these limitations, we present a ...

Added: June 18, 2026

ESQA: Event Sequences Question Answering

Abdullaeva I., Karpukhin I., Filatov A. et al., IEEE Access 2026 Vol. 14 P. 59390–59408

Event sequences, a specialized type of tabular data annotated with timestamps, are prevalent across practical domains such as finance, retail, social networks, and healthcare. Despite the importance of event sequence modeling and analysis, there has been little effort to adapt Large Language Models (LLMs) to this domain. In this paper, we propose a novel solution ...

Added: June 16, 2026

Proceedings of the Sixth Workshop on Teaching NLP (TeachNLP 2024)

Association for Computational Linguistics, 2024.

Added: June 14, 2026

Analysis of Images, Social Networks and Texts. AIST 2024

Cham: Springer, 2024.

Added: June 14, 2026

Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)

Association for Computational Linguistics, 2026.

Added: June 13, 2026

LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters

Vladimir Bogachev, Aletov V., Alexander Molozhavenko et al., , in: The Fourteenth International Conference on Learning Representations (ICLR 2026).: ICLR, 2026. Ch. 20503 P. 1–26.

This work presents a novel, fully Riemannian framework for Low-Rank Adaptation (LoRA) that geometrically treats low-rank adapters by optimizing them directly on the fixed-rank manifold. This formulation eliminates the parametrization ambiguity present in standard Euclidean optimizers. Our framework integrates three key components to achieve this: (1) we derive Riemannion, a new Riemannian optimizer on the fixed-rank ...

Added: April 29, 2026

Bridging the Semantic Gap in Metadata Management using Large Language Models

Сулейкин А. С., Сорокина В., Пятецкий В. Е., , in: 2025 7th International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency.: [б.и.], 2025. P. 748–753.

Effective metadata management is fundamental to data governance, ensuring that data assets are discoverable, understandable, and usable across the enterprise. However, traditional metadata systems often remain purely technical, describing structures without conveying business meaning. This disconnect — known as the semantic gap — limits the interpretability and value of metadata for business users. To address ...

Added: April 17, 2026

A textual fingerprint learning model to detect fake information spreaders in social networks

Behzadidoost R., Neurocomputing 2025 Vol. 665 P. 1–21

While earlier research has focused on detecting misinformation content, identifying the users who spread it, referred to in this paper as fake information spreaders, remains a relatively new challenge. These users deliberately mix true and false information, making detection more difficult. This paper proposes a textual fingerprint learning model to detect fake information spreaders. The ...

Added: March 12, 2026

SynEL: A synthetic benchmark for entity linking

Karpov I., Kirillovich A., Goncharova E. et al., Plos One 2026 Vol. 21 No. 1 Article e0339468

Large language models (LLMs) offer significant potential for constructing commonsense knowledge graphs from text, demonstrating adaptability across diverse domains. However, their effectiveness varies significantly with domain-specific language, highlighting a critical need for specialized benchmarks to assess and optimize knowledge graph construction sub-tasks like named entity recognition, relation extraction, and entity linking. Currently, domain-specific benchmarks are ...

Added: January 15, 2026

Разработка и интеграция AI-ассистента в систему управления обучением.

Караваева Е. А., Василевский В. И., Ланин Г. М. et al., Труды Института системного программирования РАН 2025 Т. 37 № 4 С. 175–190

The ongoing digitalization of education requires new ways of presenting information and attention retention mechanisms. The aim of the presented work is to propose a solution for implementing a large language model, which will interactively generate prompts of different types, within an e-learning course on programming. The main approaches are the analysis of existing relatively ...

Added: December 25, 2025

Prediction of protein-protein interactions using point transformer and spherical Convex Hull graphs

David Arteaga, Poptsova M., Computational and Structural Biotechnology Journal 2026 Vol. 31 P. 82–93

Accurate predictions and large-scale identification of protein-protein interactions (PPIs) are crucial for understanding their inherent biological mechanisms and protein functions in virtually all biological processes. Nowadays, graph-based deep learning models have made significant contributions in modeling proteins with physicochemical and geometric features. However, most of these models rely on conventional graph construction methods, such as ...

Added: December 22, 2025

Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence

Washington, United States of America: AAAI Press, 2025.

AAAI-25 Technical Tracks 23 (Natural Language Processing II) collects peer-reviewed research papers that advance the state of natural language processing, with an emphasis on large language models, efficient inference, instruction following, retrieval augmentation, and multimodal language understanding. The papers address both theoretical and practical challenges, including model efficiency, interactive generation, grounding in external knowledge and ...

Added: December 18, 2025

Lacuna Inc. at SemEval-2025 Task 4: LoRA-Enhanced Influence-Based Unlearning for LLMs

Kudelya A., Shirnin A., , in: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025).: Association for Computational Linguistics, 2025. P. 1528–1533.

This paper describes LIBU (LoRA enhanced influence-based unlearning), an algorithm to solve the task of unlearning - removing specific knowledge from a large language model without retraining from scratch and compromising its overall utility (SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models). The algorithm combines classical influence functions to remove the influence of ...

Added: November 17, 2025

Empaths at SemEval-2025 Task 11: Retrieval-Augmented Approach to Perceived Emotions Prediction

Morozov L., Mogilevskii A., Shirnin A., , in: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025).: Association for Computational Linguistics, 2025. P. 2000–2007.

Added: November 17, 2025

Findings of the Association for Computational Linguistics: EMNLP 2025

Association for Computational Linguistics, 2025.

The book contains this year’s edition of the Conference on Empirical Methods in Natural Language Processing! Importantly, it marks the 30th edition of EMNLP. With over 8,000 submissions, more than 3,000 accepted papers, and thousands of attendees, we have come a long way from that first workshop, which had 14 accepted papers. As the field ...

Added: November 16, 2025

3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark

Sviridov I., Miftakhova A., Tereshchenko A. et al., , in: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP).: Association for Computational Linguistics, 2025. Ch. 1353 P. 26625–26665.

Though Large Vision-Language Models (LVLMs) are being actively explored in medicine, their ability to conduct complex real-world telemedicine consultations combining accurate diagnosis with professional dialogue remains underexplored. This paper presents 3MDBench (Medical Multimodal Multi-agent Dialogue Benchmark), an open-source framework for simulating and evaluating LVLM-driven telemedical consultations. 3MDBench simulates patient variability through temperament-based Patient Agent and evaluates diagnostic accuracy and dialogue quality ...

Added: November 16, 2025

Findings of the Association for Computational Linguistics: NAACL 2025

Association for Computational Linguistics, 2025.

Added: November 6, 2025

Explainable Document Classification via Concept Whitening and Stable Graph Patterns

Parakal E. G., Kuznetsov S., Makarov I. et al., IEEE Access 2025 Vol. 13 P. 149657–149678

This paper proposes a novel explainable document classification framework that integrates Concept Whitening (CW) with graph concepts that are derived from stable graph patterns, and extracted via methods based on Formal Concept Analysis (FCA) and pattern structures. Document graphs are constructed using Abstract Meaning Representation (AMR) graphs, from which graph concepts are extracted and aligned ...

Added: October 22, 2025

Building a Clean Bartangi Language Corpus and Training Word Embeddings for Low-Resource Language Modeling

Shumen: INCOMA Ltd, 2025.

This paper introduces a rule-based lemmatization and word embedding pipeline for the endangered Bartangi language, part of the Pamiri language group. The system combines a manually constructed lemma dictionary with morphological suffix rules to improve linguistic consistency in low-resource settings. The results demonstrate enhanced lemmatization accuracy and higher-quality embeddings for downstream NLP tasks. The work ...

Added: October 20, 2025

Исследования благополучия с помощью передовых методов обработки естественного языка (NLP): перспективы и ограничения

Voevodina E., Современная зарубежная психология 2025 Т. 14 № 3 С. 172–181

Context and relevance. Well-being research faces methodological limitations of conventional psychometric measures, criticized for poor ecological validity, limited information yield, and inadequate capture of multidimensional construct of well-being. Advanced natural language processing (NLP) technologies offer solutions to these constraints. Objective. To evaluate opportunities and challenges of transformer-based NLP for well-being research. Methods and materials. We conducted an analytical review of ...

Added: October 9, 2025

CLEF 2025 Working Notes

CEUR Workshop Proceedings, 2025.

Added: October 6, 2025