Disk storage management for LHCb based on Data Popularity estimator

Hushchyn M.; Charpentier P.; A. Ustyuzhanin

doi:10.1088/1742-6596/664/8/082025

Publications

?

Disk storage management for LHCb based on Data Popularity estimator

Journal of Physics: Conference Series. 2015. Vol. 664.

Hushchyn M., Charpentier P., Ustyuzhanin A.

This paper presents a system providing recommendations for optimizing the LHCb data storage. The LHCb data storage system is a hybrid system. All datasets are kept as archives on magnetic tapes. The most popular datasets are kept on disks. The recommendation system takes the dataset usage history and metadata (size, type, configuration etc.) to generate a recommendation report. In this article present how we use machine learning algorithms to predict future data popularity. Using these predictions it is possible to estimate which datasets should be removed from disk. We use regression algorithms and time series analysis to find the optimal number of replicas for datasets that are kept on disk. Based on the data popularity and the number of replicas optimization, the recommendation system minimizes a loss function to find the optimal data distribution. The loss function represents all requirements for data distribution in the data storage system. We demonstrate how the recommendation system helps to save disk space and to reduce waiting times for jobs using this data.

Research target: Computer Science

Priority areas: IT and mathematics

Keywords: distributed computing technology

Методический подход к проектированию сервисов упрощенной интеграции распределенных it-ресурсов

Хрусталев Е. Ю., Chumichkin A. A., Информационные ресурсы России 2012 № 3 С. 2–6

We analyze patterns of development of modem information systems in terms of requirements for the IT-resources. We propose as a way of solving problems in high performance computing principles union of three promising concepts: the concept of GRID for the integration of distributed IT-resources, the concept of cloud computing to provide flexibility in their provision ...

Added: September 6, 2012

Intelligent Computing: SAI 2020: Volume 3

Cham: Springer, 2020.

This book focuses on the core areas of computing and their applications in the real world. Presenting papers from the Computing Conference 2020 covers a diverse range of research areas, describing various detailed techniques that have been developed and implemented. The Computing Conference 2020, which provided a venue for academic and industry practitioners to share new ...

Added: July 7, 2020

Distributed Computing and Grid-technologies in Science and Education 2016.

CEUR-WS, 2017.

Selected Papers of the 7th International Conference Distributed Computing and Grid-technologies in Science and Education ...

Added: February 15, 2017

The complexity of the 3-colorability problem in the absence of a pair of small forbidden induced subgraphs

Malyshev D., Discrete Mathematics 2015 Vol. 338 No. 11 P. 1860–1865

We completely determine the complexity status of the 3-colorability problem for hereditary graph classes defined by two forbidden induced subgraphs with at most five vertices. ...

Added: April 7, 2014

Priority Queueing for Packets with Two Characteristics

Chuprikov P., Nikolenko S. I., Davydow A. et al., IEEE Transactions on Networking 2018 Vol. 26 No. 1 P. 342–355

Modern network elements are increasingly required to deal with heterogeneous traffic. Recent works consider processing policies for buffers that hold packets with different processing requirements (number of processing cycles needed before a packet can be transmitted out) but uniform value, aiming to maximize the throughput, i.e., the number of transmitted packets. Other developments deal with ...

Added: March 14, 2018

Algorithms and methods for solving scheduling problems and other extremum problems on large-scale graphs

Chernyshev S. V., Cherepanov E. A., Pankratiev E. V. et al., Journal of Mathematical Sciences 2005 Vol. 128 No. 6 P. 3487–3495

Added: January 27, 2014

Hardness of Approximation for H-free Edge Modification Problems

Bliznets Ivan, Cygan M., Komosa P. et al., ACM Transactions on Computation Theory 2018 Vol. 10 No. 2 P. 1–32

The H-free Edge Deletion problem asks, for a given graph G and integer k, whether it is possible to delete at most k edges from G to make it H-free—that is, not containing H as an induced subgraph. The H-free Edge Completion problem is defined similarly, but we add edges instead of deleting them. The study of these two problem families has recently been the subject of intensive studies from the point of ...

Added: October 30, 2018

О выборе программных средств когнитивной компьютерной визуализации

Baibikova T., Domoratsky E., Вестник Московского финансово-юридического университета 2017 № 1 С. 200–206

Some questions of scientific visualization are under consideration in this paper. This article also discusses the peculiarities of application of cognitive computer graphics, singles out a range of tasks of scientific visualization. The paper gives a brief overview of modern support tools for program visualization, tendencies of their development and their main characteristics. A module ...

Added: June 10, 2017

Probably approximately correct learning of Horn envelopes from queries

Borchmann D., Hanika T., Obiedkov S., Discrete Applied Mathematics 2020 Vol. 273 P. 30–42

We propose an algorithm for learning the Horn envelope of an arbitrary domain using an expert, or an oracle, capable of answering certain types of queries about this domain. Attribute exploration from formal concept analysis is a procedure that solves this problem, but the number of queries it may ask is exponential in the size ...

Added: October 29, 2019

Proceedings of 11th Industrial Conference on Data Mining (ICDM 2012)

Springer, 2012.

Added: January 29, 2013

Технология сбора пространственных данных в полевых городских исследованиях

Goncharov R., Сапанов П. М., Яшунский А. Д., Социология власти 2013 № 3 С. 57–72

В статье представлена технология, позволяющая собирать в полевых исследованиях пространственно локализованные данные об объектах городской среды. Технология основана на автоматической привязке фотографий к пространственным координатам. Приведен план полевых и камеральных мероприятий, предложены варианты ГИС-обработки собираемых таким образом данных. В качестве примера приведены данные об использовании белорусского языка в общественном пространстве городов Белоруссии. ...

Added: April 12, 2015

Программный комплекс моделирования физических процессов при автоматизированном проектировании источников вторичного электропитания для сложных бортовых систем

Sotnikova S., Динамика сложных систем 2012 № 3 С. 84–87

In article is described designed programme complex of the physical processes modeling, which also allows to conduct the identification printed node parameters (the physical model). On printed node designed the on-board secondary power supply source is realized. For it are designed relationship interfaces of controlling program with the known program of modeling and optimization. ...

Added: December 5, 2014

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 29 мая — 1 июня 2019 г.). Вып. 18 (25)

М.: Издательский центр «Российский государственный гуманитарный университет», 2019.

Сборник включает 27 докладов международной конференции по компьютерной лингвистике и интеллектуальным технологиям «Диалог 2019», не вошедшие в ежегодник «Компьютерная лингвистика и интеллектуальные технологии», но рекомендованные Программным Комитетом к представлению на конференции. Для специалистов в области теоретической и прикладной лингвистики и интеллектуальных технологий. ...

Added: December 10, 2019

Particle Simulation for Predicting Effective Properties of Short Fiber Reinforced Composites

Skoptsov K. A., Sheshenin S., Galatenko V. V. et al., International Journal of Applied Mechanics 2016 Vol. 8 No. 2 P. 1650016-01–1650016-18

We present a method for evaluating elastic properties of a composite material produced by molding a resin filled with short elastic fibers. A flow of the filled resin is simulated numerically using a mesh-free method. After that, assuming that spatial distribution and orientation of fibers are not significantly changed during polymerization, effective elastic moduli of ...

Added: May 22, 2016

Об одномерных проекциях многогранников задач дискретной оптимизации

Vyalyi M., Дискретная математика 1991 Т. 3 № 3 С. 35–45

Added: October 17, 2014

Sheath parameters for non-Debye plasmas: Simulations and arc damage

Morozov I., Norman G. E., Insepov Z. et al., Physical Review Special Topics - Accelerators and Beams 2012 Vol. 15 P. 053501

This paper describes the surface environment of the dense plasma arcs that damage rf accelerators, tokamaks, and other high gradient structures. We simulate the dense, nonideal plasma sheath near a metallic surface using molecular dynamics (MD) to evaluate sheaths in the non-Debye region for high density, low temperature plasmas. We use direct two-component MD simulations ...

Added: October 28, 2013

Database on the Bandgap of Inorganic Substances and Materials

Kiselyova N. N., Dudarev V.A., Korzhuev M. A., Inorganic Materials: Applied Research 2016 Vol. 7 No. 1 P. 34–39

A database (DB) on the bandgap of inorganic substances available via the Internet (http://bg.imetdb.ru) was developed for the information service of specialists in the sphere of inorganic chemistry and materials science. The DB is integrated with other information systems on the properties of inorganic substances and materials, which provides the search of a wide range ...

Added: February 23, 2016

Pre-experiments on Annotation of Russian Coreference Corpus

Toldova S., Azerkovich I., Гришина Ю. et al., / NRU HSE. Series WP BRP "Linguistics". 2015.

Building benchmark corpora in the domain of coreference and anaphora resolution is an important task for developing and evaluating NLP systems and models. Our study is aimed at assessing the feasibility of enhancing corpora with information about coreference relations. The annotation procedure includes identification of text segments that are subjects to annotation (markables), marking their ...

Added: December 15, 2015

Операционные системы. Учебник и практикум

Gostev I. M., М.: Юрайт, 2016.

В настоящее время компьютерные науки стремительно развиваются. Новые версии операционных систем появляются каждые полтора-два года, поэтому было принято решение о включении в данную книгу такого материала, который не будет устаревать. Содержание учебника представляет собой некоторые наиболее общие принципы построения операционных систем, которые были разработаны более 50 лет назад и практически не изменились за прошедшее время. ...

Added: October 13, 2009

О некоторых медленно сходящихся системах преобразований термов

Beklemishev L. D., Оноприенко А. А., Математический сборник 2015 Т. 206 № 9 С. 3–20

We formulate some term rewriting systems in which the number of computation steps is finite for each output, but this number cannot be bounded by a provably total computable function in Peano arithmetic PA. Thus, the termination of such systems is unprovable in PA. These systems are derived from an independent combinatorial result known as the Worm ...

Added: March 13, 2016

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

Shuranov E., / Series Computer Science "arxiv.org". 2021.

Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion ...

Added: February 14, 2023

Шестая Всероссийская научно-практическая конференция по имитационному моделированию и его применению в науке и промышленности «Имитационное моделирование. Теория и практика» Материалы конференции. Сборник докладов

Каз.: Издательство «Фэн» Академии наук Республики Татарстан, 2013.

Материалы и доклады Шестой Всероссийской научно-практической конференции по имитацонному моделированию и его применению в науке и промышленности. ...

Added: December 14, 2013

Formation of Control Structures in Static Swarms

Karpov V. E., Karpova I. P., Procedia Engineering 2015 Vol. 100 P. 1459–1468

Work solutions are proposed for problems of leader definition and role distribution in homogeneous groups of robots. It is shown that transition from a swarm to a collective of robots with hierarchical organization is possible using exclusively local interaction. The local revoting algorithm is central to the procedure for choice of leader while redistribution of roles can ...

Added: March 14, 2015

Сборник трудов конференции NI Academic Days 2017, Москва 13-14 апреля 2017 г.

М.: National Instruments Russia, 2017.

Содержание сборника составляют доклады с результатами оригинальных исследований и технических решений, ранее не публиковавшиеся. Мы надеемся, что предлагаемый сборник окажется полезным для специалистов, работающих в различных областях науки и техники, для широкого круга преподавателей, аспирантов и студентов ВУЗов, а также для преподавателей средних школ и технических колледжей. ...

Added: May 10, 2017