Распределенные горизонтально масштабируемые решения для управления данными

?

Распределенные горизонтально масштабируемые решения для управления данными

Труды Института системного программирования РАН. 2013. Т. 24. С. 327–258.

С.Д. Кузнецов, Посконин А. В.

Many modern applications (such as large-scale Web-sites, social networks, research projects, business analytics, etc.) have to deal with very large data volumes (also referred to as “big data”) and high read/write loads. These applications require underlying data management systems to scale well in order to accommodate data growth and increasing workloads. High throughput, low latencies and data availability are also very important, as well as data consistency guarantees. Traditional SQL-oriented DBMSs, despite their popularity, ACID transactions and rich features, do not scale well and thus are not suitable in certain cases. A number of new data management systems and approaches have emerged over the last decade intended to resolve scalability issues. This paper reviews several classes of such systems and key problems they are able to solve. A large variety of systems and approaches due to the general trend toward specialization in the field of SMS: every data management system has been adapted to solve a certain class of problems. Thus, the selection of specific solutions due to the specific problem to be solved: the expected load, the intensity ratio of read and write, the form of data storage and query types, the desired level of consistency, reliability requirements, the availability of client libraries for the selected language, etc.

Research target: Computer Science

Priority areas: IT and mathematics

Language: Russian

DOI

Text on another site

Системы управления данными категории NoSQL

С.Д. Кузнецов, Посконин А. В., Программирование 2014 Т. 40 № 6 С. 34–47

In the last decade, a new class of data management systems collectively called NoSQL systems emerged and are now intensively developed. The main feature of these systems is that they abandon the relational data model and the SQL, do not fully support ACID transactions, and use distributed architecture (even though there are non-distributed NoSQL systems ...

Added: November 6, 2017

NoSQL data management systems

S.D. Kuznetsov, Посконин А. В., Programming and Computer Software 2014 Vol. 40 No. 6 P. 323–332

Added: November 6, 2017

Большие данные: современные подходы к хранению и обработке

Клеменков П. А., Kuznetsov S. D., Труды Института системного программирования РАН 2012 Т. 23 С. 143–158

Big data challenged traditional storage and analysis systems in several new ways. In this paper we try to figure out how to overcome this challenges, why it's not possible to make it efficiently and describe three modern approaches to big data handling: NoSQL, MapReduce and real-time stream processing. The first section of the paper is ...

Added: October 31, 2017

Matrix multiplication and universal scalability of the time on the Intel Scalable processors

Russkov A., Shchur L., Journal of Physics: Conference Series 2019 Vol. 1163 No. 012079 P. 1–5

Matrix multiplication is one of the core operations in many areas of scientific computing. We present the results of the experiments with the matrix multiplication of the big size comparable with the big size of the onboard memory, which is 1.5 terabyte in our case. We run experiments on the computing board with two sockets ...

Added: March 26, 2019

Возможно ли сотрудничество SQL и NoSQL?

Сергей Кузнецов, Посконин А. В., Открытые системы. СУБД 2013 № 9 С. 38–41

Many different data management systems are available nowadays, ranging from familiar SQL-based solutions to completely new systems designed from scratch. Wide range of available options made it possible to choose one that optimally suits application requirements. However, one can benefit even more from using different solutions within a single application for particular tasks. This paper ...

Added: January 30, 2018

Angara interconnect makes GPU-based Desmos supercomputer an efficient tool for molecular dynamics calculations

Stegailov V., Dlinnova E., Ismagilov T. et al., International Journal of High Performance Computing Applications 2019 Vol. 33 No. 3 P. 507–521

In this paper, we describe the Desmos supercomputer that consists of 32 hybrid nodes connected by a low-latency highbandwidth Angara interconnect with torus topology. This supercomputer is aimed at cost-effective classical molecular dynamics calculations. Desmos serves as a test bed for the Angara interconnect that supports 3D and 4D torus network topologies, and verifies its ...

Added: November 11, 2018

Triclustering in Big Data Setting

Egurnov D., Ignatov D. I., Точилкин Д. С., / Series LNCS "Lecture Notes in Computer Science". 2020.

In this paper, we describe versions of triclustering algorithms adapted for efficient calculations in distributed environments with MapReduce model or parallelisation mechanism provided by modern programming languages. OAC-family of triclustering algorithms shows good parallelisation capabilities due to the independent processing of triples of a triadic formal context. We provide the time and space complexity of ...

Added: November 10, 2020

Applying MapReduce to Conformance Checking

Shugurov I., Mitsyuk A. A., Proceedings of the Institute for System Programming of the RAS 2016 Vol. 28 No. 3 P. 103–122

Process mining is a relatively new research field, offering methods of business processes analysis and improvement, which are based on studying their execution history (event logs). Conformance checking is one of the main sub-fields of process mining. Conformance checking algorithms are aimed to assess how well a given process model, typically represented by a Petri ...

Added: September 12, 2016

Synchronization of Conservative Parallel Discrete Event Simulations on a Small-World Network

Ziganurova L., Shchur L., Physical Review E - Statistical, Nonlinear, and Soft Matter Physics 2018 Vol. 98 No. 022218 P. 1–15

We examine the question of the influence of sparse long-range communications on the synchronization in parallel discrete event simulations (PDES). We build a model of the evolution of local virtual times (LVT) in a conservative algorithm including several choices of local links. All network realizations belong to the small-world network class. We find that synchronization ...

Added: July 13, 2018

Особенности моделирования распределенных систем

Замятина Е.Б., Миков А. И., Михеев Р. А., Вестник Пермского университета. Серия: Математика. Механика. Информатика 2013 № 4(23) С. 107–118

The problems of simulation of distributed information systems are discussed. There are a large number of specialized software systems dedicated to design a simulation model of distributed systems and to run a simulation experiment. The article deals with software and language tools of CAD Triad.Net, considers the main features of the simulation model and how ...

Added: March 10, 2015

Вероятность ошибки CRC при наличии пакетной случайной помехи

Baranov P., Baranov A., Проблемы информационной безопасности. Компьютерные системы 2017 № 4

The article analyzes possibilities of errors in telecommunication protocols using packet data transmission. Probabilistic model of a prolonged-action additive interference is represented as a sequence of executions of independent interference blocks with definite length. The paper shows that in certain conditions concerning a polynomial of degree k, used for creation of CRC code, with block ...

Added: February 26, 2018

The scalability analysis of a parallel tree search algorithm

Posypkin M., Kolpakov R., Optimization Letters 2020 Vol. 14 No. 8 P. 2211–2226

Increasing the number of computational cores is a primary way of achieving the high performance of contemporary supercomputers. However, developing parallel applications capable to harness the enormous amount of cores is a challenging task. It is very important to understand the principle limitations of the scalability of parallel applications imposed by the algorithm’s structure. The ...

Added: October 30, 2020

Исследование масштабируемости FlowVision на кластере с интерконнектом Ангара

Акимов В. С., Силаев Д. П., Симонов А. С. et al., Вычислительные методы и программирование: новые вычислительные технологии 2017 Т. 18 С. 406–415

The scalability of computations in FlowVision CFD software on the Angara-C1 cluster equipped with Angara interconnect is studied. Several test problems with 260 thousand, 5.5 million and 26.8 million computational cells are considered. Computations in FlowVision are performed using a new solver of linear systems based on the algebraic multigrid (AMG) method. It is shown ...

Added: October 30, 2019

A Constrained Shortest Path Scheme for Virtual Network Service Management

Chemodanov D., Esposito F., Calyam P. et al., IEEE Transactions on Network and Service Management 2019 Vol. 16 No. 1 P. 127–142

Virtual network services that span multiple data centers are important to support emerging data-intensive applications in fields such as bioinformatics and retail analytics. Successful virtual network service composition and maintenance requires flexible and scalable “constrained shortest path management” both in the management plane for virtual network embedding (VNE) or network function virtualization service chaining (NFV-SC), ...

Added: December 3, 2019

Particle Simulation for Predicting Effective Properties of Short Fiber Reinforced Composites

Skoptsov K. A., Sheshenin S., Galatenko V. V. et al., International Journal of Applied Mechanics 2016 Vol. 8 No. 2 P. 1650016-01–1650016-18

We present a method for evaluating elastic properties of a composite material produced by molding a resin filled with short elastic fibers. A flow of the filled resin is simulated numerically using a mesh-free method. After that, assuming that spatial distribution and orientation of fibers are not significantly changed during polymerization, effective elastic moduli of ...

Added: May 22, 2016

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 29 мая — 1 июня 2019 г.). Вып. 18 (25)

М.: Издательский центр «Российский государственный гуманитарный университет», 2019.

Сборник включает 27 докладов международной конференции по компьютерной лингвистике и интеллектуальным технологиям «Диалог 2019», не вошедшие в ежегодник «Компьютерная лингвистика и интеллектуальные технологии», но рекомендованные Программным Комитетом к представлению на конференции. Для специалистов в области теоретической и прикладной лингвистики и интеллектуальных технологий. ...

Added: December 10, 2019

Algorithms and methods for solving scheduling problems and other extremum problems on large-scale graphs

Chernyshev S. V., Cherepanov E. A., Pankratiev E. V. et al., Journal of Mathematical Sciences 2005 Vol. 128 No. 6 P. 3487–3495

Added: January 27, 2014

Программный комплекс моделирования физических процессов при автоматизированном проектировании источников вторичного электропитания для сложных бортовых систем

Sotnikova S., Динамика сложных систем 2012 № 3 С. 84–87

In article is described designed programme complex of the physical processes modeling, which also allows to conduct the identification printed node parameters (the physical model). On printed node designed the on-board secondary power supply source is realized. For it are designed relationship interfaces of controlling program with the known program of modeling and optimization. ...

Added: December 5, 2014

Технология сбора пространственных данных в полевых городских исследованиях

Goncharov R., Сапанов П. М., Яшунский А. Д., Социология власти 2013 № 3 С. 57–72

В статье представлена технология, позволяющая собирать в полевых исследованиях пространственно локализованные данные об объектах городской среды. Технология основана на автоматической привязке фотографий к пространственным координатам. Приведен план полевых и камеральных мероприятий, предложены варианты ГИС-обработки собираемых таким образом данных. В качестве примера приведены данные об использовании белорусского языка в общественном пространстве городов Белоруссии. ...

Added: April 12, 2015

Priority Queueing for Packets with Two Characteristics

Chuprikov P., Nikolenko S. I., Davydow A. et al., IEEE Transactions on Networking 2018 Vol. 26 No. 1 P. 342–355

Modern network elements are increasingly required to deal with heterogeneous traffic. Recent works consider processing policies for buffers that hold packets with different processing requirements (number of processing cycles needed before a packet can be transmitted out) but uniform value, aiming to maximize the throughput, i.e., the number of transmitted packets. Other developments deal with ...

Added: March 14, 2018

The complexity of the 3-colorability problem in the absence of a pair of small forbidden induced subgraphs

Malyshev D., Discrete Mathematics 2015 Vol. 338 No. 11 P. 1860–1865

We completely determine the complexity status of the 3-colorability problem for hereditary graph classes defined by two forbidden induced subgraphs with at most five vertices. ...

Added: April 7, 2014

Шестая Всероссийская научно-практическая конференция по имитационному моделированию и его применению в науке и промышленности «Имитационное моделирование. Теория и практика» Материалы конференции. Сборник докладов

Каз.: Издательство «Фэн» Академии наук Республики Татарстан, 2013.

Материалы и доклады Шестой Всероссийской научно-практической конференции по имитацонному моделированию и его применению в науке и промышленности. ...

Added: December 14, 2013

Formation of Control Structures in Static Swarms

Karpov V. E., Karpova I. P., Procedia Engineering 2015 Vol. 100 P. 1459–1468

Work solutions are proposed for problems of leader definition and role distribution in homogeneous groups of robots. It is shown that transition from a swarm to a collective of robots with hierarchical organization is possible using exclusively local interaction. The local revoting algorithm is central to the procedure for choice of leader while redistribution of roles can ...

Added: March 14, 2015

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

Shuranov E., / Series Computer Science "arxiv.org". 2021.

Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion ...

Added: February 14, 2023