Information Spaces for Big Data Processing: Unification and Parallelization of Sequential Information Accumulation Procedures

P. Golubtsov

doi:10.1109/CBI.2019.00031

Publications

?

Information Spaces for Big Data Processing: Unification and Parallelization of Sequential Information Accumulation Procedures

P. 212–220.

Golubtsov P.

In large-scale research, data are usually collected on many sites, have a huge volume, and new data are constantly generated. Since it is often impossible to collect all the relevant data on a single computer, much attention is paid to the algorithms that provide sequential or parallel accumulation of information and do not need to store all the original data. As an example of information accumulation, the Bayesian updating procedure for linear experiments is analyzed. The corresponding information spaces are defined and the relations between them are studied. It is shown that processing can be unified and simplified by introducing a special canonical form of information representation and transforming all the data and the original prior information into this form. Thanks to the rich algebraic properties of the canonical information space, the sequential Bayesian procedure allows various parallelization options that are ideally suited for distributed data processing platforms, such as Hadoop MapReduce. This opens up the possibility of a flexible and efficient scaling of information accumulation in distributed data processing systems.

Keywords: big data Bayesian information update information spaces distributed data processing

In book

21st IEEE Conference on Business Informatics (CBI)

IEEE Computer Society, 2019.

Information spaces: optimizing sequential and parallel processing in big data

Golubtsov P., , in: 7th International conference "Problems of Mathematical Physics and Mathematical Modelling” (2018) Book of abstracts.: M.: National Research Nuclear University "MEPhI", 2018. P. 173–176.

The process of Bayesian information update is essentially sequential: as a result of observation, a prior information is transformed to a posterior, which is later interpreted as a prior for the next observation, etc. It is shown that this procedure can be unified and parallelized by converting both the measurement results and the original prior ...

Added: January 23, 2019

Procedia Computer Science. 3rd International Conference on Information Technology and Quantitative Management, ITQM 2015

Amsterdam: Elsevier, 2015.

Welcome to the Third International Conference on Information Technology and Quantitative Management (ITQM 2015), July 21-24, 2015, Rio De Janeiro, Brazil. The theme of ITQM 2015 is "Exploring Data Science in IT and Quantitative Management". ITQM 2015 is organized by International Academy of Information Technology and Quantitative Management (IAITQM) and Ibmec/RJ, Brazil. ...

Added: September 19, 2015

Большие данные в образовании: DATA-ANTHROPO для политик и практик развития

Наука, 2022.

В книге раскрывается концептуальный DATA-ANTHROPO подход в аналитике образовательных данных. Подход основан на применении методов data-анализа, выявляющих детерминанты и корреляции развития человека и человеческих групп. Для этого используется не типовая система индикаторов анализа, как в институциональном подходе, а система индикаторов, включающая метрики развития человеческого потенциала (ценности развития, удовлетворенность предоставленными возможностями развития, условия самореализации, выбора, участия ...

Added: October 19, 2022

Моделирование образовательных процессов и их оптимизация на примере модели работы с электронными образовательными ресурсами

Прокофьев Д. О., Starykh V., Информационные технологии 2015

This study investigates main problems of automation and optimization of educational processes with the help of BPMS and Big Data. The questions concerning process modeling are raised, particularly related to the integration of process-oriented and business analysis systems. The main goal of study is to find possible new way to implement the ideas of metadata ...

Added: October 9, 2015

Система автоматической обработки русскоязычных текстов

Dubov M., Mirkin B., Шаль А. А., Открытые системы. СУБД 2014 № 10 С. 15–17

Currently, automating of text processing and analysis is a main tendency of IT applications. As of this moment, there is no unified approach to the analysis and visualization of big volumes of text data. Our system LM Monitor (Latent Meaning Monitor) generates so-called reference graphs which can be considered part of the popular technology of ...

Added: December 16, 2014

Geo-Economy of the Future: Sustainable Agriculture and Alternative Energy

Portanskiy A., Springer, 2022.

This book presents an international review of the modern geo-economy and a scientific take on the geo-economy of the future. It identifies the challenges of climate change and their impact on the modern geo-economy. Prospects for the geo-economy of the future are outlined based on sustainable agriculture and alternative energy. Policy implications are put forward ...

Added: July 4, 2022

AgroTech. AI, Big data, IoT

Springer, 2022.

At present, agricultural economics has to solve the complex and responsible task of provision of food security. The problem with the achievement of this task is that the innovative business trends of recent years were concentrated in other spheres of the economy and were only indirectly connected to the agricultural economy. Thus, in the first ...

Added: March 15, 2024

Большие данные и их приложения в электроэнергетике: от бизнес аналитики до виртуальных электростанций

Krylov V., Крылов С. В., М.: Нобель Пресс, 2014.

Предназначена для студентов и специалистов в области разработки информационных систем в том числе для электроэнергетики и руководителей ИТ подразделений предприятий, всем, кто работает над планированием направлений развития электроэнергетики и просто интересуется прогресcом в этой области В книге рассматривается направление в области обработки данных, получившее название Большие Данные (Big Data), рассказывается о техниках и технологиях. Главный фокус ...

Added: October 10, 2015

Multimodal Clustering of Boolean Tensors on MapReduce: Experiments Revisited

Ignatov D. I., Egurnov D., Точилкин Д. С., , in: Supplementary Proceedings ICFCA 2019 Conference and WorkshopsVol. 2378.: CEUR Workshop Proceedings, 2019. P. 137–151.

This paper presents further development of distributed multimodal clustering. We introduce a new version of multimodal clustering algorithm for distributed processing in Apache Hadoop on computer clusters. Its implementation allows a user to conduct clustering on data with modality greater than two. We provide time and space complexity of the algorithm and justify its relevance. ...

Added: October 31, 2019

О социально-экономических последствиях внедрения перспективных цифровых технологий

Рейнгольд Л. А., Соловьев А. В., Klychikhina O., В кн.: Россия: Тенденции и перспективы развития. Ежегодник. Вып. 16. Ч. 2: XII Международная научно-практическая конференция «Регионы России: Стратегии развития и механизмы реализации приоритетных национальных проектов и программ», конференция «Научно-технологическое развитие России: Приоритеты, проблемы, решения» / РАН. ИНИОН. Отд. науч. сотрудничества; Отв. ред. В.И. Герасимов. – М., 2021. – Ч. 2. – 1024 с. ISBN 978-5-248-01003-5Ч. 2. Вып. 16.: ИНИОН РАН. Отд. науч. сотрудничества, 2021. С. 367–373.

В настоящее время формируется новая цифровая среда общества, требуется осмысление социально-экономических последствий технологической трансформации. Происходят качественные изменения в цифровых технологиях, в результате которых в ближайшие нескольких лет природная среда по лучит принципиально новые свойства. Возникают в новом качестве следующие комплексы проблем: – Скрытая сложность среды, окружающей человека, при кажущемся ее упрощении; – Необходимость сосуществования субъектов со все ...

Added: October 12, 2021

Формирование экономических свойств цифровой среды

Dneprovskaya N., Шевцова И. В., Вестник Московского университета. Серия 6: Экономика 2024 Т. 59 № 4 С. 114–134

Digital environment includes a huge number of information and telecommunication technologies (IT), while their usage generates common features for them. The goal of the study is to identify the cumulative factor of digital environment and analysis of ways to exploit them into economic activities. The analysis of quantitative indicators of the state of digital environment in Russia and abroad ...

Added: September 6, 2024

Искусственный интеллект как драйвер цифровой трансформации права

Дейнеко А. Г., В кн.: Третьи Бачиловские чтения. Цифровая трансформация: вызовы праву и векторы научных исследований.: М.: Проспект, 2020. С. 205–210.

The article will consider the changes that, in the author’s opinion, the Russian legislation will have to undergo due to the rapid development of artificial intelligence and robotics technologies. The main approaches to such improvement and their problem points will be identified. ...

Added: September 7, 2021

Computational Management Science. Network Analysis and Applications

Springer, 2024.

Big data has become an integral part of modern networks. With the increasing amount of data generated by devices, machines, and applications, networks are constantly being challenged to handle and process this data in a timely and efficient manner. The size, complexity, and variety of data in networks are increasing rapidly, which requires new approaches ...

Added: June 9, 2024

Оценка эффективности правовых норм в условиях развития «больших данных»

Churakov V., В кн.: Регуляторная политика в России: проблемы теории и практики.: М.: Проспект, 2019. С. 59–68.

В условиях современного состояния правовой системы особую актуальность приобретает тематика оценки эффективности регулирования. Официальная статистика заставляет задуматься о ценности права, реальной необходимости принимать столь большое количество нормативных правовых актов. Так, Государственная Дума Российской Федерации VII созыва, функционирующая с осени 2016 г., приняла (одобрила) 768 законопроектов. Из них лишь 3 в дальнейшем были отклонены Президентом Российской ...

Added: December 8, 2019

Editorial: Network Analysis and Applications

Pardalos P. M., Pardalos P., Kalyagin V. A. et al., Computational Management Science 2024 Vol. 21 No. 1 Article 35

Added: February 22, 2025

Базовые структуры данных системы поддержки принятия решений FCART

Parinov A., Научно-техническая информация. Серия 2: Информационные процессы и системы 2014

В статье рассматриваются сочетания базовых структур данных локального хранилища системы поддержки принятия решений FCART и приводятся временные характеристики при использовании больших объемов данных. ...

Added: November 19, 2013

SEQUENCE-BASED AND STRUCTURE-BASED MACHINE-LEARNING MODELS FOR RECOGNITION OF 3’-END L1 AND ALU STEM-LOOPS IN HUMAN GENOME

Poptsova M., Шеин А. В., Zaikin A., , in: The proceedings of International congress «Biotechnology: state of the art and perspectives» FEBRUARY 25 - 27, 2019.: LLC “RED GROUP”, 2019. P. 356–356.

We built and evaluated two types of models: sequence-based and structure-based for recognition of 3’-end stem- loops of human L1s and Alus and found most important parameters contributing to recognition: Shift, Tilt and Rise, and aslo hydrophilicity. ...

Added: November 12, 2019

SIGMOD/PODS '21: Proceedings of the 2021 International Conference on Management of Data

NY: ACM, 2021.

The annual ACM SIGMOD/PODS Conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences. The conference includes a fascinating technical program with research and industrial talks, tutorials, demos, and focused workshops. It also hosts a poster session to learn about innovative ...

Added: April 28, 2021

PROSPECTS OF TRANSFERRIG THE LARGE VOLUMES OF RADIO ASTRONOMY DATA

Isaev E., Tarasov P. A., Odessa Astronomical Publications 2014 Vol. 27 No. 2 P. 72–73

Added: November 24, 2014

Комбинированный алгоритм выделения сообществ в графах взаимодействующих объектов

Chepovskiy A., Лобанова С. Ю., Бизнес-информатика 2017 Т. 42 № 4 С. 64–73

In this paper, we propose and implement a method for detecting intersecting and nested communities in graphs of interacting objects of different natures. For this, two classical algorithms are taken: a hierarchical agglomerate and one based on the search for k-cliques. The combined algorithm presented is based on their consistent application. In addition, parametric options ...

Added: December 10, 2017

Towards a Cloud Computing Paradigm for Big Data Analysis in Smart Cities

Massobrio R., Nesmachnow S., Tchernykh A. et al., Programming and Computer Software 2018 Vol. 44 No. 3 P. 181–189

In this paper, we present a Big Data analysis paradigm related to smart cities using cloud computing infrastructures. The proposed architecture follows the MapReduce parallel model implemented using the Hadoop framework. We analyse two case studies: a quality-of-service assessment of public transportation system using historical bus location data, and a passenger-mobility estimation using ticket sales ...

Added: August 10, 2018

Монополизация медиарынка как вызов демократической управляемости. Цифровая трансформация печатных СМИ и политики госрегулирования (на примере США)

Balayan A. A., Томин Л. В., Информационное общество 2020 № 5 С. 80–88

The paper focuses on the transformation of the advertising market under the influence of platform companies, using the US example, to show the mechanism of digital disruption in print media business model. The development of digital infrastructure has allowed platform companies to collect and monetize data, deliver personalized ads to users throughout the internet. A ...

Added: October 28, 2020

Who’s Bad? Attitudes Toward Resettlers From the Post-Soviet South Versus Other Nations in the Russian Blogosphere

Svetlana S. Bodrunova, Koltsova O., Sergey Koltcov et al., International Journal of Communication 2017 Vol. 11 P. 3242–3264

Communication in social media is increasingly being found to reproduce or even reinforce ethnic prejudice and hostility toward migrants. In Russia of the 2010s, with its world’s second largest immigrant population, polls have detected high levels of hostility of the Russian population toward migranty (migrants), a label attached to resettlers from Central Asia and the ...

Added: October 4, 2017

Персонификация бренда: применение методологии психометрии на больших данных социальных сетей (данные российского рынка)

Mahar D. H., Syropyatov V. V., Елисеева В. С., Креативная экономика 2023 Т. 17 № 5 С. 1705–1730

To date, brand marketing is directly related to the detailed study of big data created by the consumer when making a choice, personal preferences and motivations. , the company tries to include the consumer as much as possible in the process of creating and developing not only the product, but also the brand. Personification is ...

Added: May 24, 2024