• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Metadata-Driven Industrial-Grade ETL System
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Metadata-Driven Industrial-Grade ETL System

P. 2433–2442.
Suleykin A., Panfilov P.

Digital transformation of a railway system based on big data technologies relies on integrating large volumes of streaming data into digitally enabled enterprise systems to form a comprehensive and efficient intelligent transportation system. Data requirements of the smart railway transportation involve a large number of unstructured data and semi-structured data including railway KPI data. Traditional ETL technology cannot cope with fast growing demands of processing large volumes of real-time data collected from heterogeneous sources both inside the system and in the environment. According to the characteristics of the railway KPI data, this paper proposes the designs of an automated ETL system with higher versatility and efficiency of data processing. To reach the goals, we optimize the workflow of the ETL using a proprietary designed metadata management framework. Making ETL suitable for big data-driven railway transportation environment, requires redesigning the ETL processing rules by using metadata model and then optimizing the extracting, transforming and loading processes of the ETL system. Our experimental results with actual railway KPI data show that the proposed metadata supported automated ETL system can effectively serve the railway KPI data processing using open source distributed big data technologies. The proposed metadata framework proved to be efficient in processing complex data structures and large data capacity of big data.

Language: English
DOI
Text on another site
Keywords: metadataIntelligent transportation systemsBig data technologiesKPI dataETL-Processes

In book

2020 IEEE International Conference on Big Data (Big Data 2020)
IEEE, 2020.
Similar publications
Federated Reinforcement Learning for Intelligent Traffic Signal Control: A Privacy-Preserving Approach with Edge-Assisted Aggregation
Ali J. Dayoub, Ehab S. Suleiman, , in: Proceedings of the 2026 8th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE).: IEEE, 2026. Ch. 159 P. 1–5.
Abstract— Urban traffic congestion costs the global economy over $1 trillion annually, necessitating intelligent traffic signal control (ITSC) solutions. Traditional centralized approaches face critical limitations: privacy violations from vehicle trajectory data sharing, prohibitive communication overhead, and scalability challenges in heterogeneous urban environments. This paper presents a federated reinforcement learning (FRL) framework for privacy-preserving traffic signal ...
Added: April 30, 2026
Parallel Multi-Level Simulation for Large-Scale Detailed Intelligent Transportation System Modeling
Stepanyants V., Karpov A., Margaryan A. et al., FUTURE TRANSPORTATION 2025 Vol. 5 No. 4 Article 141
Nowadays, the problems of traffic accidents, inefficiency, and congestion still affect transportation systems. Conventional solutions often do not resolve and can even exacerbate the problems. Intelligent transportation system (ITS) technology, including intelligent vehicles, could provide a solution for these problems. However, such technologies should be thoroughly verified and validated before their large-scale adoption. Computer simulation ...
Added: October 17, 2025
Новая программная платформа для моделирования транспортных потоков с участием беспилотных автомобилей
Beklaryan A., Вестник ЦЭМИ 2023 Т. 6 № 1 Статья 5
The article presents a new software platform for modelling traffic flows involving unmanned vehicles, using a number of advanced technological solutions, in particular, the FLAME GPU supercomputer agent modelling framework, intelligent software modules based on fuzzy and hierarchical clustering, genetic optimization algorithms, a subsystem for visualizing the state of agents-vehicles based on OpenGL, etc. As ...
Added: June 4, 2023
Оптимизация характеристик интеллектуальной транспортной системы с использованием генетического алгоритма вещественного кодирования на основе адаптивной мутации
Akopov A. S., Бекларян Л. А., Beklaryan A., Информационные технологии 2023 Т. 29 № 3 С. 115–125
A novel real-coded genetic algorithm (FCGA-AM) that uses the proposed adaptive mutation (AM) operator is presented. The algorithm is designed to optimise the characteristics of the developed intelligent transportation system. The performance of the proposed genetic algorithm was evaluated in comparison with other methods of multicriteria heuristic optimization with the use of various test instances ...
Added: June 4, 2023
Hybrid Deep Learning Enabled Air Pollution Monitoring in ITS Environment
Dutta A. K., Sampson J., Ahmad S. et al., Computers, Materials and Continua 2022 Vol. 72 No. 1 P. 1157–1172
Intelligent Transportation Systems (ITS) have become a vital part in improving human lives and modern economy. It aims at enhancing road safety and environmental quality. There is a tremendous increase observed in the number of vehicles in recent years, owing to increasing population. Each vehicle has its own individual emission rate; however, the issue arises ...
Added: April 11, 2022
Digital Ecosystem-Based KPI-Driven Railway Communication Network Reporting System
Panfilov, P., Suleykin, A., ElDarawany, A., , in: MEDES '21: Proceedings of the 13th International Conference on Management of Digital EcoSystems.: NY: Association for Computing Machinery (ACM), 2021. P. 163–166.
This research is focused on architectural and modeling issues of design and development of digital reporting system aimed at the railway communication network infrastructure. Our approach to these problems is based on digital ecosystem paradigm and open-source Big Data technologies. It also aims at methodology for KPIs data preparation and collection in railway communication networks. ...
Added: January 15, 2022
Automatic Vehicle License Plate Recognition Using Optimal K-Means with Convolutional Neural Network for Intelligent Transportation Systems
Pustokhina I., Pustokhin D. A., Rodrigues J. J. et al., IEEE Access 2020 Vol. 8 P. 92907–92917
Due to recent developments in highway research and increased utilization of vehicles, there has been significant interest paid on latest, effective, and precise Intelligent Transportation System (ITS). The process of identifying particular objects in an image plays a crucial part in the fields of computer vision or digital image processing. Vehicle License Plate Recognition (VLPR) ...
Added: October 2, 2021
Automatic vehicle license plate recognition using optimal deep learning model
Vaiyapuri T., Nandan Mohanty S., Sivaram M. et al., Computers, Materials and Continua 2021 Vol. 67 No. 2 P. 1881–1897
The latest advancements in highway research domain and increase in the number of vehicles everyday led to wider exposure and attention towards the development of efficient Intelligent Transportation System (ITS). One of the popular research areas i.e.,Vehicle License PlateRecognition (VLPR) aims at determining the characters that exist in the license plate of the vehicles. The ...
Added: October 2, 2021
On Machine Learning Applicability to Transaction Time Prediction for Time-Critical C-ITS Applications
Stepanov N., Veprev A., Sharapova A. et al., , in: 2021 44th International Conference on Telecommunications and Signal Processing (TSP).: IEEE Computer Society, 2021. P. 408–413.
The demand in providing a reliable operation of modern Cooperative Intelligent Transport Systems (C-ITS) is growing tremendously. One of the enablers for improving efficiency is applying Machine Learning (ML) techniques to predict network characteristics. This work proposes methods to solve the Regression and Classification tasks for the real-life C-ITS mission-critical (transaction) data between buses and ...
Added: October 2, 2021
Кластеризация данных, извлечение ключевых слов и лексическое разнообразие в текстах эссе учебного корпуса
Scherbakova A., В кн.: Межкультурное пространство: лингвистический и дидактический аспекты. Материалы секций "Межкультурная лингвистика", "Межкультурная транслатология" и студенческого научного форума. Пленарное заседание и секция «Межкультурная дидактика».Ч. 2.: Издательство ПетрГУ, 2021.
The paper focuses on the task of clustering essays produced by ESL (English as a Second Language) learners. The data was taken from a learner corpus REALEC. The division of texts by certain characteristics can be useful to speed up the analysis of a single corpus or access to the necessary sections of a large ...
Added: September 30, 2021
Две проблемы российской статистики: взгляд пользователя
Bessonov V. A., Вопросы статистики 2021 Т. 28 № 4 С. 5–22
The article discusses two groups of problems in Russian statistics that still have no viable solutions. The first one - the state of the statistics interface – is the set of channels through which users obtain statistical information. The second – metadata status – is the information on how the indicators are constructed. The ...
Added: September 14, 2021
Chapter 8 Building Resilience into the Metadata-Based ETL Process Using Open Source Big Data Technologies
Panfilov P., Suleykin A., , in: Resilience in the Digital AgeVol. 12660: Lecture Notes in Computer Science.: Springer, 2021. Ch. 8 P. 139–153.
Extract-transform-load (ETL) processes play a crucial role in data analysis in real-time datawarehouse environments which demand lowlatency and high availability features for functionality. In essence, ETL- processes are becoming bottlenecks in such environments due to complexity growth, number of steps in data transformations, number of machines used for data processing and finally, increasing impact of ...
Added: February 5, 2021
Трансформация академического письма в цифровую эпоху
Safonova M., Safonov A. A., Высшее образование в России 2021 № 2 С. 144–153
Abstract.  Retaining its unifying function in creating academic texts, academic writing is undergoing various changes brought about by digitisation. The key trends are related to three aspects of academic writing: formal bibliometrics, principles of collaboration, and interaction with the readers. The first trend concerns the growing importance of identifiers, citation standards, key words and other ...
Added: December 12, 2020
Многоагентная система управления наземными беспилотными транспортными средствами
Akopov A. S., Бекларян Л. А., Khachatryan N. et al., Информационные технологии 2020 Т. 26 № 6 С. 342–353
This article presents a ground-based unmanned vehicles (UV) control system developed using agent-based simulation methods (supported by AnyLogic). An important feature of such a system is the ability to assess the influence of various parameters (such as average initial speeds, input stream intensities, frequency of data exchange between BTS agents, etc.) on the behavior and ...
Added: June 18, 2020
Industrial track: Architecting railway KPIs data processing with Big Data technologies
Suleykin, A., Panfilov, P., Bakhtadze, N., , in: 2019 IEEE International Conference on Big Data (Big Data).: IEEE, 2019. P. 2047–2056.
In our conducted research we have built the data processing pipeline for storing railway KPIs data based on Big Data open-source technologies – Apache Hadoop, Kafka, Kafka HDFS Connector, Spark, Airflow and PostgreSQL. Created methodology for data load testing allowed to iteratively perform data load tests with increased data size and evaluate needed cluster software ...
Added: February 27, 2020
Implementing Big Data Processing Workflows Using Open Source Technologies
Suleykin, A., Panfilov P., , in: 30th DAAAM International Symposium on Intelligent Manufacturing and Automation 2019.Vol. 30 (1): Proceedings of the 30th DAAAM International Symposium ''Intelligent Manufacturing & Automation''.: Curran Associates, Inc., 2019. Ch. 054 P. 0394–0404.
In our implementation research, we apply workflow approach to the modeling and development of the Big Data processing pipeline using open source technologies. The data processing workflow is a set of interrelated steps which launch some particular jobs such as Spark job, shell job or Postgre SQL command. All workflow steps are chained to form ...
Added: February 2, 2020
A European Perspective on Privacy and Mass Surveillance at the Crossroads
Rusinova V., / Series WP BRP "Basic research program". 2019. No. 87.
This article concentrates on two recent judgments issued by the European Court of Human Rights (ECHR) Chambers, on Centrum för Rättvisa v. Sweden and Big Brother Watch and Others v. the United Kingdom, which expressly acknowledged that mass surveillance per se does not violate the Convention on the Protection of Human Rights and Fundamental Freedoms. These judgments have been recently referred to ...
Added: April 1, 2019
Big Data and travel industry
Булгаков А. Л., Financial and Economic Tools Used in the World Hospitality Industry: Proceedings of the 5th International Conference on Management and Technology in Knowledge, Service, Tourism & Hospitality 2017 (SERVE 2017), 21-22 October 2017 & 30 November 2017 2018 Vol. 1 P. 265–270
The use of Big Data technology has been a modern trend in the travel industry over the last 10 years. At present, almost all travel companies that desire to stay profitable and be customeroriented use the Big Data technology. Therefore, we have several questions to answer: should we use Big Data in tourism or should ...
Added: October 30, 2018
Towards a Cloud Computing Paradigm for Big Data Analysis in Smart Cities
Massobrio R., Nesmachnow S., Tchernykh A. et al., Programming and Computer Software 2018 Vol. 44 No. 3 P. 181–189
In this paper, we present a Big Data analysis paradigm related to smart cities using cloud computing infrastructures. The proposed architecture follows the MapReduce parallel model implemented using the Hadoop framework. We analyse two case studies: a quality-of-service assessment of public transportation system using historical bus location data, and a passenger-mobility estimation using ticket sales ...
Added: August 10, 2018
The pros and cons of the Intelligent Transportation System implementation at toll plaza in Russia
Plaksin S., Kondrashov A., Ястребова Е. В. et al., / Series WP BRP "Basic research program". 2015. No. 02/URB/2015.
The implementation of Intelligent Transportation System elements into the toll plaza system is an actual topic nowadays and its positive effect is the subject of wide speculations. It is considered that the toll plaza Intelligent Transportation System can play a significant role in construction and operating costs reduction and improve the traffic safety. Also, the ...
Added: December 9, 2015
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit