Fitting Pattern Structures for Knowledge Discovery in Big Data

S. Kuznetsov

?

Fitting Pattern Structures for Knowledge Discovery in Big Data

P. 254–266.

Pattern structures, an extension of FCA to data with complex descriptions, propose an alternative to conceptual scaling (binarization) by giving direct way to knowledge discovery in complex data such as logical formulas, graphs, strings, tuples of numerical intervals, etc. Whereas the approach to classification with pattern structures based on preceding generation of classifiers can lead to double exponent complexity, the combination of lazy evaluation with projection approximations of initial data, randomization and parallelization, results in reduction of algorithmic complexity to low degree polynomial, and thus is feasible for big data.

Language: English

Keywords: knowledge discovery pattern structures big data

Publication based on the results of:

Mathematical Models, Algorithms, and Software Tools for the Intelligent Analysis of Big Textual and Structural Data (2013)

In book

Proc. 11th International Conference on Formal Concept Analysis (ICFCA 2013)

Vol. 7880. , Springer, 2013.

SIGMOD/PODS '21: Proceedings of the 2021 International Conference on Management of Data

NY: ACM, 2021.

The annual ACM SIGMOD/PODS Conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences. The conference includes a fascinating technical program with research and industrial talks, tutorials, demos, and focused workshops. It also hosts a poster session to learn about innovative ...

Added: April 28, 2021

Кластеризация медицинских больших данных как инструментарий систем поддержки принятия решений в математической кардиологии с использованием облачных технологий

Shmid A., Новопашин М. А., Зимина Е. Ю., Системный администратор 2018 Т. 188-189 № 07-08 С. 92–96

Массовое использование мобильных устройств для съема электрокардиограмм (ЭКГ) приводит к количественному росту доступных для исследования ЭКГ множества пациентов. Таким образом, появляются новые возможности исследования колебательных процессов долговременной динамики индивидуального состояния сердечно-сосудистой системы (ССС) любого пациента. В статье демонстрируются новые возможности долговременного постоянного наблюдения за состоянием ССС массы пациентов, позволяющие выявить закономерности динамики ССС, которые приводят к ...

Added: September 13, 2018

Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at ECAI 2014)

Prague: CEUR Workshop Proceedings, 2014.

The first and the second edition of the FCA4AI Workshop showed that many researchers working in Artificial Intelligence are indeed interested by a well-founded method for classi- fication and mining such as Formal Concept Analysis (see http://www.fca4ai.hse.ru/). The first edition of FCA4AI was co-located with ECAI 2012 in Montpellier and published as http://ceur-ws.org/Vol-939/ while the ...

Added: September 12, 2014

Цифровая автократия. Институциональная специфика отношений государства и IT-компаний

Balayan A. A., Томин Л. В., Публичная политика 2020 Т. 4 № 2 С. 101–115

The article is devoted to the study of the genealogy of models of digital autocracies as a historical constellation of economic, political and technological factors. They were formed as a response to a number of challenges: adaptation to the transforming neoliberal economic system, the preservation of socio-political stability in the context of further market reforms ...

Added: June 30, 2021

Pattern structures for news clustering

Makhalova T., Ilvovsky D., Galitsky B., , in: Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at IJCAI 2015). Buenos Aires: [б.и.], 2015. P. 35–42.

Usually web search results are represented as long list of document snippets. It is difficult for users to navigate through this collection of text. We propose clustering method that uses pattern structure constructed on augmented syntactic parse trees. In addition, we compare our method to other clustering methods and demonstrate the limitations of the competitive methods. ...

Added: October 11, 2016

Хранение и обработка графа социальных сетей

Polyakov I. V., Chepovskiy A., Chepovskiy A., Вестник Новосибирского государственного университета. Серия: Информационные технологии 2013 Т. 11 № 4 С. 77–83

In this paper special data structure for big social graph storing and operating is presented. We discuss mainly graph paths searching, obtaining subgrapths and addition of new edges and vertices. ...

Added: October 17, 2013

Базовые структуры данных системы поддержки принятия решений FCART

Parinov A., Научно-техническая информация. Серия 2: Информационные процессы и системы 2014

В статье рассматриваются сочетания базовых структур данных локального хранилища системы поддержки принятия решений FCART и приводятся временные характеристики при использовании больших объемов данных. ...

Added: November 19, 2013

FCA and pattern structures for mining care trajectories

Buzmakov A. V., Egho E., Jay N. et al., , in: Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at IJCAI 2013)Issue 1058. Beijing: CEUR Workshop Proceedings, 2013. P. 7–14.

In this paper, we are interested in the analysis of sequential data and we propose an original framework based on Formal Concept Analysis (FCA). For that, we introduce sequential pattern structures, an original specification of pattern structures for dealing with sequential data. Pattern structures are used in FCA for dealing with complex data such as ...

Added: October 23, 2015

SEQUENCE-BASED AND STRUCTURE-BASED MACHINE-LEARNING MODELS FOR RECOGNITION OF 3’-END L1 AND ALU STEM-LOOPS IN HUMAN GENOME

Poptsova M., Шеин А. В., Zaikin A., , in: The proceedings of International congress «Biotechnology: state of the art and perspectives» FEBRUARY 25 - 27, 2019. LLC “RED GROUP”, 2019. P. 356–356.

We built and evaluated two types of models: sequence-based and structure-based for recognition of 3’-end stem- loops of human L1s and Alus and found most important parameters contributing to recognition: Shift, Tilt and Rise, and aslo hydrophilicity. ...

Added: November 12, 2019

Синтез информационной системы управления подсистемами технического обеспечения интеллектуальных зданий

Vikentyeva O., Deryabin A. I., Shestakova L. V. et al., Вестник Московского государственного строительного университета 2017 Т. 12 № 10 С. 1191–1201

Subject: smart house maintenance requires taking into account a number of factors - resource conservation, mitigating working expenditures, safety enhancement, ensuring comfort of leisure and operation. Automation of such engineering systems networks as illumination, climate control, security and communication, may be achieved through utilization of contemporary technologies (e.g. IoT – Internet of Things). However, storing ...

Added: November 21, 2017

Loan Portfolio Dataset From MakerDAO Blockchain Project

Chaleenutthawut Y., Davydov V., Evdokimov M. et al., IEEE Access 2024 Vol. 12 P. 24843–24854

Decentralized finance (DeFi) offers a range of financial instruments and services that leverage the capabilities of web3 technology. Maker protocol, which enables users to obtain loans backed by cryptocurrencies, is one of them. Unlike traditional banks, Maker’s data is transparently recorded on the Ethereum blockchain. In this research paper, we focus on analyzing the lending ...

Added: September 4, 2024

Data-Driven Authoritarianism: Non-Democracies and Big Data

Kabanov Y., Karyagin M., , in: Digital Transformation and Global Society Third International Conference, DTGS 2018, St. Petersburg, Russia, May 30 –June 2, 2018, Revised Selected Papers, Part IIssue 858. Cham: Springer, 2018. P. 144–155.

The article discusses the problems of power asymmetry and political dynamics in the era of Big Data, assessing the impact Big Data may have on power relations and political regimes. While the issues of political ethics of the data turn are mostly discussed in relation to democracies, little attention has been given to hybrid regimes ...

Added: October 27, 2018

2020 Global Smart Industry Conference (GloSIC)

IEEE, 2020.

Added: December 3, 2020

Proceedings of The 11th International Conference on Theory and Practice of Electronic Governance (ICEGOV2018)

Styrin E. M., Sandoval-Almazan R., NY: ACM Press, 2018.

The 11th International Conference on Theory and Practice of Electronic Governance (ICEGOV2018) took place in Galway, Republic of Ireland, between 4 and 6 April 2018. The conference was held under the high patronage of the Department of Public Expenditure and Reform (DPER), Government of Ireland. The Insight Centre for Data Analytics, part of the National ...

Added: June 12, 2018

Современные направления сбора и анализа данных медицинской статистики

Arkhipova M., Sirotin V., В кн.: Цифровая статистика. Новые задачи и траектория движения: материалы IV Cъезда медицинских статистиков Москвы, 21–23 сентября 2022 г. М.: ГБУ «НИИОЗММ ДЗМ», 2022. С. 4–5.

Added: January 12, 2023

Большие данные в образовании: DATA-ANTHROPO для политик и практик развития

Наука, 2022.

В книге раскрывается концептуальный DATA-ANTHROPO подход в аналитике образовательных данных. Подход основан на применении методов data-анализа, выявляющих детерминанты и корреляции развития человека и человеческих групп. Для этого используется не типовая система индикаторов анализа, как в институциональном подходе, а система индикаторов, включающая метрики развития человеческого потенциала (ценности развития, удовлетворенность предоставленными возможностями развития, условия самореализации, выбора, участия ...

Added: October 19, 2022

AgroTech. AI, Big data, IoT

Springer, 2022.

At present, agricultural economics has to solve the complex and responsible task of provision of food security. The problem with the achievement of this task is that the innovative business trends of recent years were concentrated in other spheres of the economy and were only indirectly connected to the agricultural economy. Thus, in the first ...

Added: March 15, 2024

Service-Oriented Computing

Berlin, Heidelberg: Springer, 2013.

The proceedings of the 11th International Conference on Service-Oriented Computing (ICSOC 2013), held in Berlin, Germany, December 2–5, 2013, contain high-quality research papers that represent the latest results, ideas, and positions in the field of service-oriented computing. Since the first meeting more than ten years ago, ICSOC has grown to become the premier international forum ...

Added: March 21, 2014

Fast Generation of Best Interval Patterns for Nonmonotonic Constraints

Buzmakov A. V., Kuznetsov S., Napoli A., , in: Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings* 2. Vol. 9285. Dordrecht, L., Cham, Heidelberg, NY: Springer, 2015. P. 157–172.

In pattern mining, the main challenge is the exponential explosion of the set of patterns. Typically, to solve this problem, a constraint for pattern selection is introduced. One of the first constraints proposed in pattern mining is support (frequency) of a pattern in a dataset. Frequency is an anti-monotonic function, i.e., given an infrequent pattern, ...

Added: October 22, 2015

Труды ХVIII международной конференции DAMDID / RSDL’2016, 11-14 октября 2016, Ершово, Московская область, Россия

НИЯУ МИФИ, 2016.

In 2016 the International Conference “Data Analytics and Management in Data Intensive Domains” (DAMDID/RCDL’2016) was held on October 11 – 14 in the Holiday Center, Ershovo (Moscow region). By tradition the “Data Analytics and Management in Data Intensive Domains” conference (DAMDID) is planned as a multidisciplinary forum of researchers and practitioners from various domains of science and research, promoting ...

Added: January 26, 2017

Направления регулирования Больших данных и неприкосновенность частной жизни в новых экономических реалиях

Savelyev A., Закон 2018 № 5 С. 122–143

The paper is focused on the analysis of possible ways of improvement of existing personal data legislation in order to eliminate unreasonable barriers to the implementation of Big Data technologies in light of associated risks: data leaks, improper use of data, processing of imprecise data, and discriminatory practices. The author criticizes legislative initiatives to liberalize ...

Added: May 15, 2018

Concept-based chatbot for interactive query refinement in product search

Goncharova E., Ilvovsky D., Galitsky B., , in: Proceedings of the 9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021)Vol. 2972. CEUR-WS, 2021. P. 51–58.

Added: October 28, 2021

Information Spaces for Big Data Processing: Unification and Parallelization of Sequential Information Accumulation Procedures

Golubtsov P., , in: 21st IEEE Conference on Business Informatics (CBI). IEEE Computer Society, 2019. P. 212–220.

In large-scale research, data are usually collected on many sites, have a huge volume, and new data are constantly generated. Since it is often impossible to collect all the relevant data on a single computer, much attention is paid to the algorithms that provide sequential or parallel accumulation of information and do not need to ...

Added: July 31, 2019

Большие данные и их приложения в электроэнергетике: от бизнес аналитики до виртуальных электростанций

Krylov V., Крылов С. В., М.: Нобель Пресс, 2014.

Предназначена для студентов и специалистов в области разработки информационных систем в том числе для электроэнергетики и руководителей ИТ подразделений предприятий, всем, кто работает над планированием направлений развития электроэнергетики и просто интересуется прогресcом в этой области В книге рассматривается направление в области обработки данных, получившее название Большие Данные (Big Data), рассказывается о техниках и технологиях. Главный фокус ...

Added: October 10, 2015