Глава
Discovery of technology trends from patent data on the basis of predictive analytics
В книге
Full texts of third international conference on data analytics are presented.
В статье рассмотрена проблема прогнозирования численности клиентской базы компании в рамках решения задачи управления клиентами. Автором предложен новый подход к сегментоориентированному прогнозированию численности клиентов, в основе которого лежит адаптация модели движения кадров О.В. Староверова. Также в статье рассмотрены условия применимости данной модели и модификация основных положений в зависимости от характера взаимоотношений клиента и компании.
The practical relevance of process mining is increasing as more and more event data become available. Process mining techniques aim to discover, monitor and improve real processes by extracting knowledge from event logs. The two most prominent process mining tasks are: (i) process discovery: learning a process model from example behavior recorded in an event log, and (ii) conformance checking: diagnosing and quantifying discrepancies between observed behavior and modeled behavior. The increasing volume of event data provides both opportunities and challenges for process mining. Existing process mining techniques have problems dealing with large event logs referring to many different activities. Therefore, we propose a generic approach to decompose process mining problems. The decomposition approach is generic and can be combined with different existing process discovery and conformance checking techniques. It is possible to split computationally challenging process mining problems into many smaller problems that can be analyzed easily and whose results can be combined into solutions for the original problems.
Pattern structures, an extension of FCA to data with complex descriptions, propose an alternative to conceptual scaling (binarization) by giving direct way to knowledge discovery in complex data such as logical formulas, graphs, strings, tuples of numerical intervals, etc. Whereas the approach to classification with pattern structures based on preceding generation of classifiers can lead to double exponent complexity, the combination of lazy evaluation with projection approximations of initial data, randomization and parallelization, results in reduction of algorithmic complexity to low degree polynomial, and thus is feasible for big data.
The proceedings of the 11th International Conference on Service-Oriented Computing (ICSOC 2013), held in Berlin, Germany, December 2–5, 2013, contain high-quality research papers that represent the latest results, ideas, and positions in the field of service-oriented computing. Since the first meeting more than ten years ago, ICSOC has grown to become the premier international forum for academics, industry researchers, and practitioners to share, report, and discuss their ground-breaking work. ICSOC 2013 continued along this tradition, in particular focusing on emerging trends at the intersection between service-oriented, cloud computing, and big data.
В статье выполнен анализ перспектив использования технологии «больших данных» (Big Data) в юриспруденции. Обосновывается позиция, что «большие данные» должны использоваться как для объяснения каких-либо явлений, так и для прогнозирования последствий. Автором описаны проблемы, возникающие при применении Big Data в юридических исследованиях. Указанные проблемы могут иметь технический (доступ к данным, технические возможности, верификация данных) и содержательный характер (интерпретация полученных данных и корреляций). Сделан вывод о необходимости активизации исследований с применением «больших данных» с учетом описанных ограничений.
Operational processes leave trails in the information systems supporting them. Such event data are the starting point for process mining – an emerging scientific discipline relating modeled and observed behavior. The relevance of process mining is increasing as more and more event data become available. The increasing volume of such data (“Big Data”) provides both opportunities and challenges for process mining. In this paper we focus on two particular types of process mining: process discovery (learning a process model from example behavior recorded in an event log) and conformance checking (diagnosing and quantifying discrepancies between observed behavior and modeled behavior). These tasks become challenging when there are hundreds or even thousands of different activities and millions of cases. Typically, process mining algorithms are linear in the number of cases and exponential in the number of different activities. This paper proposes a very general divide-and-conquer approach that decomposes the event log based on a partitioning of activities. Unlike existing approaches, this paper does not assume a particular process representation (e.g., Petri nets or BPMN) and allows for various decomposition strategies (e.g., SESE- or passage-based decomposition). Moreover, the generic divide-and-conquer approach reveals the core requirements for decomposing process discovery and conformance checking problems.