This book constitutes the proceedings of the 16th International Conference on Formal Concept Analysis, ICFCA 2021, held in Strasbourg, France, in June/July 2021.
The 14 full papers and 5 short papers presented in this volume were carefully reviewed and selected from 32 submissions. The book also contains four invited contributions in full paper length.
The research part of this volume is divided in five different sections. First, "Theory" contains compiled works that discuss advances on theoretical aspects of FCA. Second, the section "Rules" consists of contributions devoted to implications and association rules. The third section "Methods and Applications" is composed of results that are concerned with new algorithms and their applications. "Exploration and Visualization" introduces different approaches to data exploration.
This book constitutes the proceedings of the 20th International Conference on Mathematical Optimization Theory and Operations Research, MOTOR 2021, held in Irkutsk, Russia, in July 2021.
The 29 full papers and 1 short paper presented in this volume were carefully reviewed and selected from 102 submissions. Additionally, 2 full invited papers are presented in the volume. The papers are grouped in the following topical sections: combinatorial optimization; mathematical programming; bilevel optimization; scheduling problems; game theory and optimal control; operational research and mathematical economics; data analysis.
The book begins with a discussion of what a performant system is and progresses to measuring performance and setting performance goals. It introduces different classes of queries and optimization techniques suitable to each, such as the use of indexes and specific join algorithms. You will learn to read and understand query execution plans along with techniques for influencing those plans for better performance. The book also covers advanced topics such as the use of functions and procedures, dynamic SQL, and generated queries. All of these techniques are then used together to produce performant applications, avoiding the pitfalls of object-relational mappers.
The materials of The International Scientific – Practical Conference is presented below. The Conference reflects the modern state of innovation in education, science, industry and social-economic sphere, from the standpoint of introducing new information technologies. It is interesting for a wide range of researchers, teachers, graduate students and professionals in the field of innovation and information technologies.
This special conference starts the new series, and therefore, it launches a tradition to follow, and an opportunity for a rapid spin up. This ICCQ conference was organized by the HSE and leading IT innovative companies such Huawei and Yandex. The ICCQ 2021 attracted a number of renowned experts including Jens Palsberg, Anders Møller, and David West. The papers were submitted from the world over. The conference attracted speakers and attendees from the USA, Europe, and Asia; therefore, this is truly an international event. Although ICCQ started as a relatively small single-day conference, it immediately gained the IEEE support. The plan for the next years is to embrace the world while keeping high quality standards.
This book focuses on crisis management in software development which includes forecasting, responding and adaptive engineering models, methods, patterns and practices. It helps the stakeholders in understanding and identifying the key technology, business and human factors that may result in a software production crisis. These factors are particularly important for the enterprise-scale applications, typically considered very complex in managerial and technological aspects and therefore, specifically addressed by the discipline of software engineering. Therefore, this book throws light on the crisis responsive, resilient methodologies and practices; therewith, it also focuses on their evolutionary changes and the resulting benefits.
The International conference “Linguistic Forum 2020: Language and Artificial Intelligence” took place in 2020 on November 12-14 in Moscow, Russia. The conference is organized by the Institute of Linguistics, Russian Academy of Sciences. This conference is part of a series of annual forums initiated by the Institute of Linguistics RAS in 2019. The aim of the 2020 forum is to foster dialogue among researchers working at the interface of linguistics and artificial intelligence including those engaged in computational linguistics and natural language processing. Developments in AI have been responsible for recent advances in natural language generation and comprehension; they have also expanded the boundaries of these technologies’ applicability. Neural networks and dense embeddings have replaced models based on feature engineering and traditional discrete categories of linguistic analysis. As a result, the boundary between fundamental and applied linguistic research is being eroded. Empirical linguistics is taking on board these new technologies, in part, to enable better modelling of language and documentation of data. AI is also increasingly becoming a part of the everyday life of language users. Can fundamental linguistics currently offer technologically viable ideas or methods? These and similar conceptual and methodological problems were the focus of the forum.
In this paper, a statistical game was defined and solved. Its solution is: the optimal randomized decision rule, the probability of a correct decision on this rule, and the worst a priori distribution of the test subjects knowledge levels. We have developed a method for assessment the accuracy and reliability of decision making by on test results. The proposed program allows you to assessment the reliability of the solution for a test containing 10 items with different levels of difficulty, and 11 different levels of knowledge level.
The article suggests the integration of a neural network as a parallel element base in a telecommunication system. In this case, the ability to learn or adapt to external conditions is applied as the main advantage. For telecommunication systems in conditions when it is possible, this ability will improve noise immunity, reliability, operability, etc. The article considers an example of the integration of a neural network into a discrete matched signal filter. It is noted that the use of parallel mathematical methods in signal processing leads to the maximum effect of increasing the quality parameters of such telecommunication elements
Artificial intelligence and machine learning helps to improve the quality of customer service and change the methods of companies’ activities. For this reason, enterprises should consider integrating these technologies into digital transformation plans to remain competitive. Low-code machine learning platforms allow companies and business professionals with minimal coding experience to create applications and fill in the gaps of the personnel in their organization. Automated machine leaning (AutoML) technology represents the next step in the evolution of machine learning, providing non-technical companies with the ability to create machine learning applications quickly and cheaply
The article discusses the possibilities of studying the state of the social sphere according to the repository of the Moscow Government open data portal by administrative districts and city districts using Business Intelligence Platforms and Data Science and Machine Learning Platforms intellectual technologies. Opportunities are presented for using machine learning technologies for business analytics platforms to identify hidden patterns in order to make informed management decisions
Proceedings of the international conference "Neural Information Processing Systems 2020." (NeurIPS 2020)
Proceedings of Machine Learning Research: Volume 119: International Conference on Machine Learning, 12-18 July 2020
The 24th European Conference on Advances in Databases and Information Systems (ADBIS 2020) was set to be held in Lyon, France, during August 25–28, 2020, in conjunction with the 24th International Conference on Theory and Practice of Digital Libraries (TPDL 2020) and the 16th EDA days on Business Intelligence & Big Data (EDA 2020). However, because of the worldwide COVID-19 crisis, ADBIS, TPDL, and EDA had to take place online during August 25–27, 2020. Yet, the three con- ferences joined their forces to propose common keynotes, workshops, and a Doctoral Consortium.
MiRNA isoforms (isomiRs) are single stranded small RNAs originating from the same pri-miRNA hairpin as a result of cleavage by Drosha and Dicer enzymes. Variations at the 5ʹ-end of a miRNA alter the seed region of the molecule, thus affecting the targetome of the miRNA. In this manuscript, we analysed the distribution of miRNA cleavage positions across 31 different cancers using miRNA sequencing data of TCGA project. As a result, we found that the processing positions are not tissue specific and that all miRNAs could be correctly classified as ones exhibiting homogeneous or heterogeneous cleavage at one of the four cleavage sites. In 42% of cases (42 out of 100 miRNAs), we observed imprecise 5ʹ-end Dicer cleavage, while this fraction was only 14% for Drosha (14 out of 99). To the contrary, almost all cleavage sites of 3ʹ-ends (either Drosha or Dicer) were heterogeneous. With the use of only four nucleotides surrounding a 5ʹ-end Dicer cleavage position we built a model which allowed us to distinguish between homogeneous and heterogeneous cleavage with the reliable quality (ROC AUC = 0.68). Finally, we showed the possible applications of the study by the analysis of two 5ʹ-end isoforms originating from the same exogeneous shRNA hairpin. It turned out that the less expressed shRNA variant was functionally active, which led to the increased off-targeting. Thus, the obtained results could be applied to the design of shRNAs whose processing will result in a single 5ʹ-variant.
The goal of the paper is to develop a new algorithm for predicting whether the company will go bankrupt on the base of unbalanced data. To do it, we propose to consider the classification as a multi-objective optimization problem and construct a prediction model as an ensemble while minimizing the parameters FPR (False Positive Rate) and FNR (False Negative Rate) at the same time. To create the ensemble, the proposed algorithm of a Multi-Objective Classifier Selection (MOCS) selects only classifiers that belong to the Pareto-optimal set in FPR/FNR space; that is, there is no dominance between them, and they satisfy some additional conditions. In the general case, MOCS is determined by three parameters: two threshold values that limit false rates (FNR and FPR), and the crowding distance, which defines the uniqueness of the classifier's results. We tested the proposed algorithm on data collected from 2457 Russian companies, 456 of which went bankrupt, and 5910 Polish companies, 410 of which received bankruptcy status. Datasets contain features such as financial ratios and business environment factors. In the testing, we used more than 70 combinations of under-sampling, over-sampling, and no sampling methods with static and dynamic classification models. Final ensembles include seven classifiers for the Russian dataset and four classifiers for the Polish dataset combined by soft voting rule. In both cases, the proposed algorithm produces a significant improvement of prediction results as in terms of standard metrics (geometric mean, the area under the ROC curve) and in the visual representation in the FNR/FPR space, namely in the shift from a Pareto-optimal set of classifiers.
Recent statistics report that more than 3.7 million new cases of cancer occur in Europe yearly, and the disease accounts for approximately 20% of all deaths. High-throughput screening of cancer cell cultures has dominated the search for novel, effective anticancer therapies in the past decades. Recently, functional assays with patient-derived ex vivo 3D cell culture have gained importance for drug discovery and precision medicine. We recently evaluated the major advancements and needs for the 3D cell culture screening, and concluded that strictly standardized and robust sample preparation is the most desired development. Here we propose an artificial intelligence-guided low-cost 3D cell culture delivery system. It consists of a light microscope, a micromanipulator, a syringe pump, and a controller computer. The system performs morphology-based feature analysis on spheroids and can select uniform sized or shaped spheroids to transfer them between various sample holders. It can select the samples from standard sample holders, including Petri dishes and microwell plates, and then transfer them to a variety of holders up to 384 well plates. The device performs reliable semi- and fully automated spheroid transfer. This results in highly controlled experimental conditions and eliminates non-trivial side effects of sample variability that is a key aspect towards next-generation precision medicine.
We consider a linear-quadratic control problem where a time parameter evolves according to a stochastic time scale. The stochastic time scale is defined via a stochastic process with continuously differentiable paths. We obtain an optimal infinite-time control law under criteria similar to the long-run averages. Some examples of stochastic time scales from various applications have been examined.
The need for accurate balancing in electricity markets and a larger integration of renewable sources ofelectricity require accurate forecasts of electricity loads in residential buildings. In this paper, we considerthe problem of short-term (one-day ahead) forecasting of the electricity-load consumption in residentialbuildings. In order to generate such forecasts, historical electricity consumption data are used, presentedin the form of a time series with a fixed time step. Initially, we review standard forecasting methodologiesincluding naive persistence models, auto-regressive-based models (e.g., AR and SARIMA), and the tripleexponential smoothing Holt-Winters (HW) model. We then introduce three forecasting models, namelyi) the Persistence-based Auto-regressive (PAR) model, ii) the Seasonal Persistence-based Regressive (SPR)model, and iii) the Seasonal Persistence-based Neural Network (SPNN) model. Given that the accuracy ofa forecasting model may vary during the year, and the fact that models may differ with respect to theirtraining times, we also investigate different variations of ensemble models (i.e., mixtures of the previ-ously considered models) and adaptive model switching strategies. Finally, we demonstrate through sim-ulations the forecasting accuracy of all considered forecasting models validated on real-world datagenerated from four residential buildings. Through an extensive series of evaluation tests, it is shown thatthe proposed SPR forecasting model can attain approximately a 7% forecast error reduction over standardtechniques (e.g., SARIMA and HW). Furthermore, when models have not been sufficiently trained, ensem-ble models based on a weighted average forecaster can provide approximately a further 4% forecast errorreduction.
High energy physics experiments rely heavily on the detailed detector simulation models in many tasks. Running these detailed models typically requires a notable amount of the computing time available to the experiments. In this work, we demonstrate a new approach to speed up the simulation of the Time Projection Chamber tracker of the MPD experiment at the NICA accelerator complex. Our method is based on a Generative Adversarial Network – a deep learning technique allowing for implicit estimation of the population distribution for a given set of objects. This approach lets us learn and then sample from the distribution of raw detector responses, conditioned on the parameters of the charged particle tracks. To evaluate the quality of the proposed model, we integrate a prototype into the MPD software stack and demonstrate that it produces high-quality events similar to the detailed simulator, with a speed-up of at least an order of magnitude. The prototype is trained on the responses from the inner part of the detector and, once expanded to the full detector, should be ready for use in physics tasks.
One can meet the software architecture style’s notion in the software engineering literature. This notion is considered important in books on software architecture and university sources. However, many software developers are not so optimistic about it. It is not clear, whether this notion is just an academic concept, or is actually used in the software industry. In this paper, we measured industrial software developers’ attitudes towards the concept of software architecture style. We also investigated the popularity of eleven concrete architecture styles. We applied two methods. A developers survey was applied to estimate developers’ overall attitude and define what the community thinks about the automatic recognition of software architecture styles. Automatic crawlers were applied to mine the open-source code from the GitHub platform. These crawlers identified style smells in repositories using the features we proposed for the architecture styles. We found that the notion of software architecture style is not just a concept of academics in universities. Many software developers apply this concept in their work. We formulated features for the eleven concrete software architecture styles and developed crawlers based on these features. The results of repository mining using the features showed which styles are popular among developers of open-source projects from commercial companies and non-commercial communities. Automatic mining results were additionally validated by the Github developers survey.
The problem of finding relevant data while searching the internet represents a big challenge for web users due to the enormous amounts of available information on the web. These difficulties are related to the well-known problem of information overload. In this work, we propose an online web assistant called OWNA. We developed a fully integrated framework for making recommendations in real-time based on web usage mining techniques. Our work starts with preparing raw data, then extracting useful information that helps build a knowledge base as well as assigns a specific weight for certain factors. The experiments show the advantages of the proposed model against alternative approaches.
Classical molecular dynamics (MD) calculations represent a significant part of the utilization time of high-performance computing systems. As usual, the efficiency of such calculations is based on an interplay of software and hardware that are nowadays moving to hybrid GPU-based technologies. Several well-developed open-source MD codes focused on GPUs differ both in their data management capabilities and in performance. In this work, we analyze the performance of LAMMPS, GROMACS and OpenMM MD packages with different GPU backends on Nvidia Volta and AMD Vega20 GPUs. We consider the efficiency of solving two identical MD models (generic for material science and biomolecular studies) using different software and hardware combinations. We describe our experience in porting the CUDA backend of LAMMPS to ROCm HIP that shows considerable benefits for AMD GPUs comparatively to the OpenCL backend.