New Approaches for Boosting to Uniformity
The use of multivariate classifiers has become commonplace in particle physics. To enhance the performance, a series of classifiers is typically trained; this is a technique known as boosting. This paper explores several novel boosting methods that have been designed to produce a uniform selection efficiency in a chosen multivariate space. Such algorithms have a wide range of applications in particle physics, from producing uniform signal selection efficiency across a Dalitz-plot to avoiding the creation of false signal peaks in an invariant mass distribution when searching for new particles.
Nowadays, product reviews on e-commerce sites tend to be a valuable resource in terms of evaluation of customers' behavior, their preferences, and needs. This paper provides an approach for sentiment analysis of product reviews in Russian using convolutional neural networks. We use Word2Vec pre-trained vectors as inputs for neural networks. This approach utilizes no hand-crafted features or sentiment lexicons. The training dataset was collected from reviews on top-ranked goods from the major e-commerce site in Russia, where the user-ranked scores were used as class labels. The system demonstrated the F-measure score up to 75.45% in a three-class classification. The collected training dataset and word embeddings are available to the research community.
In this paper, we analyze a new approach for demand prediction in retail. One of the signicant gaps in demand prediction by machine learning methods is the unaccounted sales data censorship. Econometric approaches to modeling censored demand are used to obtain consistent and unbiased estimates of parameters. These approaches can also be transferred to different classes of machine learning models to reduce the prediction error of sales volume. In this study we build two ensemble models to predict demand with and without demand censorship, aggregating predictions for machine learning methods such as Linear regression, Ridge regression, LASSO and Random forest. Having estimated the predictive properties of both models, we test the best predictive power of the models with accounting for the censored nature of demand.
This 2-volume set constitutes the refereed proceedings of the 9th Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2019, held in Madrid, Spain, in July 2019.
The 99 papers in these volumes were carefully reviewed and selected from 137 submissions. They are organized in topical sections named:
Part I: best ranked papers; machine learning; pattern recognition; image processing and representation.
Part II: biometrics; handwriting and document analysis; other applications.
This book constitutes the proceedings of the 23rd International Symposium on Foundations of Intelligent Systems, ISMIS 2017, held in Warsaw, Poland, in June 2017. The 56 regular and 15 short papers presented in this volume were carefully reviewed and selected from 118 submissions. The papers include both theoretical and practical aspects of machine learning, data mining methods, deep learning, bioinformatics and health informatics, intelligent information systems, knowledge-based systems, mining temporal, spatial and spatio-temporal data, text and Web mining. In addition, four special sessions were organized; namely, Special Session on Big Data Analytics and Stream Data Mining, Special Session on Granular and Soft Clustering for Data Science, Special Session on Knowledge Discovery with Formal Concept Analysis and Related Formalisms, and Special Session devoted to ISMIS 2017 Data Mining Competition on Trading Based on Recommendations, which was launched as a part of the conference.
This two-volume set LNCS 10305 and LNCS 10306 constitutes the refereed proceedings of the 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, held at Gran Canaria, Spain, in June 2019. The 150 revised full papers presented in this two-volume set were carefully reviewed and selected from 210 submissions. The papers are organized in topical sections on machine learning in weather observation and forecasting; computational intelligence methods for time series; human activity recognition; new and future tendencies in brain-computer interface systems; random-weights neural networks; pattern recognition; deep learning and natural language processing; software testing and intelligent systems; data-driven intelligent transportation systems; deep learning models in healthcare and biomedicine; deep learning beyond convolution; artificial neural network for biomedical image processing; machine learning in vision and robotics; system identification, process control, and manufacturing; image and signal processing; soft computing; mathematics for neural networks; internet modeling, communication and networking; expert systems; evolutionary and genetic algorithms; advances in computational intelligence; computational biology and bioinformatics.
A method of topological data analysis is proposed that allows one to find out the homotopy type of the object under study. Unlike mature and widely used methods based on persistent homologies, our method is based on computing differential invariants of some map associated with an approximating map. Differential topology tools and the analogy with the main result in Morse theory are used. The approximating map can be constructed in the usual way using a neural network or otherwise. The method allows one to identify the homotopy type of an object in the plane because the number of circles in the homotopy equivalent object representation as a wedge is expressed through the degree of some map associated with the approximating map. The performance of the algorithm is illustrated by examples from the MNIST database and transforms thereof. Generalizations and open questions relating to a higher-dimension case are discussed.
There are many different methods for computing relevant patterns in sequential data and interpreting the results. In this paper, we compute emerging patterns (EP) in demographic sequences using sequence-based pattern structures, along with different algorithmic so- lutions. The purpose of this method is to meet the following domain requirement: the obtained patterns must be (closed) frequent contiguous prefixes of the input sequences. This is required in order for demogra- phers to fully understand and interpret the results.
One of the main objectives of strategic management is the development and selection of strategies to achieve the desired results. The main goal of this paper is the analysis of the main domains or areas of machine learning application to support the process of strategic planning and decision making. The scientific methodology of the research studies is methods and procedures of modeling and intelligent analysis. This is theoretical and empirical paper in equal measure. This paper deals with the issues of machine learning implementation and how intellectual models and systems can be used to support the process of strategic planning in the context of theory of economic growth and development. At the preprocessing stage on the basis of a modeled base of examples of strategy options, the use of clustering methods for forming groups of similar parameters that influence the choice of strategies and groups of similar enterprise objects, each of which has a certain type of strategy, are demonstrated. On the next step the selection of ranked characteristics that affect the choice of strategy is made. At the stage of solving the problem of choosing strategies, neural network and neuro-fuzzy approaches are used. The advantage of this hybrid method is based on the fact that the hybrid technology can combine the advantages of neural networks as well as the advantages of fuzzy logic.
In this paper, we describe a deep-learning system for emotion detection in textual conversations that participated in SemEval-2019 Task 3 “EmoContext”. We designed a specific architecture of bidirectional LSTM which allows not only to learn semantic and sentiment feature representation, but also to capture user-specific conversation features. To fine-tune word embeddings using distant supervision we additionally collected a significant amount of emotional texts. The system achieved 72.59% micro-average F1 score for emotion classes on the test dataset, thereby significantly outperforming the officially-released baseline. Word embeddings and the source code were released for the research community.