Universal Algorithm for Trading in Stock Market Based on the Method of Calibration
We present a universal method for algorithmic trading in Stock Market which performs asymptotically at least as well as any stationary trading strategy that computes the investment at each step using continuous function of the side information. In the process of the game, a trader makes decisions using predictions computed by a randomized well-calibrated algorithm. We use Dawid's notion of calibration with more general checking rules and some modication of Kakade and Foster's randomized rounding algorithm for computing the well-calibrated forecasts. The method of randomized calibration is combined with Vovk's
method of defensive forecasting in RKHS. Unlike in statistical theory, no stochastic assumptions are made about the stock prices.
This work tackles the problem of modeling author style in Russian. In particular, we solve the task of authorship attribution using the collected dataset of 30 authors, 1506 texts written in the period of 18th – 21st century. We apply various approaches to solving the attribution problem: Random Forest, Logistic Regression, SVM Classifier. In terms of text representation, we use seven models in three language levels: lexis, morphology, and syntax. Most importantly, we propose our own set of morpho-syntactic features that perform on about the same level as doc2vec, but are fully interpretable. The conducted experiments show the effectiveness of their standalone use, as well as the increase in the quality of classification when using these attributes along with the classic doc2vec-based approach. All code, including feature extraction, is made freely available. Additionally, we analyze the performance of individual features as style markers. Finally, we study classification errors in order to identify the patterns in the misattribution of specific authors.
This volume contains the refereed proceedings of the 8th International Conference on Analysis of Images, Social Networks, and Texts (AIST 2019). The previous conferences during 2012–2018 attracted a significant number of data scientists – students, researchers, academics, and engineers working on interdisciplinary data analysis of images, texts, and social networks.
Predicting the impact of news events on changes in the price of financial assets can be used to manage the value of the company. The work demonstrates the possibility of using the method of content analysis to identify the degree of influence of non-financial risks on the market value of a public company. Using four public companies of the Russian market as an example, the authors test the method and show its limitations and possibilities.
In this paper, a deep learning method study is conducted to solve a new multiclass text classification problem, identifying user interests by text messages. We used an original dataset of almost 90 thousand forum text messages, labeled for ten interests. We experimented with different modern neural network architectures: recurrent and convolutional, as well as simpler feedforward networks. Classification accuracy was evaluated for different architectures, text representations, and sets of miscellaneous parameters.
There are many different methods for computing relevant patterns in sequential data and interpreting the results. In this paper, we compute emerging patterns (EP) in demographic sequences using sequence-based pattern structures, along with different algorithmic so- lutions. The purpose of this method is to meet the following domain requirement: the obtained patterns must be (closed) frequent contiguous prefixes of the input sequences. This is required in order for demogra- phers to fully understand and interpret the results.
This 2-volume set constitutes the refereed proceedings of the 9th Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2019, held in Madrid, Spain, in July 2019.
The 99 papers in these volumes were carefully reviewed and selected from 137 submissions. They are organized in topical sections named:
Part I: best ranked papers; machine learning; pattern recognition; image processing and representation.
Part II: biometrics; handwriting and document analysis; other applications.
We trained Random Forest model to recognize patterns of nucleosome and non-B DNA structures, considered as potential nucleosome barriers in the mouse genome. We showed that among four types of structures – Z-DNA, H-DNA, G-Quadruplexes and SIDD regions – recognition of G-Quadruplexes and H-DNA showed the best performance.
One of the most challenging data analysis tasks of modern High Energy Physics experiments is the identification of particles. In this proceedings we review the new approaches used for particle identification at the LHCb experiment. Machine-Learning based techniques are used to identify the species of charged and neutral particles using several observables obtained by the LHCb sub-detectors. We show the performances of various solutions based on Neural Network and Boosted Decision Tree models.