The 8th Russian Summer School in Information Retrieval (RuSSIR 2014) was held on August 18-22, 2014 in Nizhniy Novgorod, Russia.1 The school was co-organized by the National Research University Higher School of Economics2 and the Russian Information Retrieval Evaluation Seminar (ROMIP)

This paper provides the reader with a report on 9th Russian Summer School in Information Retrieval (RuSSIR 2015).

Here we use computational Bayesian phylogenetic methods to generate a phylogeny of Tungusic languages and estimate the time-depth of the family. Our analysis is based on a dataset of 254 basic vocabulary items collected for 21 Tungusic doculects. Our results are consistent with two previously proposed basic classifications: variants of the Manchu-Tungusic and the North-South classification. We infer a time-depth between the 8th century BC and the 12th century AD. The application of Bayesian phylogenetic methods to Tungusic languages is unprecedented and provides a reliable quantitative basis for previous estimates based on classical historical linguistic and lexicostatistic approaches.

We propose a concatenated code construction based on convolutional codes. We prove that minimum distance of this construction equals product of free distances of component codes.

The paper considers the phoneme recognition by facial expressions of a speaker in voice-activated control systems. We have developed a neural network recognition algorithm by using the phonetic words decoding method and the requirement for isolated syllable pronunciation of voice commands. The paper presents the experimental results of viseme (facial and lip position corresponding to a particular phoneme) classification of Russian vowels. We show the dependence of the classification accuracy on the used classifier (multilayer feed-forward network, support vector machine, k-nearest neighbor method), image features (histogram of oriented gradients, eigenvectors, SURF local descriptors) and the type of camera (built-in or Kinect one). The best accuracy of speaker-dependent recognition is shown to be 85% for a built-in camera and 96% for Kinect depth maps when the classification is performed with the histogram of oriented gradients and the support vector machine.

Methods of identification of the form of objects based on the signature analysis and invariant to affine transformations are considered. It is shown as these methods it is possible to apply to surface quality assurance. Questions of sensitivity of these methods are considered. Dependences of these methods on noise are brought.

The Cell Formation Problem (CFP) is an important optimisation problem in manufacturing. It has been introduced in the Group Technology (GT) and its goal is to group machines and parts processed on them into production cells minimising the movement of parts to other cells for processing and maximising for each cell the loading of its machines with operations on its parts. We consider one of the computationally hardest formulations of this problem – the CFP with a variable number of cells and the grouping efficacy objective, which is a fractional function. The CFP literature contains many heuristic algorithms, but only a small number of exact approaches especially for this formulation. In the current paper, we present an exact branch-and-bound algorithm for the same hard CFP formulation. To linearise the fractional objective function, we apply the Dinkelbach approach. We have been able to solve 24 of the 35 instances from the well known GT benchmark. For the remaining 11 instances, the difference in the grouping efficacy with the best known solutions is less than 2.6%.

A new method for deriving estimates of the rate of convergence of optimal methods for solving problems of smooth (strongly) convex stochastic optimization is described. The method is based on the results of stochastic optimization derived from results on the convergence of optimal methods under the conditions of inexact gradients with small noises of nonrandom nature. In contrast to earlier results, all estimates in the present paper are obtained in model generality

We propose an accelerated gradient-free method with a non-Euclidean proximal operator associated with the *p*-norm (1 ⩽ *p* ⩽ 2). We obtain estimates for the rate of convergence of the method under low noise arising in the calculation of the function value. We present the results of computational experiments.

We propose a new way of justifying the accelerated gradient sliding of G. Lan, which allows one to extend the sliding technique to a combination of an accelerated gradient method with an accelerated variance reduction method. New optimal estimates for the solution of the problem of minimizing a sum of smooth strongly convex functions with a smooth regularizer are obtained

Recently, it has been shown how, on the basis of the usual accelerated gradient method for solving problems of smooth convex optimization, accelerated methods for more complex problems (with a structure) and problems that are solved using various local information about the behavior of a function (stochastic gradient, Hessian, etc.) can be obtained. The term “accelerated methods” here means, on the one hand, the presence of some unified and fairly general way of acceleration. On the other hand, this also means the optimality of the methods, which can often be proved rigorously. In the present work, an attempt is made to construct in the same way a theory of accelerated methods for solving smooth convex-concave saddle-point problems with a structure. The main result of this article is the obtainment of in some sense necessary and sufficient conditions under which the complexity of solving nonlinear convex-concave saddle-point problems with a structure in the number of calculations of the gradients of composites in direct variables is equal in order of magnitude to the complexity of solving bilinear problems with a structure

A new version of accelerated gradient descent is proposed. The method does not require any a priori information on the objective function, uses a linesearch procedure for convergence acceleration in practice, converge according to well-known lower bounds for both convex and nonconvex objective functions, and has primal-dual properties. A universal version of this method is also described.

This work is devoted to the investigation of particle acceleration during magnetospheric dipolarizations. A numerical model is presented taking into account the four scenarios of plasma acceleration that can be realized: (A) total dipolarization with characteristic time scales of 3 min; (B) single peak value of the normal magnetic component Bz occurring on the time scale of less than 1 min; (C) a sequence of rapid jumps of Bz interpreted as the passage of a chain of multiple dipolarization fronts (DFs); and (D) the simultaneous action of mechanism (C) followed by the consequent enhancement of electric and magnetic fluctuations with the small characteristic time scale 1 s. In a frame of the model, we have obtained and analyzed the energy spectra of four plasma populations: electrons e, protons Hþ, helium Heþ, and oxygen Oþ ions, accelerated by the above-mentioned processes (A)–(D). It is shown that Oþ ions can be accelerated mainly due to the mechanism (A); Hþ and Heþ ions (and to some extent electrons) can be more effectively accelerated due to the mechanism (C) than the single dipolarization (B). It is found that high-frequency electric and magnetic fluctuations accompanying multiple DFs (D) can strongly accelerate electrons e and really weakly influence other populations of plasma. The results of modeling demonstrated clearly the distinguishable spatial and temporal resonance character of particle acceleration processes. The maximum particle energies depending on the scale of the magnetic acceleration region and the value of the magnetic field are estimated. The shapes of energy spectra are discussed.

We study properties of Markov chain Monte Carlo simulations of classical spin models with local updates. We derive analytic expressions for the mean value of the acceptance rate of single-spin-flip algorithms for the one-dimensional Ising model. We find that for the Metropolis algorithm the average acceptance rate is a linear function of energy. We further provide numerical results for the energy dependence of the average acceptance rate for the three- and four-state Potts model, and the XY model in one and two spatial dimensions. In all cases, the acceptance rate is an almost linear function of the energy in the critical region. The variance of the acceptance rate is studied as a function of the specific heat. While the specific heat develops a singularity in the vicinity of a phase transition, the variance of the acceptance rate stays finite.

Cardiovascular disease associated with metabolic syndrome has a high prevalence, but the mechanistic basis of metabolic cardiomyopathy remains poorly understood. We characterised the cardiac transcriptome in a murine metabolic syndrome (MetS) model (LDLR−/−; ob/ob, DKO) relative to the healthy, control heart (C57BL/6, WT) and the transcriptional changes induced by ACE-inhibition in those hearts. RNA-Seq, differential gene expression and transcription factor analysis identified 288 genes differentially expressed between DKO and WT hearts implicating 72 pathways. Hallmarks of metabolic cardiomyopathy were increased activity in integrin-linked kinase signalling, Rho signalling, dendritic cell maturation, production of nitric oxide and reactive oxygen species in macrophages, atherosclerosis, LXR-RXR signalling, cardiac hypertrophy, and acute phase response pathways. ACE-inhibition had a limited effect on gene expression in WT (55 genes, 23 pathways), and a prominent effect in DKO hearts (1143 genes, 104 pathways). In DKO hearts, ACE-I appears to counteract some of the MetS-specific pathways, while also activating cardioprotective mechanisms. We conclude that MetS and control murine hearts have unique transcriptional profiles and exhibit a partially specific transcriptional response to ACE-inhibition.

Modeling tools provide a valuable support for DNA origami design. However, current solutions have limited application for conformational analysis of the designs. In this work we present a tool for a thorough study of DNA origami structure and dynamics. The tool is based on a novel coarse-grained model dedicated to geometry optimization and conformational analysis of DNA origami. We explored the ability of the model to predict dynamic behavior, global shapes, and fine details of two single-layer systems designed in hexagonal and square lattices using atomic force microscopy, Förster resonance energy transfer spectroscopy, and all-atom molecular dynamic simulations for validation of the results. We also examined the performance of the model for multilayer systems by simulation of DNA origami with published cryo-electron microscopy and atomic force microscopy structures. A good agreement between the simulated and experimental data makes the model suitable for conformational analysis of DNA origami objects. The tool is available at http://vsb.fbb.msu.ru/cosm as a web-service and as a standalone version.

Checking the correctness of distributed systems is one of the most difficult and urgent problems in software engineering. A combined toolset for the verification of real-time distributed systems (RTDS) is described. RTDSs are specified as statecharts in the Universal Modeling Language (UML). The semantics of statecharts is defined by means of hierarchical timed automata. The combined toolset consists of a UML statechart editor, a verification tool for model checking networks of real-time automata in UPPAAL, and a translator of UML statecharts into networks of timed automata. The focus is on the translation algorithm from UML statecharts into networks of hierarchical timed automata. To illustrate the proposed approach to the verification of RTDSs, a toy example of a real-time crossroad traffic control system is analyzed.

We study the computational complexity of the dominating set problem for hereditary graph classes, i.e., classes of simple unlabeled graphs closed under deletion of vertices. Every hereditary class can be defined by a set of its forbidden induced subgraphs. There are numerous open cases for the complexity of the problem even for hereditary classes with small forbidden structures. We completely determine the complexity of the problem for classes defined by forbidding a five-vertex path and any set of fragments with at most five vertices. Additionally, we also prove polynomial-time solvability of the problem for some two classes of a similar type. The notion of a boundary class is a helpful tool for analyzing the computational complexity of graph problems in the family of hereditary classes. Three boundary classes were known for the dominating set problem prior to this paper. We present a new boundary class for it.

With technology evolving rapidly and proliferating, it is imperative to pay attention to mobile devices’ security being currently responsible for various sensitive data processing. This phase is essential as an intermediate before the cloud or distributed ledger storage delivery and should be considered additional care due to its inevitability. This paper analyzes the security mechanisms applied for internal use in the Android OS and the communication between the Android OS and the remote server. Presented work aims to examine these mechanisms and evaluate which cryptographic methods and procedures are most advantageous in terms of energy efficiency derived from execution time. Nonetheless, the dataset with the measurements collected from 17 mobile devices and the code for reproducibility is also provided. After analyzing the collected data, specific cryptographic algorithms are recommended to implement an application that utilizes native cryptographic operations on modern Android devices. In particular, selected algorithms for symmetric encryption are AES256 / GCM / No Padding; for digital signature – SHA512 with RSA2048 / PSS, and for asymmetric encryption – RSA3072 / OAEP with SHA512 and MGF1 Padding.

We show that the inequality H(A|B, X) + H(A|B, Y) ≤ H(A|B) for jointly distributed random variables A, B, X, Y, which does not hold in general case, holds under some natural condition on the support of the probability distribution of A, B, X, Y. This result generalizes a version of the conditional Ingleton inequality: if for some distribution I(X : Y|A) = H(A|X, Y) = 0, then I(A : B) ≤ I (A : B|X) + I(A : B|Y)+I(X : Y). We present two applications of our result. The first one is the following easy-to-formulate theorem on edge colorings of bipartite graphs: assume that the edges of a bipartite graph are colored in K colors so that each two edges sharing a vertex have different colors and for each pair (left vertex x, right vertex y) there is at most one color a such both x and y are incident to edges with color a; assume further that the degree of each left vertex is at least L and the degree of each right vertex is at least R. Then K LR. The second application is a new method to prove lower bounds for biclique cover of bipartite graphs.