A hybrid approach to automated identification and monitoring of technology trends is presented. The hybrid approach combines methods of ontology based information extraction and statistical methods for processing OBIE results. The key point of the approach is the so called ‘black box’ principle. It is related to identification of trends on the basis of heuristics stemming from an elaborate ontology of a technology trend.

Conventional image recognition methods usually include dividing the keypoint neighborhood (for local features) or the whole object (for global features) into a grid of blocks, computing the gradient magnitude and orientation at each image sample point and uniting the orientation histograms of all blocks into a single descriptor. The query image is recognized by matching its descriptors with the descriptors of reference images. The matching is usually done by summation of distances between descriptors of corresponding blocks. Unfortunately, such approach does not lead to a correct distance between vector of points (histograms of each block) if popular square of Euclidean distance is used as a discrimination. To calculate the correct discrimination, we propose to sum the square roots (or, more generally, appropriate nonlinear transformation) of distances between block histograms. Such approach is experimentally examined in a face recognition problem with FERET and AT&T datasets. The results support the statement that the proposed approach provides higher accuracy (up to 5.5%) than state-of-the-art methods not only for the Euclidean distance but for other popular similarity measures (*L* 1, Kullback-Leibler, Jensen-Shannon, chi-squared and homogeneity-testing probabilistic neural network).

Several notions of links between contexts – intensionally related concepts, shared intents, and bonds, as well as interrelations thereof – are considered. Algorithmic complexity of the problems related to respective closure operators are studied. The expression of bonds in terms of shared intents is given.

Lexicographically minimal and lexicographically maximal suffixes of a string are fundamental notions of stringology. It is well known that the lexicographically minimal and maximal suffixes of a given string S can be computed in linear time and space by constructing a suffix tree or a suffix array of S. Here we consider the case when S is a substring of another string T of length n.We propose two linear-space data structures for T which allow to compute the minimal suffix of S in O(log^1+ε n) time (for any fixed ε > 0) and the maximal suffix of S in O(log n) time. Both data structures take O(n) time to construct.

In this paper we propose a woven block code construction based on two convolutional codes. We also propose a soft-input decoder that allows this construction to have better error correction performance than the turbo codes with a conventional decoder. Computer simulation has showed a 0.1 dB energy gain relative to the LTE turbo code. Asymptotically the proposed code has distance greater than the product of free distances of component codes.

Gaussian graphical model selection is a statistical problem

that identifies the Gaussian graphical model from observations. Existing

Gaussian graphical model selection methods focus on the error rate

for incorrect edge inclusion. However, when comparing statistical procedures,

it is also important to take into account the error rate for

incorrect edge exclusion. To handle this issue we consider the graphical

model selection problem in the framework of multiple decision theory.We

show that the statistical procedure based on simultaneous inference with

UMPU individual tests is optimal in the class of unbiased procedures.

Probabilistic neural network (PNN) is the well-known instance-based learning algorithm, which is widely used in various pattern classification and regression tasks, if rather small number of instances for each class is available. The known disadvantage of this network is its insufficient classification computational complexity. The common way to overcome this drawback is the reduction techniques with selection of the most typical instances. Such approach causes the shifting of the estimates of the class probability distribution, and, in turn, the decrease of the classification accuracy. In this paper we examine another possible solution by replacing the Gaussian window and the Parzen kernel to the orthogonal series Fejér kernel and using the naïve assumption about independence of features. It is shown, that our approach makes it possible to achieve much better runtime complexity in comparison with either original PNN or its modification with the preliminary clustering of the training set.

We initiate a comprehensive study of the complexity of computing Boolean functions by polynomial threshold functions (PTFs) on general Boolean domains. A typical example of a general Boolean domain is {1,2}^*n* . We are mainly interested in the length (the number of monomials) of PTFs, with their degree and weight being of secondary interest.

First we motivate the study of PTFs over the {1,2}^*n* domain by showing their close relation to depth two threshold circuits. In particular we show that PTFs of polynomial length and polynomial degree compute exactly the functions computed by polynomial size THR ∘ MAJcircuits. We note that known lower bounds for THR ∘ MAJ circuits extends to the likely strictly stronger model of PTFs. We also show that a “max-plus” version of PTFs are related to AC 0 ∘ THR circuits.

We exploit this connection to gain a better understanding of threshold circuits. In particular, we show that (super-logarithmic) lower bounds for 3-player randomized communication protocols with unbounded error would yield (super-polynomial) size lower bounds for THR ∘ THR circuits.

Finally, having thus motivated the model, we initiate structural studies of PTFs. These include relationships between weight and degree of PTFs, and a degree lower bound for PTFs of constant length.

We describe pretty-printing combinators with choice which provide optimal document layout in polynomial time. Bottom-up tree rewriting and dynamic programming (BURS) is used to calculate a set of possible layouts for a given output width. We also present the results of suggested approach evaluation and discuss its application for the implementation of pretty-printers.

We address question of synchronisation in parallel discrete event simulation (PDES) algorithms. We study synchronisation in conservative PDES model adding long-range connections between processing elements. We investigate how fraction of the random long-range connections in the synchronisation scheme influences the simulation time profile of PDES. We found that small fraction of random distant connections enhance synchronisation, namely, the width of the local virtual times remains constant with increasing number of processing elements. At the same time the conservative algorithm of PDES on small-world networks remains free from deadlocks. We compare our results with the case-study simulations.

Pair-wise sequence alignment is the basic method of comparative analysis of proteins and nucleic acids. Studying the results of alignment one has to consider two questions^ (1) did the program find all the interesting similarities (“sensitivity”) and (2) are all the found similarities interesting (“selectivity‘’). Definitely, one has to specify, what alignments are considered as the interesting ones. Analogous questions can be addressed to each of the obtained alignments: (3) which part of the aligned positions are aligned correctly (“confidence”) and (4) does alignment contain all pairs of the corresponding positions of compared sequences (“accuracy”). Naturally, the answer on the questions depends on the definition of correct alignment. The presentation addresses above two pairs of questions that are extremely important in interpreting of the results of sequence comparison.

The nearest neighbor search problem is well known since 60s. Many approaches have been proposed. One is to build a graph over the set of objects from a given database and use a greedy walk as a basis for a search algorithm. If the greedy walk has an ability to find the nearest neighbor in the graph starting from any vertex with a small number of steps, such a graph is called a navigable small world. In this paper we propose a new algorithm for building graphs with navigable small world properties. The main advantage of the proposed algorithm is that it is free from input parameters and has an ability to adapt on the fly to any changes in the distribution of data. The algorithm is based on the idea of removing local minimums by adding new edges. We realize this idea to improve search properties of the structure by using the set of queries in the execution stage. An empirical study of the proposed algorithm and comparison with previous works are reported in the paper.

The parallel computing algorithms are explored to improve the efficiency of image recognition with large database. The novel parallel version of the directed enumeration method (DEM) is proposed. The experimental study results in face recognition problem with FERET and Essex datasets are presented. We compare the performance of our parallel DEM with the original DEM and parallel implementations of the nearest neighbor rule and conventional Best Bin First (BBF) k-d tree. It is shown that the proposed method is characterized by increased computing efficiency (2-10 times in comparison with exhaustive search and the BBF) and lower error rate than the original DEM.

We consider the problem of sorting a sequence of *n *keys in a RAM-like environment where memory faults are possible. An algorithm is said to be *δ**-resilient *if it can tolerate up to *δ* memory faults during its execution. A resilient sorting algorithm must produce a sequence where every pair of uncorrupted keys is ordered correctly. Finocchi, Grandoni, and Italiano devised a *δ*-resilient deterministic mergesort algorithm that runs in *O*(*n *log *n *+ *δ*2) time. We present a *δ*-resilient randomized algorithm (based on quicksort) that runs in *O*(*n *log *n *+ *δ**√**n *log *n*) expected time and its deterministic variation that runs in *O*(*n *log *n *+*δ**√**n *log *n*) worst-case time. This improves the previous known result for *δ** >**√**n *log *n*. Our deterministric sorting relies on the notion of an approximate *k*th order statistic. For this auxiliary problem, we devise a deterministic algorithm that runs in *O*(*n *+ *δ**√**n*) time and produces a key (either corrupted or not) whose order rank differs from *k *by at most *O*(*δ*).

In this paper we present a new approach to reduce the computational time spent on coloring in one of the recent branch-and-bound algorithms for the maximum clique problem. In this algorithm candidates to the maximum clique are colored in every search tree node. We suggest that the coloring computed in the parent node is reused for the child nodes when it does not lead to many new branches. So we reuse the same coloring only in the nodes for which the upper bound is greater than the current best solution only by a small value δ. The obtained increase in performance reaches 70 % on benchmark instances.

The article refers to the problem of ringing artifact suppression. The ringing effect is caused by high-frequency information corruption or loss, it appears as waves or oscillations near strong edges. We propose a novel method for ringing artifact suppression after Fourier cut-off filtering. It can be also used for image deringing in the case of image resampling and other applications where the frequency loss can be estimated. The method is based on the joint sparse coding approach. The proposed method preserves more small image details than the state-of-the-art algorithms based on total variation minimization, and outperforms them in terms of image quality metrics.