### Article

## Lines in hypergraphs

One of the De Bruijn-Erdős theorems deals with finite hypergraphs where every two vertices belong to precisely one hyperedge. It asserts that, except in the perverse case where a single hyperedge equals the whole vertex set, the number of hyperedges is at least the number of vertices and the two numbers are equal if and only if the hypergraph belongs to one of simply described families, near-pencils and finite projective planes. Chen and Chvátal proposed to define the line *uv* in a 3-uniform hypergraph as the set of vertices that consists of *u, v*, and all w such that {*u;v;w*} is a hyperedge. With this definition, the De Bruijn-Erdős theorem is easily seen to be equivalent to the following statement: If no four vertices in a 3-uniform hypergraph carry two or three hyperedges, then, except in the perverse case where one of the lines equals the whole vertex set, the number of lines is at least the number of vertices and the two numbers are equal if and only if the hypergraph belongs to one of two simply described families. Our main result generalizes this statement by allowing any four vertices to carry three hyperedges (but keeping two forbidden): the conclusion remains the same except that a third simply described family, complements of Steiner triple systems, appears in the extremal case.

Given a closed interval $I=[a,b]$ and a metric space $(M,d)$, we introduce a

nondecreasing sequence $\{\nu_n\}$ of pseudometrics on $M^I$ (the set of all

functions from $I$ into $M$), called the {\it joint modulus of variation}. We show that

if two sequences of functions $\{f_j\}$ and $\{g_j\}$ from $M^I$ are such that

$\{f_j\}$ is pointwise relatively compact on $I$, $\{g_j\}$ is pointwise convergent on $I$,

and $\limsup_{j\to\infty}\nu_n(f_j,g_j)=o(n)$ as $n\to\infty$, then $\{f_j\}$ admits

a pointwise convergent subsequence whose limit on $I$ is a conditionally regulated function.

The notion of a modular is introduced as follows. A (metric) *modular* on a set *X* is a function *w*:(0,*∞*)×*X*×*X*→[0,*∞*] satisfying, for all *x*,*y*,*z*∈*X*, the following three properties: *x*=*y* if and only if *w*(*λ*,*x*,*y*)=0 for all *λ*>0; *w*(*λ*,*x*,*y*)=*w*(*λ*,*y*,*x*) for all *λ*>0; *w*(*λ*+*μ*,*x*,*y*)≤*w*(*λ*,*x*,*z*)+*w*(*μ*,*y*,*z*) for all *λ*,*μ*>0. We show that, given *x*0∈*X*, the set *X**w*={*x*∈*X*:lim*λ*→*∞**w*(*λ*,*x*,*x*0)=0} is a metric space with metric , called a *modular space*. The modular *w* is said to be *convex* if (*λ*,*x*,*y*)↦*λ**w*(*λ*,*x*,*y*) is also a modular on *X*. In this case *X**w* coincides with the set of all *x*∈*X* such that *w*(*λ*,*x*,*x*0)<*∞* for some *λ*=*λ*(*x*)>0 and is metrizable by . Moreover, if or , then ; otherwise, the reverse inequalities hold. We develop the theory of metric spaces, generated by modulars, and extend the results by H. Nakano, J. Musielak, W. Orlicz, Ph. Turpin and others for modulars on linear spaces.

Similarity searching has a vast range of applications in various fields of computer science. Many methods have been proposed for exact search, but they all suffer from the curse of dimensionality and are, thus, not applicable to high dimensional spaces. Approximate search methods are considerably more efficient in high dimensional spaces. Unfortunately, there are few theoretical results regarding the complexity of these methods and there are no comprehensive empirical evaluations, especially for non-metric spaces. To fill this gap, we present an empirical analysis of data structures for approximate nearest neighbor search in high dimensional spaces. We provide a comparison with recently published algorithms on several data sets. Our results show that small world approaches provide some of the best tradeoffs between efficiency and effectiveness in both metric and non-metric spaces.

We propose a novel approach for solving the approximate nearest neighbor search problem in arbitrary metric spaces. The distinctive feature of our approach is that we can incrementally build a non-hierarchical distributed structure for given metric space data with a logarithmic complexity scaling on the size of the structure and adjustable accuracy probabilistic nearest neighbor queries. The structure is based on a small world graph with vertices corresponding to the stored elements, edges for links between them and the greedy algorithm as base algorithm for searching. Both search and addition algorithms require only local information from the structure. The performed simulation for data in the Euclidian space shows that the structure built using the proposed algorithm has navigable small world properties with logarithmic search complexity at fixed accuracy and has weak (power law) scalability with the dimensionality of the stored data.

This proceedings publication is a compilation of selected contributions from the “Third International Conference on the Dynamics of Information Systems” which took place at the University of Florida, Gainesville, February 16–18, 2011. The purpose of this conference was to bring together scientists and engineers from industry, government, and academia in order to exchange new discoveries and results in a broad range of topics relevant to the theory and practice of dynamics of information systems. Dynamics of Information Systems: Mathematical Foundation presents state-of-the art research and is intended for graduate students and researchers interested in some of the most recent discoveries in information theory and dynamical systems. Scientists in other disciplines may also benefit from the applications of new developments to their own area of study.

A form for an unbiased estimate of the coefficient of determination of a linear regression model is obtained. It is calculated by using a sample from a multivariate normal distribution. This estimate is proposed as an alternative criterion for a choice of regression factors.