Joint Optimization of Segmentation and Color Clustering
Binary energy optimization is a popular approach for segmenting an image into foreground/background regions. To model region appearance, color, a relatively high dimensional feature, should be handled effectively. A full color histogram is usually too sparse to be reliable. One approach is to reduce dimensionality by color space clustering. Another popular approach is to fit GMMs for soft color space clustering. These approaches work well when the foreground/background are sufficiently distinct. In cases of more subtle difference in appearance, both approaches may reduce or even eliminate foreground/background distinction. This happens because either color clustering is performed completely independently from segmentation, as a preprocessing step (in clustering), or independently for the foreground and independently for the background (in GMM). We propose to make clustering an integral part of segmentation, by including a new clustering term in the energy. Our energy favors clusterings that make foreground/ background appearance more distinct. Exact optimization is not feasible, therefore we develop an approximate algorithm. We show the advantage of including the color clustering term into the energy function on camouflage images, as well as standard segmentation datasets.
This article represents a new technique for collaborative filtering based on pre-clustering of website usage data. The key idea involves using clustering methods to define groups of different users.
The article is devoted to the history and problems of creating interfaces. Shows the complexity and importance of effective interfaces, noted that this problem is a system of multilevel interdisciplinary. The new systems should be given serious attention to issues of human efficiency level. Man is still the leading element in determining the efficiency of any ergatic system. The main means of control in ergatic systems including computers, is the graphic manipulator (GM), with which to control the on-screen controls. Are the main styles of user interface. The most popular are GUI-interface (GUI - GraphicalUserInterface) and based on them WUI-interface (WUI-WebUserInterface). The development of equipment and technology of computer modeling led to the active introduction of virtual reality technology to ensure the inclusion of people in artificial worlds. Their main feature - full control of all the parameters of the development and the emergence of a sense of presence in people who live in these environments, which are called immersive. Technology induced environments allow a number of new, not generally applicable to the present, of interfaces using specially engineered virtual environments. Much attention is paid to creating the most advanced systems - systems contact management, which are the camera and sophisticated software. The drawbacks of modern non-contact control. Is being developed to create a contactless intelligent interface, which will allow: to control with data from a video camera, which is installed on your computer have a high noise immunity, clearly identify the user to recognize the situational environment, have an acceptable cost.
This is a textbook in data analysis. Its contents are heavily influenced by the idea that data analysis should help in enhancing and augmenting knowledge of the domain as represented by the concepts and statements of relation between them. According to this view, two main pathways for data analysis are summarization, for developing and augmenting concepts, and correlation, for enhancing and establishing relations. Visualization, in this context, is a way of presenting results in a cognitively comfortable way. The term summarization is understood quite broadly here to embrace not only simple summaries like totals and means, but also more complex summaries such as the principal components of a set of features or cluster structures in a set of entities.
The material presented in this perspective makes a unique mix of subjects from the fields of statistical data analysis, data mining, and computational intelligence, which follow different systems of presentation.
Most of today’s machine learning techniques requires large manually labeled data. This problem can be solved by using synthetic images. Our main contribution is to evaluate methods of traffic sign recognition trained on synthetically generated data and show that results are comparable with results of classifiers trained on real dataset. To get a representative synthetic dataset we model different sign image variations such as intra-class variability, imprecise localization, blur, lighting, and viewpoint changes. We also present a new method for traffic sign segmentation, based on a nearest neighbor search in the large set of synthetically generated samples, which improves current traffic sign recognition algorithms.
The volume contains the abstracts of the 12th International Conference "Intelligent Data Processing: Theory and Applications". The conference is organized by the Russian Academy of Sciences, the Federal Research Center "Informatics and Control" of the Russian Academy of Sciences and the Scientific and Coordination Center "Digital Methods of Data Mining". The conference has being held biennially since 1989. It is one of the most recognizable scientific forums on data mining, machine learning, pattern recognition, image analysis, signal processing, and discrete analysis. The Organizing Committee of IDP-2018 is grateful to Forecsys Co. and CFRS Co. for providing assistance in the conference preparation and execution. The conference is funded by RFBR, grant 18-07-20075. The conference website http://mmro.ru/en/.
The paper describes the results of an experimental study of topic models applied to the task of single-word term extraction. The experiments encompass several probabilistic and non-probabilistic topic models and demonstrate that topic information improves the quality of term extraction, as well as NMF with KL-divergence minimization is the best among the models under study.
A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use an approach based on computing (closed) sets of attributes having large support (large extent) as clusters of similar documents. The method is tested in a series of computer experiments on large public collections of web documents and compared to other established methods and software, such as biclustering, on same datasets. Practical efficiency of different algorithms for computing frequent closed sets of attributes is compared.
Technology mining (TM) helps to acquire intelligence about the evolution of research and development (R&D), technologies, products, and markets for various STI areas and what is likely to emerge in the future by identifying trends. The present chapter introduces a methodology for the identification of trends through a combination of “thematic clustering” based on the co-occurrence of terms, and “dynamic term clustering” based on the correlation of their dynamics across time. In this way, it is possible to identify and distinguish four patterns in the evolution of terms, which eventually lead to (i) weak signals of future trends, as well as (ii) emerging, (iii) maturing, and (iv) declining trends. Key trends identified are then further analyzed by looking at the semantic connections between terms identified through TM. This helps to understand the context and further features of the trend. The proposed approach is demonstrated in the field photonics as an emerging technology with a number of potential application areas.