Black-Box Optimization with Local Generative Surrogates
We propose a novel method for gradient-based optimization of black-box simulators using differentiable local surrogate models. In fields such as physics and engineering, many processes are modeled with non-differentiable simulators with intractable likelihoods. Optimization of these forward models is particularly challenging, especially when the simulator is stochastic. To address such cases, we introduce the use of deep generative models to iteratively approximate the simulator in local neighborhoods of the parameter space. We demonstrate that these local surrogates can be used to approximate the gradient of the simulator, and thus enable gradient-based optimization of simulator parameters. In cases where the dependence of the simulator on the parameter space is constrained to a low dimensional submanifold, we observe that our method attains minima faster than baseline methods, including Bayesian optimization, numerical optimization and approaches using score function gradient estimators.
The seminal model by Laurent Itti and Cristoph Koch demonstrated that we can compute the entire flow of visual processing from input to resulting fixations. Despite many replications and follow-ups, few have matched the impact of the original model - so what made this model so groundbreaking? We have selected five key contributions that distinguish the original salience model by Itti and Koch; namely its contribution to our theoretical, neural and computational understanding of visual processing. Further, the model showed how salience could be used to make predictions for both spatial and temporal distributions of fixations. During the last 20 years, advances in the field have brought up various techniques and approaches to salience modeling, many of which tried to augment the initial Itti and Koch model. One of the most recent trends has been to adopt the computational power of deep learning neural networks, however, this has also shifted their primary focus to spatial classification. We present a review of recent approaches to modeling salience, and discuss the models from the point of view of their contribution to computational cognitive neuroscience.
RGBD images, combining high-resolution color and lower-resolution depth from various types of depth sensors, are increasingly common. One can significantly improve the resolution of depth maps by taking advantage of color information; deep learning methods make combining color and depth information particularly easy. However, fusing these two sources of data may lead to a variety of artifacts. If depth maps are used to reconstruct 3D shapes, eg, for virtual reality applications, the visual quality of upsampled images is particularly important. The main idea of our approach is to measure the quality of depth map upsampling using renderings of resulting 3D surfaces. We demonstrate that a simple visual appearance-based loss, when used with either a trained CNN or simply a deep prior, yields significantly improved 3D shapes, as measured by a number of existing perceptual metrics. We compare this approach with a number of existing optimization and learning-based techniques.
Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior (DWP), that exploit generative models to encourage a specific structure of trained convolutional filters e.g., spatial correlations of weights. We define DWP in the form of an implicit distribution and propose a method for variational inference with such type of implicit priors. In experiments, we show that DWP improves the performance of Bayesian neural networks when training data are limited, and initialization of weights with samples from DWP accelerates training of conventional convolutional neural networks.
This article is devoted to the application of the cellular automata mathematical apparatus to the problem of continuous optimization. The cellular automaton with an objective function is introduced as a new modification of the classic cellular automaton. The algorithm of continuous optimization, which is based on dynamics of the cellular automaton having the property of geometric symmetry, is obtained. The results of the simulation experiments with the obtained algorithm on standard test functions are provided, and a comparison between the analogs is shown.
The law of accelerating returns can be viewed as a concept that describes acceleration of technological progress. The idea is that tools are used for developing more advanced tools that are applied for creating even more advanced tools etc. A similar idea has been implemented in algorithms for advancing artificial intelligence. In this paper, the results of applying these algorithms in games are discussed. Nevertheless, real life tasks seem more complicated. The game theoretic approach can be applied for transition from theoretical and unrealistic games to more complex and practical tasks. Applications of the game theoretic approach to advance artificial intelligence in solving tasks in the credit industry are proposed.
We propose a novel multi-texture synthesis model based on generative adversarial networks (GANs) with a user-controllable mechanism. The user control ability allows to explicitly specify the texture which should be generated by the model. This property follows from using an encoder part which learns a latent representation for each texture from the dataset. To ensure a dataset coverage, we use an adversarial loss function that penalizes for incorrect reproductions of a given texture. In experiments, we show that our model can learn descriptive texture manifolds for large datasets and from raw data such as a collection of high-resolution photos. We show our unsupervised learning pipeline may help segmentation models. Moreover, we apply our method to produce 3D textures and show that it outperforms existing baselines.
his volume presents the results of the Neural Information Processing Systems Competition track at the 2018 NeurIPS conference. The competition follows the same format as the 2017 competition track for NIPS. Out of 21 submitted proposals, eight competition proposals were selected, spanning the area of Robotics, Health, Computer Vision, Natural Language Processing, Systems and Physics.
Competitions have become an integral part of advancing state-of-the-art in artificial intelligence (AI). They exhibit one important difference to benchmarks: Competitions test a system end-to-end rather than evaluating only a single component; they assess the practicability of an algorithmic solution in addition to assessing feasibility.
Applications that cater to the needs of disaster incident response generate large amount of data and demand large computational resource access. Such datasets are usually collected in real-time at the incident scenes using different Internet of Things (IoT) devices. Hierarchical clouds, i.e., core and edge clouds, can help these applications’ real-time data orchestration challenges as well as with their IoT operations scalability, reliability and stability by overcoming infrastructure limitations at the ad-hoc wireless network edge. Routing is a crucial infrastructure management orchestration mechanism for such systems. Current geographic routing or greedy forwarding approaches designed for early wireless ad-hoc networks lack efficient solutions for disaster incident-supporting applications, given the high-speed and low-latency data delivery that edge cloud gateways impose. In this paper, we present a novel Artificial Intelligent (AI)-augmented geographic routing approach, that relies on an area knowledge obtained from the satellite imagery (available at the edge cloud) by applying deep learning. In particular, we propose a stateless greedy forwarding that uses such an environment learning to proactively avoid the local minimum problem by diverting traffic with an algorithm that emulates electrostatic repulsive forces. In our theoretical analysis, we show that our Greedy Forwarding achieves in the worst case a path stretch approximation bound with respect to the shortest path, without assuming presence of symmetrical links or unit disk graphs. We evaluate our approach with both numerical and event-driven simulations, and we establish the practicality of our approach in a real incident-supporting hierarchical cloud deployment to demonstrate improvement of application level throughput due to a reduced path stretch under severe node failures and high mobility challenges of disaster response scenarios.
A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.
Event logs collected by modern information and technical systems usually contain enough data for automated process models discovery. A variety of algorithms was developed for process models discovery, conformance checking, log to model alignment, comparison of process models, etc., nevertheless a quick analysis of ad-hoc selected parts of a journal still have not get a full-fledged implementation. This paper describes an ROLAP-based method of multidimensional event logs storage for process mining. The result of the analysis of the journal is visualized as directed graph representing the union of all possible event sequences, ranked by their occurrence probability. Our implementation allows the analyst to discover process models for sublogs defined by ad-hoc selection of criteria and value of occurrence probability
Existing approaches suggest that IT strategy should be a reflection of business strategy. However, actually organisations do not often follow business strategy even if it is formally declared. In these conditions, IT strategy can be viewed not as a plan, but as an organisational shared view on the role of information systems. This approach generally reflects only a top-down perspective of IT strategy. So, it can be supplemented by a strategic behaviour pattern (i.e., more or less standard response to a changes that is formed as result of previous experience) to implement bottom-up approach. Two components that can help to establish effective reaction regarding new initiatives in IT are proposed here: model of IT-related decision making, and efficiency measurement metric to estimate maturity of business processes and appropriate IT. Usage of proposed tools is demonstrated in practical cases.