The Deep Weight Prior
Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior (DWP), that exploit generative models to encourage a specific structure of trained convolutional filters e.g., spatial correlations of weights. We define DWP in the form of an implicit distribution and propose a method for variational inference with such type of implicit priors. In experiments, we show that DWP improves the performance of Bayesian neural networks when training data are limited, and initialization of weights with samples from DWP accelerates training of conventional convolutional neural networks.
We present a model for freight train time prediction based on station network analysis and specific feature engineering. We discuss the first pipeline to improve the freight flight duration prediction in Russia. While every freight company use only reference book made by RZD (Russian Railways) based on railroad distances with accuracy measured in days, we argue that one could predict the flight duration with error less than twenty hours while decreasing error to twelve hours for certain type of freight trains.
The seminal model by Laurent Itti and Cristoph Koch demonstrated that we can compute the entire flow of visual processing from input to resulting fixations. Despite many replications and follow-ups, few have matched the impact of the original model - so what made this model so groundbreaking? We have selected five key contributions that distinguish the original salience model by Itti and Koch; namely its contribution to our theoretical, neural and computational understanding of visual processing. Further, the model showed how salience could be used to make predictions for both spatial and temporal distributions of fixations. During the last 20 years, advances in the field have brought up various techniques and approaches to salience modeling, many of which tried to augment the initial Itti and Koch model. One of the most recent trends has been to adopt the computational power of deep learning neural networks, however, this has also shifted their primary focus to spatial classification. We present a review of recent approaches to modeling salience, and discuss the models from the point of view of their contribution to computational cognitive neuroscience.
Proceedings of Machine Learning Research: Volume 97: International Conference on Machine Learning, 9-15 June 2019, Long Beach, California, USA
We propose a novel multi-texture synthesis model based on generative adversarial networks (GANs) with a user-controllable mechanism. The user control ability allows to explicitly specify the texture which should be generated by the model. This property follows from using an encoder part which learns a latent representation for each texture from the dataset. To ensure a dataset coverage, we use an adversarial loss function that penalizes for incorrect reproductions of a given texture. In experiments, we show that our model can learn descriptive texture manifolds for large datasets and from raw data such as a collection of high-resolution photos. We show our unsupervised learning pipeline may help segmentation models. Moreover, we apply our method to produce 3D textures and show that it outperforms existing baselines.
his volume presents the results of the Neural Information Processing Systems Competition track at the 2018 NeurIPS conference. The competition follows the same format as the 2017 competition track for NIPS. Out of 21 submitted proposals, eight competition proposals were selected, spanning the area of Robotics, Health, Computer Vision, Natural Language Processing, Systems and Physics.
Competitions have become an integral part of advancing state-of-the-art in artificial intelligence (AI). They exhibit one important difference to benchmarks: Competitions test a system end-to-end rather than evaluating only a single component; they assess the practicability of an algorithmic solution in addition to assessing feasibility.
The Varieties of Democracy (V-Dem) project relies on country experts who code a host of ordinal variables, providing subjective ratings of latent|that is, not directly observable regime characteristics over time. Sets of around ve experts rate each case (country-year observation), and each of these raters works independently. Since raters may diverge in their coding because of either differences of opinion or mistakes, we require systematic tools with which to model these patterns of disagreement. These tools allow us to aggregate ratings into point estimates of latent concepts and quantify our uncertainty around these point estimates. In this chapter we describe item response theory models that can that account and adjust for differential item functioning (i.e. differences in how experts apply ordinal scales to cases) and variation in rater reliability (i.e. random error). We also discuss key challenges specic to applying item response theory to expert-coded cross-national panel data, explain the approaches that we use to address these challenges, highlight potential problems with our current framework, and describe long-term plans for improving our models and estimates. Finally, we provide an overview of the different forms in which we present model output.
Data sets quantifying phenomena of social-scientific interest often use multiple experts to code latent concepts. While it remains standard practice to report the average score across experts, experts likely vary in both their expertise and their interpretation of question scales. As a result, the mean may be an inaccurate statistic. Item-response theory (IRT) models provide an intuitive method for taking these forms of expert disagreement into account when aggregating ordinal ratings produced by experts, but they have rarely been applied to cross-national expert-coded panel data. We investigate the utility of IRT models for aggregating expert-coded data by comparing the performance of various IRT models to the standard practice of reporting average expert codes, using both data from the V-Dem data set and ecologically motivated simulated data. We find that IRT approaches outperform simple averages when experts vary in reliability and exhibit differential item functioning (DIF). IRT models are also generally robust even in the absence of simulated DIF or varying expert reliability. Our findings suggest that producers of cross-national data sets should adopt IRT techniques to aggregate expert-coded data measuring latent concepts.
L’ouvrage d’Adrian Mackenzie, professeur au Département de sociologie à l’Université de Lancaster, est d’un genre inédit au sein de la littérature émergente, mais encore peu étendue en sciences humaines et sociales, qui explore le fonctionnement du machine learning (ML). Les avancées spectaculaires de cette branche de l’intelligence artificielle (IA) depuis quelques années ont éclipsé les autres approches en la matière et ont soudainement transformé l’IA en un problème social et politique. Plusieurs auteurs ont déjà insisté sur la nécessité de focaliser le regard sur les outils de l’IA, en pointant les limites des travaux qui ne traitent que des effets sociaux des « algorithmes ». Comme le fait remarquer l’anthropologue des sciences et des techniques Nick Seaver, la plupart des travaux sur le sujet s’agitent au sujet des « algorithmes » ou le « big data », en insistant sur leurs effets néfastes, voire catastrophiques, pour la société sans jamais préciser exactement ce qu’ils sont. Le transfert des connaissances et des perspectives entre les spécialistes en IA et en SHS (d’ailleurs dans les deux sens) est pourtant indispensable pour en proposer une critique informée et efficace.