Subspace Inference for Bayesian Deep Learning

D. Vetrov; Izmailov P.; Maddox W.; Kirichenko P.; Garipov T.; Gordon Wilson A.

?

Subspace Inference for Bayesian Deep Learning

P. 1–11.

Vetrov D., Izmailov P., Maddox W., Kirichenko P., Garipov T., Gordon Wilson A.

Bayesian inference was once a gold standard for learning with neural networks, providing accurate full predictive distributions and well calibrated uncertainty. However, scaling Bayesian inference techniques to deep neural networks is challenging due to the high dimensionality of the parameter space. In this paper, we construct low-dimensional subspaces of parameter space, such as the first principal components of the stochastic gradient descent (SGD) trajectory, which contain diverse sets of high performing models. In these subspaces, we are able to apply elliptical slice sampling and variational inference, which struggle in the full parameter space. We show that Bayesian model averaging over the induced posterior in these subspaces produces accurate predictions and well-calibrated predictive uncertainty for both regression and image classification.

Language: English

Text on another site

Keywords: bayesian network Variational inference Bayesian learning gradient methods

In book

Proceedings of the 35th Uncertainty in Artificial Intelligence Conference (UAI-2019)

[б.и.], 2019.

Classification Using Marginalized Maximum Likelihood Estimation and Black-Box Variational Inference

Shalileh S., , in: Data Analysis and Optimization. In Honor of Boris Mirkin's 80th Birthday.: Springer, 2023. P. 349–361.

Based upon variational inference (VI) a new set of classification algorithms has recently emerged. This set of algorithms aims (A) to increase generalization power, (B) to decrease computational complexity. However, the complex math and implementation considerations have led to the emergence of black-box variational inference methods (BBVI). Relying on these principles, we assume the existence ...

Added: October 17, 2023

MARS: Masked Automatic Ranks Selection in Tensor Decompositions

Kodryan M., Kropotov D., Vetrov D., / Series QTNML 2020 "First Workshop on Quantum Tensor Networks in Machine Learning, NeurIPS 2020". 2020.

Tensor decomposition methods have recently proven to be efficient for compressing and accelerating neural networks. However, the problem of optimal decomposition structure determination is still not well studied while being quite important. Specifically, decomposition ranks present the crucial parameter controlling the compression-accuracy trade-off. In this paper, we introduce MARS - a new efficient method for ...

Added: February 5, 2021

Bayesian Social Learning from Consumer Reviews

Zseleva A., Ifrach B., Maglaras C. et al., Operations Research 2018

Added: June 19, 2018

Finding equilibria in the traffic assignment problem with primal-dual gradient methods for stable dynamics model and beckmann model

Kubentayeva M., Gasnikov A., Mathematics 2021 Vol. 9 No. 11 Article 1217

In this paper, we consider the application of several gradient methods to the traffic assignment problem: we search equilibria in the stable dynamics model (Nesterov and De Palma, 2003) and the Beckmann model. Unlike the celebrated Frank–Wolfe algorithm widely used for the Beckmann model, these gradients methods solve the dual problem and then reconstruct a ...

Added: October 29, 2021

Analysing the firm failure process using Bayesian networks

Zelenkov Y., Business Informatics 2022 Vol. 16 No. 1 P. 22–41

This work analyses the firm failure process stages using the Bayesian network as a modelling tool because it allows us to identify causal relationships in the firm profile. We use publicly available data on French, Italian and Russian firms containing five samples corresponding to periods from one to five years before observation. Our results confirm ...

Added: April 18, 2022

Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition

Izmailov P., Novikov A., Kropotov D., , in: Proceedings of Machine Learning Research. Proceedings of The International Conference on Artificial Intelligence and Statistics (AISTATS 2018).: [б.и.], 2018. P. 726–735.

We propose a method (TT-GP) for approximate inference in Gaussian Process (GP) models. We build on previous scalable GP research including stochastic variational inference based on inducing inputs, kernel interpolation, and structure exploiting algebra. The key idea of our method is to use Tensor Train decomposition for variational parameters, which allows us to train GPs ...

Added: December 10, 2018

Variational Inference for Sequential Distance Dependent Chinese Restaurant Process.

Bartunov S. O., Vetrov D., Journal of Machine Learning Research 2014 Vol. 32 No. 1 P. 1404–1412

Recently proposed distance dependent Chinese Restaurant Process (ddCRP) generalizes extensively used Chinese Restaurant Process (CRP) by accounting for dependencies between data points. Its posterior is intractable and so far only MCMC methods were used for inference. Because of very different nature of ddCRP no prior developments in variational methods for Bayesian nonparametrics are appliable. In ...

Added: July 9, 2014

Model Criticism of Bayesian Networks in Educational Assessment: A Systematic Review

Uglanova I., Practical Assessment, Research and Evaluation 2021 Vol. 26 Article 22

There is increased use of Bayesian networks (BN) in educational assessment. In psychometrics, BN serves as a measurement model with high flexibility, suitable to model educational assessment data with a complex structure. BN is a novel psychometric approach and not all aspects of its application are well-known. The article aims to provide the systematization of ...

Added: November 15, 2021

Doubly Semi-Implicit Variational Inference

Molchanov D., Kharitonov V., Sobolev A. et al., , in: Proceedings of Machine Learning Research, Volume 89: The 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019).: PMLR, 2019. P. 2593–2602.

We extend the existing framework of semi-implicit variational inference (SIVI) and introduce doubly semi-implicit variational inference (DSIVI), a way to perform variational inference and learning when both the approximate posterior and the prior distribution are semi-implicit. In other words, DSIVI performs inference in models where the prior and the posterior can be expressed as an ...

Added: November 20, 2019

Bayesian Learning of Consumer Preferences for Residential Demand Response

Губко М. В., Kuznetsov S., Neznanov A. et al., IFAC-PapersOnLine 2016 Vol. 49 No. 32 P. 24–29

In coming years residential consumers will face real-time electricity tariffs with energy prices varying day to day, and effective energy saving will require automation - a recommender system, which learns consumer's preferences from her actions. A consumer chooses a scenario of home appliance use to balance her comfort level and the energy bill. We propose ...

Added: January 24, 2017

Counterfactual explanations based on synthetic data generation

Yuri A. Zelenkov, Elizaveta V. Lashkevich, Business Informatics 2024 Vol. 18 No. 3 P. 24–40

A counterfactual explanation is the generation for a particular sample of a set of instances that belong to the opposite class but are as close as possible in the feature space to the factual being explained. Existing algorithms that solve this problem are usually based on complicated models that require a large amount of training data and significant ...

Added: October 13, 2024

Knowledge Generation in Raw Material Industries

Krukov V. A., Milyaev D. V., Dushenin D. I. et al., Studies on Russian Economic Development 2022 Vol. 33 No. 3 P. 257–266

Abstract— The article presents a mathematical toolkit for forecasting the development of innovative technologies that can be used for cost-effective development of hard-to-recover hydrocarbon reserves. The proposed approach is a symbiosis of agent-based models, Bayesian networks, learning curves, statistical analysis, and numerical simulation modeling methods. These techniques have not been used as a set in ...

Added: October 29, 2022

Improving Maximum Likelihood Estimation Using Marginalization and Black-Box Variational Inference

Shalileh S., , in: Intelligent Data Engineering and Automated Learning – IDEAL 2021.: Springer, 2021. P. 204–212.

Based upon Black Box Variational Inference, a new set of classification algorithms has recently emerged. The goals of this set of algorithms are twofold: 1) increasing generalization power; 2) decreasing computational and implementation complexity. To this end, we assume a set of latent variables during the generation of data points. We subsequently marginalize the conventional ...

Added: April 19, 2022

Оценка вероятностей дефолта российских банков: эмпирический анализ

Smirnov S. N., Финансовый бизнес 2011 № 3 С. 9–16

We suggest an econometric model of probability of default based on regular financial disclosures of Russian banks. We also suggest a quantization of the continuous explanatory variables that allows to account for non-linear effects and to achieve superior accuracy compared with regression tree and Bayesian network models estimated over the same sample. The econometric estimates ...

Added: November 30, 2012

Entropy Dimension Reduction Method for Randomized Machine Learning Problems

Popkov Y., Dubnov Y. A., Popkov A. Y., Automation and Remote Control 2018 Vol. 79 No. 11 P. 2038–2051

The direct and inverse projections (DIP) method was proposed to reduce the feature space to the given dimensions oriented to the problems of randomized machine learning and based on the procedure of “direct” and “inverse” design. The “projector” matrices are determined by maximizing the relative entropy. It is suggested to estimate the information losses by ...

Added: February 12, 2019

Faster variational inducing input Gaussian process classification

Izmailov P., Kropotov D., Journal of machine learning and data analysis 2017 Vol. 3 No. 1 P. 20–35

Background: Gaussian processes (GP) provide an elegant and effective approach to learning in kernel machines. This approach leads to a highly interpretable model and allows using the Bayesian framework for model adaptation and incorporating the prior knowledge about the problem. The GP framework is successfully applied to regression, classification, and dimensionality reduction problems. Unfortunately, the ...

Added: December 6, 2018

Machine Learning and Data Mining in Pattern Recognition

Springer, 2014.

This book constitutes the refereed proceedings of the 10th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2014, held in St. Petersburg, Russia in July 2014. The 40 full papers presented were carefully reviewed and selected from 128 submissions. The topics range from theoretical topics for classification, clustering, association rule and ...

Added: September 30, 2014

Universal gradient methods for convex optimization problems

Nesterov Y., Mathematical Programming 2014

Research Article ...

Added: January 28, 2016

Variational Inference for Sequential Distance Dependent Chinese Restaurant Process

Bartunov S., Vetrov D., , in: JMLR Workshop and Conference ProceedingsIssue 32: Proceedings of The 31st International Conference on Machine Learning.: Beijing: Microtome Publishing, 2014. P. 1404–1412.

Added: March 4, 2015

Community Embeddings with Bayesian Gaussian Mixture Model and Variational Inference

Anton I.N. Begehr, Peter B. Panfilov, , in: 2022 IEEE 24th Conference on Business Informatics (CBI)Vol. 2: CBI Forum and Workshop Papers.: IEEE, 2022. P. 88–96.

Graphs, such as social networks, emerge naturally from various real-world situations. Recently, graph embedding methods have gained traction in data science research. The graph and community embedding algorithm ComE aims to preserve first-, second- and higher-order proximity. ComE requires prior knowledge of the number of communities K. In this paper, ComE is extended to utilize ...

Added: December 6, 2022

Fast and modular regularized topic modelling 21st Conference of Open Innovations Association, FRUCT 2017; Helsinki; Finland; 6 November 2017 до 10 November 2017; Номер категорииCFP1767Z-ART; Код 134240

Vorontsov K. V., Kochedykov D., Apishev M. et al., IEEE Computer Society, 2017.

Topic modelling is an area of text mining that has been actively developed in the last 15 years. A probabilistic topic model extracts a set of hidden topics from a collection of text documents. It defines each topic by a probability distribution over words and describes each document with a probability distribution over topics. In ...

Added: December 6, 2019

Матричные уравнения локального логико-вероятностного вывода оценок истинности элементов в алгебраических байесовских сетях

Тулупьев А. Л., Sirotkin A., Вестник Санкт-Петербургского университета. Серия 1. Математика. Механика. Астрономия 2012 № 3 С. 63–72

The processing of probabilistically uncertain knowledge patterns in intellectual decision support systems falls into three kinds of probabilistic-logic inference, such as reconciliation, a priori and a posteriori inference. The paper presents formulae that allow for putting the process down in terms of matrix-vector language. ...

Added: March 25, 2014

Solving Smooth Min-Min and Min-Max Problems by Mixed Oracle Algorithms

Gladin E., Sadiev A., Gasnikov A. et al., , in: Mathematical Optimization Theory and Operations Research: 20th International Conference, MOTOR 2021, Irkutsk, Russia, July 5–10, 2021, Proceedings.: Cham: Springer, 2021. P. 19–40.

In this paper, we consider two types of problems that have some similarity in their structure, namely, min-min problems and min-max saddle-point problems. Our approach is based on considering the outer minimization problem as a minimization problem with an inexact oracle. This inexact oracle is calculated via an inexact solution of the inner problem, which ...

Added: November 29, 2024

Robust Variational Inference

Figurnov M., Struminsky K., Vetrov D., / Series arXiv:1611.09226 "arxiv.org". 2016.

Variational inference is a powerful tool for approximate inference. However, it mainly focuses on the evidence lower bound as variational objective and the development of other measures for variational inference is a promising area of research. This paper proposes a robust modification of evidence and a lower bound for the evidence, which is applicable when ...

Added: November 30, 2016