• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Gray-box Inference for Structured Gaussian Process Models
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 22, 2026
HSE Graduates AI Project Wins at TECH & AI Awards
Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).
May 20, 2026
HSE University Opens First Representative Office of Satellite Laboratory in Brazil
HSE University-St Petersburg opened a representative office of the Satellite Laboratory on Social Entrepreneurship at the University of Campinas in Brazil. The platform is going to unite research and educational projects in the spheres of sustainable development, communications and social innovations.
May 18, 2026
The 'Second Shift' Is Not Why Women Avoid News
Women are more likely than men to avoid political and economic news, but the reasons for this behaviour are linked less to structural inequality or family-related stress than to personal attitudes and the emotional perception of news content. This conclusion was reached by HSE researchers after analysing data from a large-scale survey of more than 10,000 residents across 61 regions of Russia. The study findings have been published in Woman in Russian Society.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Gray-box Inference for Structured Gaussian Process Models

P. 353–361.
Galliani P., Dezfouli A., Bonilla E., Quadrianto N.

We develop an automated variational inference method for Bayesian structured prediction problems with Gaussian process (GP) priors and linear-chain likelihoods. Our approach does not need to know the details of the structured likelihood model and can scale up to a large number of observations. Furthermore, we show that the required expected likelihood term and its gradients in the variational objective (ELBO) can be estimated efficiently by using expectations over very low-dimensional Gaussian distributions. Optimization of the ELBO is fully parallelizable over sequences and amenable to stochastic optimization, which we use along with control variate techniques to make our framework useful in practice. Results on a set of natural language processing tasks show that our method can be as good as (and sometimes better than, in particular with respect to expected log-likelihood) hard-coded approaches including svm-struct and crfs, and overcomes the scalability limitations of previous inference algorithms based on sampling. Overall, this is a fundamental step to developing automated inference methods for Bayesian structured prediction. 

Language: English
Full text
Text on another site
Keywords: Gaussian processesstructured prediction

In book

Proceedings of Machine Learning Research. 2017. Volume 54: Artificial Intelligence and Statistics
Vol. 54: Artificial Intelligence and Statistics. , [б.и.], 2017.
Similar publications
Surrogate uncertainty estimation for your time series forecasting black-box: learn when to trust
Erlygin L., Zholobov V., Baklanova V. et al., , in: 2023 IEEE International Conference on Data Mining Workshops (ICDMW) 1–4 December 2023, Shanghai, China.: Shanghai: IEEE Computer Society, 2023. P. 1247–1258.
Machine learning models play a vital role in time series forecasting. These models, however, often overlook an important element: point uncertainty estimates. Incorporating these estimates is crucial for effective risk management, informed model selection, and decision-making.To address this issue, our research introduces a method for uncertainty estimation. We employ a surrogate Gaussian process regression model. ...
Added: March 20, 2024
Uncertainty Estimation in Autoregressive Structured Prediction
Andrey Malinin, Gales M., , in: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). ICLR, 2021.: ICLR, 2021. P. 1–31.
Added: November 1, 2021
Gaussian processes with multidimensional distribution inputs via optimal transport and Hilbertian embedding
Bachoc F., Suvorikova A., Ginsbourger D. et al., Electronic journal of statistics 2020 Vol. 14 No. 2 P. 2742–2772
In this work, we propose a way to construct Gaussian processes indexed by multidimensional distributions. More precisely, we tackle the problem of defining positive definite kernels between multivariate distributions via notions of optimal transport and appealing to Hilbert space embeddings. Besides presenting a characterization of radial positive definite and strictly positive definite kernels on general ...
Added: October 30, 2020
High extremes of Gaussian chaos processes: a discrete time approximation approach
A. I. Zhdanov, V. I. Piterbarg, Theory Probability and its Applications 2018 Vol. 63 No. 1 P. 1–21
Let $\mathbf{\boldsymbol{\xi}}(t)=(\xi_{1}(t),\ldots,\xi_{d}(t))$ be a Gaussian zero mean stationary a.s. continuous vector process. Let $g\colon{\mathbb{R}}^{d}\to {\mathbb{R}}$ be a homogeneous function of positive degree. We study probabilities of high extrema of the Gaussian chaos process $g(\mathbf{\boldsymbol{\xi}}(t))$. Important examples are products of Gaussian processes, $\prod_{i=1}^{d}\xi_{i}(t)$, and quadratic forms $\sum_{i,j=1}^{d}a_{ij}\xi_{i}(t)\xi_{j}(t)$. Methods of our studies include the Laplace saddle point ...
Added: November 14, 2019
On probability of high extremes for product of two Gaussian stationary processes
A. I. Zhdanov., Theory Probability and its Applications 2015 Vol. 60 No. 3 P. 520–527
Let $(X(t),Y(t))$, $t\ge0$, be a zero-mean stationary Gaussian vector process with a covariance functions for components $r_i(t)$ satisfying Pickand's condition $r_i(t)=1-c_i|t|^{\alpha_i}(1+o(1))$, $t\to 0$, $c_i>0$, $0<\alpha_i\le2$, $i=1,2.$ Let $r_i(t)<1$, $i=1,2$, $t>0.$ Assuming that $r\equiv {\bf E}\,X(t)Y(t)\in(-1,1)$ and $\lim_{t,s\rightarrow0}({\bf E}\,X(t)Y(s)-r)/|t-s|^{\min(\alpha_1,\alpha_2)}$ exists, we study the behavior of probability ${\bf P}(\max_{t\in\lbrack0,p]}X(t)Y(t)>u)$ as $u\rightarrow\infty$ for any $p$. In particular, we ...
Added: November 14, 2019
On probability of high extremes for product of two independent Gaussian stationary processes
Zhdanov A., Piterbarg V.I., Extremes 2015 Vol. 18 No. 1 P. 99–108
Let X(t), Y(t), t ≥ 0, be two independent zero-mean stationary Gaussian processes, whose covariance functions are such that ri (t) = 1 − |t|^{a_{i}} + o(|t|^{a_{i}}) as t → 0, with 0 < a_{i} ≤ 2, i = 1, 2 and both of the functions are less than one for non-zero t . We derive for any p ...
Added: November 14, 2019
Точная асимптотика малых уклонений в L_2-норме с весом для некоторых гауссовских процессов
Pusev R., Назаров А. И., Записки научных семинаров ПОМИ РАН 2009 Т. 364 С. 166–199
We find the exact small ball asymptotics under weighted L_2-norm for a wide class of Gaussian processes which generate boundary-value problems for ordinary differential equations. Sharp constants in the asymptotics are derived for a number of processes connected with special functions. ...
Added: January 28, 2019
Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition
Izmailov P., Novikov A., Kropotov D., , in: Proceedings of Machine Learning Research. Proceedings of The International Conference on Artificial Intelligence and Statistics (AISTATS 2018).: [б.и.], 2018. P. 726–735.
We propose a method (TT-GP) for approximate inference in Gaussian Process (GP) models. We build on previous scalable GP research including stochastic variational inference based on inducing inputs, kernel interpolation, and structure exploiting algebra. The key idea of our method is to use Tensor Train decomposition for variational parameters, which allows us to train GPs ...
Added: December 10, 2018
Faster variational inducing input Gaussian process classification
Izmailov P., Kropotov D., Journal of machine learning and data analysis 2017 Vol. 3 No. 1 P. 20–35
Background: Gaussian processes (GP) provide an elegant and effective approach to learning in kernel machines. This approach leads to a highly interpretable model and allows using the Bayesian framework for model adaptation and incorporating the prior knowledge about the problem. The GP framework is successfully applied to regression, classification, and dimensionality reduction problems. Unfortunately, the ...
Added: December 6, 2018
Quantifying Learning Guarantees for Convex but Inconsistent Surrogates
Struminsky K., Lacoste-Julien S., Osokin A., , in: Advances in Neural Information Processing Systems 31 (NIPS 2018).: [б.и.], 2018. P. 1–9.
We study consistency properties of machine learning methods based on minimizing convex surrogates. We extend the recent framework of Osokin et al. (2017) for the quantitative analysis of consistency properties to the case of inconsistent surrogates. Our key technical contribution consists in a new lower bound on the calibration function for the quadratic surrogate, which ...
Added: October 29, 2018
Marginal Weighted Maximum Log-likelihood for Efficient Learning of Perturb-and-Map models
Shpakova T., Bach F., Osokin A., , in: Proceedings of the international conference on Uncertainty in Artificial Intelligence (UAI 2018).: [б.и.], 2018. P. 1–11.
We consider the structured-output prediction problem through probabilistic approaches and generalize the ``''perturb-and-MAP'' framework to more challenging weighted Hamming losses, which are crucial in applications. While in principle our approach is a straightforward marginalization, it requires solving many related MAP inference problems. We show that for log-supermodular pairwise models these operations can be performed efficiently ...
Added: October 29, 2018
SEARNN: Training RNNs with global-local losses
Leblond R., Alayrac J., Osokin A. et al., , in: Proceedings of the 6th International Conference on Learning Representations (ICLR 2018).: [б.и.], 2018. P. 1–16.
We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the "learning to search" (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an ...
Added: October 29, 2018
Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition
Izmailov P., Novikov A., Kroptov D., / Series arXiv "math". 2017.
We propose a method (TT-GP) for approximate inference in Gaussian Process (GP) models. We build on previous scalable GP research including stochastic variational inference based on inducing inputs, kernel interpolation, and structure exploiting algebra. The key idea of our method is to use Tensor Train decomposition for variational parameters, which allows us to train GPs ...
Added: October 20, 2017
On Structured Prediction Theory with Calibrated Convex Surrogate Losses
Osokin A., Bach F., Lacoste-Julien S., , in: Advances in Neural Information Processing Systems 30 (NIPS 2017).: Montreal: Curran Associates, 2017. P. 302–313.
We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees. For any task loss, we construct a convex surrogate that can be optimized via stochastic gradient descent and we prove tight bounds on the so-called "calibration function" relating the excess surrogate risk to the actual ...
Added: October 19, 2017
Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs
Osokin A., Alayrac J., Lukasewitz I. et al., , in: Proceedings of Machine Learning Research. Proceedings of the International Conference on Machine Learning (ICML 2016)Vol. 48.: NY: [б.и.], 2016. P. 885–925.
In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications. The key intuition behind our improvements is that the estimates of block gaps maintained by ...
Added: October 19, 2017
Context-Aware CNNs for Person Head Detection
Vu T., Osokin A., Laptev I., , in: Proceedings of the IEEE International Conference on Computer Vision (ICCV 2015).: Santiago de Chile: IEEE, 2015. P. 2893–2901.
Person detection is a key problem for many computer vision tasks. While face detection has reached maturity, detecting people under full variation of camera view-points, human poses, lighting conditions and occlusions is still a difficult challenge. In this work we focus on detecting human heads in natural scenes. Starting from the recent R-CNN object detector, ...
Added: October 19, 2017
Perceptually Inspired Layout-Aware Losses for Image Segmentation
Osokin A., Kohli P., , in: Lecture Notes in Computer Science. Proceedings of the 13th European Conference on Computer Vision (ECCV 2014)* 2. Vol. 8690.: Zürich: Springer, 2014. P. 663–678.
Interactive image segmentation is an important computer vision problem that has numerous real world applications. Models for image segmentation are generally trained to minimize the Hamming error in pixel labeling. The Hamming loss does not ensure that the topology/structure of the object being segmented is preserved and therefore is not a strong indicator of the ...
Added: October 19, 2017
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit