Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
The overestimation bias is one of the major impediments to accurate off-policy learning. This paper investigates a novel way to alleviate the overestimation bias in a continuous control setting. Our method—Truncated Quantile Critics, TQC,—blends three ideas: distributional representation of a critic, truncation of critics prediction, and ensembling of multiple critics. Distributional representation and truncation allow for arbitrary granular overestimation control, while ensembling provides additional score improvements. TQC outperforms the current state of the art on all environments from the continuous control benchmark suite, demonstrating 25% improvement on the most challenging Humanoid environment.
We present a model for freight train time prediction based on station network analysis and specific feature engineering. We discuss the first pipeline to improve the freight flight duration prediction in Russia. While every freight company use only reference book made by RZD (Russian Railways) based on railroad distances with accuracy measured in days, we argue that one could predict the flight duration with error less than twenty hours while decreasing error to twelve hours for certain type of freight trains.
A measurement of the charm-mixing parameter yCP using D0 → KþK−, D0 → πþπ−, and D0 → K−πþ decays is reported. The D0 mesons are required to originate from semimuonic decays of B− and B0 mesons. These decays are partially reconstructed in a data set of proton-proton collisions at center-of-mass energies of 7 and 8 TeV collected with the LHCb experiment and corresponding to an integrated luminosity of 3 fb−1. The yCP parameter is measured to be ð0.57 0.13ðstatÞ 0.09ðsystÞÞ%, in agreement with, and as precise as, the current world-average value.
Online social networks have become an essential communi- cation channel for the broad and rapid sharing of information. Currently, the mechanics of such information-sharing is captured by the notion of cascades, which are tree-like networks comprised of (re)sharing actions. However, it is still unclear what factors drive cascade growth. Moreover, there is a lack of studies outside Western countries and platforms such as Facebook and Twitter. In this work, we aim to investigate what fac- tors contribute to the scope of information cascading and how to predict this variation accurately. We examine six machine learning algorithms for their predictive and interpretative capabilities concerning cascades’ structural metrics (width, mass, and depth). To do so, we use data from a leading Russian-language online social network VKontakte capturing cascades of 4,424 messages posted by 14 news outlets during a year. The results show that the best models in terms of predictive power are Gradient Boosting algorithm for width and depth, and Lasso Regression algorithm for the mass of a cascade, while depth is the least predictable. We find that the most potent factor associated with cascade size is the number of reposts on its origin level. We examine its role along with other factors such as content features and characteristics of sources and their audiences.
The law of accelerating returns can be viewed as a concept that describes acceleration of technological progress. The idea is that tools are used for developing more advanced tools that are applied for creating even more advanced tools etc. A similar idea has been implemented in algorithms for advancing artificial intelligence. In this paper, the results of applying these algorithms in games are discussed. Nevertheless, real life tasks seem more complicated. The game theoretic approach can be applied for transition from theoretical and unrealistic games to more complex and practical tasks. Applications of the game theoretic approach to advance artificial intelligence in solving tasks in the credit industry are proposed.
Proceedings of Machine Learning Research: Volume 97: International Conference on Machine Learning, 9-15 June 2019, Long Beach, California, USA
The Fifth HCT Information Technology Trends (ITT 2018) is a major international research conference for the presentation of innovative ideas, approaches, technologies, research findings and outcomes, best practices and case studies, national and international projects, institutional standards and policies on Emerging Technologies for Artificial Intelligence. ITT 2018 will provide an outstanding forum for researchers, practitioners, students, policy makers, and users to exchange ideas, techniques and tools, raise awareness and share experiences related to all practical and theoretical aspects of Emerging Technologies for Artificial Intelligence, so as to develop solutions related to communications, computer science and engineering, control systems as well as interdisciplinary research and applications.
L’ouvrage d’Adrian Mackenzie, professeur au Département de sociologie à l’Université de Lancaster, est d’un genre inédit au sein de la littérature émergente, mais encore peu étendue en sciences humaines et sociales, qui explore le fonctionnement du machine learning (ML). Les avancées spectaculaires de cette branche de l’intelligence artificielle (IA) depuis quelques années ont éclipsé les autres approches en la matière et ont soudainement transformé l’IA en un problème social et politique. Plusieurs auteurs ont déjà insisté sur la nécessité de focaliser le regard sur les outils de l’IA, en pointant les limites des travaux qui ne traitent que des effets sociaux des « algorithmes ». Comme le fait remarquer l’anthropologue des sciences et des techniques Nick Seaver, la plupart des travaux sur le sujet s’agitent au sujet des « algorithmes » ou le « big data », en insistant sur leurs effets néfastes, voire catastrophiques, pour la société sans jamais préciser exactement ce qu’ils sont. Le transfert des connaissances et des perspectives entre les spécialistes en IA et en SHS (d’ailleurs dans les deux sens) est pourtant indispensable pour en proposer une critique informée et efficace.
A search for CP violation in the Cabibbo-suppressed D0 → K+K−π+π− decay mode is performed using an amplitude analysis. The measurement uses a sample of pp collisions recorded by the LHCb experiment during 2011 and 2012, corresponding to an integrated luminosity of 3.0 fb−1. The D0 mesons are reconstructed from semileptonic b-hadron decays into D0μ−X final states. The selected sample contains more than 160 000 signal decays, allowing the most precise amplitude modelling of this D0 decay to date. The obtained amplitude model is used to perform the search for CP violation. The result is compatible with CP symmetry, with a sensitivity ranging from 1% to 15% depending on the amplitude considered.