mGPT: Few-Shot Learners Go Multilingual
arXive , 2022.
Shliazhko O., Fenogenova A., Tikhonova M., Mikhailov V., Kozlova A., Shavrina T.
Recent studies report that autoregressive language models can successfully solve many NLP tasks via zero- and few-shot learning paradigms, which opens up new possibilities for using the pre-trained language models. This paper introduces two autoregressive GPT-like models with 1.3 billion and 13 billion parameters trained on 60 languages from 25 language families using Wikipedia and Colossal Clean Crawled Corpus. We reproduce the GPT-3 architecture using GPT-2 sources and the sparse attention mechanism; Deepspeed and Megatron frameworks allow us to parallelize the training and inference steps effectively. The resulting models show performance on par with the recently released XGLM models by Facebook, covering more languages and enhancing NLP possibilities for low resource languages of CIS countries and Russian small nations. We detail the motivation for the choices of the architecture design, thoroughly describe the data preparation pipeline, and train five small versions of the model to choose the most optimal multilingual tokenization strategy. We measure the model perplexity in all covered languages and evaluate it on the wide spectre of multilingual tasks, including classification, generative, sequence labeling and knowledge probing. The models were evaluated with the zero-shot and few-shot methods. Furthermore, we compared the classification tasks with the state-of-the-art multilingual model XGLM. source code and the mGPT XL model are publicly released.
Switzerland : Springer, 2017
This book constitutes the refereed proceedings of the Third Russian Supercomputing Days, RuSCDays 2017, held in Moscow, Russia, in September 2017. The 41 revised full papers and one revised short paper presented were carefully reviewed and selected from 120 submissions. The papers are organized in topical sections on parallel algorithms; supercomputer simulation; high performance architectures, ...
Added: November 15, 2017
20th International Conference, AIED 2019, Chicago, IL, USA, June 25-29, 2019, Proceedings, Part II ...
Added: July 19, 2019
Proceedings of the international conference on Uncertainty in Artificial Intelligence (UAI 2018) ...
Added: October 29, 2018
This book constitutes the refereed post-conference proceedings of the 5th Russian Supercomputing Days, RuSCDays 2019, held in Moscow, Russia, in September 2019. The 60 revised full papers presented were carefully reviewed and selected from 127 submissions. ...
Added: December 11, 2019
Coimbra : Association for Computational Creativity, 2020
Added: September 29, 2020
American Association for Artificial Intelligence (AAAI) Press, 2015
Added: September 18, 2017
Интегрированные модели и мягкие вычисления в искусственном интеллекте. Сб. научных трудов VII-й Международной научно-практической конференции (Коломна, 20-22 мая 2013)
М. : Физматлит, 2013
Conference is devoted to application of the integrated models and soft computing in artificial intelligence. ...
Added: May 26, 2013
Added: November 3, 2019
Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22
Added: February 25, 2019
Innovations in Digital Economy. First International Conference, SPBPU IDE 2019, St. Petersburg, Russia, October 24–25, 2019, Revised Selected Papers
Switzerland : Springer, 2020
This book constitutes the revised and extended papers of the First International Conference on Innovations in Digital Economy, SPBU IDE 2019, held in St. Petersburg, Russia, in October 2019. The 8 papers presented were thoroughly reviewed and selected for publication from 78 submissions. The papers are organized according the following topical sections: economic efficiency and social consequences ...
Added: October 9, 2020
M. : -, 2016
Proceedings of ISP RAS are a double-blind peer-reviewed journal publishing scientific articles in the areas of system programming, software engineering, and computer science. The journal's goal is to develop a respected network of knowledge in the mentioned above areas by publishing high quality articles on open access. The journal is intended for researchers, students, and ...
Added: September 14, 2016
The sixteen-volume set comprising the LNCS volumes 11205-11220 constitutes the refereed proceedings of the 15th European Conference on Computer Vision, ECCV 2018, held in Munich, Germany, in September 2018. The 776 revised papers presented were carefully reviewed and selected from 2439 submissions. The papers are organized in topical sections on learning for vision; computational photography; human analysis; ...
Added: October 30, 2018
The State and Perspective of Russian Studies in Artificial Intelligence (Based on the Proceedings of the 13th Russian Conference on Artificial Intelligence with International Participation)
, , Automatic Documentation and Mathematical Linguistics 2013 Vol. 47 No. 1 P. 36-43
The main directions of research in the field of artificial intelligence are presented on the basis of the Proceedings of the 13th Russian Conference on Artificial Intelligence with International Participation. ...
Added: September 28, 2013
Proceedings of Machine Learning Research. Proceedings of the International Conference on Machine Learning (ICML 2017)
Sydney : [б.и.], 2017
Proceedings of Machine Learning Research. Volume 70: International Conference on Machine Learning, 6-11 August 2017, International Convention Centre, Sydney, Australia ...
Added: February 25, 2018
P. : Université Paris 13 - Paris Sorbonne Cité, 2013
In this workshop we will bring together participants who have solutions for one or more of the following problems: How can mutual understanding be optimized with the help of technology in hospitals where both patients and professionals have varying language skills, cultural backgrounds and cognitive capacities? Can domain ontologies, natural language processing tools, multilingual knowledge-based ...
Added: December 18, 2014
L. : IEEE, 2016
IntelliSys 2016 conference will focus on areas of intelligent systems and artificial intelligence and how it applies to the real world. It is an opportunity for researchers in this field to meet and discuss solutions, scientific results, and methods in solving important problems in this field. Conference Topics include, but are not limited to: Artificial ...
Added: February 25, 2017
, , et al., Всероссийский криминологический журнал 2015 Т. 9 № 3 С. 423-430
Modern criminalists do not share a common opinion regarding the choice of parameters which could be used to work out a system of characteristics to differentiate a maniac killer from an ordinary person. This hinders the development of efficient software for investigation purposes. The paper describes the experience of developing a neural network that can ...
Added: October 1, 2015
Formal Concept Analysis: 16th International Conference, ICFCA 2021, Strasbourg, France, June 29 – July 2, 2021, Proceedings
This book constitutes the proceedings of the 16th International Conference on Formal Concept Analysis, ICFCA 2021, held in Strasbourg, France, in June/July 2021. The 14 full papers and 5 short papers presented in this volume were carefully reviewed and selected from 32 submissions. The book also contains four invited contributions in full paper length. The research part ...
Added: July 10, 2021
Intelligent Decision Technologies. Proceedings of the 12th KES International Conference on Intelligent Decision Technologies (KES-IDT 2020)
Singapore : Springer, 2020
This volume contains the proceedings of the 12 International KES Conference on Intelligent Decision Technologies (KES-IDT 2020) being held as a Virtual Conference, in June 17–19, 2020. The KES-IDT is an international annual conference organized by KES International. The KES-IDT conference is a sub-series of the KES Conference series. The KES-IDT is an interdisciplinary conference and provides opportunities for the presentation ...
Added: August 19, 2020
, , et al., Искусственный интеллект и принятие решений 2020 № 1 С. 3-16
The article is devoted to a review of the latest natural language processing (NLP) technologies that can be applied in strategic analytics. The introduction discusses the main problems in this area and specific tasks that can be solved using NLP tools. The article provides an overview of the main application areas in which these tools ...
Added: May 6, 2020
, , , Прикладная информатика 2014 № 3(51) С. 128-135
In article the new view on positioning of the theory of creation of systems of situational management as the artificial intelligence one of directions possess mechanisms of generation are describ. The foreshortening of an offer sight allowing to reveal the problems which decision made development of this theory with possibility of creation of program systems of new type. Keywords: ...
Added: August 19, 2014
, , , Прикладная информатика 2014 № 4(52) С. 63-84
In article the new view on positioning of the theory of creation of systems of situational managementas the artificial intelligence one of directions possess mechanisms of generation are described.The foreshortening of an offer sight allowing to reveal the problems which decision made developmentof this theory with possibility of creation of program systems of new type. ...
Added: August 19, 2014
IEEE Computer Society, 2020
ICTAI 2020: The annual IEEE International Conference on Tools with Artificial Intelligence (ICTAI) provides a major international forum where the creation and exchange of ideas related to artificial intelligence are fostered among academia, industry, and government agencies. The conference facilitates the cross-fertilization of these ideas and promotes their transfer into practical tools, for developing intelligent systems ...
Added: January 30, 2021
Parallel Computational Technologies. 12th International Conference, PCT 2018, Rostov-on-Don, Russia, April 2–6, 2018, Revised Selected Papers
Cham : Springer, 2018
This book constitutes the refereed proceedings of the 12th International Conference on Parallel Computational Technologies, PCT 2018, held in Rostov-on-Don, Russia, in April 2018. The 24 revised full papers presented were carefully reviewed and selected from 167 submissions. The papers are organized in topical sections on high performance architectures, tools and technologies; parallel numerical algorithms; supercomputer simulation. ...
Added: March 11, 2019