Training Transformers Together
Borzunov A., Ryabinin M., Dettmers T., Lhoest Q., Saulnier L., Diskin M., Jernite Y., Wolf T.
, , , Bayesian Sparsification of Recurrent Neural Networks / . 2017.
Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights. Recently proposed Sparse Variational Dropout (Molchanov et al., 2017) eliminates the majority of the weights in a feed-forward neural network without significant loss of quality. We apply this technique to sparsify recurrent neural ...
Added: October 19, 2017
, , , , in : Supercomputing. RuSCDays 2020. Communications in Computer and Information Science. Vol. 1331: 6th Russian Supercomputing Days, RuSCDays 2020, Moscow, Russia, September 21–22, 2020, Revised Selected Papers.: Switzerland : Springer, 2020. P. 634-646.
Added: October 29, 2021
, , , in : 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). : Seul : IEEE, 2020. P. 2800-2805.
Added: March 29, 2021
, , et al., , in : Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019). : IEEE, 2019. P. 9601-9611.
We introduce ABC-Dataset, a collection of one million Computer-Aided Design (CAD) models for research of geometric deep learning methods and applications. Each model is a collection of explicitly parametrized curves and surfaces, providing ground truth for differential quantities, patch segmentation, geometric feature detection, and shape reconstruction. Sampling the parametric descriptions of surfaces and curves allows ...
Added: November 26, 2019
, , et al., , in : Proceedings of the 7th International Conference on Learning Representations (ICLR 2019). : ICLR, 2019. P. 1-17.
Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior (DWP), that exploit generative models to encourage a specific structure of ...
Added: September 2, 2019
Experience in Organizing Flexible Access to Remote Computing Resources from JupyterLab Environment Using Technologies of Everest and Templet Projects
, , , , in : Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021), Dubna, Russia, July 5-9, 2021. : CEUR Workshop Proceedings, 2021. P. 558-561.
The paper describes the experience of building distributed web applications based on the interactive computing technologies of the Jupyter project. The new architecture of such applications is proposed, considering the possibility of deploying a Jupyter notebook server separately from computing resources, and the possibility to interact with several computing resources simultaneously. These features are implemented ...
Added: October 30, 2022
, , , , in : Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings,. * 3.: Springer, 2021. P. 677-693.
Added: November 12, 2021
, , , , in : Supercomputing. RuSCDays 2018. Communications in Computer and Information Science, vol 965. Springer, Cham. : Springer, 2019. P. 687-698.
High-performance computing plays an increasingly important role in modern science and technology. However, the lack of convenient interfaces and automation tools greatly complicates the widespread use of HPC resources among scientists. The paper presents an approach to solving these problems relying on Everest, a web-based distributed computing platform. The platform enables convenient access to HPC ...
Added: October 19, 2019
, , et al., , in : Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021). : Association for Computational Linguistics, 2021. P. 1249-1254.
This work describes our approach for subtasks of SemEval-2021 Task 8: MeasEval: Counts and Measurements which took the official first place in the competition. To solve all subtasks we use multi-task learning in a question-answering-like manner. We also use learnable scalar weights to weight subtasks’ contribution to the final loss in multi-task training. We fine-tune ...
Added: September 23, 2021
Dagstuhl Publishing, 2021
Welcome to the DISC 2021, the 35th International Symposium on Distributed Computing, held on October 4–18, 2021. DISC is an international forum on the theory, design, analysis, and implementation of distributed systems and networks, focusing on distributed computing in all its forms. DISC is organized in cooperation with the European Association for Theoretical Computer Science ...
Added: October 14, 2021
NY : ACM, 2016
Added: August 30, 2018
Association for Computing Machinery (ACM), 2021
Welcome to the 40th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC 2021), held virtually (due to the COVID-19 pandemic) on July 26-30, 2021. PODC is the premier forum for presentation of research on all aspects of distributed computing, including the theory, design, implementation, and applications of distributed algorithms, systems, and networks. This volume contains ...
Added: October 14, 2021
, , , in : Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Kazan, Russia, July 17–19, 2019, Revised Selected Papers. Communications in Computer and Information Science. Vol. 1086.: Springer, 2020. P. 154-159.
In this paper, a deep learning method study is conducted to solve a new multiclass text classification problem, identifying user interests by text messages. We used an original dataset of almost 90 thousand forum text messages, labeled for ten interests. We experimented with different modern neural network architectures: recurrent and convolutional, as well as simpler ...
Added: November 7, 2019
, , et al., Journal of Industrial Information Integration 2021 Vol. 23 Article 100216
Automated early process fault detection and prediction remains a challenging problem in industrial processes. Traditionally it has been done by multivariate statistical analysis of sensor readings and, more recently, with the help of machine learning methods. The quality of machine learning models strongly depends on feature engineering, that in turn heavily relies on expertise of ...
Added: March 21, 2021
Speech decoding from a small set of spatially segregated minimally invasive intracranial EEG electrodes with a compact and interpretable neural network
, , et al., Journal of Neural Engineering 2022 Vol. 19 No. 6 Article 066016
Objective. Speech decoding, one of the most intriguing brain-computer interface applications, opens up plentiful opportunities from rehabilitation of patients to direct and seamless communication between human species. Typical solutions rely on invasive recordings with a large number of distributed electrodes implanted through craniotomy. Here we explored the possibility of creating speech prosthesis in a minimally ...
Added: December 9, 2022
, , et al., , in : Machine Learning in Clinical Neuroimaging and Radiogenomics in Neuro-oncology. Third International Workshop, MLCN 2020, and Second International Workshop, RNO-AI 2020. Lecture Notes in Computer Science. Vol. 12449: Machine Learning in Clinical Neuroimaging and Radiogenomics in Neuro-oncology.: Springer, 2020. Ch. 5. P. 45-55.
Electroencephalography (EEG) is a well-established non-invasive technique to measure the brain activity, albeit with a limited spatial resolution. Variations in electric conductivity between different tissues distort the electric fields generated by cortical sources, resulting in smeared potential measurements on the scalp. One needs to solve an ill-posed inverse problem to recover the original neural activity. In this article, ...
Added: December 10, 2020
, , , , in : European Conference on Visual Perception 2017 Abstract Book. : [б.и.], 2017. Ch. 2. P. 18-18.
Approximately twenty years ago, Laurent Itti and Christof Koch created a saliency map of visual attention in an attempt to recreate the work of biological pyramidal neurons by mimicking neurons with centre-surround receptive fields. The Saliency Model launched many studies that contributed to the understanding of layers of vision and the sphere of visual attention. ...
Added: October 15, 2018
Advanced Computing. 10th International Conference, IACC 2020, Panaji, Goa, India, December 5–6, 2020, Revised Selected Papers, Part II
10th International Conference, IACC 2020, Panaji, Goa, India, December 5–6, 2020, Revised Selected Papers, Part II series: Communications in Computer and Information Science (2021) volume 1368 ...
Added: July 7, 2021
, , et al., , in : Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP 2020). Vol. 4.: SciTePress, 2020. P. 214-221.
We propose a novel multi-texture synthesis model based on generative adversarial networks (GANs) with a user-controllable mechanism. The user control ability allows to explicitly specify the texture which should be generated by the model. This property follows from using an encoder part which learns a latent representation for each texture from the dataset. To ensure ...
Added: November 8, 2020
, , et al., , in : The NeurIPS '18 Competition: From Machine Learning to Intelligent Conversations. : Springer, 2020. P. 295-315.
Added: February 20, 2021
, , et al., Frontiers in Genetics 2021 Article 638191
We propose a method for generating an electrocardiogram (ECG) signal for one cardiac cycle using a variational autoencoder. Our goal was to encode the original ECG signal using as few features as possible. Using this method we extracted a vector of new 25 features, which in many cases can be interpreted. The generated ECG has ...
Added: October 29, 2021
The performance of machine learning methods is heavily dependent on the choice of data representation (or features) on which they are applied. The rapidly developing field of representation learning is concerned with questions surrounding how we can best learn meaningful and useful representations of data. We take a broad view of the field and include ...
Added: October 31, 2018
Intelligent Distributed Computing VII. Proceedings of the 7th International Symposium on Intelligent Distributed Computing - IDC 2013, Prague, Czech Republic, September 2013
Dordrecht, L., Cham, Heidelberg, NY : Springer, 2014
This book represents the combined peer-reviewed proceedings of the Seventh International Symposium on Intelligent Distributed Computing - IDC-2013, of the Second Workshop on Agents for Clouds - A4C-2013, of the Fifth International Workshop on Multi-Agent Systems Technology and Semantics - MASTS-2013, and of the International Workshop on Intelligent Robots - iR-2013. All the events were ...
Added: March 13, 2015
, , , Siberian Journal of Life Sciences and Agriculture 2021 Т. 13 № 1 С. 144-155
Background. Development of a convolutional neural network model for detecting cassava diseases from a mobile phone photo. Materials and methods. The material for the research was taken images with various types of cassava diseases, published in open access of the Kaggle platform. Research methods: theory of design and development of information systems, programming, methods of augmentation and extension ...
Added: November 17, 2021