Proceedings of the Workshop of the 5th International Conference on Learning Representations (ICLR)
In this paper, we consider the problem of fine-tuning a discrete event simulator of distributed storage system by a neural network trained with reinforcement learning algorithms on real data. The simulator has a set of control parameters that affect its behaviour and can be tuned during the simulation. Variation of these parameters influences how realistic the simulation is. The problem of simulator tuning is equivalent to the discovery of an optimal control strategy that leads to sensible results. We investigate different optimization metrics and demonstrate the viability of the approach.
This research is motivated by sustainability problems of oil palm expansion. Fast-growing industrial Oil Palm Plantations (OPPs) in the tropical belt of Africa, Southeast Asia and parts of Brazil lead to significant loss of rainforest and contribute to the global warming by the corresponding decrease of carbon dioxide absorption. We propose a novel approach to monitoring of the expansion of OPPs based on an application of state-of-the-art Fully Convolutional Neural Networks (FCNs) to solve Semantic Segmentation Problem for Landsat imagery. The proposed approach significantly outperforms per-pixel classification methods based on Random Forest using texture features, NDVI, and all Landsat bands. Moreover, the trained FCN is robust to spatial and temporal shifts of input data. The paper provides a proof of concept that FCNs as semi-automated methods enable OPPs mapping of entire countries and may serve for yearly detection of oil palm expansion.
Intelligent Systems Conference (IntelliSys) 2018 is the fourth research conference in the series. This conference is a part of SAI conferences being held since 2013. The conference series has featured keynote talks, special sessions, poster presentation, tutorials, workshops, and contributed papers each year. The conference focus on areas of intelligent systems and artificial intelligence (AI) and how it applies to the real world. IntelliSys is one of the best respected Artificial Intelligence (AI) Conference.
We present a probabilistic model with discrete latent variables that control the computation time in deep learning models such as ResNets and LSTMs. A prior on the latent variables expresses the preference for faster computation. The amount of computation for an input is determined via amortized maximum a posteriori (MAP) inference. MAP inference is performed using a novel stochastic variational optimization method. The recently proposed adaptive computation time mechanism can be seen as an ad-hoc relaxation of this model. We demonstrate training using the general-purpose concrete relaxation of discrete variables. Evaluation on ResNet shows that our method matches the speed-accuracy trade-off of adaptive computation time, while allowing for evaluation with a simple deterministic procedure that has a lower memory footprint.
This paper deals with automatic classification of questions in the Russian language. In contrast to previously used methods, we introduce a convolutional neural network for question classification. We took advantage of an existing corpus of 2008 questions, manually annotated in accordance with a pragmatic 14-class typology. We modified the data by reducing the typology to 13 classes, expanding the dataset and improving the representativeness of some of the question types. The training data in a combined representation of word embeddings and binary regular expression-based features was used for supervised learning to approach the task of question tagging. We tested a convolutional neural network against a state-of-the-art Russian language question classification algorithm, an SVM classifier with a linear kernel and questions represented as word trigram counts, as the baseline model (60.22% accuracy on the new dataset). We also tested several widely-used machine learning methods (logistic regression, Bernoulli Naïve Bayes) trained on the new question representation. The best result of 72.38% accuracy (micro) was achieved with the CNN model. We also ran experiments on pertinent feature selection with a simple Multinomial Naïve Bayes classifier, using word features only, Add-1 smoothing and no strategy for out-of-vocabulary words. Surprisingly, the setting with top-1200 informative word features (by PPMI) and equal priors achieved only slightly lower accuracy, 70.72%, which also beats the baseline by a large margin.
The performance of machine learning methods is heavily dependent on the choice of data representation (or features) on which they are applied. The rapidly developing field of representation learning is concerned with questions surrounding how we can best learn meaningful and useful representations of data. We take a broad view of the field and include topics such as deep learning and feature learning, metric learning, compositional modeling, structured prediction, reinforcement learning, and issues regarding large-scale learning and non-convex optimization. The range of domains to which these techniques apply is also very broad, from vision to speech recognition, text understanding, gaming, music, etc.
Brain-computer interfaces find application in a number of different areas and have the potential to be used for research as well as for practical purposes. The clinical use of BCI includes current studies on neurorehabilitation ([Frolov et al., 2013; Ang et al., 2010]), and there is the prospect of using BCI to restore movement and communication capabilities, providing alternative effective pathways to those that may be lost due to injury or illness. The processing of electrophysiological data requires analysis of high-dimensional, nonstationary, noisy signals reflecting complex underlying processes and structures. We have shown that for non-invasive neuroimaging methods such as EEG the potential improvement lies in the field of machine learning and involves designing data analysis algorithms that can model physiological and psychoemotional variability of the user. The development of such algorithms can be conducted in different ways, including the classical Bayesian paradigm as well as modern deep learning architectures. The interpretation of nonlinear decision rules implemented by multilayer structures would enable automatic and objective knowledge extraction from the neurocognitive experiments data. Despite the advantages of non-invasive neuroimaging methods, a radical increase in the bandwidth of the BCI communication channel and the use of this technology for the prosthesis control is possible only through invasive technologies. Electrocorticogram (ECoG) is the least invasive of such technologies, and in the final part of this work we demonstrate the possibility of using ECoG to decode the kinematic characteristics of the finger movement.
The GRU-based recurrent neural networks (RNN) for constructing recommendation systems are proposed. Such systems are mainly developed by large companies for specific domains. At the same time, small companies don’t have the necessary resources to develop their own unique systems. Therefore, they need universal recommendation system (or recommender platform) automatically customized for a specific domain. This system allows to develop own recommendation system from scratch for companies whose services are under development. The RNN-based approach is proposed for session-based recommendation with automatically modelling of the domain. This approach is based on the content analysis of the web sites. Several modifications to classic RNNs such as a ranking loss function that make it more viable for this specific problem are considered. General scheme of the approach and architecture of the recommendation system based on proposed scheme are described in this paper.
This workshop aims to bring together researchers, educators, practitioners who are interested in techniques as well as applications of making compact and efficient neural network representations. One main theme of the workshop discussion is to build up consensus in this rapidly developed field, and in particular, to establish close connection between researchers in Machine Learning community and engineers in industry. We believe the workshop is beneficial to both academic researchers as well as industrial practitioners.