Deep convolutional neural networks capabilities for binary classification of polar mesocyclones in satellite mosaics
Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed manually. Existing algorithms for the automatic detection of polar mesocyclones are based on the conventional analysis of patterns of cloudiness and they involve different empirically defined thresholds of geophysical variables. As a result, various detection methods typically reveal very different results when applied to a single dataset. We develop a conceptually novel approach for the detection of MCs based on the use of deep convolutional neural networks (DCNNs). As a first step, we demonstrate that DCNN model is capable of performing binary classification of 500 × 500 km patches of satellite images regarding MC patterns presence in it. The training dataset is based on the reference database of MCs manually tracked in the Southern Hemisphere from satellite mosaics. We use a subset of this database with MC diameters falling in the range of 200–400 km. This dataset is further used for testing several different DCNN setups, specifically, DCNN built “from scratch”, DCNN based on VGG16 pre-trained weights also engaging the Transfer Learning technique, and DCNN based on VGG16 with Fine Tuning technique. Each of these networks is further applied to both infrared (IR) and a combination of infrared and water vapor (IR + WV) satellite imagery. The best skills (97% in terms of the binary classification accuracy score) is achieved with the model that averages the estimates of the ensemble of different DCNNs. The algorithm can be further extended to the automatic identification and tracking numerical scheme and applied to other atmospheric phenomena that are characterized by a distinct signature in satellite imagery.
Autonomous taxies are in high demand for smart city scenario. Such taxies have a well specified path to travel. Therefore, these vehicles only required two important parameters. One is detection parameter and other is control parameter. Further, detection parameters require turn detection and obstacle detection. The control parameters contain steering control and speed control. In this paper a novel autonomous taxi model has been proposed for smart city scenario. Deep learning has been used to model the human driver capabilities for the autonomous taxi. A hierarchical Deep Neural Network (DNN) architecture has been utilized to train various driving aspects. In first level, the proposed DNN architecture classifies the straight and turning of road. A parallel DNN is used to detect obstacle at level one. In second level, the DNN discriminates the turning i.e. left or right for steering and speed controls. Two multi layered DNNs have been used on Nvidia Tesla K 40 GPU based system with Core i-7 processor. The mean squared error (MSE) for the detection parameters viz. speed and steering angle were 0.018 and 0.0248 percent, respectively, with 15 milli seconds of realtime response delay.
It has been shown that the activations invoked by an image within the top layers of a large convolutional neural network provide a high-level descriptor of the visual content of the image. In this paper, we investigate the use of such descriptors (neural codes) within the image retrieval application. In the experiments with several standard retrieval benchmarks, we establish that neural codes perform competitively even when the convolutional neural network has been trained for an unrelated classification task (e.g. Image-Net). We also evaluate the improvement in the retrieval performance of neural codes, when the network is retrained on a dataset of images that are similar to images encountered at test time. We further evaluate the performance of the compressed neural codes and show that a simple PCA compression provides very good short codes that give state-of-the-art accuracy on a number of datasets. In general, neural codes turn out to be much more resilient to such compression in comparison other state-of-the-art descriptors. Finally, we show that discriminative dimensionality reduction trained on a dataset of pairs of matched photographs improves the performance of PCA-compressed neural codes even further. Overall, our quantitative experiments demonstrate the promise of neural codes as visual descriptors for image retrieval.
The performance of machine learning methods is heavily dependent on the choice of data representation (or features) on which they are applied. The rapidly developing field of representation learning is concerned with questions surrounding how we can best learn meaningful and useful representations of data. We take a broad view of the field and include topics such as deep learning and feature learning, metric learning, compositional modeling, structured prediction, reinforcement learning, and issues regarding large-scale learning and non-convex optimization. The range of domains to which these techniques apply is also very broad, from vision to speech recognition, text understanding, gaming, music, etc.
Objective: Brain-computer interfaces (BCIs) decode information from neural activity and send it to external devices. The use of Deep Learning approaches for decoding allows for automatic feature engineering within the specific decoding task. Physiologically plausible interpretation of the network parameters ensures the robustness of the learned decision rules and opens the exciting opportunity for automatic knowledge discovery. Approach: We describe a compact convolutional network-based architecture for adaptive decoding of electrocorticographic (ECoG) data into finger kinematics. We also propose a novel theoretically justified approach to interpreting the spatial and temporal weights in the architectures that combine adaptation in both space and time. The obtained spatial and frequency patterns characterizing the neuronal populations pivotal to the specific decoding task can then be interpreted by fitting appropriate spatial and dynamical models. Main results: We first tested our solution using realistic Monte-Carlo simulations. Then, when applied to the ECoG data from Berlin BCI competition IV dataset, our architecture performed comparably to the competition winners without requiring explicit feature engineering. Using the proposed approach to the network weights interpretation we could unravel the spatial and the spectral patterns of the neuronal processes underlying the successful decoding of finger kinematics from an ECoG dataset. Finally we have also applied the entire pipeline to the analysis of a 32-channel EEG motor-imagery dataset and observed physiologically plausible patterns specific to the task. Significance: We described a compact and interpretable CNN architecture derived from the basic principles and encompassing the knowledge in the field of neural electrophysiology. For the first time in the context of such multibranch architectures with factorized spatial and temporal processing we presented theoretically justified weights interpretation rules. We verified our recipes using simulations and real data and demonstrated that the proposed solution offers a good decoder and a tool for investigating motor control neural mechanisms.
Brain-computer interfaces find application in a number of different areas and have the potential to be used for research as well as for practical purposes. The clinical use of BCI includes current studies on neurorehabilitation ([Frolov et al., 2013; Ang et al., 2010]), and there is the prospect of using BCI to restore movement and communication capabilities, providing alternative effective pathways to those that may be lost due to injury or illness. The processing of electrophysiological data requires analysis of high-dimensional, nonstationary, noisy signals reflecting complex underlying processes and structures. We have shown that for non-invasive neuroimaging methods such as EEG the potential improvement lies in the field of machine learning and involves designing data analysis algorithms that can model physiological and psychoemotional variability of the user. The development of such algorithms can be conducted in different ways, including the classical Bayesian paradigm as well as modern deep learning architectures. The interpretation of nonlinear decision rules implemented by multilayer structures would enable automatic and objective knowledge extraction from the neurocognitive experiments data. Despite the advantages of non-invasive neuroimaging methods, a radical increase in the bandwidth of the BCI communication channel and the use of this technology for the prosthesis control is possible only through invasive technologies. Electrocorticogram (ECoG) is the least invasive of such technologies, and in the final part of this work we demonstrate the possibility of using ECoG to decode the kinematic characteristics of the finger movement.
We propose a novel multi-texture synthesis model based on generative adversarial networks (GANs) with a user-controllable mechanism. The user control ability allows to explicitly specify the texture which should be generated by the model. This property follows from using an encoder part which learns a latent representation for each texture from the dataset. To ensure a dataset coverage, we use an adversarial loss function that penalizes for incorrect reproductions of a given texture. In experiments, we show that our model can learn descriptive texture manifolds for large datasets and from raw data such as a collection of high-resolution photos. We show our unsupervised learning pipeline may help segmentation models. Moreover, we apply our method to produce 3D textures and show that it outperforms existing baselines.
A new public dataset of traffic sign images is presented. The dataset is intended for training and testing the algorithms of traffic sign recognition. We describe the dataset structure and guidelines for working with the dataset, comparing it with the previously published traffic sign datasets. The evaluation of modern detection and classification algorithms conducted using the proposed dataset has shown that existing methods of recognition of a wide class of traffic signs do not achieve the accuracy and completeness required for a number of applications.
A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.
Event logs collected by modern information and technical systems usually contain enough data for automated process models discovery. A variety of algorithms was developed for process models discovery, conformance checking, log to model alignment, comparison of process models, etc., nevertheless a quick analysis of ad-hoc selected parts of a journal still have not get a full-fledged implementation. This paper describes an ROLAP-based method of multidimensional event logs storage for process mining. The result of the analysis of the journal is visualized as directed graph representing the union of all possible event sequences, ranked by their occurrence probability. Our implementation allows the analyst to discover process models for sublogs defined by ad-hoc selection of criteria and value of occurrence probability
Existing approaches suggest that IT strategy should be a reflection of business strategy. However, actually organisations do not often follow business strategy even if it is formally declared. In these conditions, IT strategy can be viewed not as a plan, but as an organisational shared view on the role of information systems. This approach generally reflects only a top-down perspective of IT strategy. So, it can be supplemented by a strategic behaviour pattern (i.e., more or less standard response to a changes that is formed as result of previous experience) to implement bottom-up approach. Two components that can help to establish effective reaction regarding new initiatives in IT are proposed here: model of IT-related decision making, and efficiency measurement metric to estimate maturity of business processes and appropriate IT. Usage of proposed tools is demonstrated in practical cases.