Application of Kullback-Leibler divergence for short-term user interest detection
In this paper we show how several similarity measures can be combined for finding similarity between a pair of users for performing Collaborative Filtering in Recommender Systems. Through aggregation of several measures we find super similar and super dissimilar user pairs and assign a different similarity value for these types of pairs. We also introduce another type of similarity relationship which we call medium similar user pairs and use traditional JMSD for assigning similarity values for them. By experimentation with real data we show that our method for finding similarity by aggregation performs better than each of the similarity metrics. Moreover, as we apply all the traditional metrics in the same setting, we can assess their relative performance
This volume contains the papers presented at the ACM RecSys Challenge 2015 workshop held on September 16, 2015, in Vienna, Austria. The challenge offered participants the opportunity to work on a large-scale e-commerce dataset from a big retailer in Europe. Participants tackled the problem of predicting what items a user intends to purchase, if any, given a click sequence performed during an activity session on the e-commerce website. The challenge was launched on November 15, 2014, and ran for seven months, attracting 850 teams from 49 countries which submitted a total of 5,437 solutions. The winners were determined based on the final ranking of the scores at the end of the challenge. However, in order to receive the monetary prize, the participants were required to submit, and have accepted, a paper detailing the applied algorithms, and attend the challenge's workshop. There were 22 submissions, and each submission was reviewed by at least two program committee members. The following table contains a summary of the 12 accepted papers and the corresponding score and rank in the final leaderboard.
The 4th International Conference on Educational Data Mining (EDM 2011) brings together researchers from computer science, education, psychology, psychometrics, and statistics to analyze large datasets to answer educational research questions. The conference, held in Eindhoven, The Netherlands, July 6-9, 2011, follows the three previous editions (Pittsburgh 2010, Cordoba 2009 and Montreal 2008), and a series of workshops within the AAAI, AIED, EC-TEL, ICALT, ITS, and UM conferences. The increase of e-learning resources such as interactive learning environments, learning management systems, intelligent tutoring systems, and hypermedia systems, as well as the establishment of state databases of student test scores, has created large repositories of data that can be explored to understand how students learn. The EDM conference focuses on data mining techniques for using these data to address important educational questions.
Understanding the relation between (sensory) stimuli and the activity of neurons (i.e., "the neural code") lies at heart of understanding the computational properties of the brain. However, quantifying the information between a stimulus and a spike train has proven to be challenging. We propose a new (in vitro) method to measure how much information a single neuron transfers from the input it receives to its output spike train. The input is generated by an artificial neural network that responds to a randomly appearing and disappearing "sensory stimulus": the hidden state. The sum of this network activity is injected as current input into the neuron under investigation. The mutual information between the hidden state on the one hand and spike trains of the artificial network or the recorded spike train on the other hand can easily be estimated due to the binary shape of the hidden state. The characteristics of the input current, such as the time constant as a result of the (dis)appearance rate of the hidden state or the amplitude of the input current (the firing frequency of the neurons in the artificial network), can independently be varied. As an example, we apply this method to pyramidal neurons in the CA1 of mouse hippocampi and compare the recorded spike trains to the optimal response of the "Bayesian neuron" (BN). We conclude that like in the BN, information transfer in hippocampal pyramidal cells is non-linear and amplifying: the information loss between the artificial input and the output spike train is high if the input to the neuron (the firing of the artificial network) is not very informative about the hidden state. If the input to the neuron does contain a lot of information about the hidden state, the information loss is low. Moreover, neurons increase their firing rates in case the (dis)appearance rate is high, so that the (relative) amount of transferred information stays constant.
We propose a new approach for Collaborative filtering which is based on Boolean Matrix Factorisation (BMF) and Formal Concept Analysis. In a series of experiments on real data (MovieLens dataset) we compare the approach with an SVD-based one in terms of Mean Average Error (MAE). One of the experimental consequences is that it is enough to have a binary-scaled rating data to obtain almost the same quality in terms of MAE by BMF as for the SVD-based algorithm in case of non-scaled data.
A words phonetic decoding method in automatic speech recognition is considered. The properties of Kullback–Leibler divergence are used to synthesize the estimation of the distribution of divergence between minimum speech units (e.g., single phonemes) inside a single class. It is demonstrated that the min imum variance of the intraphonemic divergence is reached when the phonetic database is tuned to the voice of a single speaker. The estimations are proven by experimental results on the recognition of vowel sounds and isolated words of Russian language.
We establish a new upper bound for the Kullback-Leibler divergence of two discrete probability distributions which
are close in a sense that typically the ratio of probabilities is nearly one and the number of outliers is small.
This proceedings publication is a compilation of selected contributions from the “Third International Conference on the Dynamics of Information Systems” which took place at the University of Florida, Gainesville, February 16–18, 2011. The purpose of this conference was to bring together scientists and engineers from industry, government, and academia in order to exchange new discoveries and results in a broad range of topics relevant to the theory and practice of dynamics of information systems. Dynamics of Information Systems: Mathematical Foundation presents state-of-the art research and is intended for graduate students and researchers interested in some of the most recent discoveries in information theory and dynamical systems. Scientists in other disciplines may also benefit from the applications of new developments to their own area of study.