Working paper
Application of Kullback-Leibler divergence for short-term user interest detection
In this paper we show how several similarity measures can be combined for finding similarity between a pair of users for performing Collaborative Filtering in Recommender Systems. Through aggregation of several measures we find super similar and super dissimilar user pairs and assign a different similarity value for these types of pairs. We also introduce another type of similarity relationship which we call medium similar user pairs and use traditional JMSD for assigning similarity values for them. By experimentation with real data we show that our method for finding similarity by aggregation performs better than each of the similarity metrics. Moreover, as we apply all the traditional metrics in the same setting, we can assess their relative performance
This volume contains the papers presented at the ACM RecSys Challenge 2015 workshop held on September 16, 2015, in Vienna, Austria. The challenge offered participants the opportunity to work on a large-scale e-commerce dataset from a big retailer in Europe. Participants tackled the problem of predicting what items a user intends to purchase, if any, given a click sequence performed during an activity session on the e-commerce website. The challenge was launched on November 15, 2014, and ran for seven months, attracting 850 teams from 49 countries which submitted a total of 5,437 solutions. The winners were determined based on the final ranking of the scores at the end of the challenge. However, in order to receive the monetary prize, the participants were required to submit, and have accepted, a paper detailing the applied algorithms, and attend the challenge's workshop. There were 22 submissions, and each submission was reviewed by at least two program committee members. The following table contains a summary of the 12 accepted papers and the corresponding score and rank in the final leaderboard.
The 4th International Conference on Educational Data Mining (EDM 2011) brings together researchers from computer science, education, psychology, psychometrics, and statistics to analyze large datasets to answer educational research questions. The conference, held in Eindhoven, The Netherlands, July 6-9, 2011, follows the three previous editions (Pittsburgh 2010, Cordoba 2009 and Montreal 2008), and a series of workshops within the AAAI, AIED, EC-TEL, ICALT, ITS, and UM conferences. The increase of e-learning resources such as interactive learning environments, learning management systems, intelligent tutoring systems, and hypermedia systems, as well as the establishment of state databases of student test scores, has created large repositories of data that can be explored to understand how students learn. The EDM conference focuses on data mining techniques for using these data to address important educational questions.
We propose a new approach for Collaborative filtering which is based on Boolean Matrix Factorisation (BMF) and Formal Concept Analysis. In a series of experiments on real data (MovieLens dataset) we compare the approach with an SVD-based one in terms of Mean Average Error (MAE). One of the experimental consequences is that it is enough to have a binary-scaled rating data to obtain almost the same quality in terms of MAE by BMF as for the SVD-based algorithm in case of non-scaled data.
A words phonetic decoding method in automatic speech recognition is considered. The properties of Kullback–Leibler divergence are used to synthesize the estimation of the distribution of divergence between minimum speech units (e.g., single phonemes) inside a single class. It is demonstrated that the min imum variance of the intraphonemic divergence is reached when the phonetic database is tuned to the voice of a single speaker. The estimations are proven by experimental results on the recognition of vowel sounds and isolated words of Russian language.
We establish a new upper bound for the Kullback-Leibler divergence of two discrete probability distributions which
are close in a sense that typically the ratio of probabilities is nearly one and the number of outliers is small.
We consider certain spaces of functions on the circle, which naturally appear in harmonic analysis, and superposition operators on these spaces. We study the following question: which functions have the property that each their superposition with a homeomorphism of the circle belongs to a given space? We also study the multidimensional case.
We consider the spaces of functions on the m-dimensional torus, whose Fourier transform is p -summable. We obtain estimates for the norms of the exponential functions deformed by a C1 -smooth phase. The results generalize to the multidimensional case the one-dimensional results obtained by the author earlier in “Quantitative estimates in the Beurling—Helson theorem”, Sbornik: Mathematics, 201:12 (2010), 1811 – 1836.
We consider the spaces of function on the circle whose Fourier transform is p-summable. We obtain estimates for the norms of exponential functions deformed by a C1 -smooth phase.