Proceedings of the 2015 International ACM Recommender Systems Challenge
This volume contains the papers presented at the ACM RecSys Challenge 2015 workshop held on September 16, 2015, in Vienna, Austria. The challenge offered participants the opportunity to work on a large-scale e-commerce dataset from a big retailer in Europe. Participants tackled the problem of predicting what items a user intends to purchase, if any, given a click sequence performed during an activity session on the e-commerce website. The challenge was launched on November 15, 2014, and ran for seven months, attracting 850 teams from 49 countries which submitted a total of 5,437 solutions. The winners were determined based on the final ranking of the scores at the end of the challenge. However, in order to receive the monetary prize, the participants were required to submit, and have accepted, a paper detailing the applied algorithms, and attend the challenge's workshop. There were 22 submissions, and each submission was reviewed by at least two program committee members. The following table contains a summary of the 12 accepted papers and the corresponding score and rank in the final leaderboard.
In this paper, we describe the winning approach for the RecSys Challenge 2015. Our key points are (1) two-stage classification, (2) massive usage of categorical features, (3) strong classifiers built by gradient boosting and (4) threshold optimization based directly on the competition score. We describe our approach and discuss how it can be used to build scalable personalization systems.
In this paper we show how several similarity measures can be combined for finding similarity between a pair of users for performing Collaborative Filtering in Recommender Systems. Through aggregation of several measures we find super similar and super dissimilar user pairs and assign a different similarity value for these types of pairs. We also introduce another type of similarity relationship which we call medium similar user pairs and use traditional JMSD for assigning similarity values for them. By experimentation with real data we show that our method for finding similarity by aggregation performs better than each of the similarity metrics. Moreover, as we apply all the traditional metrics in the same setting, we can assess their relative performance
In this paper (The first author is the 1st place winner of the Open HSE Student Research Paper Competition (NIRS) in 2017, Computer Science nomination, with the topic “Extraction of Visual Features for Recommendation of Products”, as alumni of 2017 “Data Science” master program at Computer Science Faculty, HSE, Moscow), we describe a special recommender approach based on features extracted from the clothes’ images. The method of feature extraction relies on pre-trained deep neural network that follows transfer learning on the dataset. Recommendations are generated by the neural network as well. All the experiments are based on the items of category Clothing, Shoes and Jewelry from Amazon product dataset. It is demonstrated that the proposed approach outperforms the baseline collaborative filtering method.
Co-authorship networks contain invisible patterns of collaboration among researchers. The process of writing joint paper can depend of different factors, such as friendship, common interests, and policy of university. We show that, having a temporal co-authorship network, it is possible to predict future publications. We solve the problem of recommending collaborators from the point of link prediction using graph embedding, obtained from co-authorship network. We run experiments on data from HSE publications graph and compare it with relevant models.
This volume contains the papers that were presented at the ACM Recommender Systems Challenge Workshop 20181 which was held at ACM RecSys 2018, the 12th ACM Conference on Recommender Systems. The authors of these papers participated in the RecSys Challenge 2018 by designing and implementing recommender system algorithms for automatic music playlist continuation. We received 24 paper submissions, each of which received between two and four reviews from recognized experts in the area of recommender systems, information retrieval, and music. We eventually accepted 16 for presentation in the workshop.
A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.
Event logs collected by modern information and technical systems usually contain enough data for automated process models discovery. A variety of algorithms was developed for process models discovery, conformance checking, log to model alignment, comparison of process models, etc., nevertheless a quick analysis of ad-hoc selected parts of a journal still have not get a full-fledged implementation. This paper describes an ROLAP-based method of multidimensional event logs storage for process mining. The result of the analysis of the journal is visualized as directed graph representing the union of all possible event sequences, ranked by their occurrence probability. Our implementation allows the analyst to discover process models for sublogs defined by ad-hoc selection of criteria and value of occurrence probability
The geographic information system (GIS) is based on the first and only Russian Imperial Census of 1897 and the First All-Union Census of the Soviet Union of 1926. The GIS features vector data (shapefiles) of allprovinces of the two states. For the 1897 census, there is information about linguistic, religious, and social estate groups. The part based on the 1926 census features nationality. Both shapefiles include information on gender, rural and urban population. The GIS allows for producing any necessary maps for individual studies of the period which require the administrative boundaries and demographic information.
Existing approaches suggest that IT strategy should be a reflection of business strategy. However, actually organisations do not often follow business strategy even if it is formally declared. In these conditions, IT strategy can be viewed not as a plan, but as an organisational shared view on the role of information systems. This approach generally reflects only a top-down perspective of IT strategy. So, it can be supplemented by a strategic behaviour pattern (i.e., more or less standard response to a changes that is formed as result of previous experience) to implement bottom-up approach. Two components that can help to establish effective reaction regarding new initiatives in IT are proposed here: model of IT-related decision making, and efficiency measurement metric to estimate maturity of business processes and appropriate IT. Usage of proposed tools is demonstrated in practical cases.