TY - JOUR
T1 - Pre-processing approaches for collaborative filtering based on hierarchical clustering
AU - de Aguiar Neto, Fernando S.
AU - da Costa, Arthur F.
AU - Manzato, Marcelo G.
AU - Campello, Ricardo J.G.B.
N1 - Funding Information:
This work was supported by São Paulo Research Foundation (FAPESP) Grant Nos. 2016/04798–5 and 2016/20280-6, by CNPq - Brazilian National Research Council grant #302161/2017-1; also research carried out using the computational resources of the Center for Mathematical Sciences Applied to Industry (CeMEAI) funded by FAPESP (grant 2013/07375-0).
Publisher Copyright:
© 2020 Elsevier Inc.
PY - 2020/9
Y1 - 2020/9
N2 - Recommender Systems (RS) support users to find relevant contents, such as movies, books, songs, and other products based on their preferences. Such preferences are gathered by analyzing past users’ interactions, however, data collected for this purpose are typically prone to sparsity and high dimensionality. Clustering-based techniques have been proposed to handle those problems effectively and efficiently by segmenting the data into a number of similar groups based on predefined characteristics. Although such techniques have gained increasing attention in the recommender systems community, they are usually bound to a particular recommender system and/or require critical parameters, such as the number of clusters. In this paper, we present three variants of a general-purpose method to optimally extract users’ groups from a hierarchical clustering algorithm, specifically targeting RS problems. The proposed extraction methods do not require critical parameters and enable any recommender algorithm to be used at the recommendation step. Our experiments have shown promising recommendation results in the context of nine well-known public datasets from different domains.
AB - Recommender Systems (RS) support users to find relevant contents, such as movies, books, songs, and other products based on their preferences. Such preferences are gathered by analyzing past users’ interactions, however, data collected for this purpose are typically prone to sparsity and high dimensionality. Clustering-based techniques have been proposed to handle those problems effectively and efficiently by segmenting the data into a number of similar groups based on predefined characteristics. Although such techniques have gained increasing attention in the recommender systems community, they are usually bound to a particular recommender system and/or require critical parameters, such as the number of clusters. In this paper, we present three variants of a general-purpose method to optimally extract users’ groups from a hierarchical clustering algorithm, specifically targeting RS problems. The proposed extraction methods do not require critical parameters and enable any recommender algorithm to be used at the recommendation step. Our experiments have shown promising recommendation results in the context of nine well-known public datasets from different domains.
KW - Cluster quality
KW - Hierarchical clustering
KW - Optimal selection of clusters
KW - Pre-processing
KW - Recommender systems
KW - Sparsity reduction
U2 - 10.1016/j.ins.2020.05.021
DO - 10.1016/j.ins.2020.05.021
M3 - Journal article
AN - SCOPUS:85085275625
SN - 0020-0255
VL - 534
SP - 172
EP - 191
JO - Information Sciences
JF - Information Sciences
ER -