Skip to content

Instantly share code, notes, and snippets.

@marcelcaraciolo
Created December 15, 2011 03:40
Show Gist options
  • Save marcelcaraciolo/1479727 to your computer and use it in GitHub Desktop.
Save marcelcaraciolo/1479727 to your computer and use it in GitHub Desktop.
RS Formal Definition
The problem involved in Recommender Systems is that of estimating the evaluations of items
unknown to an user and using these evaluations to recommend to the user a list of items
better evaluated, that is, those items which will more probably be of the user's interest.
To make such estimates, one may use the evaluations of other items made by the same user
or the evaluations made by the other users with similar interests to a particular user.
Formalizing the problem, given a set of users U and set of items I, let s be an utility
function which defines the punctuation (evaluation or note) of an item i for an user u.
That is: s: U x I -> P, in which P is a completely ordered set, formed by non-negative
values with an interval, 0 to 10, for example. The system must recommend an item i' which
maximize the utility function for an user:
i' = arg max s(u,i) , for each i belonging to I
An element in the set U may be defined by several characteristics, which corresponds to the
user's profile. Equally, elements from set I may be also defined by several characteristics,
these related to the domain of the items. A film, for instance, may have as features its
title, genders, year of release and the names of artists, directors and writers involved
in the film production.
Since function s is not defined in all space U x I, it must be extrapolated, allowing
presenting to the users items unevaluated by them and which will probably be of their
interest. This is the central problem in RS. This extrapolation may be carried out
through the use of heuristics defining the utility function, which are empirically
validated, or it may be carried out through an estimate of the utility function by
optimizing a certain performance criterion, as the mean squared error. More specifically
the estimate of the evaluations may be obtained using methods from approximation theory,
heuristic formulas as cosine similarity and Machine Learning techniques such as Bayesian
classifiers, Support Vector Machines, Artificial Neural Networks and clustering techniques.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment