It’s not magic! Why Netflix knows you so well
Online services like Spotify and Netflix use algorithms to recommend exactly what the users want. It may seem like magic, but it actually works based on simple mathematical principles, explains Associate Professor Jes Frellsen. These algorithms are not only used in the entertainment business, they can also be applied in other areas such as cancer research.
ResearchComputer Science Department
Written 7 February, 2017 08:53 by Vibeke Arildsen
It can almost seem like magic when your music service time and again recommends new music that falls right in your taste. But actually, it can be simple math that determines what web shops and streaming services recommend for you.
"When you look at the machine learning methods used in recommender systems, they are really just algorithms based on assumptions about the similarities between users, products and users’ preferences for these products. Both these assumptions and the mathematical principles can be relatively simple, and it works amazingly well," says Jes Frellsen, researcher in machine learning and artificial intelligence at ITU.
Like asking a friend - or 10
When designing a recommender system, a company can choose between three overall approaches, explains Jes Frellsen. You might select a content-based approach that uses information about the products to generate suggestions for similar products. If a user has watched Lord of the Rings, you might suggest that he or she watch the Hobbit the next time. You can also use demographic information, such as the user's gender and age to suggest content that similar users like.
Finally, you can base the algorithm exclusively on data about the behaviour of different users. This is particularly useful when the service does not have any other data about the users, and when classifying the products is difficult.
"In these so-called collaborative filtering methods, you look exclusively at the user’s preferences for different products and compare them to the preferences of other users. For example, you could find the 10 users whose ratings of some films are the most similar to your own ratings. If you then want to predict what you will think of a certain film, you can use a weighted average of the ratings made by these 10 closest related users. It is kind of the same as asking a friend with similar tastes to recommend a film," explains Jes Frellsen.
The more the data, the better
Often companies opt for a combination of the three approaches, particularly large enterprises for whom predicting users’ needs is a key part of the business model.
Generally, recommender systems get better the more data you have. If you use both information about content, demographics and user behaviour, you will often get the best predictions.
Jes Frellsen, Accociate Professor at ITU
"Generally, recommender systems get better the more data you have. If you use both information about content, demographics and user behaviour, you will often get the best predictions," says Jes Frellsen.
Simple recommendation systems are often more transparent, however. If a company wants to explain to its users why they are getting certain recommendations, this is easier to do when using a simple method.
When the algorithm is wrong
Most people have experienced getting suggestions for films or other products that are completely off the mark. This can happen for instance if the service does not have enough data to provide good suggestions.
"A classic problem for recommender systems is how to handle new users or new products. If you are a new user who has only rated a few films, it can be difficult to tell which users you are similar to. The system has to get to know you, so to speak. Likewise, it can be hard to say which users will like a new film that no one has rated yet, unless the system uses information about for instance the genre. In both cases, the system will typically make poor predictions and thus give recommendations you won’t necessarily like," says Jes Frellsen.
"And then there are films that are just difficult to predict. The film Napoleon Dynamite is a classic example. The ratings of this film are typically very polarized, and people who otherwise have similar tastes, can completely disagree about how to rate it. This can cause problems for the algorithms," he says.
Machine learning in cancer research
With recommender systems, you want to predict a user’s preference for a product that the person has not rated, while in our case we wanted to predict how efficiently different drugs are on cell lines they have not been tested on.
Jes Frellsen, Associate Professor at ITU
Recommender systems are almost an entire academic discipline in itself, and one that has been developing rapidly over the last decade. These prediction methods are not only relevant in commercial contexts. Jes Frellsen has been part of a research collaboration that developed machine learning methods for predicting how well different types of drugs work on various cancer types.
"In this case we looked at a dataset with measurements of how effective various drugs are on different cell lines representing different types of cancer and tissue. In the dataset, not all combinations of drugs and cell lines were measured, and we tried to predict these missing measurements. The method we developed was exactly the same as those you would use in recommender systems,” he says.
“With recommender systems, you want to predict a user’s preference for a product that the person has not rated, while in our case we wanted to predict how efficiently different drugs are on cell lines they have not been tested on. The problems are very similar, it is really just about predicting the unobserved values.”
Jes Frellsen, Associate Professor, phone +45 7218 5030, email jefr@itu.dk
Vibeke Arildsen, Press Officer, phone 2555 0447, email viar@itu.dk