Building Your Own Recommendation Engine


This tutorial explains how to build a recommendation engine for your self using Ruby on Rails.

The main idea is about collecting data about everything:

For example for a video site, the data would be:

  • Who uploaded a video?
  • Who commented on a video?
  • Which tags where created?
  • Who visited the video? (also tracking anonymous visitors)
  • Who favorited a video?
  • Who rated a video?
  • Which channels was the video assigned to?
  • Text streams of title, description, tags, channels and comments are collected by a fulltext indexer which puts weight on each of the data sources.

Normally, the way we can do recommendation is :

  1. Find similarity by fulltext search on title
  2. Find similarity by fulltext search on description
  3. Find similarity by fulltext search on comments
  4. Find similarity by fulltext search on tags fulltext
  5. Similar pages where the same user has done activity (like rating, commenting)
  6. Other pages with the same tags (weighted by “expressiveness” of tags).
  7. Other pages the users from these favorites also made favorites.
  8. Other pages the raters from these ratings also rated on (weighted)
  9. Other pages browsed by people who browsed this page.

Thus, we would create functions which return lists of (id,weight) tuples for each of the points. Some only consider a limited amount of pages(eg last 50), some modify the weight by eg rating, tag count (more often tagged = less expressive).

All these will be combined into a single list by just summing up the weights by page ids, then sorted by weight. All this process is run on Cron and has to be updated frequently.