Item based collaborative filtering is a model-based algorithm for recommender engines. In item based collaborative filtering similarities between items are calculated from rating-matrix. And based upon these similarities, user’s preference for an item not rated by him is calculated. Here is a step-by-step worked out example for four users and three items. We will consider the following sample data of preference of four users for three items:
ID | user | item | rating |
241 | u1 | m1 | 2 |
222 | u1 | m3 | 3 |
276 | u2 | m1 | 5 |
273 | u2 | m2 | 2 |
200 | u3 | m1 | 3 |
229 | u3 | m2 | 3 |
231 | u3 | m3 | 1 |
239 | u4 | m2 | 2 |
286 | u4 | m3 | 2 |
Step 1: Write the user-item ratings data in a matrix form. The above table gets rewritten as follows:
m1 | m2 | m3 | |
u1 | 2 | ? | 3 |
u2 | 5 | 2 | ? |
u3 | 3 | 3 | 1 |
u4 | ? | 2 | 2 |
Here rating of user u1 for item m3 is 3. There is no rating for item m2 by user u1. And no rating also for item m3 by user u2.
Step 2: We will now create an item-to-item similarity matrix. The idea is to calculate how similar an item is to another item. There are a number of ways of calculating this. We will use cosine similarity measure. To calculate similarity between items m1 and m2, for example, look at all those users who have rated both these items. In our case, both m1 and m2 have been rated by users u2 and u3. We create two item-vectors, v1 for item m1 and v2 for item m2, in the user-space of (u2,u3) and then find the cosine of angle between these vectors. A zero angle or overlapping vectors with cosine value of 1 means total similarity (or per user, across all items, there is same rating) and an angle of 90 degree would mean cosine of 0 or no similarity. Thus, the two item-vectors would be,
v1 = 5 u2 + 3 u3
v2 = 3 u2 + 3 u3
The cosine similarity between the two vectors, v1 and v2, would then be:
cos(v1,v2) = (5*3 + 3*3)/sqrt[(25 + 9)*(9+9)] = 0.76
Similarly, to calculate similarity between m1 and m3, we consider only users u1 and u3 who have rated both these items. The two item vectors, v1 for item m1 and v3 for item m3, in the user-space would be as follows:
v1 = 2 u1 + 3 u3
v3 = 3 u1 + 1 u3
The cosine similarity measure between v1 and v3 is:
cos(v1,v3) = (2*3 + 3*1)/sqrt[(4 + 9)*(9+1)] = 0.78
We can similarly calculate similarity between items m2 and m3 using ratings given to both by users u3 and u4. The two item-vectors v3 and v4 would be:
v2 = 3 u3 + 2 u4
v3 = 1 u3 + 2 u4
And cosine similarity between them is:
cos(v2,v3) = (3*1 + 2*2)/sqrt[(9 + 4)*(1 + 4)] = 0.86
We now have the complete item-to-item similarity matrix as follows:
m1 | m2 | m3 | |
m1 | 1 | 0.76 | 0.78 |
m2 | 0.76 | 1 | 0.86 |
m3 | 0.78 | 0.86 | 1 |
Step 3: For each user, we next predict his ratings for items that he had not rated. We will calculate rating for user u1 in the case of item m2 (target item). To calculate this we weigh the just-calculated similarity-measure between the target item and other items that user has already rated. The weighing factor is the ratings given by the user to items already rated by him. We further scale this weighted sum with the sum of similarity-measures so that the calculated rating remains within a predefined limits. Thus, the predicted rating for item m2 for user u1 would be calculated using similarity measures between (m2,m1) and (m2,m3) weighted by the respective ratings for m1 and m3:
rating = (2 * 0.76 + 3 * 0.86)/(0.76+0.86) = 2.53
Recommender engine using item based collaborative filtering can be constructed using R package recommenderlab. See my blog here.
References:
1. Item-based collaborative filtering
2. recommenderlab: A Framework for Developing and Testing Recommendation Algorithms
Tags: item based collaborative filtering example, recommender engine, recommenderlab, tutorial on item-based collaborative filtering
June 8, 2016 at 11:12 am |
Thanks a lot for this. This is very helpful. Would be great to learn about recommender. Also If I do not have ratings data for any item, how do I go about.? i.e. I only know which all items user purchased, do not have any information on the rating.
August 9, 2016 at 5:37 am |
Thanks! That’s a great easy to understand explanation. Very helpful.
February 14, 2019 at 6:55 am |
Even I’m looking for the same information. Please help me out in case you have found any insights.
December 2, 2016 at 5:24 am |
Thank you so much. It was a great tutorial for beginners 🙂
May 21, 2017 at 4:06 am |
accesories n brands are nt dat important bt a gal has 2 do wat a gal has 2 do, n 4 da others wu cnt afford brands n staff dats ur prob n stop doin it other pp#l8&217;s prob n stop gossipin abt them
January 19, 2017 at 9:01 am |
thanks fror this article. i think v2 = 3 u2 + 3 u3 is wrong and it should be v2 = 2 u2 + 3 u3
March 26, 2017 at 10:11 am |
Wondering the same thing myself
March 28, 2017 at 5:28 am |
Good, explanation is in detail, but v2 = 3 u2 + 3 u3 should be v2 = 2 u2 + 3 u3, right ? Please confirm.
February 28, 2017 at 2:25 pm |
value for v1 must be v1= 5u2 + 2u2
March 16, 2017 at 5:14 am |
Thanks.Can i have some more example?
May 13, 2017 at 4:24 am |
thanks 🙂
July 23, 2017 at 6:51 pm |
Hi, I need help! What is the formula when you calculate the rating rating = (2 * 0.76 + 3 * 0.86)/(0.76+0.86) = 2.53 ? So what is the general formula without any numbers in it?
August 27, 2017 at 3:56 pm |
Wrong example … i don’t understand where he take the results. Just calculate it by yourselves.
January 23, 2019 at 1:43 pm |
trading stock
Worked out example: Item based Collaborative filtering for Recommender Engine | Linux Uncle