Favorites and Ratings
- We did this because most people rate either 1 star or 5 stars
- The ratings ended up extremely close to one another and became meaningless
- Our experience with a +/- system showed that having a down vote made for a toxic experience for everyone
- We're now just using the favorite button now to simplify and keep it a positive experience for all
Read on for more details.
We've had a rocky relationship with ratings. When we started Instructables in 2005, we had a "+/-" rating system where members could express either an up or a down vote for an Instructable. Good Instructables generally received lots of "+" votes, and the list of the most negatively voted Instructables was pretty amusing (with this taking the bottom position for quite some time). Our experience with this system was two-fold: first, negative ratings really didn't add much value to the community and were often given in a mean-spirited way rather than in a spirit of constructive criticism; second, the community expressed an interest in giving a more nuanced rating to an Instructable. In fact, for a while members would comment on a project and assign it a letter grade from A to F.
So, we implemented a 5-star rating system. An Instructable's rating was calculated by the average of all of its ratings. This method is fine when there were lots of ratings, but gives poor results when there are few. For example, if the first two raters of a newly published Instructable both rated it 1 star (for whatever reason), that Instructable would appear to be terrible by its rating, and its rating would only improve as more people decided to give it a chance and perhaps rate it. It is an awful experience for new authors to watch their work be initially down rated and then maybe improve over several days.
To address this, we built a Bayesian rating system with 5 stars. The objective was to push the rating of an Instructable towards the average rating of all Instructables when it had few raters. As more people rated it, the rating could diverge further and further from the average. We spent quite some time optimizing this system: we built messages helping people understand what 1 star meant vs. 5 stars; we made it so the first couple dozen ratings of a new member had lower weight than later ratings (so it was difficult to change an Instructable's rating by rating it with lots of newly created accounts); and we tweaked all of the underlying knobs and ratios. When everything had settled, a newly published Instructable would have a rating around 3, a very good Instructable could get to around 4, and an extremely popular and highly rated Instructable approached 4.8 or 4.9.
It seemed like the system was working, and it gave a quantitative measure of the quality of an Instructable in relation to other Instructables. However, it was ultimately deeply dissatisfying. After all this work, and lots of input from the community through ratings, thousands of Instructables were all rated pretty much the same, with only slight differences in the rating's third digit. The slight variations were most likely noise, and the ranking didn't give a definitive sense of quality to a new visitor nor a sense of accomplishment to an author. What does it mean to say Sweet Potato Fries is rated 4.22 compared to Arduino-Controlled Robotic Drum at 4.23? Is the rating going to help you decide which Instructable you might try your hand at? Further, when we looked at how people actually rated, it turned out that ratings were heavily skewed towards 1 star or 5 stars. Even after requesting more nuanced ratings, and despite how we described what various star levels meant, people were still only giving As or Fs -- either they liked the Instructable or they didn't.
This system probably would have been ok if not for two factors: star ratings appearing in Google searches and the rise of the "Like" button.
If you search for "sweet potato fries" on Google you'll likely get a couple of recipes including one from Instructables. Many first time visitors to Instructables find us through searches -- many of them food-related -- so it's very important how our content appears in search results. A while back, Google started including star ratings in some of the search results, primarily recipes. Many recipe sites have 5-star rating systems, so I guess the presumption was that including the ratings might help searchers better choose between results. Unfortunately, this doesn't work because you cannot compare the ratings of one site to the ratings of another. For example, many of the results for a "sweet potato fries" search have 5-star ratings using a basic average system (you can easily tell by the number of raters), while the result from Instructables might have more raters giving it 5-stars, but because of our Bayesian system, the overall rating might only be 4 stars. On Instructables, 4 stars represents a very good project, while on another site, everything -- even garbage -- might be rated 5. All of our work to create a meaningful rating system was being used to penalize us in search results -- why would a searcher click through to a 4-star sweet potato fries recipe when they could go to a 5-star recipe?
We couldn't tell how many people were choosing not to visit Instructables because our projects had lower star ratings, but we needed to do something. I'm not proud to admit this, but our solution was to inflate all of our ratings by a full star. Instructables rated 4.9 were suddenly joined by Instructables previously rated 3.9. The ratings became even more meaningless because lots of Instructables were now all rated 4.5 or greater.
At this same time, Facebook's "Like" button was gaining momentum. The "Like" button is a pretty nice concept as it eliminates the ability to express a negative preference and it associates an individual with the rating. I think it's more meaningful to know that 10 people like my Instructable than it is to have a 4.5 (or whatever) rating, even if that rating is the result of 10 people giving me 5 stars.
Not everyone on Instructables uses Facebook, nor do we want to outsource our rating system to Facebook, but we wanted to learn from the success of the "Like" button and expand upon the concept to make our rating system less like giving a grade or being graded. To accomplish this, we've removed our ratings and made the "favorite" action the way to express a preference on an Instructable. Marking something as a favorite is more of a commitment than saying you like it or giving it a rating, but this is a distinction I see as important and worth developing. At Instructables, we hope to inspire you to take action and build something great which requires a greater commitment than simply liking something. In the future, we plan to expand what you can do with favorites, exploring concepts such as "I made it" and "I want to make it" among others.
Do we need something more akin to a like button in addition to favorites? I'm not sure, and I'd love your feedback. We'll be closely watching how people use favorites in its new, highly prominent position. As with most things, this is an experiment, and we're collecting data to see if this change is something that helps our authors. I hope you've enjoyed seeing a little bit under the hood about our rating system! Let us know what you think.