Ratings, so you know...

I spent a bit of time today looking again at the Instructables rating system, here's what I found:

A single rating lands an Instructable around the 3 mark, there is a huge spike just on the upper side of 3.0. Mean average rating (rather close to Pi) is 3.14

Most ratings seem to be positive, the overall distribution has 68.7% of rated Instructables "above average" (60.5% of all Instructables if you include unrated submissions)

There is a "wave" of ratings heading up towards 4, and a smaller ripple going down towards 2. I see this as some kind of fluid-motion effect like waves on water, but I'd be interested in opinions.

50% of rated Instructables sit between 2.9 & 3.4
Instructables rated above 3.4 are in the top 25%
Instructables rated below 2.9 are in the lower 25%

Note: Rating is essentially a measure of popularity, it is not a linear indicator of quality / value.

Only 31 Instructables are rated above 4.5 (0.1%)
Only 19 Instructables are rated below 1.5, 1 of which is the lowest at 0.99
Only 12% of Instructables are unrated

The rating algorithm is in Rachel's FAQ if you're interested (I probably should have put this in earlier)

Picture of Ratings, so you know...
sort by: active | newest | oldest
1-10 of 35Next »
yokozuna8 years ago
Quote: "Note: Rating is essentially a measure of popularity, it is not a linear indicator of quality / value." I agree, and don't think that's necessarily a good thing. I once was the first to rate an ible, and gave it a 3.5. It then showed up as having a 2.94 average or something like that. It was like I downrated the ible, which I didn't! We can already tell popularity by page views and to a lesser degree the number of comments. Since it seems so few people rate I would like to see the ratings be more of an average and less of bringing everything to the middle. JMO :)
lemonie (author)  yokozuna8 years ago
It works better with a lot of ratings, in the green & red zones you've got definitely liked and definitely not liked. L
You are correct, but I still perceive it to be a problem because few get to the red and green zones, mostly due to lack of votes.
The weighting, and the use of the site-wide average, is (I believe) an important feature of the system. It prevents the stupid bias of "five star" (or "zero star") Instructables coming from just one or two votes.

They are essentially imposing the Copernican assumption of "everything is average until proven otherwise." :-)
I'm not saying it shouldn't be weighted. I just don't think it should be weighted so much. Out of my published instructables, there isn't a full point of difference in the rating of any two of them. I think that some are quite a bit better quality than others, even if they are only leaning towards the two ends of average. In fact, with the exception of the Stolen Ibles one I'm pretty sure the ones that got rated more often tend to be rated higher, regardless of quality. Several of those benefited in page views (and thus number of ratings, and thus higher overall rating) from contests, which should have little to do with the overall rating. Again, it's all just my opinion, I don't mean to step on toes. I'm hoping the staff appreciates the feedback and can do whatever pleases them with the information. On a side note, one of my co-workers once told me "You should always strive for mediocrity, that way nobody steals your... stuff." Only he didn't say stuff.
I don't think you're stepping on toes! You've got a good, thoughtful opinion about an intrinsically hard problem. You're pointing out some clear shortcomings of the existing algorithm. Others have pointed out problems with the simplest alternative (no weighting). Rachel et al. could add more complexity to the algorithm to deal with your concerns (e.g., reduce the weight of the site-average into the score as the number of votes goes up). It's not clear that a more complex algorithm will be either palatable or understandable to the average member.
I think the algorithm works quite efficiently, just doesn't provide as much range as I'd like. The simplest solution I can think of is: current algorith + actual rating average % 2. Of course I think that may take it back too far the other direction. So perhaps something like 2(current algorithm) + (actual rating average) % 3.
You ought to send Rachel and Eric a PM about this. You've got a concrete proposal, and a good argument supporting it. You need access to the raw data to test whether your proposal does what you want (and doesn't do what you don't want :-).
Ninzerbean8 years ago
But but but... it depends on how often you rate an ible as to how much weight your rating is worth - this is to prevent shills from over inflating your tires or something like that. For an example today I rated an ible that had not been rated yet, with a 4.5 ,and it was posted as a 3.3, and I rate pretty often, so I would have thought my rating might actually give the person a rating close to what I rated it but obviously not. Any thoughts? My thoughts on this are that the folks who don't understand the system might feel quite slighted and consequently discouraged by a few ratings and the "score" being a low 3. I know I did until it w as explained to me.
The rating system is impersonal: your ratingness doesn't change how much your vote counts. Ratings are just averages, but with a few extra data points thrown in to smooth them out (and prevent the "1 vote 5 stars" issues)
1-10 of 35Next »