The split personality critic

The 5 star rating control is misused in most applications today. How can I say that? What's more natural than selecting 5 stars for a movie or song? Actually, the intuitive nature of the rating star control is the cause of this subtle usability issue.

Ratings have multiple interpretations. Does 5 stars mean I loved a movie? Does it mean I would recommend it to my friends? Does it mean Netflix should send more movies like that one? Does it mean the cinematography/main actor/director alone was worth watching? Does it mean I would watch the movie if I'm sad/happy/with friends? 5 stars cannot imply all these things. And therein lies the trouble.

Intuitively, a 5-star control uses the "Liked it" scale. I expect most people treat it this way: 5 stars means I loved a movie. 1 star means I hated it. This, at least, works.

Meanwhile, software interprets these numbers. Netflix tries to send me movies that resemble other 5-star movies. iTunes tries to play 5-star songs more often than 3-star songs. The act of imposing this correlation introduces a tension into my decision. If I rate this movie as 5-stars, I now know that I'll get a bunch more like it, even if I didn't really want them.

An example helps: Braveheart is not a 5 star movie in my book. Yet I want to see more movies of that caliber. So should I give it the 3-stars on the "I liked it" scale, or the 5-stars on the "Send me more" scale? I liked Borat, but I usually hate that genre of movie. So I could give that movie 4 stars or 2 stars depending on the interpretation.

What is the proper way to incorporate a ratings system into an application if the end-user interpretation may differ from how the application makes use of the rating? I don't have a good answer other than to suggest that applications, including Netflix, should think very hard about how to incorporate the ratings system into their interface. There is a balance to strike between simplicity in design and accomplishment and with complexity and nuanced meaning.