The split personality critic

The 5 star rating control is misused in most applications today.  How can I say that?  What's more natural than selecting 5 stars for a movie or song?  Actually, the intuitive nature of the rating star control is the cause of this subtle usability issue.

Ratings have multiple interpretations.  Does 5 stars mean I loved a movie?  Does it mean I would recommend it to my friends?  Does it mean Netflix should send more movies like that one?  Does it mean the cinematography/main actor/director alone was worth watching?  Does it mean I would watch the movie if I'm sad/happy/with friends?  5 stars cannot imply all these things.  And therein lies the trouble.

Intuitively, a 5-star control uses the "Liked it" scale.  I expect most people treat it this way:  5 stars means I loved a movie.  1 star means I hated it.  This, at least, works.

Meanwhile, software interprets these numbers.  Netflix tries to send me movies that resemble other 5-star movies.  iTunes tries to play 5-star songs more often than 3-star songs.  The act of imposing this correlation introduces a tension into my decision.  If I rate this movie as 5-stars, I now know that I'll get a bunch more like it, even if I didn't really want them.

An example helps:  Braveheart is not a 5 star movie in my book.  Yet I want to see more movies of that caliber.  So should I give it the 3-stars on the "I liked it" scale, or the 5-stars on the "Send me more" scale?  I liked Borat, but I usually hate that genre of movie.  So I could give that movie 4 stars or 2 stars depending on the interpretation.

What is the proper way to incorporate a ratings system into an application if the end-user interpretation may differ from how the application makes use of the rating?  I don't have a good answer other than to suggest that applications, including Netflix, should think very hard about how to incorporate the ratings system into their interface.  There is a balance to strike between simplicity in design and accomplishment and with complexity and nuanced meaning.

Comments (3)

  1. Doug says:

    If your two meanings are in tension with one another, one must give way.

    For example a file folder on the computer screen behave totally different from a file folder on a real desk.  Yet, we understand from the context what each is supposed to do.

    A familiar name/metaphor is often used to get people to try something new and the new thing often usurps the old meaning which is soon forgotten.

  2. Dan says:

    An interesting idea that I like more than ratings is using a [fixed] set of tags to tag items rather than using ratings.

    For example, on the Facepunch Forums (Garry’s Mod for Half-Life 2) you can rate posts… but not on a scale.  Instead, you get a series of 12 or so tags you can mark a post with (Agree, Disagree, Good Idea, Helpful, Poor Spelling, Artistic, etc).

    For a movie, instead of having a scale for a rating, you could have options like: Liked, Disliked, Would Recommend, Like Genre, Favorite Movie, etc, which might give more valuable information, depending on what you’d need it for.

    As a side note, Facepunch adds another twist by giving YOU a label based on how you label others’ posts.  I was having a grand ol’ time giving peoples’ posts thumbs down and wrong spelling icons, and the forums branded me a a**hole.  D’oh!

  3. benkaras says:

    I like the power of the ratings control since it fits a natural user thought model.  

    I like the flexibility of adding data about how I arrived at my rating (ala Slashdot’s moderation or Facepunch Forums).

    I desire the ability to rank several attributes orthogonally.

    I fear that I will not be rewarded for my time doing so.

Skip to main content