Music Recommendations Update

The results from my second submission finally came back this morning. The server’s estimated time kept bouncing around from about 10 hours
at the start to “indetermined.” I’m not quite sure how that can happen when
the service is done on first-come first-served basis, but a number of
supposedly determistic applications do exhibit stochastic behavior :-). At no
point, however, did the server tell me that it would take a week before I’d get
my recommendations back.

In any event, the results are interesting in a number of ways. Matching the eclectic nature of my tastes, the suggestions range from
Billy Holiday and Frank Sinatra to G. Love and Special Sauce. It even managed
to find rare artists like Django Reinhardt. Interesting, also, is that the
results included Jane Monheit, but not Norah Jones or Diana Krall. Of the
three, the latter would be my favorite choice due in no small part to her
ability as a pianist as well as a vocalist.

Also interesting are the three songs by Chris Botti. I’m not sure where that comes from, because I’m one of those people who believes
that “smoot jazz” is an oxymoron. The only music in my collection that might
be considered “smoot jazz” would be one album by Lee Ritenour, and that’s Alive
in LA
—an album distinctly lacking in
songs that one would consider “smooth jazz.”

In the comments, Steven asks how long it took me to rate my collection, and how I dealt with the “I like song X when I’m in mood Y”
problem. The answer to the first is question is that it took about 45 minutes,
but, then, I’m quite familiar with everything that’s in my collection. I
didn’t use the method that’s described on the UICU web site. The answer to the
second, is that I didn’t bother. The algorithm rather folds that into the
general method by picking songs of a similar nature. As I noted above, the
returned list is as eclectic as my tastes are, so the list of recommendations
will likely match the various moods I have when listening to music.

As I see it, there are three problems with this method. The first is rather simple: my collection doesn’t include songs I don’t like. As
such, there’s nothing that I woudl rate lower than 3 stars, and a large
majority of my songs are rated either 4 or 5 stars (my “Top Rated” song list
has 622 of the 880 songs in my library). This probably accounts for the Chris
Botti recommendations even though there’s a very small chance that I’ll like
any of Chris Botti’s music.

The second problem is more difficult to deal with. There is a common thread that runs through all of the music I like, but it’s really
difficult to quantify that thread. I like music that’s multi-dimensional. For
me to like it, the music needs to blend a couple of different styles in a way
that brings some combination of harmonic and rhythmic complexity to the
experience. Also, if the song lacks dynamics from loud to soft and back, then
the chances I’ll like it are greatly reduced. I like the blues, but not
necessarily straigh-ahead blues. There are some blues guitarists who are
particularly adept at wringing the emotion out of a song, but, to be perfectly
honest, I’d much rather listen to the Allman Brothers’ Desdemona style='font-style:normal'> than Eric Clapton’s version of Walkin’
; or Stevie Ray Vaugn’s Riviera
over his rendition of Change
. The basic emotions are the same, but
the ways in which they’re evoked are vastly different.

The basic problem with any collaborative filtering algorithm is that it avoids trying to resolve questions like attempting to quantify the
dimensionality of a piece of music by folding the whole issue into comparative
user preferences, which leads me to the third problem. The whole point of
doing this is to provide recommendations. Well, a recommendation is useless if
it suggests a piece of music I’ve already heard and already decided not to
include in my collection. One way around this is to limit recommendations to new
music. Regardless of how we might define “new,” this limitation will
significantly reduce the effectiveness of the algorithm. You run into a
catch-22. For the recommendations to match any individual’s preferences, the
database needs to have a large number of ratings in it. However, the more
ratings a song has, the less likely that any given individual will not have
heard the song and/or the artist before.

Because of this basic problem, of the 50 songs that the program recommended for me, I might actually
buy one or two. None of them are songs that I’ve since listened to and
decided I have to go out and buy. Based on this, we’d have to consider the
exercie a failure. On the other hand, no experiment is a failure, because we
at least learn something about the limitations of any given algorithm or
approach. So, I still encourage people to submit their libraries. If nothing
else, it will help us to figure out more effective ways to give people tools
that will help them to expand their musical/artistic horizons.



Comments (3)

  1. Scott says:

    I only have two songs in my iTunes library. That one by Jet that they use in the commercial "Are you gonna be my girl" and one by DMX.

    It’ll be interesting to see what they recommend for me. If it’s just a mean, probably lots of Limp Bizkit 😀

  2. steven vore says:

    thanks for the answers, Rick.

    "my collection doesn’t include songs I don’t like" – that’s a big difference. I tend to rip a whole CD at a time, and frequently while I like almost the whole thing, there will be 1 or 2 that I *really* like and 1 or 2 that are just ok.

    I’m impressed with 45 minutes; with almost 2100 songs I think I’d be hard pressed to rate them in under a few hours – but I’d probably have to listen to at least a sample of each.

  3. Not rating songs you don’t like seems like a bad thing, if you don’t leave out that information, the algorithm should have a much better chance of giving "good recommendations". ( gives me rather good recommendations, my biggest problem with it is that it recommends things I’ve bought elsewhere and haven’t told it about).

    And a week to get back recommendations seems awfully long.

Skip to main content