Solving Tag-Hell


Backup Brain recently highlighted one of the problems with the tagosphere – the ‘which tag do I use’ problem.

I say ‘one’ of the problems, because there are at least three tagospheric problems that are creating a Tag-Hell:

1. When in content tagging mode which tag should I use for this post or photo? Example by Dori Smith:

Even though a proposed tag was advertised to the MSN Search Champs event attendees, chaos reigned anyhow – that’s human nature and not an atypical scenario.

2. When in content search mode which tag should I use to search and find the content I mean to find? Example by Dori Smith:

“Nine different searches, nine different sets of results; but all of them are, at their heart, looking for the exact same thing. That’s not working, by any meaning I know for the word.”

3. Another tag search problem. If I’m looking for ‘foo’, I may be looking for ‘foo’bar rather than kung’foo’ (please – no correction comment: this was poor attempt at play on words that don’t exist). Example:

The above tags are similar and they are different. But are they different enough to merit different tags, and are they similar enough to force consolidation? Yes and Affirmative I’d say. Do those who use the ‘taxonomy’ tag really mean something so different to those using tagsonomy, or folksomomy?  Maybe.

Tag-Hell Solutions

So as a possible solution, do we all delegate some uber-semantic-taxonomiz(s)ation-folkosonimis(z)ation-committee-slash-group to solve it on our behalf?  What would they call themselves, and would we agree to call the group by their chosen label?  And if this elite assemblage of chosen ones could agree a common ontology, how would it be ‘enforced’? Would it be used? How could their system let language evolve, let memetic forces do their thing and yet keep our tagging habits in order?

Given that the people who are actually interested in this stuff can’t even agree, do we really expect them to solve the problem for us?

Or, do we continue the present course and make a few degrees course correction? If so, what could we do to solve these three tag issues?

The current tagosphere’s course could be described as the ‘order emerging from chaos‘ course – let the market decide, the market being us. It’s the bottom-up approach.  But as Dori has shown, we need help, clearly.

What’s missing from the lists of tags above is the tagosphere’s feedback and signals back to the user that can help the user decide which tags to use when tagging content and searching for tagged content.

Tag Data

There are three type of tag data than can act as signals that I’m calling: Aggregate Tag Counts, Related (or Relational) Tags and Social Tags. (I’m sure there are proper folksonomic terms for these but I don’t know what they are).

Aggregate Tag Counts

The numbers (that I’ve made up for illustration) represent the number of posts / items that have been tagged with each tag.  If you saw this metadata and felt that the tags ‘tag’ and ‘tags’ meant the same thing (to you), I’d say you’d go with the flow. It’s human nature – we like to fit in. So the feedback has a reinforcing effect.

In search mode, this feedback could also help locate where the majority of the content being searched or browsed for exists. Technorati used to provide this metadata as part of tag search results, but this feature has, alas, recently disappeared.  Since I use Technorati to help me decide which tags to use I find this loss unhelpful – I feel I’m groping for the right tag to use in terms of critical mass use by others.

This lack of feedback can only make tagnoise worse.

Related (or Relational) Tags

Another feedback signal, more widely used by tag-aware systems, are related tags. I’ve explored this topic previously so I won’t labour the point other than to say that –

Related tags should be:

  • bi-directional
  • be represented by their relative strength to other tags
  • and be surfaced to the user in tagging and search modes.

In Del.icio.us the same tags ‘software‘ and ‘programming‘ are mostly bi-directional and have relationships to other tags but their relative strengths are not represented.

Social Tags

It might help to leverage the social dimension of tags. An idea proposed by Scott Koon (aka Lazycoder) is that you could use your social circle or reading list / OPML file to navigate tagged content, so that in your aggregator / feedreader / tagware you can either locate content tagged with tags that you’ve pre-defined, or browse a tagloud scoped to your OPML field / friends list.

It’s nice idea and plays squarely in the Attention space.

Let’s take it further – could we use the tags used by your reading list / OPML cloud to help decide which tags to use when tagging or searching content. The premise is that if I find you interesting, then I’m likely to find the tags you find interesting, so if I have a choice between ‘folksonomy ‘and ‘tagsonomy’ I might choose the former because you use it. In terms of solving Tag-Hell, I’m not sure if this would be as helpful as the other two types of tag data described earlier, Aggregate Tag Counts and Related (or Relational) Tag, but there maybe something in it. Who knows? (….and if your read this far, you might well be thinking, who cares?).

Tagspace Browsers

Quick note on these: Tagspace browsers are nice and fun tag discovery tools but really don’t help me decide which tags to use, nor find content.

At last, the last sentence of this post

What I’d really like to see are the Aggregate Tag Counts and Related Tags data surfaced by all tag-aware systems.

Actually, the following is the last sentence of this post, sorry

If these aids became standard UI features tag-aware systems (they do exist in some) we could emerge ourselves out of Tag-Hell.

Tags: , , , , , , ,

Comments (26)

  1. MSDNArchive says:

    Manual trackback:

    ‘Social Tagging’ – Technicalities blog, Paul Dundon: http://paulstechnicalities.blogspot.com/2006/02/social-tagging.html

  2. Anne Zelenka says:

    Alex, have you seen Clay Shirky’s article "Ontology is Overrated"? It’s here:

    http://www.shirky.com/writings/ontology_overrated.html

    In it, he discusses using URLs to merge categories rather than trying to merge categories directly:

    "You don’t merge tagging schemes at the category level and then see what the contents are. As with the ‘merging ISBNs’ idea, you merge individual contents, because we now have URLs as unique handles. You merge from the URLs, and then try and derive something about the categorization from there. This allows for partial, incomplete, or probabilistic merges that are better fits to uncertain environments — such as the real world — than rigid classification schemes."

    I don’t think what we should be aiming for is a Yahoo-style ontology, which is what I initially thought tagging was about and what Dori seems to want.

  3. MSDNArchive says:

    thanks for the pointer Anne, yes I saw (and heard) Shirky’s essay a while ago. It realy shaped a lot of my thinking on this.

    Agreed that the best way is probably the bottom-up approach…

  4. Scott says:

    "Let’s take it further – could we use the tags used by your reading list / OPML cloud to help decide which tags to use when tagging or searching content. "

    See, I’m thinking along another line. What if when you tag something, all of it’s other tags are mapped to your tag. Either automagically or manually by the user? Specifcally, if you tag something that a person known to you had tagged with a different tag. The new tag is mapped to your tag. So any search you initiate which includes your tag, now also includes the tag of the known person.

    So if I tag a post of Dori’s with "AppleOSX" but she’s tagged it with "OSX", my tag search will look at not only my personal tags during my next search for "AppleOSX", but also at least glance around at other tags, like "OSX", associated with the original post that are related by that content and present any content associated with them to me. Maybe a section titled, "You haven’t tagged these, but they share a tag with something you have tagged." Some sort of Baysean inference based on the content along with a neural net or some kind of simulated annealing or another monte carlo type routine could limit the sheer volume and hopefully hit the target more than miss.

    The problem is it kind of requires people using tags with a laser-like precision. Otherwise after a time, a tag search ends up returning what looks like a Google/MSN/Yahoo search. So I’d suggest only extended the "Friend of a Tag" motif to people in my trust-zone. Friends/family/co-workers/people I know not to be stupid or indiscriminate taggers. What do you think?

  5. My pet peeve is that some sites space separate tags, and others use the comma. Argh.

  6. Jack Vinson says:

    David Weinberger said it well in his Release 1.0 piece last year on tagging. (paraphrase) It doesn’t matter that you can’t find every last item related to your topic. There are too many already. What matters is that you can find _something_ and that you can also use knowledge about who is doing teh tagging to inform you as to the quality of the underlying link.

  7. Frank Smadja says:

    Hi Alex –

    Great discussion.  I have read this and your other post on tag relationship.s  I believe that the right approach is based on the following 2 points:

    1- let the "advanced user" define tag relationships

    2- Use clustering and other statistical based techniques to infer tag relationships for the searcher and the less advanced tagger.

    This the approach we are working on at RawSugar.  Also I have written a short paper with 2 other (external) researchers on tag clustering and I’d be happy to send it to you if you’re interested.  You can email me at myfirstname@mycompanyname.com

    Thanks for the great discussion.

    Frank

  8. Keven says:

    春节期间游历网站所阅读的有关Web2.0(包括图书馆2.0)的东西,略加点评。

  9. David Weinberg says:

    Interesting thread Alex …  I’ll stay tuned.

    Looks like the tag paradigm is set to remain in a state of flux for some time yet. I guess this is a healthy reflection of the high levels of innovative activity in that space.

    DumbFind’s new tag based search engine launched last week is a case in point. Steering clear of the old fashioned keyword-only search models, Dumbfind’s wears its tag oriented search paradigm on its sleeve (www.dumbfind.com).

    It helps that is one of the few search engines to maintain its own search index, making DumbFind independent of the major players. That independence will become increasingly important as other players like Google cave in under pressure from China, for example, and start the regrettable pracice of customizing their content to suit the censorship whims of local politics.

    Let the tagging debate continue!

  10. Don Demsak says:

    Alex, you must have missed my post "Random Acts of Senseless Tagging" http://donxml.com/allthingstechie/archive/2005/11/14/2272.aspx

    The Publishing Industry already has an XML based language that will handle expressing tags and tag hierarchies called Controlled Vocabs: http://www.prismstandard.org/specifications/1.2/modularized/PRISM_controlled_vocabulary_namespace_12.pdf

    The problem is that their is no Web Service API (currently) around this spec, which what I am working on.

    Microsoft sites like The Working Network, http://theworkingnetwork.com/blogs/blog/Default.aspx and CodeZone would be perfect uses of a technology like this.

  11. dumbfounder says:

    Dumbfind is definitely tackling this problem. We have built what we call a "relationship index" that shows the connections between all tags in the system. When you search for keywords in combination with a tag on Dumbfind, we produce results for the keywords, and then actually compute the similarity between each of the tags assigned to the results and the tag you search for. It is much more than a simple term expansion, and much different that a traditional inverted index.

  12. James Corbett suggests Raw Sugar may help out with the Tag Hell I described a little while ago:

    "Raw…

  13. MSDNArchive says:

    Raw Sugar is an advanced Social Bookmarking system that John Tropea has been recommending for ages now…

    http://eirepreneur.blogs.com/eirepreneur/2006/03/raw_sugar_trump.html

  14. MSDNArchive says:

    剛剛看過 Alex Barnett blog 內的一篇文章 Solving Tag-Hell,文中提出一個很多人都會遇到的一個有關 tagging 的問題,就是該使用那一個 tag 才好,相同的題材,每個人都會有自己愛用的 tag。

    http://blog.tinyau.net/archives/2006/02/04/current-problem-of-tagging/

  15. (Warning, this is a highly unstructured, a random-thoughts-externalized-type-post)

    First a quick definition:…

  16. Clearly a classic post on the topic of tagging, but I’ve not seen it before: A cognitive analysis of…

  17. Weddings says:

    Backup Brain recently highlighted one of the problems with the tagosphere – the ‘which tag do I use’ problem. I say ‘one’ of the problems, because there are at least three tagospheric problems that are creating a Tag-Hell: 1. When in content tagging mod