Tuesday, January 24, 2006

The Nature of Tagging: Public vs Private

By implication, tags are public. However, should tags be public or private? That's been the ongoing debate between me and Chris. Chris's stance is that tags should be private, because he only cares how he classifies things, and not how everyone else classifies things. My stance is that the power of public tagging is in its search of everyone's content, not from its way of classifying individual content.

Tagging is no more than keywords on things for grouping. And if tags were private, it would be no more than that. Gmail has a new feature that allows the user to group contacts. Public tagging gets its power from the aggregation and feedback loop of lots of people's sets of tags. The statement that one wouldn't care how other people classify things is true to you if you don't use other aspects of tagging, namely, the searching of new information through tags. If you only use tagging for classification, then it makes no difference whether the tags are public or private. The power of tagging is not that it's a better classification system; it's a way of utilizing humans to create a search index.

How could the index comprised of an aggregation of other people's tags be anything other than a hodgepodge of random keywords? That would only be true if the set of tags that people use on a particular object would be independent (uncorrelated) with everyone else's set of tags. Now, we know that to be untrue for bookmarks due to del.icio.us's work. This is because words with content are correlated with each other. Therefore, there is a limited set of keywords that people will use for any single piece of content. In fact, it's because of this that search engines can do their job. So while one person's set of keywords may not be the same as another's set of keywords, on average, they will overlap with each other. Thus, if you're searching for more content on the same subject, that's something public tagging does a really good job of. And that's a task that lots of people do on the internet...they're looking for related content.

Now, will [public] tagging work for classifying or searching all other objects? With the rage about Web 2.0, there are some people that think that everything should be tagged. Not necessarily, I say.

Tagging has been demonstrated to work if an object has intrinsic properties; a book is a book to everyone. But what if you're tagging your relationships to an object? I think that changes the nature of tagging a little bit. Because the relationships to an object is not the same for all people; and more importantly, sometimes these relationships are non-transferrable between people. A book will always be a book to anyone, but a book that your father gave you on your eighteenth birthday will be a relationship between that book and only you, and not between that book and anyone else.

That said, there is sometimes a shared (collective) relationship between groups of people and an object. A group of people might feel that Lover's Overlook is a special teenage place for them, but not everyone in the world would share the same relationship to Lover's Lane. But everyone can agree that Lover's Lane is a make-out place because it's an intrinsic property of the object. Therefore, I will have to curb my hard statement before that tagging relationships between objects would be uncorrelated. It will be less correlated than tags on intrinsic properties of an object, but some degree of correlation would exist. And therefore, you can search for all other objects that have the same relationships to people.

So should tags be public or private? I think that objects with intrinsic properties that people are searching for is very condusive to public tags. However, objects such as 'gifts I received for my birthday' might not be condusive to private tags. You might not care what everyone else received for their birthday. In addition, since they are different people than you, you might not want the same things they got.

I think the application of tags to objects in a system should be carefully scrutinized, rather than just following the assumption that everything is good for tagging. Are there instances of objects that should be privately tagged as opposed to publicly tagged?

No comments:

Post a Comment