More Differences

Just when I finished my previous post on the problems with focusing on individual differences, I discovered this post , once again by Clay Shirky. He makes a good point about why it might be important to think about individual differences when it comes to tag use. He says:

Here’s what’s radical about what protends: My vocabulary on folksonomy is personal, not vernacular — no one knows or needs to know which class I’m talking about when I tag something ‘class’, or that I use LOC to mean Library of Congress. This isn’t the same as, say, the dictionary of thieves slang from the mid-18th c. because no one else needs to know my bookmark system, and I don’t need to know anyone else’s …

So pretty clearly the view is that an individual can do whatever s/he wants, in complete isolation from every other user, and the “radical” system will accommodate each individual equally. Thus, while many people might agree, the system is “ensuring that the emergent consensus view does not have to be pushed onto any given participant“.

So this is why individual differences matter, because they are allowed to exist. Taxonomies are bad because they force everyone to be the same, folksonomies are good because they allow everyone to be individuals. This is probably good ideology, but is it good science? Is it even interesting? Is it useful?

These are the questions I have been asking all along. Where, pray tell, is the evidence that highly unusual tags that differ from the “norm” are even useful? Here are some of my less frequent tags: adhoc, bar, and controlled. I have NO IDEA what any of them stand for! Maybe I am a bad tagger? So ignore me.

Here is an observation based on a single example (so take with grain of salt .. the same grain you use for all other single examples!): a few weeks ago the most popular tag for the New York Times was “news” with 2093 instances. The next few are “newspaper” with 550, “daily” with 370, “nyc” and “media” with 229. These tags kinda make sense .. but after this you have “english” with 15 instances, “business” with 17, and “noticias” with 16. These last few are a real mix. I wouldn’t be surprised if “business” turned out to be useless for most users: it sounds like a tag you add when there is pressure to improve on just “news”, but probably never used as a retrieval aid! (Clearly an empirical question, but would someone so interested in business news ever bookmark news sources that DIDN’T have business news??). “English” and “noticas” are interesting in that they appear to cater to internationalization needs. This is definitely an important requirement for category systems, which many don’t address adequately.

But the question is, is there a significant benefit after “news”, “newspaper”, and say, “daily”? How many users would have so many links that they would not locate the New York Times without also including “business”???

But apart from the usefulness issue, the really interesting observation is still the remarkable overall agreement. Surely the New York Times is “daily”, and it includes a “business” section .. so these are all equally valid features by which to reference it. But why do (roughly) 10 to 200 times the people prefer “news” to the other two?

And then there is my thought experiment. If the evil millionaire convinced 100000 people to tag the New York Times as “finglewick”, how long would that survive? What about “really super cool site”? What about “should subscribe”? Why are some of these tags good but others not? Why not “finglewick”? Why would some good tags, that would probably generally be judged as appropriate (like “should subscribe”) not survive (in my opinion) while some others, like “news”, definitely do?? lets us label the New York Times any which way we like. But in spite of infinite freedom we all label it “news”. Now THAT is news to me!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s