Beneath the Metadata: Some Philosophical Problems with Folksonomy by Elaine Peterson appears in this month’s D-Lib. It’s an interesting piece, but I have a couple of quibbles.
Statistics and Democracy
Peterson’s main point is that folksonomy is philosophically relativistic about what something “really” means, compared to a controlled vocabulary employed by a professional cataloger. She writes,
A philosophy of relativism allows folksonomy to draw on many users with various perceptions to classify a document instead of relying on one individual cataloger to set the index terms for that item. Thus, classification terms become relative to each user. Certainly all individuals’ perceptions are influenced by their own experiences and cultures, whereas the professional cataloger, even if trying to be unbiased, has only one viewpoint. Yet to include all viewpoints opens up a classification scheme to the inconsistency that allows a work to be both about A and not about A. There is no question that an individual might have a personal, valid interpretation of a text. That is not the issue. The issue is that adding enough of those individual interpretations through tags can lead to inconsistencies within the classification scheme itself.
This seems to envision systems in which there are a handful of personal folksonomies, all on an equal playing field and therefore leading to a plurality of interpretations, and concludes that a handful is too many because it’s more than one. On the contrary, I would insist that a handful is too few. With much larger numbers of users, it becomes clearer which are commonly held viewpoints and which are fringe ones, simply through the popularity of different tags. A consensus emerges through statistics, without explicitly coordinating users.
What are the magic numbers for users and tags that make a folksonomy successful in this way? I have no idea, but it would be an interesting experiment. You could take random samples of items from del.icio.us or flickr or wherever, along with random samples of users. You’d have to present the popular tags for items, varying the number of users’ tags included, and ask people to independently assess the suitability of the popular tags for describing the item. (It would take a lot of human trials, which is why I’m not exactly jumping on this one.)
As an aside from her central argument, Peterson also makes this curious assertion:
A final criticism one could make of folksonomies as classification systems is that their advocates seem to assume everything on the Internet needs to be organized and classified. Anyone who has a home library knows that this is not necessarily true. Everyday, individuals make critical assessments of information bits they encounter. Their first decision is whether or not to retain the information, and if so, how to organize it. Folksonomy advocates seem not to recognize that critical, first decision about retention. The free labor available to create folksonomies is appealing only to those who have already agreed that the entire Internet needs some organization and cataloging. However, rather than being retained and organized, many Internet items could be eliminated, ignored, or allowed to die off. Most people put into the wastebasket (physically or online) flyers, ads and newsletters, and would not bother to organize ephemera.
Do you bookmark every webpage you visit, or every photo you see on flickr, or whatever? I sure don’t. I only bookmark the things worth retaining to me. I have to assume that, if something is bookmarked, it was important to somebody. Now, I suppose I can imagine a cadre of people out there—who were probably catalogers in a former lifetime—who sit down for hours at a time surfing web pages just to tag them. These people are decidedly in the minority. Retention isn’t just a part of folksonomy, it’s the primary motivation for regular users to engage in the “free labor” of organization at all.
Now, just because something was important to me doesn’t mean it’s “important” in general. But folksonomy is a decentralized effort, so it’s vital to realize that, just as the system does not involve a single cataloger, it does not involve a single collection development librarian, with all the attendant advantages and limitations of that approach. However, the lack of a cataloger doesn’t mean there’s no organization, and likewise the lack of a collection development librarian doesn’t mean there’s no selection and weeding. Statistics and popularity are again our guide: if only a few people bookmark it, it obviously hasn’t been considered as important or interesting as if many people do so.