16 November 2006

Philosophical Misunderstandings about Folksonomy?

Beneath the Metadata: Some Philosophical Problems with Folksonomy by Elaine Peterson appears in this month’s D-Lib. It’s an interesting piece, but I have a couple of quibbles.

Statistics and Democracy

Peterson’s main point is that folksonomy is philosophically relativistic about what something “really” means, compared to a controlled vocabulary employed by a professional cataloger. She writes,

A philosophy of relativism allows folksonomy to draw on many users with various perceptions to classify a document instead of relying on one individual cataloger to set the index terms for that item. Thus, classification terms become relative to each user. Certainly all individuals’ perceptions are influenced by their own experiences and cultures, whereas the professional cataloger, even if trying to be unbiased, has only one viewpoint. Yet to include all viewpoints opens up a classification scheme to the inconsistency that allows a work to be both about A and not about A. There is no question that an individual might have a personal, valid interpretation of a text. That is not the issue. The issue is that adding enough of those individual interpretations through tags can lead to inconsistencies within the classification scheme itself.

This seems to envision systems in which there are a handful of personal folksonomies, all on an equal playing field and therefore leading to a plurality of interpretations, and concludes that a handful is too many because it’s more than one. On the contrary, I would insist that a handful is too few. With much larger numbers of users, it becomes clearer which are commonly held viewpoints and which are fringe ones, simply through the popularity of different tags. A consensus emerges through statistics, without explicitly coordinating users.

What are the magic numbers for users and tags that make a folksonomy successful in this way? I have no idea, but it would be an interesting experiment. You could take random samples of items from del.icio.us or flickr or wherever, along with random samples of users. You’d have to present the popular tags for items, varying the number of users’ tags included, and ask people to independently assess the suitability of the popular tags for describing the item. (It would take a lot of human trials, which is why I’m not exactly jumping on this one.)

Weeding

As an aside from her central argument, Peterson also makes this curious assertion:

A final criticism one could make of folksonomies as classification systems is that their advocates seem to assume everything on the Internet needs to be organized and classified. Anyone who has a home library knows that this is not necessarily true. Everyday, individuals make critical assessments of information bits they encounter. Their first decision is whether or not to retain the information, and if so, how to organize it. Folksonomy advocates seem not to recognize that critical, first decision about retention. The free labor available to create folksonomies is appealing only to those who have already agreed that the entire Internet needs some organization and cataloging. However, rather than being retained and organized, many Internet items could be eliminated, ignored, or allowed to die off. Most people put into the wastebasket (physically or online) flyers, ads and newsletters, and would not bother to organize ephemera.

Do you bookmark every webpage you visit, or every photo you see on flickr, or whatever? I sure don’t. I only bookmark the things worth retaining to me. I have to assume that, if something is bookmarked, it was important to somebody. Now, I suppose I can imagine a cadre of people out there—who were probably catalogers in a former lifetime—who sit down for hours at a time surfing web pages just to tag them. These people are decidedly in the minority. Retention isn’t just a part of folksonomy, it’s the primary motivation for regular users to engage in the “free labor” of organization at all.

Now, just because something was important to me doesn’t mean it’s “important” in general. But folksonomy is a decentralized effort, so it’s vital to realize that, just as the system does not involve a single cataloger, it does not involve a single collection development librarian, with all the attendant advantages and limitations of that approach. However, the lack of a cataloger doesn’t mean there’s no organization, and likewise the lack of a collection development librarian doesn’t mean there’s no selection and weeding. Statistics and popularity are again our guide: if only a few people bookmark it, it obviously hasn’t been considered as important or interesting as if many people do so.

6 November 2006

Safari Books Online

Sarah Houghton-Jan at Librarian in Black wrote about problems using Safari Books Online in a library setting. I left a comment saying that this was unfortunate, because we have Safari at work and I really like it. (Safari is a service from O’Reilly and Pearson offering e-books from a number of publishers, mostly on computers and technology.)

This prompted the following note:

Hello, I read your comment on librarianinblack blog about Safari. You mentioned that you appreciated it and thought there might be a problem with the Proquest Interface. How does your library get Safari? Directly from Safari, then? And, it is not a proquest subscription purchase?

Thanks, -DN Dussan (left in comments on another post)

Well, I don’t have a “library” per se; I work at a small consulting firm with a handful of people, so we just have individual accounts, which are available directly from Safari (safari.oreilly.com). At www.safaribooksonline.com, I noticed that there’s some information on corporate licensing and libraries, although they’re pretty silent about the fact that library access comes through ProQuest. As it turns out, ProQuest has a deal for exclusive distribution of Safari content to academic, public, and school libraries.

So, I guess the only recourse for dissatisfaction with ProQuest’s interface is to appeal to ProQuest. Then again, it can’t hurt to contact Safari directly with your concerns as well, because from what I’ve seen, O’Reilly is pretty committed to offering a useful, usable product. They seem to be quite interested in the academic market, because they also offer SafariU, a service for professors to remix and mashup books to create the ideal coursepack/textbook for their courses.

6 September 2006

Evergreen is live!

Just wanted to give a big hurrah and congratulations to the guys at PINES for launching their new ILS. It’s an open-source ILS and I think it’s rocking the library automation world. You can check out the slick OPAC, Evergreen.

I’m a little biased, of course. I was an intern on the project this summer during library school.

I’ve been slow to post lately—you know, vacation, organizing all my school stuff now that I’m finished, getting back to work, etc. More is coming soon though, including a long post about the final paper I did focusing on subject headings, tags, and LibraryThing.

14 August 2006

Philosophy and predicting the future

I thought I’d share a final essay I wrote for a course on “Organizing Information”, connecting Walter Ong on the eras of information culture with Bruce Sterling on the eras of technoculture, via Suzanne Briet considering objects as documents.

That was a mouthful. It goes something like this: Ong talked about the shifts in culture that occurred in the move from oral (spoken) transmission of information to literate (written) society. Sterling talks about the shifts in culture that occurred due to the way that things (material objects) are produced and consumed, from handmade to mass-produced to “smart”.

Suzanne Briet was a librarian and documentalist in France in the early 20th century. Her work has come into the light in recent years largely thanks to Michael Buckland and an article in JASIST titled “What is a Document?”, in which Buckland explores a variety of perspectives on what constitutes a “document”. Briet (now, rather famously, in library circles at least) asserted that, although an antelope in the wild was not a document, an antelope that was captured, put in a zoo, cataloged, and considered an object of study could be considered a document just as much as text printed on paper.

So, using Briet’s ideas about objects as documents, Ong’s cultures and Sterling’s begin to converge into a conglomerate in which it is (or will be) no longer easy to distinguish between the two. This is especially the case in an “Internet of Things”, in which objects are increasingly retrievable and record information about themselves.

You can get a copy of my essay (pdf) if you’re interested.

8 August 2006

Jay Datema is blogging

If you don’t know him, Jay Datema is the technology editor at Library Journal, and he’s been blogging (since June, but I’ve been hunkered down for the last semester and hadn’t noticed until now). Check it out.

2 August 2006

And we’re back…

Too many of my neighbors with their air conditioners turned down to 62 degrees in the heat wave = power outage last night beginning while I was in the middle of revising a presentation for class today, also = web site down. Power restored, presentation over, website back. More details soon on my paper/presentation, about the conjuction of folksonomies and controlled vocabularies on LibraryThing.

29 July 2006

Diversity and [systems] librarianship

There’s currently a lot of discussion about techie women in librarianship (which now has far too long a string of episodes for me to sort out and link here, but somebody should collect them all together in some appropriate spot: Lazyweb?). Kudos to Karen Schneider for her recent post on LGBT librarians. I’ve felt some kinship in the discussion of women in technology librarianship, but I didn’t really feel I had enough library experience to talk about it coherently. Karen’s done a good job pointing out the similarities and differences in relating to the techie library world being LGBT vs. being a woman or of color (namely, it’s pretty obvious that someone is a woman, but LGBT-ness isn’t very visible). She doesn’t really have any answers, and I don’t either, but her observations are good.

17 July 2006

Horn-tooting, for fun and profit

I’ve been pretty quiet lately, because there are three weeks left of library school (eep!) and I’ve been super-busy.

However, I’m briefly popping my head out of the burrow to engage in a little shameless self-promotion. I’d like to call the attention of the reader to two things I’ve been working on lately. First, you can check out my article in this month’s Library Journal, “Shoestring Digital Library”. It’s based on my experiences building a prototype digital library using Ruby on Rails (detailed in an earlier post) and provides some ideas for building digital libraries using software from outside the usual pool of suspects.

Second, I’ve been working this semester in an internship-for-credit with the wonderful folks at PINES who are developing Evergreen, an open-source ILS. It’s currently in public beta and it’s shaping up quite well! I’ve been making use of my technical writing skills by working on the documentation. (Be aware that it’s still in progress, one of many things I’ll be working fiendishly on in the next few weeks. But it’s also a wiki, so feel free to contribute if you have the inclination!)

11 June 2006

LibX toolbar for University of Pittsburgh

The Newman Library at Virginia Tech released LibX, an easy-to-configure browser extension that brings together previously disparate toolbars, extensions, and scripts for searching library catalogs, embedding links to library materials in Amazon and other web pages, and various other goodies. In about half an hour, without any special inside information beyond a little detective work to suss out the URLs for various services, I was able to create a toolbar for the University Library System at the University of Pittsburgh, where I’m a student.

Cool.

You can install my Pitt library toolbar by clicking this link. (Support for the “Reload this page using SSL VPN” and “Follow this link using SSL VPN” features for off-campus access is limited, since our system isn’t exactly supported.) If you find any problems, please let me know.

Learn how to make your own LibX toobar on the LibX site.

7 June 2006

Jon Udell and narration of work

Jon Udell writes an interesting post about the bridge between education and career. He talks about “rickety bridges”, such as job fairs and brochures, between people choosing careers (students) and the actual career. (Of course, there are internships, mentorships, etc… but usually you’ve already more-or-less decided on a path by then, so I think he’s really onto something.) He says,

Thanks to personal online publishing and to an emerging cultural ethos of transparency, there is an exciting new possibility in the world. A young person today who is interested in software can find out what it is like to be a software developer — by evaluating products, by reading the accounts of people creating them, by making contact with those folks, and by contributing to real projects. I hope it will also become possible for young people to find out what it is like to be a psychologist, homebuilder, forester, teacher, retailer, or city planner. If we want to inspire the next generation we need to open windows onto our worlds, share our knowledge and passion, and invite them in.

This is just the kind of environment I’ve found in librarianship. (This idea makes Michael Gorman itch all over, I’m sure.) Of course, not every librarian has a personal website or blog, but then again, not even every software developer does either. And, software development lends itself to happening online in some ways that many library-type projects don’t. But I’ve found so many stories and ideas online from people from many different kinds of libraries and library jobs, and it’s what really persuaded me to go to library school.

I’m glad to be working in this profession.