9 June 2008

LibraryThing API

While I’m getting back on the blogging horse…

I realize this is old news now, but LibraryThing announced an API for work data. This is great. But what’s really awesome? This little tidbit from the post:

Scope. This is an API to work information. Once I’ve worked through the kinks here, I plan to release a member API, allowing members to do clever things with their data. For example, members will be able to make their own widgets, not just rely on ours.

I will squeal with glee the day there is a member API. I’ve been harping on the issue for ages, because I really want a way to make a “to-read” list that mashes up LibraryThing with my local library data. I can’t wait.

19 July 2007

Yahoo Pipes, Google Mashups, etc.

Is anyone out there using Yahoo Pipes, Google Mashups, or something like Dapper or Coghead on a library website or for library services? If so, I want to talk to you! I’m writing an article. Email me at jonathanweber@mac.com.

18 July 2007

Open Library architecture

You’ve no doubt already heard about the Open Library demo site from the Internet Archive, brainchild of Brewster Kahle and Aaron Swartz. I think it’s a really exciting project, and I’m sure I’ll have more to say about it soon.

One thing that struck me as interesting is a technical detail. On the “About the technology” page, there’s this tidbit:

We wanted a database that could hold tens of millions of records, that would allow random users to modify its entries and keep a full history of their changes, and that would hold arbitrary semi-structured data as users added it. Each of these problems had been solved on its own, but nobody had yet built a technology that solved all three together.

So we created ThingDB (tdb), a new database framework that gives us this flexibility. ThingDB stores a collection of objects, called “things”. For example, on the Open Library site, each page, book, author, and user is a thing in the database. Each thing then has a series of arbitrary key-value pairs as properties. […] Each collection of key-value pairs is stored as a version, along with the time it was saved and the person who saved it. This allows us to store full semi-structured data, as well as travel back thru time to retrieve old versions of it.

This sounds really interesting. It also reminds me very much of Maya’s u-forms (pdf), aside from the fact that the identifiers aren’t UUIDs. Although I’m not really database-savvy enough to know much about the underlying infrastructure that makes any of this happen, so my interest is something like an ape staring at a power drill, but still, I thought it worth noting.

8 July 2007

Karen Schneider, hip “old lady”

The biblioblogosphere is fluttering with talk about the fluffy librarian-image piece in the New York Times style section. On one hand, it’s one of those “Librarians: we’re cooler than you think we are” articles, and as those go, it’s not a half bad one. I mean, Jessamyn gets mentioned, so that’s one thing going for it right there.

But Karen Schneider calls out what’s lacking. It is, after all, the style section, and there’s a lot of concentration on cocktails, clothes, and tattoos. There’s also a glossing-over of some stereotyping that deserves examining and lack of attention to the things that truly make librarians “hip”. Karen writes,

Jessamyn is of the hippest of the hip not because she routinely uses instant messaging, but because she is such a tireless advocate for small libraries and poor communities — the unserved, often voiceless communities many of us (including me) forget about when we get hopped up about some new new thing.

Right on. And she goes on to say,

I am cool in my subversive old-lady tech-loving the-user-is-not-broken way, and getting cooler all the time, and I count among my friends and colleagues librarians of all ages, dress codes, and evening habits.

Karen, if you want to identify as “old lady”, I’ll support you on whatever you want to be. ;) But I have to say, also one of the coolest librarians I know of. Thanks for blogging.

26 January 2007

Fields are from Mars and Tags are from Venus: oh really?

When thinking about bibliographic data (for example) and social applications using taggings, it’s pretty easy to think that the data (title, author, and so on) is highly structured and therefore very different from tags, which are freeform and all that jazz. In many ways, that’s true, and it’s especially important for the purposes of bibliographic control. But in social applications where users are contributing data, the line can get a lot fuzzier. LibraryThing is an example: users contribute various structured and unstructured data about books. Some of the data comes from libraries or Amazon, some is put in by hand, and some of the library- or publisher-supplied data is cleaned up by users, because it’s not always right. Users can enter structured information in fields—information about the item in general like title and author, but also personal information, like ratings and the date it was read. They can also enter tags and search and sort books by those tags.

Flickr has just introduced “machine tags” (or “triple tags”). These build on existing geotags, which encode locations like this: geo:long=123.456. They’re three-part tags, with a namespace and a key-value pair, and you could use them to express all manner of things—like, for example dc:title=Othello. (There are also some semi-official uses of namespaces on tags in del.icio.us, like system:unfiled and filetype:mp3, and various users have used namespaces and triple tags on services like these without official support.) You might think of them as a kind of really lightweight RDF.

Triple tags really blow away the distinction between structured fields and freeform tags. This is important, because it’s a step along a road in which it’s easier for Joe and Jane User to make sense of complicated sets of data by sorting and filtering. Once you’ve become comfortable searching and sorting your tags, it’s not too much of a stretch to apply the same tools to more structured data. Sure, maybe it’s the same data that’s always been there, but now maybe Jane User could be better at manipulating it because she doesn’t have to understand “databases”, she just has grok “tags”, along with a little lightweight syntax. It’s just a different way of looking at the data, one that might prove more friendly. I know not all the tools are there yet, and I’m certainly not saying that everybody’s grandma is going to be putting machine tags on Flickr tomorrow, but I think this is a step in the right direction.

11 January 2007

On clever solutions…

When people come to the library, that’s a good thing. But, sometimes lots of people at the library can mean the library gets noisy with people working together or just chatting. People who’ve come to the library for some peace and quiet to get work done can be disturbed.

The solutions to this problem are usually to have quiet study rooms that can be closed off, and/or to formally designate or subtly design for group spaces where it’s OK to talk a little bit separate from quiet spaces. Today, I saw a pretty clever additional idea from my undergraduate alma mater: noise-canceling headphones you can check out to use while you’re in the library. Cool!

3 January 2007

What is venture capital, and is library automation getting any?

I’ve seen the phrase “venture capital” bandied about in reference to Vista Equity Partners’ recent acquisition of SirsiDynix (pdf), and the earlier acquisition of Ex Libris/Endeavor by Francisco Partners. Venture capital is a somewhat nebulous term, meaning different things to different people. Since the rise and fall of the dot-com era, however, it’s most often applied to capital offered to start-ups, anticipating large returns for the relatively high risk of investment. It provides an infusion of cash to a new or small company, enabling innovation. Sometimes “venture capital” is also applied to an investment in a beleaguered company in order to turn it around, which can be similarly high risk/high reward.

Though I can appreciate the hopes of library automation customers that the recent acquisitions may signal an infusion of cash that will fuel innovation, that’s not exactly what’s going on here. These are buyouts by private equity firms of large, established companies. Although we sometimes talk about the state of library automation software in terms that might be described as “beleaguered”, I’m not really sure that describes these companies’ financial situations.

It may indeed be the case that Vista Equity Partners and Francisco Partners intend to invest resources into these companies to make them better and more profitable, and if so, I think that’s great. (I, for one, welcome our new private equity firm overlords.) On the other hand, these acquisitions could be an example of what’s known as leveraged buyout, a strategy by which private equity firms acquire companies by borrowing against the assets they acquire. Often this involves paying themselves a big cash dividend, and then doing just enough to keep the company afloat under the sometimes excessive debt burdens they have inflicted during the acquisition, and attempting to sell it off again in a year or two.

I’m not saying I know which will happen, or even which is more likely. I didn’t do much research about Vista and Francisco’s previous acquisitions and what’s happened to them. I just wanted to make the point that we ought not look at acquisition of large companies like these the same way we look at VCs financing a startup. In these sorts of deals, there’s often a lot of fancy accounting going on that obscures the motives.

15 December 2006

Horn-tooting

I wrote an article in the current Library Journal on the development of the open-source Evergreen ILS. It makes an interesting case study for the development of a large and complicated piece of software from within a library consortium, and the resulting ILS and OPAC is pretty exciting! (Disclaimer: I was an intern on the project while I was in library school, so I’m biased.)

You can also find the article in the print issue.

14 December 2006

Google Patent Search

Google is beta-ing a patent search. Cool.

They’ve used the same technologies as Google Book Search on the historical database of patents, so you get full-text searching all the way back to the first US patents in 1790. The USPTO database has offered the images for some time, but only has full-text searching to 1976.

17 November 2006

Tags and Subject Headings in LibraryThing

When I finished library school in August, I put all of my papers in some boxes and haven’t looked at them since, because I really needed to recover. Happily, yesterday’s post about folksonomy finally forced me to dredge them out and bring to light a paper I did for my indexing and abstracting class on the use of a folksonomy alongside a controlled vocabulary in LibraryThing.

The first part is a sort of “literature review” on folksonomy (such as it is) and an overview of the concepts involved. The second part takes a look at LibraryThing and compares tags and subject headings.

The full text of the paper follows, or you can download a pdf for printing (warning: it’s in ugly, formatted-to-turn-in-for-class format; one of these days I’ll get it prettied up). Some discussion of the features of LibraryThing are slightly out of date (Tim & co. move fast!), and the statistics about popular tags and subject headings certainly are, but I think the main points are still relevant. I’d like to do some more in-depth analysis, especially of the statistical data, at some point in the future.

Read the rest of this entry »