2 August 2006

And we’re back…

Too many of my neighbors with their air conditioners turned down to 62 degrees in the heat wave = power outage last night beginning while I was in the middle of revising a presentation for class today, also = web site down. Power restored, presentation over, website back. More details soon on my paper/presentation, about the conjuction of folksonomies and controlled vocabularies on LibraryThing.

17 July 2006

Horn-tooting, for fun and profit

I’ve been pretty quiet lately, because there are three weeks left of library school (eep!) and I’ve been super-busy.

However, I’m briefly popping my head out of the burrow to engage in a little shameless self-promotion. I’d like to call the attention of the reader to two things I’ve been working on lately. First, you can check out my article in this month’s Library Journal, “Shoestring Digital Library”. It’s based on my experiences building a prototype digital library using Ruby on Rails (detailed in an earlier post) and provides some ideas for building digital libraries using software from outside the usual pool of suspects.

Second, I’ve been working this semester in an internship-for-credit with the wonderful folks at PINES who are developing Evergreen, an open-source ILS. It’s currently in public beta and it’s shaping up quite well! I’ve been making use of my technical writing skills by working on the documentation. (Be aware that it’s still in progress, one of many things I’ll be working fiendishly on in the next few weeks. But it’s also a wiki, so feel free to contribute if you have the inclination!)

7 June 2006

Jon Udell and narration of work

Jon Udell writes an interesting post about the bridge between education and career. He talks about “rickety bridges”, such as job fairs and brochures, between people choosing careers (students) and the actual career. (Of course, there are internships, mentorships, etc… but usually you’ve already more-or-less decided on a path by then, so I think he’s really onto something.) He says,

Thanks to personal online publishing and to an emerging cultural ethos of transparency, there is an exciting new possibility in the world. A young person today who is interested in software can find out what it is like to be a software developer — by evaluating products, by reading the accounts of people creating them, by making contact with those folks, and by contributing to real projects. I hope it will also become possible for young people to find out what it is like to be a psychologist, homebuilder, forester, teacher, retailer, or city planner. If we want to inspire the next generation we need to open windows onto our worlds, share our knowledge and passion, and invite them in.

This is just the kind of environment I’ve found in librarianship. (This idea makes Michael Gorman itch all over, I’m sure.) Of course, not every librarian has a personal website or blog, but then again, not even every software developer does either. And, software development lends itself to happening online in some ways that many library-type projects don’t. But I’ve found so many stories and ideas online from people from many different kinds of libraries and library jobs, and it’s what really persuaded me to go to library school.

I’m glad to be working in this profession.

6 June 2006

Test Design

Well-designed tests are one of two kinds:

  • multiple-choice (or true/false, etc.) questions with unambiguous answers, like the SAT
  • short answer or essay questions where you actually get to express a thought in writing

I have a class this semester that has quizzes that are not well designed.

First of all, they are mostly multiple choice, but the instructions are as follows:

Choose a, b, c, d, or all that apply.

Wha? That makes 16 possible answers for every question, only one of which is “right”.

On top of that, the answers are often ambiguous. If one interpretation of the answer would make it right, is that enough to make it “apply”, or should it be the only or primary interpretation of the answer?

For example:

A bibliographic record
a. a surrogate record
b. a metadata record
c. a description of an information package
d. a catalog card

What is the appropriate relationship between the question (”A bibliographic record”) and the correct responses? Should they be equivalent terms (a,b)? Definitions (c)? Examples (d)?

I circled all four, and wrote next to card catalog “It’s an subset—do the terms have to be perfectly equivalent?” I got it marked wrong; next to a, b, and c was written “intellectual info” and next to d was written “a thing”.

Well yeah, I knew that. In a short answer question, I could have said it beautifully. Unfortunately, the structure of the quiz (”or all that apply”) forces me to consider each multiple choice answer as a true/false statement. A surrogate record is a bibliographic record? True. A metadata record is a bibliographic record? True. A description of an information package is a bibliographic record? True. A catalog card is a bibliographic record? True. Oops, I mean… not true? There’s little room for nuance in a binary choice.

Likewise,

Chirographic refers to
a. physical handwriting
b. the “shadow style” used by Leonardo da Vinci
c. hand-typed manuscripts
d. manuscripts

I chose a; a and d together was the “correct” answer. How am I to interpret “manuscript”? I have several good dictionaries that have definitions of manuscript that include typewritten material, so not all manuscripts are hand-written—and the question reinforces this by including choice c, hand-typed manuscripts. So, does d mean all manuscripts, or just the hand-written ones, which could be properly considered chirographic?

Sigh.

In the end, it doesn’t really matter; these are worth a tiny portion of my grade. But really, they’re awfully designed.

24 May 2006

Thoughts on “Reading and Understanding Research”

The title of this post refers to the book Reading and Understanding Research by Locke, et al., assigned for my class in indexing and abstracting. It is, as you might guess, an introduction to reading “research” (we’ll get to what that is in a moment).

I’m going to suggest that there are a number of possible forms for such a book, among them these three:

  • a book about scientific research, discussing the scientific method, hypotheses, etc., and providing tools to distinguish between what’s scientific and what’s para- or pseudoscientific, which might be either for a lay audience or for undergraduate and graduate students in natural and social sciences
  • a book for skilled professionals (or students of skilled professions) such as nurses, counselors, teachers, and others, to understand social-scientific research that is applicable to their profession
  • a book about critical thinking generally, for a lay audience or students to apply in understanding academic research across many disciplines in the natural and social sciences and humanities

The back-cover blurbs on this book lead one to expect the first sketch above, and so does the introduction of the book. The focus is explicitly said to be on empirical research and to exclude history, philosophy, and other non-empirical lines of study.

However, I find the treatment less than satisfactory, if this is the goal. Whether the audience is a layperson or a student of science, the descriptions of scientific research are woefully inadequate. The word “hypothesis” is barely mentioned, and the concept of whether the hypothesis is testable doesn’t seem to be anywhere at all. The scientific method is loosely discussed, but it’s never called the “scientific method”. There is some discussion of “shoddy research”, under which I suppose pseudoscientific research would fall, but there’s very little discussion of how to actually detect this.

I’m not exactly trying to say the book is bad, or not useful, however. On reflection, this book is actually the second of the sketches I made above. The authors are all professors in Education departments, training teachers, and the book really appears to be aimed at their students. They want teachers to be able to read articles from psychology and sociology and other related disciplines and apply them to their work (and it could be pretty useful to librarians seeking to do the same thing, if you’re not familiar with how research works in the social sciences). This is a fine goal, but the back-cover blurbs and introduction really oversold me on what the book was all about.

As for its applicability to my indexing and abstracting class, with all due respect to my instructor, I think it could be better. I think the third of the sketches above would serve best, because it applies across all academic disciplines we might be called on to index or abstract. Why focus purely on empirical research and exclude philosophy, literary criticism, history, and so on? Although they might not be “scientific” per se, they’re analytical nevertheless, and a good book on critical thinking as applied to academic discourse should exist somewhere out there in the bibliographic universe… Now I should go find it to recommend for the future…

31 December 2005

Digital library chugging along on Rails

I took a digital libraries course this past semester, in which we learned all sorts of things about usability, accessibility, interoperability, and all the other things that digital libraries and other web applications ought to have.

At the end of the semester, we were charged with the task of actually creating a digital library in groups. Since no programming knowledge was required for the course, we were expected to use software such as Greenstone or DSpace, both open-source packages designed specifically for creating digital libraries.

Most groups chose to use Greenstone, because it’s easy to install and use. The flipside of “easy to install and use” is that the software is largely a black box, difficult to customize to support the desirable features in a digital library we’d spent the whole semester learning about. (And as it’s written mostly in C++, so you need pretty good programming chops to hack away at it.) One group chose DSpace, which has the advantage of being written in easier-to-penetrate Java, but it’s more difficult to install and set up than Greenstone.

Our Project

One of my group members was working at the library at Point Park University, which has a theatre conservatory, and we were interested in pulling together materials on plays and productions, from scripts to playbills to reviews. In thinking about the design, we took cues from other databases/digital libraries such as IMDb and the theatre databases available from Alexander Street Press.

As we sketched out the design of our digital library, we saw the potential for rich interconnections among the data (authors, plays, productions, theaters, directors, actors, etc.), and we saw that Greenstone and DSpace wouldn’t serve us well without hacking them into unrecognizable forms, which none of us has the appropriate skills for. (I’ve read Dorothea’s accounts of taming DSpace, and she’s using it for its intended purpose. I had no desire to get entangled in attempting to half-rewrite it in the course of a several-week project, and learn Java at the same time.)

Don’t get me wrong, Greenstone and DSpace are excellent pieces of software, and they certainly have their uses. But both are tied to a “bibliographic record + item” paradigm, in which there is some metadata (title, author, etc.) that describes a digital document. (DSpace’s primary purpose is actually for institutional repositories.) Our data just didn’t fit this paradigm. So what to do?

Well, the short answer is, we need a database-driven application. The long answer follows.

Selecting Software

There’s PhiloLogic from the University of Chicago, the software on which the Alexander Street Press databases mentioned above are built on (as are a number of other databases, such as ARTFL, which PhiloLogic was originally written for). This is great stuff, but it relies on texts marked up in TEI, an XML scheme for literature and other purposes. In our project, we’re using public domain texts, some from places like Project Gutenberg and other we’re scanning from books. Marking all these up in TEI would have been awfully labor-intensive for this project.

So my group put all their faith in me as I turned to the so-called “full-stack web development frameworks”: Ruby on Rails, TurboGears, and others. I first heard about these by reading about TurboGears on dchud’s work log, and was later blown away by the incredible Ruby on Rails video.

The idea behind these frameworks is to make it easy for lots of people to create web applications. A bunch of really smart programmers got together, cooked up a framework that handles the whole thing from end to end—the database, the business logic, the display views, and all—and package it up so it’s much easier to use than trying to string all those together by yourself.

Constructing the Application

I settled on Ruby on Rails (”Rails” for short), because Ruby, the programming language it uses, seemed similar to PHP, which I’ve had at least a little experience with from tweaking the templates to this blog.

Now, before I go into the nuts and bolts, let me just say, IANAP (I am not a programmer). I got a computer from Radio Shack when I was 8 and learned all about BASIC; I have an abstract understanding of logical structures from being a math major; and I am good with HTML, XML, and CSS. That’s it. I’ve never taken a programming class, written even an absurdly simple application on my own, nada. I am, however, a big subscriber to the “beat on it with a rock until it works” philosophy of computer programming (described wonderfully by Dorothea on Caveat Lector). And this is where the Rails framework is great: it has a feature called “scaffolding” that automatically sets up the basic structure of the application, including all the simple functions like viewing, adding, editing, and deleting records. No need to create something from scratch: have Rails create the scaffold, then beat with a rock until it’s the way you want it.

The basic steps were as follows:

  1. Create a database.

    I don’t really know much about SQL, and I didn’t really understand relational databases (being a hierarchical, XML kind of guy). Fortunately, it’s really easy to set up MySQL with the binary installers they’re now providing and GUI interfaces for administration (MySQL Administrator) and table creation and data entry (YourSQL). (YourSQL is for Mac, but similar things exist for Windows.)

    To create the database, you make tables for all the kinds of data you have, and name them with plurals (plays, authors, actors, etc.). Rails is smart enough to figure out that this means there are individual records for a play, an author, an actor, etc., and it creates the scaffolding for each kind of record based on the columns in the table (an author has a name, birth and death dates, etc.).

  2. Tell Rails about how the data is related.

    In the scaffolding, there’s a “model” for each of the types of records in the database. This is simply a file in which you tell Rails how the data are related. For example, here’s the model play:

    class Play < ActiveRecord::Base
        has_many :productions
        belongs_to :author
        belongs_to :genre
        has_many :characters
    end
    

    All I had to do was supply the has_many :productions-type lines, and include columns in the tables to contain the id of an associated piece of data. (For example, the production table has a play_id column.)

  3. Enter the data into the database.

  4. Mark up templates.

    The scaffolding creates templates that use HTML, CSS, and some special Ruby markup that tell Rails where to drop in the data. Then you can hack away at these to get them to look and behave the way you’d like. Here’s an example (the Ruby commands are in the <% %> parts):

    <p>< %= @play.description %></p>
    <p><b>By:</b> < %= link_to @play.author.name %></p>
    <p><b>Genre:</b> < %= link_to @play.genre.name %></p>
    

    This is pretty easy if you already know HTML. I caught on right away and found myself doing more and more complicated things pretty quickly, because it’s easy to experiment—just try it out and reload the browser.

    Also, I never really understood object-oriented programming until I saw how Rails treats the data. It uses a system called ActiveRecord (which you can see is being called on in the model above) to make the database look like objects, in the object-oriented programming sense. In the example above, @play is the current play, so @play.author finds the play’s author (because play belongs_to :author), and @play.author.name gets the author’s name from that column in the table. Rails even understands plurals, so @play.characters returns an array with all the characters associated with the play (because play has_many :characters). Cool!

Okay, so I’ve simplified this a good deal. I’m not claiming just anybody could walk in off the street and write a web application using Ruby on Rails. It does take some mucking around on the command-line (although that’s helped by GUI packages for MySQL mentioned above) and it does take a little basic programming. But it does make it way more accessible than previously for non-programmers and amateurs to write web apps.

One more thing: this all needs a web server to make it go. Rails provides a lightweight web server of its own, but for Mac OS X, there’s something even easier. A package called Locomotive gives you a GUI for creating new Rails projects and running the webserver.

Bonuses

Ruby on Rails lets us do lots of things that are really hard with relatively opaque systems like Greenstone and DSpace.

The template system makes it highly extensible—adding support for interoperability standards is easy. Want a Dublin Core record for every item? Just make a template and have Rails fill in the appropriate information. Want to add OAI-PMH or COinS-PMH support or anything else? Just do it in the template.

It’s also easy to consume the web services of others. (Here comes the part that really wowed our classmates.) Part of our data was theater locations, and what better way to represent these than a map? Google Maps offers an API which is pretty easy to implement itself, but from Carol at Rawbrick’s airport map I found Google Maps EZ, which made my work even quicker.

The Demo

Okay, here are the goods: http://plays.dystmesis.com. Check it out. For now, you can only browse, because I never got around to making a search function work before this was due. Be sure to check out the map.

The collection is actually a selection of plays that opened in New York City in 1920. We chose these because there were lots of related public domain materials. This is just a sample collection to demonstrate the power of the architecture.

I’ve turned off write access to the database, but I’ve left the links to edit, add, and delete records exposed so you can check them out. I didn’t have a chance to improve on the scaffold forms for creating and editing records, but you can see that the scaffold already does a lot for you.

Conclusions

Open frameworks like Ruby on Rails and TurboGears are making web applications easier than ever, and they’re only likely to improve with time. As librarians who want to make materials available digitally, we should be aware of them and willing to roll up our sleeves and get our hands dirty. I highly recommend it.

There are lots of tools out there that we can make use of that don’t necessarily require too much programming knowledge. Take a look at Aaron’s Western Springs History Project using WordPress, which is designed to be blog software, but works pretty well as a content management system/database application/digital library too. Check out other content management systems like Drupal, Plone, Mambo, or PostNuke. (Ann Arbor Public Library has done spectacular things with Drupal, albeit with bona fide programmers on staff.) And don’t let me put you off Greenstone or DSpace either, because they’re good pieces of software if they’re the ones you need. (Greenstone developers are also working on a new Java-and-XML based version that promises to be more hackable.) Don’t be afraid to beat on things with rocks!

28 October 2005

Whew

Based on discussions from classes, several posts have been cooking in my head, so those will be coming soon. I haven’t had time to post because I had the following things due this week:

  • an annotated bibliography of articles in the library literature, which I did focusing on OPAC research over the last five years (thoughts on this forthcoming)
  • a reference assignment using encyclopedias and similar reference materials (I loved this)
  • a searching assignment using Dialog (I actually didn’t mind its CLI or strange archaic qualities)
  • an observation of reference transactions in a library
  • a plan for a group project on digital libraries (looks like we’re working with a local university library on some theatre collections; definitely more on this later, too)

Yeah, so I don’t really mind coming to work today, having two meetings and three different projects, because it’s really much less than I had to do earlier in the week. Now I’m “free” this weekend to work at i-fest (a series of events to promote the School of Information Sciences), finish roofing the shed, catch up on my reading, go to a Halloween party…

4 October 2005

What I Learned In School Today

I’m about five weeks into library school, and I’ve learned tons. Among the highlights:

  • It’s frickin’ freezin’ in the IS building, Mr. Bigglesworth. Except when it’s hot and stuffy. Note to self: wear layers.
  • No one has mentioned Ranganathan yet. Melvil Dui, of course, and even Paul Otlet, another forgotten library forefather, whose name popped up in the biblioblogosphere recently. What gives? How can I be an effective librarian without the Five Laws?

Seriously, though, I have been learning things at a nearly exhausting pace.

I am taking a fantastic reference class with an instructor who is terrified she’s going to be mentioned on a blog somewhere—she just found un-flattering descriptions of herself on a former student’s blog. Well, I think she’s fabulous, so I’m mentioning her, though I’ll refrain from using her name.

I’m also taking introductory courses with completely opaque names like Understanding Information (the general theory of library-ness) and Retrieving Information (searching and retrieval) and a course on digital libraries. All very exciting. And working as a graduate student assistant. And working at my old job. All of this makes me feel pretty much whelmed, but it’s all manageable for now. And I can’t wait to be a real librarian.

31 August 2005

Why I already hate Blackboard

My school uses Blackboard, a learning management system (LMS). Basically, an LMS lets instructors and students have an online place to exchange information—instructors post announcements, syllabi, and readings; students drop off files for assignments; everybody discusses class topics in discussion boards.

It would be so lovely (for me, at least, and I suspect for others) if there were RSS feeds for the announcements and discussion boards. But there aren’t.

I’m expected to check my courses’ Blackboard sites regularly for information from my instructor. Some of my classes require me to post to discussion boards on a topic and respond to other students’ comments. It would be super if I could get these things the same way I get all the other often-updated websites I monitor 342 times a day, but I can’t. I have to click, type, login, click, etc. to get to a butt-ugly website with the announcements instead of it automagically appearing in my pretty and elegant newsreader (Sage with a Safari-like skin).

I’m half of a mind to write something to screen-scrape the Blackboard pages to get the announcements, but IANAP and I feel that beating that into submission would probably take precious time away from what I’m actually supposed to be learning.