Our basic idea for entry into the 2012 “Rails Rumble" was simple; build an API that reads and writes ISBNs, creating a basic catalog of associated bibliographic information in the process. There’s a lot of sources of ISBNs and bibliographic data out there. Our idea was to poll these sources and offer a simple, streamlined API that was based mostly on the ISBN rather than the idea of the book itself. A clean and clear data stream, ambitiously targeted on every ISBN in the world.
But like any simple idea, complexity lurked just below the surface.
Though ISBNs are issued by a central agency, the “meaning” of these numerical strings is not particularly organized. The process begins, at least in the US, with a publisher purchasing a block of ISBNs. The publisher assigns these ISBNs to their products. Though primarily “books” a publisher’s products might also include associated supplemental material such as a CDROM accompanying a biology textbook - or a plastic wand bundled with a Harry Potter book. Books are products first and books second, if at all.
The ISBN encodes a few facts about the product. The 13 digit string checks out as an EAN— European Article Number — an international standard despite its provincial name. The opening string 978 tells us this product comes from “bookland.” 979 also signifies bookland, but no ISBNs have yet been assigned to this expansionary prefix.
A following string identifies a designated country or language region. Following this string, the publisher can be named. Big publishers, such as Random House or Penguin or Oxford University Press, purchase big blocks at a time. Smaller publishers purchase purchase small blocks of ISBNs or even a single number and thus have longer identifying strings. Finally, a check digit at the end of the ISBN can be calculated against the full number to verify that the EAN is in fact in a valid format.
And that’d the limit of what an ISBN can reveal, more or less. The remainder of the bibliography is paratextual to the ISBN; alien.
Which brings us back to our 2012 Rails Rumble project. Building records of and about ISBNs is a cataloging task. Every catalog is built with degrees of bias and blindness. The literature of Library Science revolves around catalogs and cataloging. As a discipline, Library Science began as the Computer Science of the predigital information age. When paper was the primary machinery of information, the catalog was (and remains) paper’s database.
Seymour Lubetzky was a metaphysician of data circa mid twentieth century library science. His essays, though devoted to obsolete technologies such as the card catalog, remain relevant for their ability to get to the essence of information storage, organization and retrieval. For Lubetzky, the library begins with its catalog, without a catalog the library is an inaccessible collection of material. The start of cataloging though is the opening of prejudice:
The book (i.e. the material record) and the work (i.e. the intellectual product embodied in it) are not coterminous; that, in cataloging, the medium is not to be taken as synonymous with the message; that the book is actually only one representation of a certain work which may be found in a given library or system of libraries in different media (as books, manuscripts, films, phonorecords, punched and magnetic tape, braille), different forms (as editions, translations, versions), and even under different titles.
Lubetzky continues to anticipate and describe the problems of building a library that best serves the user, a library of works rather than objects.
For the Rail Rumble project, we took Lubetsky’s warning as an invitation to simplification. Rather than attempt to build bodies of work from an ISBN’s metadata, we took the ISBN as proof of object itself. The finished project, ISBN.IO is just that; a solid known fact, the ISBN along with place holders for such trailing incidentals as Title, Author, Page Count.
The API allows trusted users to write ISBNs and submit paratext. Conflicted paratext is checked against previous entries; we attempt to establish the most correct information. For example, if two of three writers to the API prefer William Shakespeare to Wm Shakespeare, we keep the more popular expression.
As an thought exercise, the Rails Rumble forced us to consider a key area of our business, the ISBN, as both an abstract and material entity. As a practical product, it deserves a prize as least front facing entry in this years rumble. But its humble function is open source, public and, given some time, could touch upon every ISBN, the citizens of bookland. And who know’s perhaps it will appear as backend for an interesting project for Rails Rumble 2013 such as fellow entry “Ideal Copy,” a user of our related Ruby Gem, Vacuum.