RDF for all of O’Reilly’s titles (with OPMI)

Posted by Knud on March 14th, 2009

I might be a bit late (one month) to discover this, but IT book publisher O’Reilly have recently started a service called O’Reilly Product Metadata Interface (OPMI), which provides RDF metadata for their whole catalogue of books. More details about this can be found on the O’Reilly Labs page.

I think it’s great news that a major publisher starts to open up their data to the Semantic Web! Term-wise, they do the right thing and use vocabularies that have turned into de-facto standards (FOAF and DC (terms) in particular), as well as some newly coined terms in their own O’Reilly namespace. They also get brownie points for actually making their namespace dereferencable. Good practice!

There are a few things that could be improved to make their data more useful, though:

  • They use non-http URIs like this: urn:x-domain:oreilly.com:agent:pdb:1210. That’s perfectly fine RDF, but it breaks the linked data rules – URIs like that are not dereferencable, which means it is impossible for interested agents to find out more about those resources.
  • Both the book URIs and the ontology namespace URI lead only to RDF. It would be nice if, upon a request for HTML, their servers would provide something human-readable as well. They acknowledge this problem themselves, so hopefully it will be addressed soon. Content negotiation to the rescue? For their vocabulary, these vocabulary publishing recipes might help (in combination with a tool like VocDoc).
  • The ontology source looks a bit messy, with weird namespace declarations like xmlns:p3="http://purl.org/dc/terms/#". These might be artifacts from the ontology editor they used, though. Not really harmful, just ugly.