• Following no one
  • Nothing Unseen
  • Someone Random
  • Everyone
  • Events
 

Ed Summers   -   Follow Person   Add Another Site   Edit
edsu   
Twitter
@TomTague openid? 2 days ago
RT @w3c: W3C lets XHTML2 WG expire, puts energy into (X)HTML5... FAQ: http://www.w3.org/2009/06/xhtml-faq 2 days ago
@azaroth42 "Berkeley DB XML supports XQuery 1.0 and XPath 2.0 ..." http://is.gd/1lCR7 ; not seriously thinking of using it though :-> 2 days ago
identi.ca
@mjgiarlo thanks for the pointers re:twhirl on linux - i'm back in the saddle now too 200 days ago
@danbri what does the library do? 205 days ago
@dchud the idea of me architecting anything is crazy-scary, i think we're all just on this crazy journey together yo 206 days ago
inkdroid
canonical question   50 days ago

As the last post indicated I’m part of a team at loc.gov working on an application that serves up page views like this for historic newspapers–almost a million of them in fact. For each page view there is another URL for a view of the OCR text gleaned from that image, such as this. Yeah, kind of yuckster at the moment, but we’re working on that.

Perhaps it’s obvious, but the goal of making the OCR html view available is so that search engine crawlers can come and index it. Then when someone is searching for someone’s name, say Dr. Herbert D. Burnham in Google they’ll come to page 3 in the 08/25/1901 issue of the New York Tribune. And this can happen without the searcher needing to know anything about the Chronicling America project beforehand. Classic SEO…

rest, the semantic web and my feeble brain   50 days ago

Imagine you were minting close to a million URIs for historic newspaper pages such as:

http://chroniclingamerica.loc.gov/lccn/sn85066387/1898-01-01/ed-1/seq-1/

for pages like:

The web page allows you to zoom in quite close and see lots of detail in the page:

Now lets say I want to describe this Newspaper Page in RDF. I need to decide what subject URI to hang the description off of. Should I consider this Newspaper Page resource an information resource, or a real world resource? The answer to this question determines whether or not I can hang my description of the page off the above URI, for example:

<http://chroniclingamerica.loc.gov/lccn/sn85066387/1898-01-01/ed-1/seq-1/> dcterms:issued "1898-01-01"^^<http://www.w3.org/2001/XMLSchema#date> .

Or if I need to mint a new URI for the page as a real world thing:

<http://chroniclingamerica.loc.gov/lccn/sn85066387/1898-01-01/ed-1/seq-1#page> dcterms:issued "1898-01-01"^^<http://www.w3.org/2001/XMLSchema#date> .

AWWW 1 provides some guidance:

VocabularySoup (1)   100 days ago

It’s been great to see RDFa being picked up by web2.0 publishers like Digg and MySpace. You can use the RDFa Distiller to extract the RDFa from a given web page u by constructing a URI like:

http://www.w3.org/2007/08/pyRdfa/extract?format=turtle&uri=u

Which translates kind of nicely into a command line utility to add to your ~/bin:

#!/bin/sh curl "http://www.w3.org/2007/08/pyRdfa/extract?format=turtle&uri=$1"

So with that little shell script in hand I can now look at the RDFa something like Yo La Tengo’s page on MySpace:

ed@rorty:~$ rdfa http://www.myspace.com/yolatengo @prefix myspace: <http://x.myspacecdn.com/modules/sitesearch/static/rdf/profileschema.rdf#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xhv: <http://www.w3.org/1999/xhtml/vocab#> . @prefix xml: <http://www.w3.org/XML/1998/namespace> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <http://www.myspace.com/YO LA TENGO> a myspace:MusicProfile ; myspace:profileType "Music" . <http://www.myspace.com/yolatengo> xhv:stylesheet <http://x.myspacecdn.com/modules/common/static/css/global_j03fjftp.css>, <http://x.myspacecdn.com/modules/common/static/css/header/profileheader008.css>, <http://x.myspacecdn.com/modules/common/static/css/myspace_jvtnwmp4.css>, <http://x.myspacecdn.com/modules/common/static/css/profile_adl4r-y8.css>, <http://x.myspacecdn.com/modules/profiles/static/css/musicv2_wo4zzzd-.css> ; myspace:addToFriends <http://friends.myspace.com/index.cfm?fuseaction=invite.addfriend_verify&friendID=91362837> ; myspace:friendCount "33993" ; myspace:headline "\"<b>YO LA TENGO IS MURDERING THE CLASSICS</b>\""^^rdf:XMLLiteral ; myspace:photo <http://viewmorepics.myspace.com/index.cfm?fuseaction=user.viewAlbums&friendID=91362837> ; myspace:sendMessage <http://messaging.myspace.com/index.cfm?fuseaction=mail.message&friendID=91362837&MyToken=62964687-f06b-4b8b-8227-ba97f133a029> ; myspace:viewPictures <http://viewmorepics.myspace.com/index.cfm?fuseaction=user.viewAlbums&friendID=91362837> .


Learn about this site | Contact
API | Source Code
© 2008 Christopher Blizzard