Using terms like “artisanal data harvesting,” the Faculty Technology Day keynote speaker Ben Vershbow, founder of the NYPL Labs, New York Public Library’s in-house tech startup, guided a rapt audience through ways humans and computers can better collaborate. Fordham’s IT sponsored the May 21 event.
Vershbow delivered a mile-high perspective of digital possibilities for an august institution like the NYPL, a message that resonated with Fordham faculty wrestling with similar issues of how to engage a digital audience.
He described the NYPL as a 19th-century institution striving to remain relevant in the 21st century. With a world-class archival map collection that began being digitized in the last decade, the library is seeking to change the static nature of those digital images and make the maps more dynamic and interactive.
Vershbow showed a digital image of 19th-century map of Lower Manhattan that lives online. He then explained how that map can be rectified with contemporary maps using Google Earth through a NYPL-designed website, Map Warper. The 19th-century map sits on top of the contemporary map, and a transparency dial can fade the older map atop the new map.
The result shows how extraordinarily accurate the 19th-century maps were, as the streets resolve almost perfectly with their contemporary cousins.
Such newly combined maps have the potential to be used with other data, for example, the recently released 1940 Census. Maps can then yield much richer information, such as who lived where and what they did for a living.
“We’re trying to drop pins in time,” said Vershbow. “If we open up the data, we can get the community involved.”
As the library operates on limited resources, Vershbow said that Map Warper is “a call to action,” to engage the public on the site to rectify maps for themselves. Once completed, the newly rectified maps are available for all to see, or make corrections if there’s a mistake.
NYPL has gone even further with crowd sourcing in other areas of its digitized collections. Vershbow said the library has a menu collection with approximately 45,000 menus dating back to 1848. However, as the fonts on the menus vary greatly, optical character recognition software is not be able recognize the all the “funky fonts.”
In response, NYPL has set up a “What’s on the Menu” site and has invited the public to transcribe the menus. Menu items can then be linked to recipe sites like Epicurious.
In much the same way that Wikipedia asks its community to help with quality control, the NYPL calls on the public to check the veracity of its data, he said. This nuanced “artisanal data harvesting” can only be accomplished through human and computer cooperation. The quality control may not be perfect, but the people who delve into the site are a conscientious group. Like other forms of volunteerism, participants uphold certain civic ideals.
With the preservation and digital activation of the older collections underway, an obvious question is beginning to emerge: what about contemporary collections? The letters of Percy Bysshe Shelley have their own hallowed room within the library, but what’s one to do with the Timothy Leary’s floppy discs if the information on them can’t be accessed?
Verhshbow expressed a concern for preserving “tomorrow’s past” which will include not just the emails of Salman Rushdie, but newer writings that will exist on the digital cloud. Media archeology labs are being developed at places like Emory University, but New York City still lacks a significant preservation lab, he said.
He warned of an “information hole” that could develop should the academic communities rely on the promises of technology companies to preserve important information.