President Obama’s Metadata Problem

August 7th, 2012

When the word started circulating that I was born in Toronto, Canada and merely a dual Canadian/US citizen it dashed my hopes of being elected president of the United States. But the furor was nothing compared to what President Obama has been subjected to in proving his birth in Honolulu, Hawaii in 1961 (Gosh. He’s 5 years younger than I am.)

The “birthers” as they’re called, have been obsessing for years convinced that the papers proving the president’s U.S. birth are somehow faked. What makes the story of particular interest to publishing is that the problem is being called out as a metadata issue.

Metadata for books is a complex subject that I’ve been covering in a number of blog posts, for example a recent look at SEO and metadata.

Metadata for one-page documents is far more common and far less complex the metadata for books. Anyone who uses software like Microsoft Word and Adobe Acrobat (or other PDF creation tools) creates metadata every time they save a document. Most people don’t even know they author metadata. But it’s there. Let’s look at some.

On April 17, 2012 the Office of National Drug Control Policy issued a two-page press release (Word file) called “Fact Sheet: U.S. Drug Policy.” The press release is a Word “.doc” format file. It doesn’t say so in the text of the press release, but it was authored by Rafael Lemaitre who works in the Executive Office of the President (EOP). He started writing the press release on April 12 and revised it twice, the last time in a four-minute session. It’s all right there in the document metadata.

A web search confirms that Mr. Lemaitre is the Communications Director in the Office of the National Drug Control Policy (ONDCP) which is part of the EOP (although he may wish to revise his LinkedIn bio).

President Obama’s birth certificate can also be downloaded (as a PDF file) from the Whitehouse web site. A quick glance at the metadata in the file shows there’s not much there, just an indication that the file was created on a Macintosh computer directly from the operating system, without additional software.

As the editors of the National Review point out in their second attempt to debunk the birther claims, “one of the features of a really good conspiracy theory is that very lack of evidence for the theory is taken to be yet more proof of the conspiracy.” The National Review describes itself as “America’s most widely read and influential magazine and website for Republican/conservative news, commentary and opinion,” so you might expect them to be almost eager to go along with the birthers’ claims if they found merit in them.

The person most responsible for keeping the birther beat buzzing is 80-year-old Maricopa County (Arizona) Sheriff Joe Arpaio, whom Rolling Stone last week called “America’s meanest and most corrupt politician.” The Sheriff has decided it’s somehow his responsibility to keep chasing the wild goose a sent a “posse” off to Honolulu to insult Hawaii’s state record keepers.

This was followed by the release of a new report from graphic artist Mara Zebest, one of six co-authors of the 2001 tome Inside Adobe Photoshop 6. She explains that she examined “the metadata and object code of Obama’s long-form birth certificate PDF file and explains how this information corroborates the claim that Obama’s PDF file never originated as a paper document, but rather was born in cyberspace or was — to put it another way — digitally manufactured (italics hers).” On page 7 on the report she explains, correctly, that just because “Preview is the only program displayed in the metadata does not mean Preview was the only program used in the creation process.” She goes on to state that “resaving a file (to PDF) from within Preview will remove any prior metadata evidence.”

I get stuck on the word “remove.” Preview doesn’t deliberately remove metadata from scans or illustrations. It just bother reading it. If you cared about metadata you’d use a fancier tool than a Mac’s built-in Preview function. And if you wanted to fool with the metadata in a PDF file, there’s ways to do that as well.

PDF metadata can be verbose. Indeed the infamous 2006 pink underwear press release (PDF) from Sheriff Joe’s office provides lots of detail on the document author and the software used to create the file.

But as with Ms. Zebest’s report, verbosity is no substitute for meaning. The metadata in Word files and in PDF documents can enable functionality for users intent on mining its strengths. But as with book metadata, the information that slips casually or inadvertently into a file offers modest benefits. Just try to spell your name right — that’s worth five bonus points at a clan rally.