Is Metadata Magic?

September 20, 2012 by Thad McIlroy

When I began researching The Metadata Handbook I believed that metadata was magic. I believed that if you could add rich and accurate metadata to a title listing you’d all but guarantee big sales. I’ve since learned that while metadata is enchanting, its powers are far more down to earth.

It’s easy to see why metadata gets mistaken for magic. It’s complex, extremely complex to execute extremely well. Conversancy with ONIX, the metadata standard, requires an appreciation of XML and DTDs and ISTCs and…you get the point. It was Arthur C. Clarke who first said “Any sufficiently advanced technology is indistinguishable from magic.” ONIX-based metadata nearly qualifies. But it’s not magic. It promises to make your books discoverable, and discoverability implies many new sales once your wonderful books are found.

Room for 10,000 more volumes

What this fails to take into account is the sheer volume of books published. Laura Dawson recently revealed that there are 32 million books in print. The Library of Congress “receives some 22,000 items each working day and adds approximately 10,000 items to the collections daily.” Amazon added 56,477 new Kindle ebooks in the last 30 days.

These volumes have the impact of all but extinguishing the magical flame of discoverability. Yes, metadata will make your book discoverable – at the same time it’s making another 10 or 20 or 30 thousand books discoverable. You choose.

In a recent post I considered titles on Amazon that fall under the heading “baking bread.” Amazon offers 1,515 titles in this category. When I search on Google for baking bread I need to wade through 47 results before I hit the first book, Emmanuel Hadjiandreou’s How to Make Bread. Most of the other results link to cooking sites and to how-to videos on YouTube. To be fair, the same search on Bing offered Beth Hensperger’s Baking Bread: Old and New Traditions as the 17th result.

Over at O’Reilly Tools of Change Joe Wickert notes that “today’s search engine access is generally limited to our metadata, not full book content. As a result, books are at a disadvantage to most other forms of content online.” This leads him to the question: “At what point do we expose the book’s entire contents to all the search engines?”

I think that Joe has initiated an important discussion. A couple of points.

Paradoxically, the content of a book is also metadata about the book. Look at it this way: we’ve already got defined metadata fields for “table of contents”, “excerpt” and “index”. The entire text of a book is just a very long excerpt from within the book. What better metadata about a book than all of the words and ideas contained in the text?

The topic is controversial only, I think, because of concerns about theft of the content, that the book will be read online without payment. If you could index the content of a book so that it showed up in search engines without making it simple to download or read the book then why not just do so? There are lots of ways to handle this technically, whether using JavaScript,  tagged images or… (technical experts chime in here). At the very least let Amazon and Google index the contents.

To me this points to a topic that’s fallen off the radar of late: Creating a great website for every new book.

The Big Six publishers are still creating crummy websites, even for their big-name authors. Looking at a random sample of a few titles on the bestseller list this week:

1. How Children Succeed by Paul Tough. The site looks pretty, but there’s not much content.

2. A Father First: How My Life Became Bigger Than Basketball, by Dwyane Wade. A minimal text description and a video.

3. The Party Is Over by Mike Lofgren. A tiny excerpt from the book plus four reviews.

I’m not trying to single these out. I could find hundreds more just like them. They represent the standard publisher web site treatment for a new title.

Sure it’s hard work to create a good web site. But mostly it’s a challenge for the imagination. And it’s surely no harder than writing and publishing the book.

If authors and publishers want to maximize their sales opportunities in a desperately crowded market they’ll make sure they get the metadata right by controlling the one instance entirely within their control: the book’s web site. And they should take full advantage a book’s most potent metadata: the full text.

Tags: , ,



  • Writer Unboxed » ‘Social’ Media: How Digital Is Your Reading?

    Sep 22nd, 2012 : 3:30 AM

    […] His piece is headlined Is Metadata Magic? […]

  • Peter_Turner

    Oct 17th, 2012 : 2:42 PM

    Hi Thad: I saw your comments over @BrianOleary ‘s blog and thought I’d track back. I hate to put myself in the position of defending publishers, but I expect the true reason why author sites are so lame is that publishers don’t see them as a cost-effective way of driving sales. Wrongly, I believe, they (by and large) want to leave that to third-party retailing platforms.

  • Stephen

    Nov 8th, 2012 : 12:49 PM

    Hi Thad,

    I’d be really interested in your thoughts on metadata for fiction, as all the articles I read on metadata focus on non-fiction – such as in your example of baking bread. If
    I publish a book on a boy wizard going to a magic school, for instance, how would you see metadata helping in sales? Possibly a dumb question, but not from the perspective of someone who buys fiction not by searching Amazon for new books on boy wizards – because seriously, does anyone do that for fiction? – but by browsing within a genre, or searching for a particular title or author who we’ve already heard of, or winners of awards etc…

    I can see that some people would search for a novel set in their home town, perhaps, or a few specific instances along those lines. But in general I’d say that fiction isn’t found in the same way non-fiction is, and I’m baffled about where metadata fits into the mix.