Sloppy eBook Conversions in the Spotlight

October 8th, 2011

Ebooks have always had formatting problems. They usually look like they were assembled by a six year old using an Etch A Sketch.

If the book was all simple text, the problems were often no worse than bad line spacing, missing characters and unexpected typographic artifacts, a dingbat here and there.

If the page design contained extra elements, like tables or charts or a dash of poetry, the odds were that what you saw on the screen bore little or no resemblance to what was in the handsomely printed version.

Writers & readers have complained plenty (see my partial listing of links below). But except for on a few dedicated ebook forums, the complainers were isolated voices easily ignored by publishers and ebook resellers.

All that changed a week ago when Neal Stephenson’s readers got angry. It’s not a good idea to screw up the ebook version of a brand new Neal Stephenson novel. He is one of the premiere speculative fiction writers of our time, enormously respected and a big-time bestselling author.

His new novel, Reamde (a play on “Read Me”) is described by publisher HarperCollins thusly:

The highly inventive, brilliant author of Snow Crash, Cryptonomicon, and Anathem returns with his most accessible novel to date, a high-stakes thriller in which a wealthy tech entrepreneur gets caught in the very real crossfire of his own online fantasy war game.

Reamde was released on September 20 to generally strong reviews. The first printing was 250,000 copies, and it’s at #11 on the Publishers Weekly 10/10/2011 hardcover fiction bestseller list.

Beyond Stephenson’s great track record over 13 novels there are two other major reasons why a publisher should be extra careful with a Neal Stephenson ebook.

The first is that speculative fiction readers tend to hang on ever word in a beloved author’s work. If there’s a typo they try to figure out if it’s intentional and what the meaning might be. A posting on Google+ illustrates the challenge:

At the reading Stephenson did in SF last week, someone asked him if the typos and errors, there being so many of them, might be intentional and part of some kind of code. He answered something along the lines of, “people thought the same thing about Cryptonomicon, but there wasn’t a code, just a lot of typos. But if there were a code, I wouldn’t tell you.” Long Pause. “But there isn’t.”

There’s perhaps an even bigger cause for caring about Stephenson’s ebook fidelity. Not merely a major author, Neal Stephenson is also CEO of a new publishing startup, Subutai Corporation. His major investor: Jeff Bezos. (He has also worked as an advisor to Blue Origin, another Jeff Bezos company, developing a manned sub-orbital launch system.)

So no one could have been pleased when the comments started appearing on Amazon’s Reamde page complaining about the obvious problems in the ebook.

Of course the only way to fix an ebook file full of errors is to replace it with a brand new file on Amazon’s servers. Which the publisher HarperCollins did. And then Stephenson’s fans really got angry. The major new offering from a favorite author was yanked from Amazon’s servers without notice. Their Kindle bookmarks were lost.

Worse still: what had changed between the two versions? No one was talking. Could a page or two have disappeared? Or an additional paragraph stealthily inserted?

According to a CNET report:

“After reading over 500 pages of this great book, Amazon tells me there was ‘missing content.’ After a live chat and talking to 2 support people, they won’t tell me what was missing, how much, what type of content, or why,” seethed reader cdale77.

And Cynthia Ewer vented, “As of this morning, I’m about 40 percent through the book – and I just received a notice that my Kindle edition was ‘missing content,’ and would be replaced… It seriously damages the reading experience. I’ve invested many hours in the book, overlooking various format errors along the way. Now – without more – I’m told that what I’ve read is incomplete. Do I begin again at the beginning? Do I plow on? Either way, the reading experience is fatally tainted.”

An upside of being a writer of speculative fiction is that if your publisher pulls your ebook file and replaces it with a new one you can be sure that one of your readers is going to hack the files and analyze what’s changed.

God bless the diligent and skilled pseudonymous “jetmore”. When he got the new file he “ninja’d up a text copy of both the old and new version, then massaged them a bit to make diffing easier.” And what did jetmore discover? Not a whole lot, mostly the removal of unexpected hyphens. But a few words were missing here and there, as was one whole sentence: “In Russian, Csongor said, ‘What if we need to go into the world of T’Rain?’” Neal Stephenson readers care.

None of the reports suggest that these errors occur also in the print version – they appear to have resulted from the conversion of print-ready composition files into ebook file formats. Certainly that’s what causes most 99% of ebook file problems: insufficient attention to detail while translating from print-ready PDFs.

Not that ebook conversions are easy. I sympathize with publishers who have been struggling to get their backlists rapidly online during the past few years. There are too many things that can go wrong. And while publishers generally rely on professional conversion services, it’s too recent a procedure for best practices and QA processes to have been put in place. Amazon makes it extra tough by forcing publishers to convert their files separately to its brain-damaged Mobi file format when the rest of the industry has now thoroughly settled on the single EPUB standard.

But there’s enough money in ebooks now to make it worthwhile for publishers to take the time and spend the money to do it right. They owe it to their writers and they owe it to readers.

New and Improved on the Kindle

An Assortment of Links

Here are a selection of postings that consider the challenges of ebook formatting, on the Kindle and elsewhere:

Reamde, Pottermore, Hyperion and other mistakes from publishing “experts”

Bob is upset about these problems, and he’s “tired of giving away the expertise for free.”

10 things Amazon should correct in the Kindle

Guido Henkel spots the problems and is very generous with his advice.

A typographic critique of the Kindle

A river runs through it: “From a purely visual, typography standpoint, I’d give the Kindle a C+. Good effort, but poor attention to detail.”

An Example of Bad eBook Formatting

“It’s just about all one block of text! What were they thinking?”

Automated Conversion to Ebook — Problems And Limitations

The problems are mostly from automated conversions. Here’s what happens and why.

What to expect from automated conversion to eBook

A link to a slide deck which spells the problems out clearly.

How You’re Gonna Get Screwed by Ebook Formats

Too many devices, too many formats and DRM.

Please Amazon, add a Report Bad Formatting Button on Kindle Book Pages

There is a method available now. Maybe that’s how Reamde was caught.

The difference between ebook conversion and ebook formatting

…including the 23-steps to get it right.


October 14, see also: Terry Pratchett’s “Snuff”: an ebook full of typos – HarperCollins charges premium Agency Pricing for unproofed ebook.

October 18: Artisan offers a good “meet or exceed” checklist for e-book conversions.