Has XML Failed Publishing?

October 28, 2012 by Thad McIlroy

XML will be celebrating its official 15th anniversary on February 10, 2013.

By many measures it has been a huge success. There are thousands of XML dialects used across a vast swath of the sciences, in business and in ecommerce.

By the measure of the book publishing community it’s been less than a huge success.

How much less is a subject for conjecture: I can’t find anything resembling usable data on XML adoption in the book publishing community.

Part of the challenge of measuring XML adoption in book publishing is that there is no single book publishing community. It’s broken up into several groups. Even the divisions can be divisive. But let’s just try some groupings, roughly by increasing complexity and variety of page design.

1. Trade publishing (including children’s although page design is far different for this group)

2. Educational publishing (including both K-12 and higher education although page design is far different for each segment)

3. STM = Scientific, Technical and Medical (including reference, and journals)

XML’s proponents, a diverse group of wise and practical men and woman, mostly believed that XML was suited to all forms of book publishing. I was among those proponents.

But it didn’t happen.

The degree of XML adoption appears to be roughly inverse to the order above. In other words, STM has a very high degree of XML usage, educational perhaps 50/50, and trade publishing something less than 5%. That means XML has met mixed results in a group that appeared to be a real natural: educational publishing, particularly higher ed. And XML has failed altogether for trade publishers (with only a handful of exceptions).

I want to understand why.

I think the answer is straightforward. XML is tremendously powerful but it is far too complex. Every effort made to simplify XML workflows has failed to make them simple enough for most editorial and production workers beneath the STM level. And the benefits have been insufficient for management to make the investment to forcibly train the required personnel.

My research on XML comprises over 550 files and some 600 megabytes of data. Plenty. The first file dates back to 1999 when I made notes at an Adobe briefing in San Jose. John Warnock, then Adobe’s president, said that “there is a broad misunderstanding that XML and PDF compete. We do not view XML as a competitor to PDF, but plan to incorporate XML encoding in PDF, as in most Adobe products.”

It’s really easy! (© 2012, RSI Content Solutions)

A Google search on “Adobe and XML” today reveals that there is significant support for XML in Adobe Acrobat, Adobe InDesign and Adobe FrameMaker. In none of those products is the support complete. I guess it’s much better than nothing: people manage XML workflows using each product. But the inconsistency of Adobe’s approach to XML is to me indicative of the broader XML problem. When it comes to the mainstream of publishing the use case is muddled. The best tools for XML publishing are specialized and out of the mainstream of authoring and production.

Indicative also of the XML mess is Microsoft’s very inconsistent history with XML support. Microsoft Office 2007 was at one point intended as a showcase for XML in the mainstream of document publishing. This was to have been a very big deal for XML. It failed slowly and painfully. It’s unusual to hear the letters X & M & L mentioned in sequence at Microsoft today.

What To Do Now?

I see three kinds of XML people in publishing today:

1. Those who have mastered XML to their satisfaction and use it productively.

2. Those who are still struggling to make XML work.

3. Those who gave up, along with the majority who never bothered trying to make XML work.

Once again, I wish I had some hard numbers but they’re lacking.

My sense is that we’re looking at 4/1/95. In other words, out of every 100 publishing professionals, 4 have got XML helping with their workflows (in many cases not a “full” or “robust” XML workflow, but it does the trick).

1 in 100 is stuck in that painful purgatory where they’re trying to make a go of XML, either because they think it will help them, or because they’ve been ordered to do so.

And the 95% sleep soundly every night never dreaming of embedded elements. God bless ’em.

Where Do We Go Next?

It’s my conviction that the benefits of an XML workflow are so enormous and so compelling that we have to find a way to make it work.

I’m going to be co-presenting a webinar for Aptara on November 8th. It’s a bit of a wolf in sheep’s clothing. We’re calling it A Roadmap to Efficiently Producing Multi-Format/Multi-Screen eBooks, but it’s secretly about XML and publishing. Because, after all, how else are you going to efficiently produce content for multiple formats without XML?

XML can be simplified.

People try to simplify it today by adopting a subset of a full XML workflow. This is a good solution that works well for some. But it doesn’t address XML’s fundamental complexity.

We’ve come a long way since Tim Bray, Jean Paoli and C. M. Sperberg-McQueen turned the Extensible Markup Language (XML) 1.0 into a W3C recommendation.

It’s time to develop ABCML, the ABC of markup languages. Who’s with me?

Some Background

I list here a series of resources for those interested in XML, broken down by topic.

A. Introductions to XML

I’ve chosen a handful of the most simplified introductions to XML that I’ve found on the web (and one book). Those of you who are new to XML will quickly appreciate its complexity when you attempt to digest these simplified explanations.

1. Introduction to XML: This IBM tutorial from 2002 is fairly gentle.

2. A Gentle Introduction to XML: If you say so.

3. An Introduction to XML Basics: Steven Holzner’s 2003 guide on the Peachpit Press site.

4. A Designer’s Guide to Adobe InDesign and XML: This 2008 book by James Maivald and Cathy Palmer is extremely well written. It’s specifically aimed at designers, which is to say that it’s written in a language that a designer can (at least potentially) understand. So it’s as good as it gets, but I think you’ll still find it challenging. This is the acid test of whether XML can be made safe for families with small children.

B: XML as a Failed State

1. XML Can Go to H***: One Designer’s Experience with the “Future of Publishing” (2004): According to Susan Glinert, who bears XML battle scars, the future is not bright.

2. The Truth about XML (2003): Systems powered by XML might someday prove to be the standard for information sharing between businesses, but not in the near future.

3. What will it take to get (end user) XML editors that people will use? (2011): Norman Walsh is a prominent proponent of XML, but here looks at some of the challenges surrounding the tools and user expectations.

4. XML Fever (2008): An in-depth piece from ACM, “This article is about the lessons gleaned from learning XML, from teaching XML, from dealing with overly optimistic assumptions about XML’s powers, and from helping XML users in the real world recover from these misconceptions.”

Tags: , , , , , ,



  • Guest

    Oct 28th, 2012 : 10:20 AM

    Don’t abandon hope, all ye who enter here.

  • David Blyth

    Oct 31st, 2012 : 7:55 AM


  • Bill Kasdorf

    Oct 28th, 2012 : 10:43 AM

    There are two fundamental flaws in this reasoning, based on common misconceptions about XML.

    First, the most obvious: it is in fact the case that virtually all book publishing uses XML. That’s because XML is fundamental to EPUB, and EPUB is fundamental to the digital ecosystem. So what is the flaw in the reasoning? That “using XML” means “XML first.” I do a lot of work in this area, and when a publisher tells me they want “XML first,” I ask them what they mean by that. XML authoring? No, of course not. Okay, XML editing? Well no, that really doesn’t work for us. XML _used for_ typesetting, or just XML as _a deliverable back from the typesetter_? XML in a repository for reuse? Or just for online and eBook products? When pressed, what most people mean by “XML first” is “XML for every aspect of the workflow after me.”

    Which gets to another fundamental issue. I’ll remind you, Thad, of all those Seybold seminars we did years ago, and the wise words of another Adobe sage, Jim King: saying “XML” is meaningless unless you say “XML for _what_.”

    Does a publisher of fiction need a repository of XML for repurposing and recombining content? Nope. They just need to get XML for their eBooks. But as it happens, one of big six US trade publishers, Hachette, has an all-XML workflow that is in fact extremely efficient and successful for trade books. It’s based on XHTML. This is a strategy I’ve been advocating for some time; in fact, I’m doing a workshop on this subject, “Upfront XHTML,” at O’Reilly TOC in February, where I’ll provide examples of publishers ranging from Hachette to the World Bank to the University of Toronto Press to Harvard Business Publishing that are using this strategy. And I need to emphasize that this is _real XML_, every bit as rigorous, hierarchical, richly semantic, and loaded with metadata as XML in DocBook or NLM or TEI; it just uses XHTML as a framework, which removes a TON of complexity because it is already aligned with the web-based world we live in. Another client of mine, whom I can’t name, is doing the same thing but using HTML5–but again, as XHTML, richly semantic and well structured. All the big educational publishers are using XML. True, many of them have evolved from earlier models that were inadequate to new ones that work better, but that is because their work is inherently complex.

    Which brings me to another fundamental issue. It is not XML that is complex; it is _publishing_ that is complex. XML can be as simple or as complex as you want to make it. The problem is thinking there is one “flavor” of XML that will work for everybody. That is a fundamental misunderstanding of how XML works. The reason it has taken hold so well in STM is that there is a common structure and vocabulary and community of interest in that sphere that both enables and demands standardized markup. It’s a rare STM journal that is not done in XML. But trade books? Well, the present need for a common structure and vocabulary in _that_ sphere is for dissemination of eBooks; hence EPUB. Textbook publishers, especially the big ones, have HUGE repurposing, revision, and workflow issues within their own organizations that benefit from XML, and they are all working hard on it, but in their case it _is_ hard, not because XML is hard, but because _what they want to do with it is complicated_.

    One thing I counsel my consulting clients to do is not to start by fishing around for an XML DTD or schema that they can use. Start by thinking about the _vocabulary_. What are the components of your books, what do you call them, what do you need to do with them? THEN you can figure out the best way to express that, and accomplish those goals (what XML is _for_), in XML, whether by adopting or adapting a standard like NLM or DocBook or TEI or just going straight to XHTML. Those are all _general purpose frameworks, and they can all be made to align with the structures and semantics you need. The problem is that people start at the wrong end, and try to shoehorn their content into a pre-existing model, thinking it will solve all their problems. If you start by thinking about what you want XML _for_, and thus what distinctions you need to make _in your content_, you will get XML that works for you.

    As you know, I could go on all day on this subject, so I’ll stop. But first let me make a sly plug for your upcoming book “The Metadata Handbook,” and point out that XML is also fundamental to metadata, which is fundamental to publishing. All publishing.

    So is XML working for publishers? Of course it is, all day every day. It’s just a matter of _for what purpose_. And increasingly, it is becoming fundamental to everything they do. (Okay, maybe not authoring. Okay, maybe not editing.) I know typesetters (my employer, Apex, is only one of them) who use XML _even when the customer doesn’t ask for it or get it_ because it makes the whole workflow work more smoothly. Giving up on XML? Gimme a break.

    –Your old friend and one of your biggest fans, Bill Kasdorf

  • David Blyth

    Oct 28th, 2012 : 11:29 PM

    Good points.

  • Thad McIlroy

    Oct 29th, 2012 : 7:16 PM

    Excellent, Bill. Thank you. I’m with you on “When pressed, what most people mean by ‘XML first’ is ‘XML for every aspect of the workflow after me.’”

    It points to something positive about the XML challenge: there’s a much higher degree of awareness in the book publishing community of the possibilities of XML as a tool than there was 5 or 10 years ago.

    Jim King at Adobe has many insights into PDF, and “XML for what?” is a central question. And the answer in my mind is XML for all of the things that XML can do for a publisher, things that are increasingly essential as publishing becomes an ever-more challenged business. Those include:

    1. XML to reduce composition costs, and at the same time, make it possible to more-or-less automatically reflow content into a wide range of formats for electronic and print publishing (including too many ebook formats).

    2. XML to “chunk” content to make it possible to recombine “reusable content objects” into new content products, often as part of online databases.

    Most important these days (and why I think we’ve reached a crisis point in XML adoption) is that XML affords the opportunity for rich semantic tagging. There have long been a range of benefits in STM publishing for rich semantic tagging.

    The big difference today is that semantic tagging has become the key to discoverability for all books. Working on The Metadata Handbook with Renée Register I finally realized that the ultimate book metadata is the entire text of the book. Search engines are sufficiently sophisticated today to derive a lot of the meaning of the text without markup. But the final word in discoverability is a semantically-enriched text.

    You write “Does a publisher of fiction need a repository of XML for repurposing and recombining content?” Perhaps not. But with hundreds of thousands of new books of fiction now being published each year the authors and publishers need the text of even a book of fiction to be semantically-enriched to maximize the possibilities for that book to find its broadest and most diffuse online audience. In the end, there’s no other way. Amazon features 2,458 results for “Book of the Dead”. Which one were you looking for?

  • John Dobbin

    Oct 28th, 2012 : 3:29 PM

    My very brief answer to your question Thad:

    When we (at Nexus in Sydney) first started presenting XML publishing solutions to publishers it was met with a lot of resistance from senior management. It was difficult for them to penetrate the jargon, and, as techos, we did a poor job of speaking plain english. It demanded change, but the only one that truly likes change is a wet baby. But mostly importantly it didn’t offer significant enough financial benefit, publishers were too comfortable.

    Then along came Amazon. And then came the iPad. Publishers were catapulted out of their comfort zone by seismic changes to their industry. Now they are rushing to embrace the new world order and XML is certainly a part of that. Today we are flat out deploying XML based solutions across all categories of publishers. I wish I had time to expand but have three separate projects on the boil. I don’t think XML Publishing has failed, I think it is just getting started.

  • Thad McIlroy

    Oct 29th, 2012 : 7:16 PM

    Thanks, John. I’m encouraged. Certainly the proliferation of digital formats required for every new publication jumpstarts the ROI on XML. If it can get publishers past the jargon and the complexity it’s a big win.

  • Steve Werner

    Oct 28th, 2012 : 7:34 PM

    I’ll give a short answer to your question: Even though I’ve been a trainer/consultant in the graphics field for a very long time, I still place myself in the 95% who dream non-XML dreams.

    Just like you, I’ve survived through many transitions, but the crowd that I hang with (focused on InDesign, Illustrator, Photoshop, and getting heavily into digital publishing) don’t work with XML. The clients I teach in conferences, classrooms or online aren’t asking for it, indeed don’t even know it exists.

    I’ve gone to a few XML sessions at InDesign Conference and PEPCON sessions, but then resist actually going through the slog that it would take to learn it, and, having no one asking for it, haven’t found it worth the trouble. (However, I have been going through the slog of learning to build eBooks with EPUB because there is strong demand there. And I’m starting do that with Digital Publishing Suite for the same reason.)

    Another sign: I’m one of the regular contributors on the Adobe InDesign forum. Whenever (rarely) someone asks an XML question, it almost always hangs there unanswered. There are so few people who have the expertise to answer the question in that venue. By contrast, there is a pretty vibrant community of people developing books with EPUB. (I know it’s not the same thing, of course, just that’s a technology has movement and a strong community of support.)

  • Bill McCoy

    Oct 29th, 2012 : 7:53 AM

    That vibrant community of people developing books with EPUB is using XML! (see Bill Kasdorf’s comments above).

    My POV is that most trade book publishers won’t benefit from adopting a domain-specific or custom XML schema (the classic “XML first” architecture) but at the same time no publisher should leave their IP stuck in whole-book asset files (whether ID files or printers PDFs or even EPUB files). For marketing snippets and many other purposes a content repository makes sense, and the content nuggets will need to be represented in some manner in that repository. XML is not magic but almost certainly it makes sense to use some general XML schema for portability and toolability rather than plain text or RTF or “tag soup” HTML. XHTML is a natural choice and it comes for free if you are creating EPUB.

  • Thad McIlroy

    Oct 29th, 2012 : 7:15 PM

    Hi Steve: You’ve been one of the top trainers in publishing for as long as I’ve known you, and that’s got to be heading towards 25 years. It fascinates me to read your perspective on this issue: a big crowd of working designers and page engineers for whom XML is just not an issue. Thanks for writing.

  • David Blyth

    Oct 31st, 2012 : 7:54 AM


    What’s driving XML isn’t publishing or writing at all. It’s engineering. It’s easy to use (for an engineer) and simplifies or solves a lot of engineering bete noires.

  • Scott Abel

    Oct 28th, 2012 : 9:16 PM

    There’s much more to XML than the angle brackets and the opinions of consultants. Most dive too deep into this markup well and start explaining how fabulous the publishing industry is already because they use XML somewhere in the workflow. Who cares? That’s missing the point. Adopting XML for ANYTHING is a business decision, not a technical one. It doesn’t rely on what people think, their previous experience, blah, blah, blah… It involves (in my view) whether or not XML can help throughout the entire content lifecycle, and if so, how. Oh, and, of course, there’s the cost benefit ratio.

    When I speak to C-Level management, I have no problem selling the notion of why an organization should (or shouldn’t) include XML into their business mix, simply by looking at how efficient they are today. By closely examining the many ways that they waste time ($) and effort ($) on dozens of non-value-added tasks, there’s often a case for improvement to be made. Whether XML is used is not the issue at this point. It’s like a company’s first Alcoholics Anonymous meeting. First you have to admit you have a problem (“Hello, my name is McGraw-Hill and I take investors money and spend it making books in the most inefficient — yet traditional — manner possible”). Sorry MHP. #Fail.

    Becoming operationally efficient is the goal of many organizations that don’t think of themselves as publishers. And, yet — they are publishers. They create content. Content is the lifeblood of their organization. It’s the stuff that powers everything they do. With such responsibility placed upon it, it’s surprising that it’s taken so long for most organizations to realize that content is a business asset worthy of being managed efficiently and effectively — just as efficiently and effectively as the parts they use to manufacture cars, or the dollars they use to invest in portfolios, or the time they manage to get packages from one place on Earth to another in record time. Those who admit they are inefficient (“Hello, my name is McGraw-Hill and I have a problem”) are able to start thinking about how to improve their content production lifecycle — including authoring, content management, governance, workflow, delivery and every other time-sucking set of tasks currently loaded up with manual, inefficient processes.

    This model is not mine. I’m not THAT smart. It’s not even our idea (XML folks). It was likely thought of by smart business-minded folks many years ago and implemented (and made most famous) by the Japanese automobile industry. It’s simply a matter of becoming as efficient as possible. Stripping away all the unnecessary steps in the content manufacturing process often leaves us with a clear indication of just how inefficient we have been. Once you see the possibilities — and the numbers ($), change becomes much easier.

    What traditional publishing needs is for there to be a shakeup caused by the investors and the industry media. In my view, book publishing will go the way of the recording industry (it’s actually already heading there). I expect to see companies merge (Penguin House?) and some other try desperately to hold onto their outdated modes of operation, just adding improvements along the way in a piecemeal fashion, while the big players (the technology folks) change the publishing paradigm, provide authors with tools that will likely use XML (authoring, content management and delivery). I see it coming. Maybe I’m wrong. After all, I have been before.

    But, I doubt it. I’m working with a small publisher that understands an XML workflow. They see the benefits. The understand the efficiencies possible. They even have ideas how to change their product offerings by recombining various assets (using XML component content management and dynamic, device-agnostic delivery) into new digital products. But, they weren’t quite sure how to get their authors to author content in XML. They said, “Our authors are experts in their field, but not in XML authoring. We’ll never get them to be able to do that.” Then I suggested we build templates in XML that authors can fill out. Add a WYSIWYG editor (that has XML under the covers) and encourage them to make the change by sharing some of the wealth (saving from becoming hyper efficient) with them as an incentive. As it turns out, when you give a subject matter expert author two options (1) send us your MS Word file and we’ll do all the work and we’ll pay you [insert traditional contract terms here] or, (2) use our guided authoring template to submit your book (don’t worry, it’s easy to use – here’s how [link] and we’ll pay you [insert much more attractive contract terms here that include higher percentage cut] authors actually almost always choose to fill out the template and provide the content in XML (without knowing they are doing it) in exchange for more money. This is a no brainer!

    Most publishing industry folk have yet to realize several key things: 1) They don’t make books. 2) They are now technology companies. And, 3) everyone is a publisher.

    They also haven’t been able to escape the little box in which they have found shelter and a sense of security for many decades. Times have changed. We’ll have to see where this all leads. But, whether they adopt XML for the entire lifecycle (or not) is not even the real issue. It’s whether they re-imagine the book and the processes and approaches (standards and tools) they use to do the good work they do. If they do it by leveraging the work done by private industry, they can easily benefit from more than a decade of lessons learned, false starts, mistakes and resulting best practices.

    I produce — along with Ann Rockley (author of “Managing Enterprise Content: A Unified Content Strategy” (2nd Edition, New Riders 2012) — the annual Intelligent Content Conference, held next in San Francisco — February 7-8, 2013. We’re just getting the roster up now. The theme is “Corporate Publishing” and will showcase examples, standards, methods, and tools needed to deliver the right information, to the right people, at the right time, in the right format and language on the device(s) of the users choosing. I hope you’ll join us! http://www.intelligentcontentconference.com

  • Thad McIlroy

    Oct 29th, 2012 : 7:15 PM

    Wow, Scott: what do you really think 🙂 ?

    Where I differ with you is not on goals but on execution. I couldn’t agree more that all publishers (and that includes lots of “content producers” who are slowly learning that they’re publishers too) need to use the best tools that provide an optimal ROI.

    I wish that robust XML workflows could be the answer. I’ve wished that for a long time. But after watching what’s really happening in the field I’ve decided to stop blaming the publishers and start questioning the efficacy of the tools.

  • David Blyth

    Oct 28th, 2012 : 11:28 PM

    No. XML is not to blame. Publishing is designed around hard copy books. XML is designed for passing info between dbases on the Net. So asking if XML has failed publishing is like asking if jet engines have failed the automobile or the steam boat. They’re designed for different purposes on different media from the get go.

    BTW, there’s no reason why XML must be complicated. XML is a way to _design_ a language more than a language en sic. If you want to design a simple XML language, then do so.

    Also BTW, if you want a workflow language, try UML or WS-BPEL, not XML. Again, it’s an apples and oranges comparison. XML isn’t designed for workflow. Web-Services Business Process Execution Language is.

  • Thad McIlroy

    Oct 29th, 2012 : 7:14 PM

    @David Blyth
    Publishing is designed around pages, and the publishing industry has been challenged to translate the paper-based page metaphor to digital devices. XML’s lineage is in SGML, a document markup language. XML has subsequently proved also to be adept in moving data around the Internet, but that was not its original purpose.

    I appreciate your point about taking a simplified approach to XML, but here I’ll bring back your jet engine. I think that XML as currently constituted is in fact a jet engine and when publishers try to use it to push a book from one mobile device to another they find themselves unable to understand the jet engine’s manual and mechanisms.

  • Don Day

    Oct 31st, 2012 : 6:33 AM

    Thad, I appreciate your willingness to stand up for the issues that publishers themselves are having with XML concepts. The problem seems to be that publishers in your realm of content production have either been exposed to the wrong tools or that the XML consultants and providers in that space have not done their requirements analysis properly for whatever adoption issues your community is having. If the problem is about XML syntax, it could not be much simpler and still provide identity, scoping, properties, and access to both validating and well formed parsing services needed by effective publishing applications. XML is the language of the Web–HTML itself borrows so heavily on the concepts that lay users cannot tell the difference, and with HTML5’s defined algorithms for parsing tag soup HTML, the Web is becoming more well-formed all the time. XML’s pervasiveness makes statements like “they find themselves unable to understand” sound like someone who is in denial of the existence of 21% oxygen in the air “as currently constituted” while clearly breathing it every day. We might delve into the real issue by making sure we don’t keep bringing back the strawman positions:

    1. Is XML too complex? As I said, the syntax and delimiters are about as reasonably concise as can be. We could avoid the end tag by reverting to a brace or brackets syntax like Scheme or LISP, but clearly HTML has won acceptance for the current format for the Web, and it is quite readable as a clear text format, so let’s put the syntactic complexity argument to rest.
    2. Are XML publishing applications too complex? For those already using XML for publishing workflows, the question is nonsensical because they are already getting so much benefit from reliable production workflows based on common and well-hardened tools, years of community experience, and the repeatabilty of process inherent in the business rules-like aspect of XML processing transformations. While there is some learning curve involved in adopting these tools, all trades have their skills and proficiency certifications. XML publishing concepts are taught every day at community colleges. If you want to fly high, you have an incentive to understand those jet engines!
    3. Are XML authoring applications too complex? Without a doubt, for the most part. Fully validating XML editors are not simple tools to start with, and the most common use case of technical writing brings with it the requirement for these tools to integrate a plethora of other business process concerns for technical publishing: content management, review process, reading level analysis, term checking, trademark finding, managing conditionality, running quality assurance, and much more. So yes, XML editors are complex, and so are the tasks that they help to facilitate. But if I may characterize the authors you’ve mentioned as primarily book authors, then they have a different set of concerns that are adequately met by word processors. For them, we may as well ask, Is Microsoft Word too complex? It can be if you require it to be, but you can also use it in a “sufficient for the task” mode where it serves quite well (apart from not producing valid, presentation-free, properly-scoped and semantically rich markup as part of the deal).

    If I understand the root issues correctly then, maybe we can define a manifesto to get the right focus on the essential issues:
    * XML as a publishing technology and set of tools for book publishers needs to be marketed and taught in a systematic and approachable manner from the point of view of someone supporting those more general publishing requirements.
    * Communicate the advantages of XML more effectively into that community. It is not for lack of trying; organizations like GCA and IDEAlliance have been promoting XML and solutions like PRISM for general publishing for years. We could use your help in identifying where and how to direct a messaging campaign designed to lower the perceived mental barriers to XML adoption. What do executives and managers in publishing need to know? Engineers and programmers and press personnel? The staff writers? The extended writing pool?
    * Bring the right knowledge into this space. Perhaps everything that people think they know about XML publishing is all wrong. Is everyone aware that XSL-FO is not the only way you can publish XML? CSS3 has powerful layout and transformational capabilities for print that tap into 99% of the knowledge you may already have for producing well-adorned Web sites or eBooks. The XML plugins available for most word processors including Word and FrameMaker enable composition directly from those very capable engines. You can leverage any existing publishing tool by transforming XML content to its required format and continue doing business as usual while you plan for possibly moving to an all-XML workflow somewhere down the road. XML is NOT a case of converting beliefs and experiences at the point of a sword!
    * Simplify the XML vocabulary that authors need to work with. What XML vocabularies are needed for general publishing? Not many. As Dick Hamilton pointed out, there are already several standard vocabularies that are easy to adopt since both commercial and open source production tools are already available. But even HTML has more cruft than authors ever need to be exposed to. The beauty of XML is that it enables paring down the input requirements on the workflow to just what is required. Architectures like DITA make it easy to constrain the input experience to a set of only the necessary elements for the given authoring scope with no major adjustment to the CMS or production system–it just works.
    * Finally, simplify the authoring systems for writers who don’t need the full scope of XML workflow and business rules management. There are some very helpful things about XML editors that writers can quickly learn to appreciate, especially the ability to select and move whole structures at a time. If they do not need to manage attributes, then don’t show that sub-editor. If the steps to associate a title to a figure are difficult, provide a dialog or form to manage the input. If they are expected to insert semantic phrases or structures, provide a button or selection list at that insertion point. Do they need to ever see the markup? I would hope not, if we are treating XML just as we would a binary format. In fact, do authors need to be using XML under the covers in the first place? In some cases, probably not. I see no reason why popular writing tools such as Scrivener can’t be incorporated as authoring front ends into a regulated content workflow for high end XML-based publishing. Fix the authoring tool expectations, and you will likely turn the problem around from a case of supply-side deficiency to supply-side abundance of good XML content to work with. Wouldn’t that be a good problem to have?

    Does this manifesto help in defining some positive take-aways from the perception that XML is a failure for publishing? What can we change or add?

  • Thad McIlroy

    Oct 31st, 2012 : 7:43 PM

    @Don Day

    Lots to chew on here, Don.

    As I’ve been moderating these comments it strikes me that (at the risk of over-simplifying) there are two kinds of participants here, those whose work fully immerses them in XML and so can no longer really appreciate what it’s like for those whose work feature XML just in their advanced tool set.

    A straightforward statement like “XML is the language of the Web” must seem to you like “the sun rises in the East”. But for the average non-techie in publishing the language of the web is English.

    You can accept that XML authoring applications are too complex. I suggest you also recognize that many of the people who work in publishing are failed would-be authors who went into publishing to get a paycheck. A lot of the publishing personnel being tasked with XML today came up through editorial and WYSIWYG production. Of course it’s way over their heads.

    Where I’m with you is at this statement: “XML…for book publishers needs to be taught in a systematic and approachable manner from the point of view of someone supporting more general publishing requirements.” And we’re going to have to speak to users to find out what will make XML approachable and general.

    Thanks for commenting.

  • David Blyth

    Oct 31st, 2012 : 7:50 AM


    That’s an interesting viewpoint. I like it. I just don’t think that the problem is in the manual (to stretch the analogy) – the manual is clear. I think the problem is in the publisher’s mindset, tho we may agree on that.

  • Richard Hamilton

    Oct 29th, 2012 : 8:32 AM

    As a publisher, XML has definitely not failed my company (XML Press). We have an XML back-end process that lets us generate all of the output formats we need from a single source. And we’re able to do things that would be impossible, or extremely difficult, with an unstructured format.

    That said, there’s no way I can ask every author to write directly in XML. Some can, because they already use XML in their work environments, but for others it’s a show stopper.

    Our approaches have varied, but for authors who aren’t using XML, the most successful so far has been to work in a wiki that exports content either directly in XML or in a format that is easily converted. That gives us the advantages of a wiki (collaboration, source control, etc.) on the front end and the advantages of XML on the back end.

    BTW, I’d argue that the best thing for most authors is not ABCML, it’s no-ML. I’m not convinced that markup simplicity is a solution. Even though wikis use markup that is arguably about as simple as you can get, the trend is still towards WYSIWYG interfaces. The issue is making the mechanics of writing as painless as possible for the writer, while still capturing the structure and metadata needed to run your back-end processes.

    Regarding complexity, I think Bill Kasdorf nailed it in his comment. Publishing is complex, and you need tools that can deal with that complexity. XML is one of those tools.

  • Thad McIlroy

    Oct 29th, 2012 : 7:10 PM

    @Richard Hamilton
    “…for [some authors writing in XML is] a show stopper,” is something I hear all-too-frequently. The author of the to-be-published text cannot/will not take part in the workflow. That has to be called a flaw. Sure, we’ve got workarounds for this problem. But isn’t this representative of the larger problem with XML workflows for most book publishers?

  • David Nelson

    Oct 29th, 2012 : 8:47 AM

    For years I have struggled with the creative challenge of transforming printed books into digital form. Despite my pleas, the authoring, editing, and design work for our books is rarely allowed to pause so that my eBook crew can offer observations about why the randomness of certain tables, figures, and odd arrangements of text keep the save-as-EPUB button from making the process simple and painless. But…something wonderful has begun here at our small publishing house. Our eBook sales are increasing and suddenly eBooks are no longer a novelty — no longer something other staff tolerate hearing me enthusiastically talk about. A prediction of mine was recently voiced by our editor an innovative new idea — let’s publish the eBook FIRST and then hand the files over to the art department and let them arrange the text and add decorative graphic devices as they wish. Instead of working with author-supplied, randomly styled Word files, the art crew will receive hierarchically correct, consistently tagged text files with thoroughly thought-out content organization for each and every content object in the book. For a small publisher like us, receiving XML source files (or even consistently styled Word docs) is a dream I gave up on long ago. BUT… now that I will be the main content conduit between editor-and-eBook, I will have the opportunity to review author manuscripts and shape the chapter structures BEFORE editing starts.

    And if the InDesign folks want to get into the act, importing XML into InDesign with tag-based formatting, all the better. And if they don’t, well that’s up to them. In my previous job, I fought the good fight for XML publishing and the holy grail of single-source outputing. While I learned a lot from that experience, I ended up being perceived as more of the problem than a promising solution. I continue to use XML (and all it’s flavors) and enjoy the continuing ride into EPUB3 and apps creation. But I have totally given up on trying to persuade authors, editors, and typesetters (oh, do we call them typesetters anymore?) to even think about the letters X, M, or L.

  • Thad McIlroy

    Oct 29th, 2012 : 7:10 PM

    @David Nelson
    It’s always interesting to get the word from the trenches. You’re not the only one who has given up on trying to persuade authors, editors, and…compositors…to think about the letters X, M, or L.

    I’ve thought of ebook-first as a strategy and it holds some appeal. The problem in my mind is what I call “bookishness” — authors still dream in print, not in digital, and for many authors dreaming in print influences how they approach the structure of their narratives. There’s nothing wrong with this: it’s worked well for centuries. Today’s ebooks are the pablum of book formats: bland, sloppy and ill-formed. The appearance conveys so little meaning. I’m not willing to say goodbye to design and structure as a component of the experience of writing and reading a book.

  • Thad McIlroy

    Oct 29th, 2012 : 1:47 PM

    “Semantic markup and automation will win in the long run because they offer compelling advantages in delivering a variety of products…” 100% agree. The process is invaluable and necessary. If only we could create tools that civilians found easier to use.

  • Thad McIlroy

    Oct 29th, 2012 : 1:50 PM

    Thanks for sharing both successes and failures, Steve. There is still so much to learn.

  • Stefan Gentz

    Oct 29th, 2012 : 3:03 PM

    First, David Blyth, XML is not a jet engine. And content is not a steam boat. Content is a rocket and XML is the launch vehicle. And old binary formats are silos. Let’s face it: We all know all the problems that come from unstructured, binary locked, inaccessible content. Especially when it’s not at least »pseudo-xmled« (that is, properly tagged with »speaking« paragraph, character etc. styles with a »technically invisible« but present structure and content logic.

    Second, the »XML is too complex« legend: No, it is not. I did gave really a lot of XML trainings over the last years; primarily for technical writers, translators and graphic designers. And it did never ever took me longer than a half day to enable the trainees to get a basic understanding of XML. At the end of the day they do understand XML and have developed their own first “mini architecture” and have build their first document on it.

    The first thing I tell my first-touch-with-xml trainees in the morning is:
    1. Forget everything you have heard about XML.
    2. XML is no rocket science.
    3. XML is fun. And satisfying. Like cleaning up.

    And you know what? It works! Of course: We can build a huge architectures on XML and ancillary technologies. And yes, we can make it really, really complex. I have seen »grown« corporate xml-architectures that might need days to understand. Some of them were indeed so complex, that even the architectures themself got lost in them. So, yes, XML can be pain in the neck. But so can be unstructured Word-Documents with myriads of styles and formatting overrides.

    Third, David, maybe XML was not explicitly designed for traditional book publishing. It was designed with something much more general in mind, book publishing being just a part of it. The W3C writes: »The Extensible Markup Language (XML) is a simple text-based format for representing structured information: documents, data, configuration, books, transactions, invoices, and much more.« (http://www.w3.org/standards/xml/core). So yes, technically it’s for pook publishing. And yes, there’s no reason to not use if for book publishing.

    And no, cost is not an issue. Cost can only become an issue if you have bad consultants and get stupid advice. Every proper and fair XML publishing expert can set up an XML authoring and publishing ecosystem for less $10,000–$50,000 for small to medium complex requirements. And that includes a couple of licences for a proper XML content authoring software, building a Corporate XML Architecture and, yes, author training. Base your work on a standard and you can get it even cheaper. Put a few ten thousand bucks more into the pan and you also get a CMS and legacy data migration. It just scales up with the size of the company, world-wide rollout/training, licences and how much legacy content you want to migrate. And seriously, most bigger companies probably spend more money per year for toilet paper and cleaning stuff. If every company would spent the same amount of money that they spent for contract cleaners per year into cleaning and refactoring their content ecosystem, they could take away most of the pain very fast.

    Thad McIlroy requests: »It’s time to develop ABCML, the ABC of markup languages.« and asks: »Who’s with me?«. I’m not. Not that I don’t like the idea. I do love KISS (Keep it short and simple – okay, this post contradicts that …). But because that was exactly the idea behind XML. XML already has everything we need to create simple structures. You can create a classic novel with just a handful of elements. And a child book as well. If you use some attributes like »role«, »class« or the likeyou can keep it very lean. We only need to resist the temptations of extending elegant structures with case-based requirements. If you look into the early discussions in the late 90ies suggested names were »MAGMA« (Minimal Architecture for Generalized Markup Applications), »SLIM« (Structured Language for Internet Markup) and »MGML« (Minimal Generalized Markup Language). (See http://en.wikipedia.org/wiki/XML#History).

    And, finally, we already have HTML. HTML5+CSS3 comes with more or less everthing to create every possible content. Yottabytes of content will be markuped in HTML5 soon. With just about 120 Elements and a handful of attributes. And due to the class attribute highly flexible. And I would be even more radical and throw away quite a lot of those elements. Anyone who want’s to tell me that this is not enough for his content authoring requirements will need some very good arguments to convince me. And need to remember that I can guide authors through structure and stick them to it with a proper authoring tool like FrameMaker (read: take out some of the flexibility of HTML). So convincing, that Yottabytes and billions of content producers weight less than that.

    So, has XML failed publishing? No. But some old schoolers from the last century have. Actually there are still quite a lot of them. Big, fundamental changes can take decades. XML is not even 15 years old. But it has already come a long way in this short time! Now it’s up to the publisher to finally jump on the waggon. The XML train is just starting up. But it’s getting faster and faster. Publishers need to understand very soon that they will have to jump now. There might be a handul of years left, but hey, remember that question: Whops? Has really another year passed already?

    Proper processes are the backbone. XML is the mother missile. The booster for all kind of information today and in the foreseeable future. XML is not an illness. It will not go away. It is there. And it’s already everywhere, even if you do not see it (yes, even in .docx). It’s not a pain in the neck. But binary locked content is. Last century thinking is. Medivial processes are.

  • Thad McIlroy

    Oct 29th, 2012 : 3:24 PM

    @Stefan Gentz

    I’m always encouraged to read upbeat success stories like yours.

    But then I have to ask myself: are all the failures just fools who don’t like having fun cleaning up with XML? Surely not. There’s some problems here somewhere. My ABCML imagines a much simplified vocabulary around a similar structure. I sense that’s probably not the solution, but I tossed it out there.

    I’m with you in loathing (and fearing) “binary locked content” and “last century thinking”. But XML’s lineage goes back to Charles Goldfarb’s IBM’s Generalized Markup Language (GML) from the the 1960s. That’s a l-o-n-g time ago!

  • David Blyth

    Oct 29th, 2012 : 4:04 PM

    Hi Stefan;
    1) “First, David Blyth, XML is not a jet engine. And content is not a steam boat. Content is a rocket and XML is the launch vehicle.”

    My point was simply that XML is not explicitly designed for publishing. This is something with which you somewhat agreed in your Point #3 when you said “Third, David, maybe XML was not explicitly designed for traditional book publishing.”

    So yup. We agree. I’ll comment more on this when I get to Point #3.

    2) “Second, the »XML is too complex« legend:”

    I never said it was. In fact, I said “…there’s no reason why XML must be complicated.” So yup. We agree.

    3) “It [XML] was designed with something much more general in mind, book publishing being just a part of it. The W3C writes: »The Extensible Markup Language (XML) is a simple text-based format for representing structured information: documents, data, configuration, books, transactions, invoices, and much more.« ”

    Agreed. The W3C says that XML is designed to represent structured info. Which means that it was not explicitly designed for publishing. If you need something explicitly designed for publishing try DITA or DocBook.

    BTW, not every published (hard copy) doc is structured, or at least structured easily. But I digress.

    4a) “And no, cost is not an issue.”

    Here I must disagree. Of course cost is an issue. Our group has trouble getting a few hundred smackers to attend a conference, much less 10K.

    4b) “And seriously, most bigger companies probably spend more money per year for toilet paper and cleaning stuff.

    Absolutely. But that doesn’t mean they’ll spend it. There are documented cases where companies decide to spend a million to advertise their improved documentation rather than spend the same million to actually improve their documentation (thus saving 10 million). The same applies to XML publishing. The _ability_ to do something does not give someone the _wisdom_ to actually do it.

    5) “So, has XML failed publishing? No. But some old schoolers from the last century have.”


    6) “Proper processes are the backbone. XML is the mother missile.”

    Agreed. We may be near the same wavelengths, if not actually on the same one.

    Take care!

  • Joe Gollner

    Oct 30th, 2012 : 6:28 PM

    Thanks Thad for raising the question “Has XML Failed Publishing?” It is important to raise these questions from time to time, even if they spark a little controversy. I must also distribute thanks to all the commentators who have jumped in as there have been a lot of good points raised. And it is good to see the exchanges moving towards consensus and better still consensus that stands on firm ground. Although it may lead to a destabilization of the consensus, I am inclined to go back to Thad’s original question and to suggest that while it has some obvious refutations it also has some merit that we would do well to consider. But first, as something of an old-timer in the world of markup languages, I thought I would toss a few observations in just for good measure.

    The topic of XML and Publishing reminds me of a great man who is part of the story behind XML. The gentleman I am thinking of is Yuri Rubinsky. Back in the 1980s, he founded a company called SoftQuad and they released an SGML Editor named Author/Editor. Yuri sprang to mind because I remember how, in the very early 1990s, he somewhat quixotically undertook to show how a piece of fiction (a novel by Margaret Atwood) could be simultaneously published in several print formats as well as in braille and as a voice synthesized recording if the source had been prepared using the Standard Generalized Markup Language (SGML). Yuri’s goal was to enable unrestricted access to published work and it is also worth noting (as well as not being surprising) that Yuri was one of the first people to advocate for web accessibility standards. So Yuri saw an application within publishing for which structured markup would offer an overwhelmingly potent answer. It turned out, however, that the moral attractiveness of making content instantly, and cost-effectively, available to all potential consumers was not enough to spark the interest of most commercial publishers.

    This story helps to underscore an important point that has surfaced in the exchanges sparked by this post: XML provides a mechanism for specifying a markup language and that everything turns on the application to which it will be put (the answer to the “for what” question).

    Going back to Yuri Rubinsky again, it is my recollection that it was Yuri who was the most impassioned advocate for bringing markup intelligence to the web, back in the days when the web was brand new. While it may sound like a bold claim, it is one that I am happy to defend, but I have routinely called Yuri the spiritual father of XML – the champion for intelligent content on the web who was the most balanced and thoughtful in balancing the competing demands that were, at the time, contending over what would later be called XML. And among the battling forces at work at the time, he was the one person who most completely embodied the need for XML to retain and to bolster features from SGML that were specifically designed to support publishers. As some of you know, the story of Yuri Rubinsky ends abruptly in early 1996 when he passed away to the shock of everyone in the industry. Besides being a loss that many of us still feel acutely to this day, Yuri’s passing left a massive hole in the community that was forging what would become the Extensible Markup Language (XML) recommendation. The hole that was left was a potent advocate for the needs of publishers.

    David raised one point that I have touched upon on several occasions in the past when he mentions that “XML is not explicitly designed for publishing”. In a presentation I gave in 2008 called “XML in the Wilderness”, and in a whitepaper I wrote not long after that called “The Emergence of Intelligent Content”, I argued that whereas SGML very much reflected a focus on publishing requirements (in their full-bodied complexity) XML can be constructively understood as a flight away from the needs of publishers and towards the needs of software developers building web applications. The balance that Yuri advocated for, and in many ways embodied, and that would have seen the needs of publishers receive more attention, was lost with Yuri’s passing. I have likened this period in the history of XML to a retreat into the desert (from the perspective of publishers at least) although (following the story of St. Jerome) I have come to see the end result as a positive thing (and perhaps especially for publishers). The redirection of XML towards enabling software architectures and applications that are native to the web (taken as broadly as possible) has yielded many benefits and these benefits, such as engaging and socially active application environments for authoring and collaboration, now offer us the means with which to make fundamentally better contributions to how we facilitate the business of publishing.

    Now this brings us all the way back to Thad’s original question “Has XML Failed Publishing?”

    If I limit my view to the most visible parts of the commercial publishing industry, setting aside areas such as legal publishing who were among the first at the SGML buffet line let alone the XML martini bar, and if I take the reference to XML to stand for the industry of technology and service providers that have operated in that space (myself included, I hasten to add) I would be inclined to say “yes!” As an industry we have not been anywhere near as effective as we could have been in making XML simpler to adopt, adapt and apply for publishers. Now it is true that until Apple and Amazon starting to shake things up the driving need for XML, and for multi-format / multi-audience publishing, was not as tangible as it is now. It is also true that we now have a great number of tools and techniques at our disposal so that we can now deliver on the promise of XML just as publishers come to see a burning need for it.

    So perhaps fortune is sometime kind just as it is sometimes cruel. Now that the need to change has become somewhat more pressing for publishers, it is indeed fortunate that we now have several decades of experience working with markup languages and this experience, when coupled with the latest generation in socially-enabled web applications, can now be used to meet the needs of publishers in ways that will not feel alien or awkward. We are now ready to follow Yuri’s example and to make a reality of his vision.

  • Thad McIlroy

    Oct 31st, 2012 : 5:59 AM

    @Joe Gollner
    I remember Yuri Rubinsky well: he was a tireless advocate for all things markup. But I also rolling my eyes when watching Yuri’s presentations. I appreciated his humor and energy, but thought he was visiting publishing from another planet, the Planet Complexity. I didn’t believe publishing was ready for SGML. And while the SGML folks like to point to XML as a greatly simplified version of SGML, that’s like saying hernia surgery is a greatly simplified version of brain surgery. It’s not any easier for the students who failed ‘Anatomy 101’.

  • Mike McNamara

    Oct 31st, 2012 : 12:47 AM

    What a great post, accompanied by some very interesting comments. For me, the answer has to be ‘No’ it has not failed. I think it is just taking a very long time for many non-technical publishers such as book publishers to understand the business benefits that they can derive from its use. Far too conservative and risk-averse to change, many publishers have stayed far too long with the status quo simply because there was no need to change as their market did not demand it.

    However, the ever increasing need to produce eBooks (of whatever ‘format’, platform and size – serial, snippets etc.) for the ‘mass market’ will only increase the focus on publishers having more ‘agile’ content for all their publications. In my opinion only xML based content can form the foundation for a more flexible use of publishers content (the lower case x is intentional so as to include the latest ML’s such HTML5 for the purpose of this comment).

    Although many publishers are now benefiting from their use of xML, I think perhaps some of the blame for its overall slow take up can lay at the door of software vendors for taking far too long to deliver easy to use ‘xML’ products into the hands of authors and editorial users. As already stated in a previous comment, Microsoft clearly missed the mark with its own offerings, but add-on & conversion tool software developers have helped to a point.

    Getting xML fully adopted earlier in the content creation chain is still a challenge, but it is happening more often today than in the past. With an increasing variety of platforms now available for content consumption, increasing the Findability and Discoverability of relevant content derived from a better use of Metadata also increases the pressure for more publishers to adopt xML. As I mentioned above, xML hasn’t failed, it’s just taken much longer than we all thought it would take to be better accepted in some publishing circles.

  • Thad McIlroy

    Oct 31st, 2012 : 5:47 AM

    Hi Mike: Thanks for your comments. My fear is that we’re “blaming the victim” — we see publishers as “too conservative and risk-averse to change” [which of course they are, but who isn’t?] and then it’s their fault for not getting on the XML bandwagon, rather than an inherent failing of the proposed solution.

    I think you’re correct in your implication that now’s the time. There’s never been a more compelling use case for XML. The next year should prove once and for all if this is the tool we need.

  • Sarah O'Keefe

    Nov 1st, 2012 : 6:28 AM

    Publishers are hanging on to what they know. There’s an entrenched resistance to change because change means that a lot of people are either going to lose their jobs or have to change their job responsiblities. Most normal people dislike change. The rest become consultants.

  • Thad McIlroy

    Nov 2nd, 2012 : 5:07 AM

    @Sarah O’Keefe
    But I think that an “entrenched resistance” is too quickly demonized. It’s in fact a coping skill and a survival skill. It’s up to management and its consultants to ease staff forward into a comfort zone for change.

  • Mike McNamara

    Nov 2nd, 2012 : 5:41 AM

    In my opinion, implementing an XML workflow does not automatically result in job losses, but it will lead to changes in work related practices. That really has to be seen as a step forward. If you can release staff from doing mundane and repetitive work and increase the overall efficiency of the workflow, that has to be a good thing.

  • Sarah O'Keefe

    Nov 2nd, 2012 : 5:49 AM

    @twitter-6005272:disqus The problem is that people who are experts at one workflow are threatened by a new workflow in which their expertise becomes irrelevant. We can certainly argue that general publishing skills are important, but most people see themselves as experts in XYZ Software, and when you take away XYZ Software and replace it with ABC Software, they get cranky. Also, some people like not having to think too hard—those of us who live and breathe this stuff think that mundane and repetitive is to be avoided, but not everyone thinks that way. Even I can see the occasional appeal of a line-by-line review of copy fitting and kerning.

  • Mike McNamara

    Nov 6th, 2012 : 7:29 AM

    I agree that change is something that many in the workplace fear. However, managed correctly & reasons for change explained clearly, no one need get left behind. I’ve seen that work successfully many times. However, there will always be those that choose not move forward.

  • BigEater

    Dec 19th, 2012 : 7:23 PM

    I have been through a million new software and new system introductions. The problem has never been unwillingness by the actual rank and file. It has inevitably been that management gets days of hand holding and training from the salesmen usually including lavish lunches and evening entertainment, while the people who have to do the actual work get–at most–a half a day with a junior-level engineer who spends the whole time in a distracted, pissy mood because he or she feels that his or her time and talents are being wasted on trying to teach these non-engineering baboons.
    Then when management sees people confused and unproductive because of the bad training, they blame it on the workers, not their own cheapskatery for not spending enough on training.

  • Mike McNamara

    Nov 2nd, 2012 : 5:36 AM

    I think it is sometimes a combination of both resistance to make any change & badly designed solutions – I’ve experienced and seen both! All of us involved with XML publishing do need to be better at explaining the business benefits and putting together productive and easy to use solutions.

  • Scott Abel

    Oct 31st, 2012 : 7:11 AM

    Let me propose something a little different. It’s not my idea, but it will perhaps explain a little about the successes of XML.

    First, tools aren’t to blame. XML introduces new ways of authoring content (this involves authors having to actually think differently about writing) and different ways of organizing, storing, reassembling, reusing, repurposing content. This means organizations have to refashion their processes, rethink employee roles and responsibilities, etc. You can wish there were tools that magically made things as easy as the Staples “Easy” button, but that’s unrealistic.

    In the corporate publishing arena, XML publishing (and authoring tools and component content management systems) are professional tools used by professionals. They are not consumer products. They are precision content creation, management and delivery tools. Surgeons, medical professionals who are being challenged to adopt new, more complex surgical tools (think laser over scalpel), also wish there was an easy button that would train their hands to hold and use lasers the way they are accustomed to holding and using scalpels. But, as it turns out, that’s not realistic. Laser surgical devices are held by the same hands, guided by the same brains and the same eyes, and used to perform the same tasks using existing knowledge of anatomy. But, they have one distinct and very demanding difference over scalpels. They require the surgeon to grip them in a way that is different than the scalpel, which requires the surgeon to develop new muscles in order to hold the device and use it as designed. While surgeons may wish that the laser scalpels worked just like their old school cutting tool cousins, they do not. In order for surgeons to use these tools — and provide their patients with the huge benefits laser surgery promises (decreased size of incision, reduced risk of infection, quicker healing times) the surgeon must adapt to the new-and-improved method and tool set.

    The rewards aren’t limited to the patient. Insurance companies benefit. As do surgical facilities.. And, surgeons. Laser surgery can differentiate a surgeon from the rest of the pack, make him more desirables or her more efficient. But, to benefit they must change. They must grow. Systems and programs, standards and processes must also change. There is no “easy” button.

    Is XML really that different? Is it possible that we are just crossing over from the old way of doing things to the new one? Is it equally possible that blaming the tool vendors for not making their laser surgery experience identical to the scalpel experience — or the unstructured content, word processing/desktop publishing experience the same as XML authoring — asking for a bit too much?

    If would be helpful to see the adoption of XML overtime to understand the growth of XML authoring. It would be equally interesting to see the growth / adoption patterns of XML authoring compared to other major changes in technique/approach in other paradigm-shifting technologies.

    And, while we’re looking at statistics and charts, it would be equally interesting to see what we can learn from industries that “wish” things were different when change cam a knockin’. A quick look back in recent history — the recording industry — would be a great lesson.

    My take is XML is not to blame. It is used to help organizations accomplish majorly important work, increase profits, decrease expenses, eliminate waste and provides many other benefits. It can be implemented well (or poorly). It can be leveraged for maximum benefit, or under-utilized. It can be made more challenging than it need be — often because the people implementing it don’t understand how to do it successfully.

    One more point. We have been taught over time to shop for tools. We have a problem and often think “What software can make this problem go away.” Not surprisingly, this is a bad approach. Tools don’t come first. Analysis and planning do. Change is also critical. It’s possible to empower thousands of people to create XML content easily in an MS-Word-like environment (like the Irish government has) without breaking the bank. And, it’s possible, when done right, to empower book publishers to do the same. But, looking for an “easy” button first involves doing the hard work required in the research and planning stages.


  • Scott Abel

    Oct 31st, 2012 : 7:15 AM

    Thad: I think you should definitely attend Intelligent Content 2013. Maybe you could be on a panel and discuss these issues live with some of the folks in the comments field below. Thoughts? Email me offline at scottabel@mac.com.

  • David Blyth

    Oct 31st, 2012 : 8:16 AM

    Sarah O’Keefe

    What industry are you talking about when you say `the industry will reach a tipping point’?

    In chip design, I think XML is already past the tipping point. When you’re talking about a few hundred thousand registers (and a few billion transistors), efficiency is the Name Of The Game. Adding a few extra thousand efficient registers is far more cost effective than adding 5 custom designed ones – unless you’re talking about Mars Rovers or the like and that’s just too limited of a market. Making boring registers by the bucket load is cheap!

    But chip design is at the base of lots of other industries, starting with cell phones and tablets and working its way up. Maybe publishers haven’t seen XML that much because they (and lots of other industries) are near the top of the chain.

    But I think it’s coming straight at you. From where I sit, XML is a steam locomotive coming down the pike – to mix my analogies quite badly – and it’s smashing everything in its path. My job is planning out routes and laying down the rails.

    Tho I don’t mind throwing a shovel full of coal in once and awhile…;)

  • Scott Abel

    Oct 31st, 2012 : 8:43 AM

    I think Sarah is with you on this one. As it turns out, we’re very aware of the chip industry, and telecom, and internet communications sectors and have worked on projects that utilize XML to deliver complex customizable product information, documentation and training to multiple customer types, using multiple configurations of products, in multiple languages, etc. XML is at the heart of all of these publishing initiatives.

  • David Blyth

    Nov 1st, 2012 : 8:07 AM


    I’m quite interested in this topic, but am too used to XML-speak. So can you clarify what you mean, especially when you say “XML is at the heart of all these publishing initiatives”? Do you mean “hard copy publishing”, “Web publishing”, “publishing” in some other sense, or “rendering” in the XML sense? It sounds like you mean “output publishing to end users” while I’m more concerned about “rendering to anyone”. But… I’m not positive.

    Telecomm and chip design deal with multiple languages (VHDL, SPICE…) depending on the stage you’re at. You have to render to those languages as you move along anyway, so rendering to HTML, PDF, MS Word and so on is exactly the same process as rendering to anything else.

    Thus, there’s no particular need for engs to cooperate with TW projects which render docs or training material at the end stage. In fact, doing so is inefficient because it’s much easier to just adapt the working solutions you already have for non-TW languages in the middle.

  • David Blyth

    Nov 1st, 2012 : 8:11 AM

    Side-thought. Yes, many engineers are rude. But there are also many _technological_ reasons why they don’t cooperate – it’s counter-productive to the product they’re making. It’s up to TWs to learn and exploit those reasons. Engineers already know what they are.

  • Sarah O'Keefe

    Oct 31st, 2012 : 12:28 PM

    I suppose the publishing industry, although there’s a lot of publishing happening in niches of other industries.

  • Scott Abel

    Oct 31st, 2012 : 8:41 AM

    Sarah O’Keefe: Me, too –> re: “I’d be quite interested in any examples where quality at a higher cost actually won over speed and efficiency in the larger market.” Then, I’d like to have a conversation with the investors in those companies and find out what they think.

  • Melissa Serdinsky

    Nov 1st, 2012 : 10:13 AM

    Thad has encouraged me to engage in this medium. I have tried to edit this down a bit. Here it is:

    I’m not the posting type, maybe I should be since I tend to scratch my head on why issues such as XML in publishing or a true standard ONIX feed continue to be a Holy Grail quest that walks a fine line between Monty Python and religious fervor. Your other posters are a bit intimidating and I can’t tell how many of them actually work in a publishing house and in what division if they
    do. It seems obvious to me they have the luxury of thinking about one thing – XML or content creation which is a luxury I do not have.

    The promise of XML as a savior to publishing was a communication gap between new style code jockeys and work-a-day publishing people. We heard XML is very structured and can be mapped to enable multiple outputs from one content source. We didn’t hear or understand is that XML is very fluid, adaptable, without bounds or definition. Code jockeys did not hear or understand that books (aka content) is inherently visual and all those style elements that they want us to drop serve a purpose in print and, yes, in e.

    The very nature of XML (structure without rules) is difficult to navigate in non-fiction publishing where every subject is unique with unique elements. Footnotes, charts, tables, lists, instructions, asides, references, etc, etc are present for a reason and it is not just for print. Non-fiction can be dry so a reader needs a break visually in a case study to reflect on the subject. Recipes have a known structure and people like pictures of the food they are cooking. Footnotes prove the author’s point by referencing documented “evidence.” And on and on. I continue to run up against XML experts or coders or mappers or that tell us we have to drop elements in our print books to get an electronic output and if we don’t we are living in the dark ages. Not an acceptable answer. You still use your eyes to read ebooks, right?

    In my role at a publishing house and as a digital distribution provider, I have had the luxury of having just enough rope to hang myself. To date, I have not kicked the stool out from under me, but I came close by trying to work with 2 different “XML first solution” providers who shall remain nameless. I thought we could partner and develop a system over several years that could be used by not only my publishing group of 10 imprints but the 380 publishers we represent in some form (physical or electronic distribution). We wanted to have a knowledge and technology share with dividends for all. My “partners” wanted me to foot their development bill and basically tell me to publish fiction.

    And where are we housing, accessing, pushing, indexing all of this XML? I’ve got a DAMS but do I need an XML server with a custom interface and limitless connectivity to push it everywhere on the globe? You know we are in publishing, right? Not our core competency and if that needs to change, I’m ok with that but what is that new skill set? I’ll spare you my treatise on how Jackie O ruined publishing salaries for most of us, but we are not going to be paying developers salaries anytime soon. In our offices, we open up quickshops to educate people on what is out there, how to create basic elements and see who takes to it. Roundtables, open forums, internal facebook…doing all we can to find the people who are interested enough to join the geek patrol and try to figure out how to handle some basic publishing issues. I’ve gone outside the industry to see what is going on – lots of talk, not much doing but we are learning how to “start the conversation” I think. I hope.

    And will XML get me into all those proprietary formats that are proliferating by the hour? And keep me up to speed when the code changes on the next chest thumping upgrade in the tablet war? This is why I say your posters seem to have a luxury of thinking in terms of a very defined space of XML. Publishers have to think about getting the content, massaging it, outputting it into one universal and a minimum of 2 proprietary formats + 1 additional fixed layout and the other A is working on their proprietary tool aka format. And metadata…if I can create the book to your proprietary format, why the hell can’t you take basic metadata in the industry acceptable format?

    Help me Obi Wan

  • Thad McIlroy

    Nov 2nd, 2012 : 5:05 AM

    @Melissa Serdinsky
    You may several important points.

    First is the communication gap between those who promised XML “as “a savior to publishing”. It’s clear now that this led to a communication gap with “work-a-day publishing people.” And that rift remains a huge barrier moving forward. XML was derived from SGML, a language used in highly complex documentation. As such it was complexity handed down from on high, not a system built from the ground up around the real needs of book publishers.

    You point also to a technical employment gap: the staff required by publishers to make XML work cost far more than publishers can afford to pay. So even when publishers approach XML with open arms they find themselves poorly-staffed to implement.

    And finally you point to what I think is a key issue in the debate: Let’s just say that as a publisher you manage to built styled agility into your comment. That, in itself, is only half the task of then outputting said content to PDF, EPUB 2.1, EPUB 3. Mobi, KF8, browsers and apps. Thus XML, for all its complexity, is not currently even a complete solution.


  • bowerbird

    Nov 1st, 2012 : 11:24 AM

    thad said:

    > XML’s proponents, a diverse group of
    > wise and practical men and woman,

    wait. you finally got a woman on-board?
    seriously? great! when did that happen?


    just kidding, of course.

    if i was gonna take issue with anything,
    it would be the “wise and practical” part.

    because only unwise and impractical
    proponents fail to consider the _costs_
    of implementing something and only
    focus on the “benefits” of doing it…


    oh, and speaking of those “benefits”,
    perhaps the silliest statement in an
    otherwise rather unflinching analysis
    — congratulations on that, thad —
    was where you asked the “question”
    (which i take it was supposed to be
    rhetorical in nature) about how people
    could publish to multiple platforms
    _without_ using x.m.l.? simple, dude.
    really, far simpler than _using_ x.m.l.

    but, you know, good luck selling that
    flawed x.m.l. product to the masses…

    renaming it is a good idea. that will
    probably help you fool a few people.


  • Thad McIlroy

    Nov 2nd, 2012 : 4:53 AM

    XML is tremendously over-specified for the average book project. I dream that we can return to the “extensible” essence of the Extensible Markup Language, by creating a far simpler base set of codes and practices that can be extended in logical steps to deal with more complex projects.

    Technically it’s a no-brainer. But after 15 years the existing XML “power structure” is (naturally enough) well in place. Can this group really be motivated toward real change? More likely a new group would form, respectful of tradition but committed to practical changes.

  • Michael Boses

    Nov 1st, 2012 : 2:14 PM

    Thanks for starting this great discussion. Here is my view of XML publishing today from having helped a fairly large number of authors create XML in projects such as the Irish Government that Scott mentioned.

    The most important question to ask is, “what are the tags there for?” Authors are increasingly amenable to some level of effort to provide semantics and portability, but most projects do not stop at that level of tags. Often the bulk of the tags are an attempt to simplify or eliminate downstream processes. Authors have never liked doing prepress work and XML does not change that. Here is a simple experiment for any publisher who has experienced a failed XML project:

    – Take one chapter of content and duplicate it in Word and on paper.

    – Have an author highlight the printed copy with the same level of information that was implemented in XML.

    – Do the same with the Word copy using any method you like; colored text, highlights, comments, etc.

    – If these activities seem untenable for your authors then the problem was not XML.

    We do need to reduce the cost of downstream publishing processes, but having authors manually apply XML is not a scalable way to do it. Semantics are a different story; the author is probably the best resource there. Having authors provide enough structure to support portability is usually not onerous. Even with profiling for adaptive content this
    is still not a very large set of tags–in fact, I would suggest it comes close to your ABCML, Thad.

    There are other technologies that can be used to analyze content and prepare it for publishing by adding more tags. So if this is true, why isn’t everyone already using these technologies. From my experience the answer is that XML is most often an “all or nothing” proposition. How often has someone said: “Our authors could not handle all of the markup so we just let them go back to Word,” and then, “Oh, we tried automating markup and the technology just isn’t there. Now we outsource manual conversion.”

    This reflects a tendency for organizations to see only two options: either put all the responsibility for on the authors; or expect technology to be able to make sense of completely unstructured content (save perhaps some Word styles). A better approach is to expect far less of authors, and then use what they provide to reduce the technical challenge of automating the remaining tagging of the content.

    I agree with many who feel that an effective approach to XML content is needed now more than ever. To get there for all types of publishers, we will have to do some things differently than they were done in the past.

  • Thad McIlroy

    Nov 2nd, 2012 : 4:48 AM

    @ Michael Boses
    I like your “simple experiment for any publisher who has experienced a failed XML project.” Great idea. But by the same measure I question “Semantics are a different story; the author is probably the best resource there. Having authors provide enough structure to support portability is usually not onerous.” Certainly a SME — subject matter expert — is a necessity for semantic tagging, and the author is the best expert. I still haven’t heard of a good toolset to encourage the author to get onboard.

    I like also your implication that there could be a “middle ground”. That is what I also now belief.

  • Pubfluence.Info

    Nov 2nd, 2012 : 6:22 AM

    “XML, like any automated workflow, results in products that are in some ways inferior to the custom-crafted alternative.”

    Sarah, sorry to say, but I’ve yet to meet a publisher that would accept this statement as a reason to move to an XML workflow whatever the increase in operational efficiency. That said, I accept that compromises will be made. Just look at the state of the quality of many of the eBooks available today!

  • Thad McIlroy

    Nov 3rd, 2012 : 6:01 PM

    “I’ve yet to meet a publisher that would accept [an inferior product] as a [byproduct of a] reason to move to an XML workflow…”

    This of course comes up all the time. Everything is a tradeoff: we all want the lowest price, fastest turnaround and best quality. In the end we settle for two out of three, or some fractional measures of all three.

    What you’re pointing to is that XML is at its very strongest when there’s no perceived loss of quality.

    Automated workflows, of which XML is the #1 flavor, are the cheapest per page once you get past the set-up/start-up costs. And nothing can touch automated workflows for turnaround.

    The quality is also excellent, but the design is inflexible. You lose the cost and turnaround benefit if you step outside the automated template. And lots of book design, whether it needs to be or not, is still one-off.

    So traditional publishing staff are asked to sacrifice their craft while undertaking to learn an extremely techy publishing system in order to maybe save some time and money. No wonder it never made it to Broadway!

  • Poetry Friday: Findability, Discoverability, and Marketing « Think Kid, Think!

    Nov 8th, 2012 : 6:42 PM

    […] From reading The Minders of Make-Believe to browsing current release catalogs to studying metadata and XML, I’ve immersed myself in the past, present, and future of “publishing” to figure […]

  • This Week in InDesign Articles, Number 99 | InDesignSecrets

    Nov 14th, 2012 : 4:40 PM

    […] Has XML failed book publishing? That’s the question Thad McIlroy is asking over at thefutureofpublishing. Go read his thoughts and then read the very lively and educational comments! […]

  • BigEater

    Dec 19th, 2012 : 7:03 PM

    I admire you guys. If you can keep this XML thing going it will definitely make knowledge more accessible and help the arts and sciences to move forward more rapidly, so I’m happy to share my thoughts. I hope they’re a little bit useful.

    My background: I know absolutely nothing about XML except that from my days working at a giant publishing company I fear XML more than I fear the fiery torments of Hell.

    However, I have been a writer or editor in newspapers and magazines for 30 years and have used everything from hand-set lead type, to a Typositor, to a proprietary mainframe system, to ATEX, and am just finishing my first e-book in InDesign 6 output to .mobi via the Kindle plug in. I also just finished my first semester of XHTML/CSS at a local college.

    Here’s my opinion.

    Forget about XML as a standalone gig that’s going to make you rich, famous, and admired by your spouse. Kei$ha is never going beg you to come over and help her tag her lyrics. Take your ego out of that equation because it’s your ego that’s holding you back.

    Writers, photographers, editors, designers, stylists, art directors, these people have room for only one computer gestalt at a time. So if I’m deep in the Zen of Macintosh/InDesign/Photoshop/DreamWeaver I’m just not able to start grooving to your PC-based XML beat, no matter how hard you pound it on my head. Give up.

    The good news, based on my recent e-publishing venture is that InDesign is now a very HTML/CSS-like tool and I think it would be useful to explore how to make your XML work invisibly in the background behind InDesign/Photoshop/DW.

    Here’s why: the pre-planning, text tagging, and metadata entry required for a decent-looking Kindle book (mine has recipes and 22 full-size photos) might translate very elegantly your XML tags. I have one set of tags for recipe titles, one for ingredients, another for method. I have the photos loaded down with tons of metadata, and even my InDesign template is packed with metadata. I also have tags for introductory text, author names, all that.

    So what you should be doing is figuring out how to work WITH me and create an application that uses the same kind of logic that I have already internalized from Adobe products. Same tools, same look and feel, same everything. Of course, no one wants to pay royalties to Adobe and maybe Adobe already told you to take a hike, but I think it’s the only way you people are ever going to make this happen on the scale that this deserves.