September 26, 2003
Continuous Improvement in Content
The obvious advantage of single-source publishing is the ability to "write once and publish many times." This begins to show benefits even if your goal is simply publishing to many formats, such as print and Web. The ROI becomes greater if you are also doing things like localization and translation of the single-source content. Larger organizations are beginning to reap great benefit from atomizing content in such a way that translation, as an example, can be better managed through a change control process. Write it once, translate it once, and use it many times.
But the real bang for the buck in content management will likely come when organizations reach a point where they are in a mode of continuous improvement on managed content. I have seen this a few times in my career, and the results were impressive.
I worked for a specialized software development and information publishing company in the early 1990s. We maintained large databases of dictionary and thesaurus data in English and a number of other languages. Because the data was highly structured (in SGML and in some relational tables), we could derive many forms of the data, including subsets of individual databases and supersets of multiple databases. As a result, we were able to help develop whole new product lines for the company, very efficiently.
This didn't happen all at once of course. In many cases, the content had to first be digitized, then structured. Little or no digitization and structuring happens without manual cleanup, and because we were doing specialized work in multiple languages, the cleanup could be expensive and time-consuming. But once structured, ongoing enhancement to the content became efficient. If we wanted to add a field, or amplify an existing field, we could easily extract the existing content and set up an editorial tool for staff or outside contributors to use. The edits were then made in a structured form as well, so re-import to the database was then automatic.
Over time, we developed comprehensive, structured, and editorially enhanced databases that drove significant new product development for the company. We also were able to do scores of ad hoc, quick turn-around projects for little or no additional cost. We could easily extract or subset a database, for example, with a single query or script. We also had an excellent set of metrics for estimating ongoing and future work. Moreover, most of the internal tools we developed were reusable, given that the data was so consistent
This is a model that both commercial publishers and organizations that use content in product support should consider. Long-living content benefits from continuous improvement, and such continuous improvement works best with structured content.
Posted by Bill Trippe at 1:12 PM
September 25, 2003
Why Content Management Projects Succeed or Fail
Someone raised an excellent new topic on cms-list, seeking input on why CMS projects sometimes fail. This has already sparked some lively discussion. If you have a technical take on CMS efforts, cms-list is an especially useful list.
I have seen both successes and failures in content management projects. While I have not attempted to formally catalog the reasons, certain things seem to be consistent, at least on the "successful" side. I have noticed at least three major things in common with successful projects.
- Successful projects have champions. Someone has already mentioned this on cms-list, correctly noting that large CMS projects require support from a key driver in upper management. I would also argue that CMS projects need champions at the working level—beginning with the architects and including both editorial champions and graphic design champions. The editorial champions are those people who know the content, how the audience uses it, and how to continually improve it. The more input these people have in the project, the better the outcome.
- Successful projects have metrics. I believe in calculating ROI. My most successful clients have well developed metrics for how long things take to accomplish and how much they cost. They know how much it costs to develop content, and they know how much it costs to add functionality to their content management systems. They then make decisions based on hard data, with an eye on return on investment and profitability.
- Successful projects begin with a lot of skepticism. I had an engagement recently where I presented an analysis of three competing CMS proposals to my client. Each was from a reputable vendor, and had been prepared based on extensive input from the client and from me. We worked very hard to develop an apples-to-apples comparison for client management. The proposals all fell within a range of 15% of each other, including training, professional services, and maintenance over the first 18 months. When I was done with my presentation, one of the first questions from one of the corporate officers was, "So they all think this can be done for about $600,000?" I answered yes. So he turned to his colleagues and said, "Well we need to budget about $2 million then." This was followed by a hearty, knowing laugh. They had all been here before.
I realized the three elements listed here also have something in common—they are all related to a high-level of engagement, communication, and consistency in management style. That same group of champions knows and works with the organization's metrics and is committed to ROI from major initiatives. That same group of champions is skeptical about what vendors (and consultants!) tell them and takes ownership themselves. I have been struck by how successful projects have open and ongoing communication. Skepticism is a healthy thing.
Posted by Bill Trippe at 2:23 PM | Comments (3)
September 23, 2003
Component Management of XML
Back in the day, serious content management meant component-level management of XML elements. This was a carryover from early SGML content management approaches, before the term "content management" had legs. Of course, early "content management" applications were really Web delivery engines (early versions of Vignette come to mind), and some systems now thought of as "content management" were actually document management systems (the aptly named Documentum).
At the time, there were a few technologies that handled component management of SGML data. These included Xyvision's PDM (once an acronym for Parlance Document Manager, and now called Content@), Interleaf's RDM, Chrystal Software's Astoria, and a product from a company called Texcel. (I forget the entire lineage, but somewhere along the way, Interleaf acquired the Texcel technology, and either incorporated it in RDM or began using it as part of a later offering called BladeRunner. Interleaf was then acquired by Broadvision, and the BladeRunner technology—I believe—is now part of what Broadvision calls "One-to-One Publishing.")
I mention this because the need for component-level management of (now) XML is still there, and these technologies still do the job. They are not the only technologies that do the job, anymore. However, it would be interesting to come up with a matrix of detailed requirements for single-source XML-based publishing and see how these technologies fare against other, newer ones. Do they still maintain a technical advantage?
Bill Trippe
btrippe@nmpub.com
Posted by Bill Trippe at 12:33 PM
September 18, 2003
278 E-mails, and Nothing's On
Is e-mail dead? I tend to not think too much of e-mail in the context of content management, but I do have clients who are interested in at least managing e-mail archives. But, in truth, is it really worth saving?
As of 2:00 today, I had received 130 e-mails in my primary (business) e-mail account. 74 of them were successfully identified as spam and filtered into a separate Microsoft Outlook folder. Another eight were infected with a virus, and were trapped by my anti-virus software and deleted. Another dozen were bounces from AOL e-mail accounts that had been spammed by someone else using my email address. This leaves 36 legitimate business e-mails. I have a second business account that has excellent server-side spam filtering software. Every three days it sends me an e-mail showing me the (typically) scores of spam it has trapped on and quarantined. In the same period of time, I may have received 20 legitimate e-mails.
I'm tempted to ask, "What the hell is wrong with this picture?" but a more accurate question might be, "Does this work at all?"
I manage, through significant effort and expense on my part, to make email work for me and a small number of e-mail addresses that I manage for others. But the outcome of all of this effort is, at best, a handicapped medium. Before e-mail can be a useful part of enterprise content, it must first be, by and large, useful content. Right now it isn't.
Bill Trippe
btrippe@nmpub.com
Posted by Bill Trippe at 3:59 PM | Comments (1)
September 17, 2003
Whither SVG?
SVG (Scalable Vector Graphics) is, to my mind, an obviously great thing—an industry standard, XML-based language for rendering vector graphics, animation, and user interfaces. Yet it continues to languish. Despite a lot of push from the W3C and a cadre of interested vendors (led by Adobe but absent, notably, Microsoft to date), SVG still does not have a great deal of traction.
The surest sign that SVG is lagging is that there are still more books about SVG than related job postings on monster.com. This is an unscientific measure to be sure, but a telling one. I test this measure on several technologies semi-regularly. As of today, amazon.com was selling nine books on SVG (plus an instructional CD and an eBook), while monster.com listed seven jobs that mentioned SVG (and none of them all that prominently). As a co-author of one of these nine books, I have to hope that the seven people who get these SVG jobs plan to buy a lot of books.
Apart from my selfish interest, though, I would like to see SVG gain more ground. Is it simply a matter of Microsoft weighing in with deeper support for SVG? Is Flash and its family of products simply too entrenched? I think these factors play a part, but perhaps more significant is a continued lack of focus on an improved experience for the end user. SVG is one of many technologies that could improve the end user experience. When will client development benefit from more attention to the graphical user interface?
Posted by Bill Trippe at 9:32 AM | Comments (1)
September 16, 2003
How Much Content is Structured?
The conventional wisdom has recently been that 20% of content is structured, and the rest is unstructured. This factoid may be loosely borrowed from an older number, which is that 20% of an organization's data assets are in some kind of structured form (typically relational database tables). I wonder if either number really holds up.
I think it is the rare organization that has a great deal of "document" content in a structure such as XML (or SGML for older materials). In my experience, most large collections of XML-tagged content represent one silo of an organization's data (the technical manuals in a manufacturing company, or the catalog data in, say, an electrical supply company). Some legacy content may never end up in a structured form, but the toolsets are almost there to allow us to ask the question, "Should all new content, born digitally, be structured?"
I'm not sure I am ready to answer the question. Are you?
Bill Trippe
btrippe@nmpub.com
Posted by Bill Trippe at 11:01 PM | Comments (2)
September 15, 2003
More XML-Based Publishing?
Seybold was a very busy week for me, so I didn't really get a chance to step back and really think about what trends seemed to be represented there. However, it does seem like there is more XML-based publishing going on. And this includes publishing to print, through desktop engines such as Quark Express and Adobe InDesign.
The conversations I had on the show floor seemed to indicate this expanded emphasis on XML comes largely from the requirement for simultaneous output to print, the Web, and other electronic formats. Nothing new there, of course, but the reality seems to be setting in that multiple output publishing is here to stay. As I have said elsewhere, conventional wisdom says everyone's "second business is publishing"; now everyone's second business is multiple output publishing. So, if that is the case, and XML does the job, it follows to use XML, doesn't it? Not always, of course, but apparently in more and more cases.
I hope that this new emphasis doesn't lead people down a path of complex, nearly impossible implementations of XML. Most documents can be supported by very simple XML Document Type Definitions (DTDs) or schemas, some of which are already in the public domain. (Although some of the public domain ones are also over-engineered and difficult to implement, too, so be careful there as well.) Keep the initial implementation very simple, starting with a pilot and going from there.
Bill Trippe
btrippe@nmpub.com
Posted by Bill Trippe at 10:59 PM
September 11, 2003
CMS, DAM, and Hosted Applications
I have spent some time at the show with Crownpeak, Atomz, and eMotion. The first two are Applications Service Providers (ASPs) for content management, whereas eMotion is an ASP for digital asset management. When I start to think about the ASP option, I sometimes find myself thinking that everyone should simply do it—the argument seems so compelling.
Of course, it doesn't make sense for everyone to use a hosted application for content management. At certain ends of the market, it might not make financial sense (if you are either very small or very large, or if your content, workflow, and/or business requirements are inordinately (and necessarily) complex). But the track record for CMS implementations is still relatively spotty, so why not pay someone else to perform a set of tasks and functions for you at an agreed price? Sometimes it makes too much sense.
Posted by Bill Trippe at 7:06 PM | Comments (2)
Seybold, Day 3
Well, I knew my first couple of days at Seybold would be a blur, but it has been even more hurried than I thought it would be. The Gilbane Conference sessions on Tuesday went very well. A highlight for me was Dana Hallman's presentation on the US GSA's process of choosing a CMS for www.firstgov.gov. She did a great job of explaining the process GSA went through to evaluate different systems and companies. The upshot? For a large system evaluation, the evaluation itself has to be run as a complex project, and the major stakeholders need to have involvement and visibility at key stages. Given it was a government acquisition, the evaluation process needed to be open and fair. The result was a multimillion dollar award to Vignette—and no challenges or protests from the other bidders. The GSA clearly did a great job at this stage of the project.
The XML Web Services Intensive was going swimmingly until a bomb scare evacuated all three buildings in the Moscone Center. The result was that we had to cancel the last two sessions of the day, which was a disappointment. I am going to explore options for doing the last two sessions via the Web. If you were attending and are interested in a Webinar follow-up, please email me at btrippe@nmpub.com.
Today I am off to the show floor, which I have only seen for a few moments. The bookstore was certainly busy when I stopped by, which I always see as a sign of health.
Posted by Bill Trippe at 1:41 PM
September 9, 2003
Is DRM Emerging Again?
Seybold has more about Digital Rights Management than I thought it might, given the relative softness in the market. Bill Rosenblatt did an all-day intensive yesterday, and has an excellent keynote session scheduled for tomorrow, The Great Digital Copyright Debate.
Bill has lined up excellent speakers for this:
--Joe Kraus, Co-founder, Digitalconsumer.org
--Tim O'Reilly, Founder & President, O'Reilly & Associates
--Dean Marks, Senior Vice President, Intellectual Property, Corporate Business Development and Strategy, Warner Bros. Entertainment Inc.
Bill himself is a great moderator and speaker, and this is a stellar lineup. It should be a great session.
It's interesting, though most likely a coincidence, that Microsoft last week announced availability of their Windows Rights Management client. A number of people see Microsoft's Rights Management Services as a springboard for a lot of development. Perhaps Microsoft's client application will get some traction, and fuel more development of DRM solutions.
Posted by Bill Trippe at 12:43 PM
September 8, 2003
Seybold San Francisco 2003
Seybold San Francisco has always been my favorite trade show. You can't beat the locale, of course, but I have also found it to always be upbeat, informative, and a great indicator of the current marketplace. I will be reporting from here over the next several days. Wednesday will likely be a quiet day for reporting, as I am moderating the day-long XML-Web Services event (discussed elsewhere in the blog). I have a couple of current projects that will drive some of my research; in particular I have been thinking about Web Services style integration, and a couple of familiar (for me) but somewhat older topics--technical documentation in XML and ebooks (of all things!).
Posted by Bill Trippe at 8:51 PM
September 3, 2003
Storing your XML
Many organizations are now working with XML data in one or more applications. As the use of XML grows, an important question arises&--where should XML data be stored?
I originally wrote about this in February of last year for Transform magazine. It's interesting that the primary argument still holds up more than a year and a half later; some of the vendors have changed of course.
If you take even a cursory glance at the XML storage market, you will see many vendors vying for your attention and your dollars. These include major database companies like Oracle and Microsoft, and a long list of companies with specialized XML repositories. The various products reflect completely different approaches to data storage and management, and understanding them in detail will require your technical staff to dig into some subtle and complex technical questions.
In fact, XML has reopened some fundamental questions of data storage that some people, certain vendors especially, had felt were already answered. For many years, object-oriented database vendors argued that their systems were superior for storing document data, but they never gained much marketshare or mindshare against the giant relational database vendors like Oracle. Customers seemed to accept the argument that relational databases would do the job well enough, and IT organizations wanted to shorten, not lengthen, the list of technologies in use and under maintenance.
Have things changed now where you should consider a specialized database for XML storage? The answer, of course, is it depends. It depends on three things mainly--how much persistent XML data you have, whether you need high-performance, real-time access to it, and what kinds of querying you think you might want to do with the data. Let's take a quick look at each of these.
Persistent XML. Applications use XML in one of two ways--as a source format, or as something that is created and used for exchanging data between two data sources, usually temporarily. But you may well have source XML data that needs updating and maintenance. The more XML data you have, and the more user interaction and updating it requires, the more you may need a specialized tool for storing it.
Real-time access. If the data is stored in a format besides XML, and you need instantaneous access to it in XML format, then access could be slowed by requiring the application to go through transformation processes. A database that is optimized for XML storage will provide better performance, so if performance is key, you should consider an XML repository.
Querying. Relational databases support SQL for querying relational data, of course, but XML data cannot be queried with SQL. SQL is designed for the row-and-column orientation of relational data, and both end users and developers are comfortable with its approach. XML data, with its mix of elements, attributes, and textual data, requires a different approach. Instead, XML data is queried with tools based on XPath. While relational database vendors such as Oracle have basic XPath support, you may need a specialized repository if you have extensive and complex requirements for XPath-based querying.
If you do have one or more of these requirements, then it makes a lot of sense for you to look at specialized tools for storing your XML. And while there are many players in the space, there are two clear market leaders, Software AG with its Tamino XML Server (http://www.softwareag.com/tamino/), and Ixiasoft Corporation (http://www.ixiasoft.com/) with its TEXTML Server (XIS). Both companies bring a lot to the table. In Software AG's case, they are a long-established database company, having introduced the Adabas product some 30 years ago. They have made an aggressive and very strong entry to the market with Tamino. In Ixiasoft's case, they are a newer vendor with a sole focus on this technology.
This is not to say you would be saying goodbye to Oracle or Microsoft anytime soon. For one thing, you will still have many uses for relational data and the tools that work best with it. But as your use of XML increases, your need for specialized tools will also likely increase. This will be especially true if you have large volumes of XML data, real-time processing needs, or complex queries to run. Those are the same questions you ask of your database today, and will ask of your XML database tomorrow.
UPDATE (04/09/06): This was written a few years ago, but the general ideas still apply. I would say, since I originally wrote this, MS SQL Server has come on as more of a presence in storing XML, and Mark Logic's XML content server has carved out an impressive chunk of this market. And as Dave Kellogg, the CEO of Mark Logic noted recently, the question of XML storage is still a very live one for IBM and Oracle.
Posted by Bill Trippe at 5:19 PM








