December 18, 2007

Call for Papers: Gilbane San Francisco 2008

They are now accepting proposals for panel participation and presentations for Gilbane San Francisco 2008, to be held at the Westin Market Hotel, San Francisco, June 17 - 19, 2008.

Join the content and information technology's leading analysts, IT strategists, and technologists at the industry's most popular and important conference this coming Spring. Share your expertise and experience, and network with the forward-thinking implementers and thought leaders.

How to be a speaker

Choose a topic area from the list below and see how to submit a proposal. The deadline is January 15, 2008. Topics to be covered in-depth include:

If you've never been to one of the Gilbane events and want see what we have been covering in our conferences, check-out the programs from the recent hugely successful Gilbane Boston 2007 and Gilbane San Francisco 2007.

Posted by Bill Trippe at 10:14 AM | Comments (1)

December 4, 2007

Meanwhile, Over at Gilbane...

Tomorrow, I will be part of a webinar, What Every Publisher Needs to Know About Content Management. It's being put on by Book Business Magazine and sponsored by Follett Digital Resources. Matt Steinmetz, Special Projects Editor for Book Business will be moderating, and I will be joined on the virtual dais by Jabin White, Vice President for Product Management at Silverchair.

I'm going to be presenting a market overview, offer some definitions, and discuss some recent and emerging trends. I'm going to leave most of the heavy lifting to Jabin, though. He is truly one of the smart guys in the business and an excellent presenter, and I am looking forward to hearing what he has to say.

You can go right to the registration page here.

Posted by Bill Trippe at 8:40 PM

November 28, 2007

Wall Street Hearts AMZN

It's been an up and down week or so in the market, but not so for Amazon. Wishful eBook fans might imagine it is all due to Kindle, but impressive online Christmas shopping numbers are the more likely booster.

Posted by Bill Trippe at 10:53 AM

October 25, 2007

All the News that's Fit to Click?

eMarketer says that, "It’s wake-up time for the publishing industry. Like it or not, readers and advertisers are turning to the Internet, and print brands must follow." The numbers are compelling.

088077.gif

You can read some of the summary and purchase the report here.

Posted by Bill Trippe at 10:52 AM

October 24, 2007

A Billion Here, A Billion There

And sooner or later, you start talking about some serious revenue.

The Interactive Advertising Bureau (IAB) and PricewaterhouseCoopers (PwC) released the IAB Internet Advertising Revenue Report covering the second quarter and the first six months of 2007. Internet advertising revenues (U.S.) for the first six months of 2007 were nearly $10 billion, setting yet another new record and representing a nearly 27 percent increase over the first half of 2006. Internet advertising revenue totaled nearly $5.1 billion for the second quarter of 2007, exceeding the $5 billion mark for the first time in a quarter, a 25.4 percent increase over the same period in 2006.

Posted by Bill Trippe at 8:06 PM

August 20, 2007

Semantic Web Strategies Conference Program is Ready

Bob DuCharme reports that the Semantic Web Strategies program is ready.

I'm very happy to announce that the program for the Semantic Web Strategies conference in San Jose September 30 - October 2nd is finished and available. For keynote speakers, we've got some well-known names who all bring a combination of experience and creativity to their semantic web work: Eric Miller, Nova Spivack, and Kingsley Idehen. We also have presentations on many interesting projects from large and small organizations and well-known semantic web companies such as TopQuadrant, Zepheira, and Access Innovations (of DataHarmony fame) as sponsors.

Posted by Bill Trippe at 4:57 PM

March 13, 2007

Slow Blogging

I have been swamped with work, so I have been slow to blog. There are a few items of note, though.

Premium content does indeed seem to have a life. One of the interesting things about these three items is that two of them are top-shelf traditional publishers and the third is a top-shelf TV network. The lesson for me is that people will pay for premium content when the content is very good.

Posted by Bill Trippe at 5:56 PM

December 27, 2006

Goodbye 2006, Welcome 2007

Apoorv Durga says goodbye to 2006 and welcome to 2007 in the world of portals and content management.

2006 has been an exciting year for content technologies. Based on some of the interesting happenings, the following themes (in no particular order) have emerged that might have an impact on this space in coming years: Standards, or the lack of them was evident.

Posted by Bill Trippe at 9:00 AM

September 29, 2006

So Much for the Death of Print

I roll my eyes (well, not really, but figuratively) when I hear people crow too much about the death of print publishing. Clearly, a great deal of publishing is transitioning to electronic distribution, and--just as clearly--publishers are finding slower growth in print products, faster growth in electronic, and improving margins in electronic. But this headline, among others, reminds us that print is not dead.

Posted by Bill Trippe at 9:40 AM | Comments (0) | TrackBack

September 17, 2006

Monetizing that Content, Baby!

I have always done a few things to make a little money off this blog. I do the Amazon Associates thing, run Pheedo ads, am an affiliate of MarketingSherpa, and syndicate my content through Newstex. The results have been modest--no, make that paltry. Then today I got my first royalty statement from Newstex and found out that my first month's royalties totaled... drum roll please... $1.78.

So much for early retirement.

But I enjoy writing the blog, and I have learned that how much I get paid for a piece of writing does not necessarily equate to how good the writing is or how much I enjoyed the project. For instance I still think one of the best things I ever wrote (co-wrote actually) was a chapter in The Columbia Guide to Digital Publishing. And, for that, I have earned about $54 in royalties.

Posted by Bill Trippe at 5:54 PM | Comments (3) | TrackBack

September 8, 2006

Here and There

Slow blogging lately, as I have been heads down with some work. But here are some things for your consideration.

Posted by Bill Trippe at 2:07 PM

June 12, 2006

Correcting the Record about Microsoft

A List blogger Robert Scoble is leaving Microsoft for a podcasting startup, but wants to correct the record about some assumptions some people are making about what this means for Microsoft.

I have read Scoble for a few months, and maybe I missed his heyday, but I don't find him to be terribly interesting. This is in part, I think, because I don't land very hard on either side of the Microsoft wars--that is, I don't see them as the evil empire or as the greatest organization in the world. As an analyst, I see them as a dominant software company, but not dominant in the areas that I follow most closely--content management and publishing systems. As a small business person, I look at Microsoft the way I look at my accountant and lawyer--as things I have to spend money on. (And, yes, I know that I could go the open source route, but I just have never taken the time to do it--I am too busy doing client work.) I guess my only other opinion about Microsoft is that their products are less reliable than they should be, given the enormous resources of the company. For example, with two million beta users, Vista better work well, and so far it doesn't.

But back to Scoble. I wish him well in his new venture. He sounds excited about it. But I think his value as a blogger just declined precipitously, as he is no longer in the belly of the best.

UPDATE: Scoble was gracious enough to link to me.

Posted by Bill Trippe at 12:25 PM | Comments (1) | TrackBack

June 3, 2006

Coming Soon to a 737 Near You

As long as they do this instead of allowing people to yack on their cell phones, I will be happy.

Posted by Bill Trippe at 6:03 PM

June 2, 2006

User-Generated Content: Where Does it Fit?

I spoke at the NFAIS event today, and it seemed to go well. You can find the slides here.

Posted by Bill Trippe at 11:27 PM

May 31, 2006

Imagine

Dave Winer Imagine[s]â„¢ whatâ„¢ theâ„¢ worldâ„¢ wouldâ„¢ beâ„¢ likeâ„¢ifâ„¢ everyoneâ„¢ trademarkedâ„¢ everyâ„¢ wordâ„¢ thatâ„¢ wasâ„¢ everâ„¢ addedâ„¢ toâ„¢ theâ„¢ language.â„¢ Itâ„¢ wouldâ„¢ getâ„¢ prettyâ„¢ tiresomeâ„¢ reallyâ„¢ fast.â„¢ He is referring to this, but it is also worth reading this.

Posted by Bill Trippe at 6:39 PM

May 25, 2006

The Changing Face of Content

I will be speaking next Friday, June 2, at an NFAIS event, The Changing Face of Content: Creating Innovative Information Services for the 21st Century. My topic will be user-contributed content. Per the abstract:

A broad range of content is now being created by individuals as a result of readily accessible web tools. While this class of published information is not usually held to the more strict traditional publication process associated with books and journals, it nevertheless often constitutes material worthy of distribution and preservation. This session will focus on the challenges in enhancing the visibility of this new form of content and how such content can be incorporated into digital collections, products and services.

It's a day-long event, right in center city Philadelphia, and registration is still open.

Posted by Bill Trippe at 3:44 PM

April 8, 2006

Folio 40

Folio Magazine has listed their Folio 40, "the oldest and most prestigious list honoring publishers who’ve had a significant impact on their own products and the magazine industry in general." It honors individuals, and this year included Jon Udell. Jon is always worth reading, Folio is to be commended for recognizing Jon's leadership among technology writers, and Jon offered a gracious acknowledgment. But I couldn't help but be struck by the irony that Folio, a magazine about the magazine industry, is smart enough to recognize Jon but does not have an obvious RSS feed.

Posted by Bill Trippe at 10:47 AM

April 3, 2006

CM Professionals Spring 2006 Summit

Seth Gottlieb has a roundup of the speaker list from the CM Professionals Spring 2006 Summit. I will be delivering one of the keynote presentations on actionable content.

UPDATE: You can read the press release here.

Posted by Bill Trippe at 8:43 PM

March 29, 2006

How Gauche

An unintended effect of keyword advertising.

Posted by Bill Trippe at 8:46 PM

March 27, 2006

How Nerdy Are You?

Via Publishing 2.0, I learned that Newsweek is asking and answering this question with an interactive poll. Apparently, a few people at www.msnbc.com are not terribly nerdy as the link to the poll (and apparently the entire article) cause my Firefox session to crash.

But it works just dandy in Internet Explorer. Coincidence? I think not!

It turns out I am barely a nerd. I scored a 30, which puts me just in the "Heading to Geekdom" (30-60) category and--egads!--on the cusp of "Stuck in the Last Century" (0-29). Well, come to think of it, in the last century we weren't stuck with this doofus, so maybe that isn't such a bad place to be after all. I am going to go shoot for a lower score...

Posted by Bill Trippe at 2:19 PM

February 28, 2006

The Long Tail, Redux

Gerry McGovern has some contrarian ideas on The Long Tail.

Posted by Bill Trippe at 8:05 AM

February 10, 2006

Blogs, Blogs Everywhere

I've commented on the ridiculous growth in blogs, but Tim Bray has a much more clever take on it, and thanks to perl no less!

Posted by Bill Trippe at 12:03 PM

February 2, 2006

DRM at Davos

Bill Rosenblatt has a report on his talks at last week's World Economic Forum in Davos.

Posted by Bill Trippe at 12:16 PM

January 14, 2006

I Was the Internet Once

Via PaidContent.org, I learn that the Chicago Tribune is the lastest major newspaper to stop printing the pages and pages of closing stock and mutual fund prices that have long been a staple of daily newspapers. They will still run the comprehensive tables on Sundays, and the daily papers will tabulate highlights from the day--most active stocks, local stocks, etc.

This probably strikes many people as a no-brainer. I probably last looked up a closing stock price in a newspaper in 1999 or so, and the Web is lush with financial information. But there are certainly some readers who--whether because of age or habit or temperament--still go to their daily newspaper for this kind of information. So I am sure some small percentage of the Tribune's readers will not be happy with the change.

And this is certainly more evidence of how the game is changing for print newspapers. Boston is now a two-newspaper town, but it had three dailies when I was a kid, and at least two of them had evening editions. Evening editions started disappearing as local television stations began producing more local news, and there are few evening papers anymore (if any?).

My first paying jobs were in newspapers. When I was 12, I started a paper route, and would eventually deliver about 60 newspapers a day to my neighborhood. I made about $15 a week, which was a small fortune in those days. I could buy all my essentials--baseball gloves, hockey equipment, and all the candy bars I could possibly eat. I also began working at a corner drug store, where I would deliver telegrams and prescriptions around the neighborhood. I also had the job of delivering four evening newspapers--one to a shut-in across the street and the others to the three brothers who ran the corner grocery store on the opposite corner. The brothers, I would learn later, were in the stock market, and the evening edition, which came out about 6:00 PM, would somehow--miraculously to my thinking--have the closing stock prices from when the market closed at 4:00.

I was the last link in this amazing information chain--from the markets in New York, to the presses somewhere in downtown Boston, by truck to the corner drugstore just outside Boston, and then into my hands. The driver would throw the bundle of papers from the truck onto the drugstore step. The clerk--a few years older than me and full of secret knowledge--would produce a boxcutter from his pocket and slice the twine from around the papers. I would grab the top four papers and negotiate the busy intersection. There were no walk signals in those days, and the traffic barely paused. At the first break in the traffic I would sprint across the street and through the door of the busy market. The brothers would be around the counter. Charlie, the butcher, in his long bloody apron was always the one to take the papers from me, slipping me a quarter--15 cents for the three newspapers and the remaining dime for me. Before I was back out the door they would have the papers open, scanning the closing prices to see how they had done that day.

Posted by Bill Trippe at 11:03 AM

January 12, 2006

Go Figure

So who said print was dead? TV Guide redesigns their magazine and increases newsstand sales by 38%. I've never been a subscriber, so I don't know what the redesign does for their existing readers, but clearly there is some appeal to the newly designed product. To paraphrase Mark Twain, the reports of the death of print were greatly exaggerated. Indeed, the equations of print vs online are more complicated than merely "print will shrink as online grows."

Posted by Bill Trippe at 12:03 PM

January 8, 2006

The Long Tail

OK, I'm slow sometimes. I finally got around to reading Chris Anderson's article, "The Long Tail," some fifteen months after it first appeared and 10 months after Frank Gilbane commented on its relevence to enterprise software. It had caught on enough that I understood the basic idea, but the article is definitely worth reading, as is Anderson's blog. I find myself agreeing with the overall premise and a lot of his ideas, but he is enamored of some things that I am not terribly impressed with. Google Print comes up again and again, and all I can conclude about Google Print is that the search is only decent, the navigation frustrating, and the page rendering is often abysmal (see here, here, and here for examples I found in a couple of minutes of random searching, and I have seen worse). I look at Google Print as a potential model that can exploit the long tail, but a crude and early attempt at something that will be done much better in the future--either by a later version of this product or an entirely different product. Of course, Yahoo and others are in the game too, and publishers such as Random House and Harper Collins seem to want to take things into their own hands. And while the details of these books-on-demand models get worked out, I am sure Anderson will be most directly pleased if you simply buy his upcoming book.

Apart from my nitpicks about some of Anderson's examples, the ideas are important--and I think very important for publishers. Anderson says it best himself in the original article (bolded emphasis mine):

What's really amazing about the Long Tail is the sheer size of it. Combine enough nonhits on the Long Tail and you've got a market bigger than the hits. Take books: The average Barnes & Noble carries 130,000 titles. Yet more than half of Amazon's book sales come from outside its top 130,000 titles. Consider the implication: If the Amazon statistics are any guide, the market for books that are not even sold in the average bookstore is larger than the market for those that are (see "Anatomy of the Long Tail"). In other words, the potential book market may be twice as big as it appears to be, if only we can get over the economics of scarcity. Venture capitalist and former music industry consultant Kevin Laws puts it this way: "The biggest money is in the smallest sales."

I hear this in different ways all the time from publishers who are ahead of the curve in electronic distribution of their content. Journal publishers who provide sales of single articles have found customers who would never have bought an entire subscription. Speciality publishers who have digitized old manuscripts and back issues of publications are finding small but whole new audiences for their content. The examples--and Anderson's ideas--are compelling and instructive.

Posted by Bill Trippe at 6:00 PM

January 4, 2006

Ambient Findability

If you've been hearing about Peter Morville and Ambient Findability, he has a very readable introductory article in the November/December issue of Online. Peter blogs at findability.org, which includes a link to his interview this week on NPR's OnPoint. Of course, you can also just buy the book.

Posted by Bill Trippe at 11:47 PM

eBook Fare: Bestsellers, SciFi, Reference, and More

The International Digital Publishing Forum has announced their eBook best seller list for 2005. It's an interesting mix, including traditional bestsellers (Dan Brown dominates the list), SciFi and Fantasy (Star Wars Episode III topped the charts), and staples like bibles and dictionaries. Here's the top ten, with their retail price.

  1. Star Wars: Episode III Revenge of the Sith by George Lucas (Del Rey, $7.99)
  2. The Da Vinci Code by Dan Brown (Doubleday, $14.95)
  3. Angels & Demons by Dan Brown (Pocket Books, $6.99)
  4. State of Fear by Michael Crichton (HarperCollins, $7.99)
  5. Digital Fortress by Dan Brown (St. Martin's Press, $5.99)
  6. Embers Falling on Dry Grass by Robert Jordan (Simon & Schuster, Inc., $3.50)
  7. Deception Point by Dan Brown (Pocket Books, $6.99)
  8. Merriam-Webster's Collegiate Dictionary, Eleventh Edition (Merriam-Webster, $25.95)
  9. Holy Bible, New International Version (Zondervan, $14.99)
  10. The Narrows by Michael Connelly (Little, Brown, $5.95)

Posted by Bill Trippe at 10:22 AM

January 2, 2006

2006: Another Good Year for Content Management?

I don't do or have much access to quantitative research, but many people (here, here, and here) pointed to 2005 as a growth year for content management. One of the drivers was the whole, dreary compliance thing, but I think there was more to it. The other contributing factors included:

If I were to rank the relative importance of these factors, I would probably put the last one first. I had a number of conversations in 2005 where senior managers said to me, "Content management technology is more useful now." When I first heard this, I didn't really buy it. To me, the technology of content has been relatively stable for about 3 years anyway, maybe more. But the more I heard people say this, the more I had to reflect on it, and the more I realized they were right. If you look at my bulleted list (and you could think of more), a lot of good things have come together over the past several years. The result has been that content applications are easier and less expensive to deploy. And the rising skill sets have combined with a recognition from management to create more projects, more applications, and more growth for the overall industry.

2005 was indeed a good year for the content management industry, and I expect 2006 to be as well.

Posted by Bill Trippe at 11:40 AM

December 21, 2005

Simplicity

I've been hearing a lot about simplicity as the new driving goal of technology design. Via the MeansBusiness newsletter Ideas in the News, we get these thoughts on simplicity from Jean-Philippe Courtois, president of Microsoft International.

"More is going to happen over the next ten years than it did in the past ten. By 2015, the world will again experience the kind of dramatic shift that the internet brought, which is a pretty exciting notion. A lot of this change is going to happen through software...Workers and organisations are already nearing the point of so-called information overload, where the sheer volume of data and the complexity of the applications necessary to work with it threaten to overwhelm the powers of human cognition. These distractions have a demonstrable effect on the productivity and health of workers. Along with the proliferation of channels and features that IT offers, we are looking to offer simplification and insight with our products. That means we are trying to address things like prioritisation, context, attention management, and also to bring in better and smarter ways to visualise and control volumes of complex data."

I have some better suggestions for Microsoft to accomplish first, beginning with a secure operating system, more stable software, and core applications that aren't bloated with thousands of features most readers never even learn about, let alone use. Any one of these things would dramatically simplify the experience of millions of users around the world. And then Microsoft executives wouldn't have to bloviate about information overload--a topic that has been under discussion for 20 years and this guy presents as if it is brand new.

Posted by Bill Trippe at 10:55 AM

December 8, 2005

The eBook Wars

Writing for Publish.com, Ben Charny has an excellent roundup of the online book efforts at Google, Yahoo, and elsewhere.

Posted by Bill Trippe at 5:52 PM

December 7, 2005

Virtual XML Garden

Every now and then I pop over to IBM's alphaWorks site to see about some of their new technology, and I am rarely disappointed. One of the new projects up there is a Virtual XML Garden, which is described as "an implementation of XML processing directly and efficiently over arbitrary, structured data" but is really more than that.

With Virtual XML Garden, users can write scripts in XPath (as well as a subset of the forthcoming XQuery language) that mix and match virtual XML views on a number of provided data sources, including XML access to ZIP archives, the file system, binary formatted data, and hierarchical (Information Management System (IMS)) databases.

Check it out. This strikes me as an incredibly useful thing.

Posted by Bill Trippe at 2:43 PM

November 28, 2005

O Canada!

I have always said Canada, and especially, Ontario, has had a special role in advancing XML (and SGML before it). This article backs me up on it, and makes the case that Ottawa deserves special mention. The article does a great job of giving proper due to companies like Exoterica and MicroStar, and the many forward-thinking and smart people who have worked for them over the years.

Posted by Bill Trippe at 5:48 PM

November 21, 2005

Random House Pushes Back

Google Print, aka, Google Book Search, is not without its challengers. Authors have sued, publishers have sued, and now publishing giant Random House has essentially said, "thanks, but no thanks."

(For some background, the blog, DigitalKoans, has a useful bibliography.)

I called Random House, but they wouldn't comment on the details of any relationship, so what I am about to say is purely speculation. But it seems to me that Random House is saying the digitization and control of their books is their job, and not Google's, and I wonder if this might play out in a certain way.

1. They opt out of the Google program and do their own digitization.
2. They post their digital files on a public web site for wide searching but controlled distribution.
3. They make their own arrangement with Google Book Search, offering limited rights to their own digitized files--or not. They would already be in organic Google results, and Google wouldn't shut them out because they're Random House and represent too much of the book business.

That seems to the kind of control Random House is aiming at. This gives them organic search results in Google, with the specific Book Search results as well, if they want them. It also has the effect of calling Google's bluff. I mean, if Google is only doing this for altruistic reasons, why not let the publishers do their own digitization?

This makes a lot of sense to me. And, Random House aside, I would certainly take this approach if I were a publisher. Publishers have compelling reasons to digitize anyway—for marketing purposes alone, even if eBooks continue to yield small revenues. And the options for digitizing seem to be getting cheaper by the minute. If I'm a publisher, why should I cede the business to Google?

Posted by Bill Trippe at 4:32 PM

November 8, 2005

Current Trends in Publishing Technologies

I am speaking today the SSP Fall Seminar, Tech Blitz: Embracing Technology and Process Changes. It is being held in Philadelphia at PALINET Headquarters. You can download a PDF of the slides here.

Posted by Bill Trippe at 7:18 AM

December 29, 2004

One-Stop Compassion

Amazon.com is providing a very east way for its customers to give money to the American Red Cross for disaster relief in East Africa and South Asia in the aftermath of the earthquake and resulting tsunamis on December 26.

Click here to give. At this writing, 26,093 people have given $1,396,966.22. Give a little, and then keep refreshing that page and watch the total increase. Since I wrote this sentence, the totals have risen to 26156 people giving $1,401,616.22.

Posted by Bill Trippe at 10:37 AM

December 24, 2004

Reuse

For a client project and for some current writing, I have been thinking a lot about reuse lately. When you do single-source publishing, especially with something like SGML or XML, you have a great opportunity to both repurpose the content into other formats (print, HTML, help, etc.) but to also reuse the content objects in multiple content products. Thus an aircarft maintenance task, say, checking tire pressure, that is common to many different aircraft models can be used in many different manuals.

Ann Rockley makes this point very well in a recent column for Transform Magazine:

To graduate to object-oriented content reuse, you must create modular content components, such as procedures, product overviews and sales descriptions, and then reuse these components in as many ways as possible. Brochures, manuals, training materials, troubleshooting guides and positioning papers are all prime candidates for content reuse, and they may all exist in print, Web and other forms.

The folks at Data Conversion Labs have done some research recently that suggests that as much as 50% of your content may be redundant and thus a perfect candidate for reuse. They have a new product and services offering, Harmonizer, that assesses how much of your content may be redundant and runs processes to eliminate extraneous content and "harmonize" remaining text to be more standardized.

The idea of reuse is easy; the application is difficult. I have seen it done extraordinarily well, though, with enormous payback. Data Conversion Labs is on the right track by coming up with a standard offering to help with the initial analysis. Another good starting point is Ann Rockley's book, Managing Enterprise Content. The chapter on reuse, "Fundamental Concepts of Reuse," can be downloaded for free here.

Posted by Bill Trippe at 1:18 PM

December 17, 2004

A Few Thoughts

I have been very busy with work, so haven't been posting much. A few thoughts, though:

Posted by Bill Trippe at 9:42 PM

October 22, 2004

Is "Content Management" Undergoing a Redefinition?

TechWeb, among other sites, is reporting today on IBM's announcement of expanded offerings for compliance. What's noteworthy about IBM's announcement is how much it focuses on technologies like storage, electronic forms, and data archiving. Of course, you could view this as IBM simply playing to their strengths, with offerings like Tivoli Data Storage. But if so much marketing message goes to these points, do we eventually start thinking of content management differently from how we do now?

Posted by Bill Trippe at 5:18 PM | Comments (2)

October 10, 2004

Content Management and eCommerce

It feels sort of old-fashioned to write the word "eCommerce," whereas once it was the Holy Grail. The timeworn truth now is that the dot-com bubble burst, but the reality is that billions of dollars in business has moved to the Web. More significant is how much business process happens using the Internet as infrastructure.

Perhaps the question we should now be asking is, how much business can now be conducted impersonally over the Internet?

Content management, of course, plays a fundamental role in Internet-based commerce. Content management has been my core focus for several years now, and I have worked closely with large companies who have been automating how content is used in design, manufacturing, sales and marketing, logistics, and customer support. This is an area of intense focus for many companies now, and the platforms and systems to support content management are growing more powerful and more functional all the time.

Why is content management fundamental to eCommerce? Commerce involves intensive communication at all phases of the process, and eCommerce requires that much of the communication happen automatically and online--impersonally, as a colleague recently put it. When the products are complex, the content is correspondingly voluminous and complex, increasing the need for content management technology.

In fact, the challenge of content management is even more complex and interesting than that--this is precisely where the issues get interesting. I have been writing for years about how content management supports all kinds of business processes--research and development, design, manufacturing, marketing and sales, customer support, maintenance and supplies. I have not been able to successfully articulate this yet, but there is something fundamental about the connection between business processes and the content that supports these processes. There is some kind of lever here--the intimate relationship between content and business process at all points in the buying and selling process.

In the long run, organizations that most effectively tie content management and eCommerce together will profit from their efforts. Of course, I may have understated the potential impact. The real impact for these companies could be much more fundamental. Indeed, one could argue that being successful at this kind of process will soon become necessary for survival in many industries.

Posted by Bill Trippe at 7:34 PM

August 27, 2004

Enterprise DRM?

If we are going to have "enterprise DRM," such technology must become a key and well integrated part of the larger enterprise infrastructure that includes content management, document management, application serving, security, and other enterprise technologies. Many organizations are investing in several of these applications, but don't yet know how they all fit together. There are some piecemeal solutions out there--where a collaboration tool or CMS has been linked with DRM. But does any organization yet have a fully integrated environment, where DRM spans single sign-on and all content applications?

Posted by Bill Trippe at 6:27 PM

July 16, 2004

SAP Enters the Content Management Market

My colleauge Mark Walter called my attention to SAP's acquisition of privately held catalog vendor A2i, Inc. The press release has additional detail, but the subheading pretty much says it all, "Strategic Acquisition Integrates Enterprise-Wide Product Content Management and Data Aggregation Capabilities into SAP NetWeaver Platform; Solution to Deliver Global Data Synchronization for Consumer Products and Retail Industries."

In other words, all NetWeaver, all the time.

Posted by Bill Trippe at 9:09 PM

June 18, 2004

My Own Experience with Moore's Law

I bought a new hard drive yesterday for one of my desktop machines—a 100 Gigabyte drive for 100 dollars. It made me harken back to one of my biggest early computer purchases in 1987, when I bought a hard drive for my Macintosh Plus computer. I had been been growing tired of swapping floppies in and out of my dual floppy drives. If I remember correctly, the first floppy disk held the operating system and the second floppy disk would hold the application, such as Microsoft Word. So if you were creating a document, you would swap the application disk in and out for a data disk when you needed to load a new document or save the one you were working on.

I was frugal enough then that I really thought long and hard about buying a hard drive. They were expensive, and in the Macintosh world, it meant choosing from a long and dizzying list of hard drive manufacturers. I finally settled on a Jasmine 40 MB hard drive for $700.

The math is pretty amazing. My 1987 purchase worked out to $17.50 per Megabyte, and my 2004 purchase worked out to .09 cents per Megabyte. Check my math, but this is significantly faster than Moore's Law.

Posted by Bill Trippe at 9:20 PM | Comments (2)

May 18, 2004

Steady Growth Ahead for Book Publishers

According to the Book Industry Study Group, book publishers will see modest growth over the next several years, with the educational segments of the market promising the healthiest expansion:

Annual consumer expenditures for books will reach $44 billion by 2008, according to Book Industry TRENDS 2004. Trade, mass-market and professional publishing revenues will rise roughly 10 percent between now and then. Revenue growth will be higher still for university press and college publishers, but the most significant growth will be in the elhi and standardized-test segments of the industry, with respective increases of more than 20 and 45 percent.

This larger trend seems to track with my own observation that educational publishers are in a spending mode on technology. This is good. As they experience growth, the technology investment can help them maintain and perhaps even improve on profitability amidst the growth.

Posted by Bill Trippe at 11:15 PM

April 25, 2004

The State of the eForms Market

I gave my opening presentation at the Seybold eForms summit. It went well, though I have to admit to being a bit in awe of the audience, which included many of the real thinkers and doers in the XHTML and eForms world. But my job was to lay the groundwork for the rest of the day, and the feedback was positive.

Posted by Bill Trippe at 1:30 PM | Comments (1)

April 13, 2004

Brief DRM Conference Report

I am at Bill Rosenblatt's DRM conference, and it is a big hit. Excellent crowd, excellent content. I moderated a session on the e-publishing market, which went well.

I was very interested in a session on legal issues that ended up being mainly about compliance, and the need for DRM in the enterprise. One of the speakers, Nick Ackerman from the law firm Dorsey & Whitney, made some excellent points. I was especially glad to hear him not define compliance narrowly, as in just Sarbanes-Oxley or just HIPAA. He made the point that compliance is an enterprise-wide problem, requiring an enterprise-wide solution, i.e., DRM.

George Everhart, CEO of DRM vendor Sealed Media, then took this a step further, saying that compliance should be viewed not as the final goal for an enterprise DRM strategy, but as the catalyst for a broader "information lifecycle management policy." This has DRM cozying up to records management in some ways, but also suggests that organizations need to look at all organizational information—and develop policies for all of it.

Posted by Bill Trippe at 4:58 PM

April 6, 2004

Content Management Market Consolidation

I have been researching some of the recent consolidation in the CMS market. The result is a new article in The Gilbane Report, "Content Management Industry Consolidation: What Does It Mean?"

To quote briefly from the introduction:

The content management community is abuzz with discussions about what content management is, and what the difference is between all of its various incarnations, including ECM, WCM, DM, KM DAM, etc. (). The seemingly endless supplier consolidation has done nothing to lessen interest in this question, and there will continue to be consolidation for the foreseeable future even though there will also continue to be new companies and new products and content technologies emerging. What does this ongoing consolidation mean to businesses planning or imple-menting content management strategies?

This month Bill takes a look at some of the longer term trends behind the recent consolidation. There are obvious reasons some vendors need to "bulk-up" to maintain business models and stay competitive, but there are also more subtle trends underlying much of the consolidation. To fully understand how supplier consolidation will impact your content management plans you need to take a look at these less obvious trends. It is certainly important to understand who owns who, and whether the products are integrated, but it is also critical to appreciate the cumulative effect of consolidation and the influence of larger computing industry trends to be fully prepared adjust your content manage-ment strategy as the industry evolves.

Posted by Bill Trippe at 4:21 PM

April 3, 2004

Editorial Process in Content Management

In some email discussion recently, people have been talking about putting more emphasis on the "C" in discussions of CMS, or content management systems. I couldn't agree more.

As I mentioned in this email discussion, content "management" isn't tied, per se, to technology. My colleague Jenn Accettola makes this point really well in saying that a content management "system" begins with the systematic processes that allow organizations to manage content effectively. The technology, then, can be viewed as a way to accelerate or enhance these processes.

I have worked in and with organizations who have all kinds of editorial and review processes that they conceive and plan independently of a particular technology. They then look at automated and nonautomated processes for accomplishing the task, and choose the better one—based on cost, effectiveness, and timeliness. In this way, they are making business decisions first, and technology decisions second. It is instructive to those of us who think "computer first" that the organization's overall goal is better content at the right price, not better content through automation.

Posted by Bill Trippe at 7:27 PM

March 9, 2004

Outsell Finds Market for Paid Content to be Larger than Thought

Outsell does excellent research about the publishing and information industries. They have some new research that suggests the market for paid content is much larger than previously thought. To quote briefly from their press release:

"Outsell, Inc. ... today released startling new analysis revealing the online paid content market to be 35 times larger than commonly reported. A new Outsell Briefing, Content Vendor Best Practices: Busting Up Fee Vs. Free, provides specific and actionable information for content vendors and information users alike. The Briefing includes profiles of more than 100 successful content providers focused on blended business models that create value for their users. Rather than worrying about fee OR free, innovative companies are taking a wide-open, both/and approach, creating a very large and often misrepresented market and ending the fee vs. free debate. "

I am not at all surprised by this. Indeed, I think the market for paid Internet content is undercounted—as is the market for paid internet advertising. There is a lot of good news out there for publishers.

However, the ability to capitalize on these opportunities depends on publishers being able to deploy multichannel publishing technology at a reasonable and predictable cost.

For publishers, the Internet can seem like a conundrum. In the midst of so much plenty, why is there so little real revenue? Indeed, the opportunity defines the challenge--so many potential opportunities, and yet so many of them are unproven.

Publishers are used to a model where they can focus on predicting audience and revenue against a relatively well-known set of costs. The Web, for all its potential, is still unformed, and few business models have any kind of track record.

Complicating matters is the difficult question of predicting costs. Many Web development efforts have not just proven costly--they have often suffered from cost overruns, unmet expectations, and enormous hidden costs. Add to this the constant change in technical requirements and infrastructure, and publishers are left with often staggering challenges.

Building a Web infrastructure is a complex, highly technical undertaking that many organizations are unprepared to face. Research from CAP Ventures and elsewhere suggests that as many as 60% of in-house Web development and integration efforts fail, and are abandoned at significant cost over time.

Publishers who consider building their own systems face high costs of software and integration, and the need to maintain and upgrade the system over time. Publishers are not typically staffed for this kind of operation, and even may not have necessary skills and experience to contract for this work efficiently.

Any single component application of a Web site is itself complex and difficult to select, install, and customize. Yet integration of multiple component applications is even harder. Even a component technology such as a search engine, long considered "commodity" software, is difficult to integrate across a complex Web site to the point where the end user is ensured a consistent experience across the site.

Publishers who seek partners are often faced with "all or nothing" proposals that bind them to larger portals that could well cannibalize or overtake their own business.

Further complicating things is the perceived need to move quickly, even in the face of partial information.

Even in cases where the initial Web development effort has been completed somewhat successfully, the publisher is likely left with a maintenance headache. Systems and subsystems change so quickly, by the time the project is completed, major components of the system will likely needed to be upgraded or swapped out.

So What to Do?

The key is to effectively manage all aspects of the technology that supports your publishing. This begins with basic questions such as, "Should we even try to do this ourselves, or should we look at options for partnering, outsourcing, or relying on an Application Service Provider (ASP)?" If you are not the kind of organization that is accustomed to running a lot of technology, you should think twice before trying to run a complex publishing operation on your own.

Just my thoughts, as usual. I would love to hear from others on this topic.

Posted by Bill Trippe at 1:01 PM

February 28, 2004

The Future of Content Delivery: Services-Oriented Architectures now Available

The article I was researching and writing for Transform Magazine is now available online. To quote from the lead:

Approaches to software development come and go. Today's dominant approach gives way to the hot, emerging trend, and the cycle then repeats itself. When the dust settles, organizations still have much hard work left to make their systems and applications productive and efficient.

Web services, or, more broadly, service-oriented architectures (SOAs), are quickly becoming the dominant approach to software development and integration. SOAs are best understood as a method for integrating software as services, with networked software and content made available through standardized, XML-based mechanisms. In this approach, a software service can be easily and openly integrated with other services. As an example, you could flexibly tap into a software service that provides currency conversion, another one that translates text into other languages and another one that converts a document from one format to another.

The appeal of SOAs is consistent with the broader momentum toward the loose coupling of applications and data. As more organizations move content and applications out to the Web, software developers are pressed to find rapid, efficient means of bringing content and applications to the very thin client provided by browsers. Just as XML gave developers an open means of data transfer between applications, SOAs are giving developers an open means of having software communicate with and control other software over the Internet.

Posted by Bill Trippe at 3:37 PM

February 12, 2004

First, do no Harm: Can Privacy and Advanced Information Technology Coexist?

The following is an article I wrote last year for EContent magazine that delves into some of the privacy issues raised by the growth in scope, depth, and power of database technology. While it specifically discusses medical privacy and the HIPAA act, its broader questions apply to many kinds of content and information.

Advances in networking and database technology have brought vast amounts of data together, and as search and querying technology improves, these vast stores of data become increasingly meaningful to even the casual user. In the right hands, such networked data and content can be invaluable--the doctor who needs vital patient records, the security analyst who wants to glean some intelligence from financial records. But for all its potential good use, the same data has great potential for misuse--either inadvertent or intentional. Mishaps have already happened and, while policies are in places (and new ones are soon to be implemented), the risk remains. Perhaps the real question then is to ask whether technology to preserve privacy can advance as quickly as the technology that seems to be putting privacy at risk.

In late December 2002, the U.S. Department of Defense reported that its efforts to computerize the medical records of military personnel were set back when hard drives containing the records of a half-million personnel were stolen. The records included names, social security numbers, and medical claims histories. According to the Associated Press, the Defense Department had seen the new computerized system "as a potential 'data gold mine' for military physicians and other healthcare professionals that will provide quick and easy access to military patient records worldwide."

While this is perhaps the most spectacular recent privacy breach, it is not the only one. According to news accounts, patient record information has been compromised at a major pharmaceutical chain, a health insurance company, and an online retailer of healthcare products, to name a few places. In each of these cases, the compromise has been inadvertent: in one case, information was emailed to the wrong parties and in another case-sensitive information was accidentally posted to a public Web site. But when these accidental disclosures are considered in light of the Defense Department theft and some well-publicized security breaches at ecommerce companies, the concern begins to grow.

Indeed, many would argue that, when it comes to medical records, any compromise is unacceptable and that every reasonable effort should be made to safeguard such data. To that end, the federal government is mandating the enforcement of new patient privacy rules under the Health Insurance Portability and Accountability Act of 1996 (HIPAA). HIPAA is a broad law that called upon Congress to delineate what rights patients have to control their own medical information, and what procedures and mechanisms would be followed for appropriate sharing of that information. The result is a broad set of regulations to be followed by healthcare providers, insurers, and related organizations such as medical researchers--anyone who handles patient information.

PRIVACY IS FUNDAMENTAL

The assumption behind the protection of medical record information is that privacy is a fundamental right. In announcing the HIPAA regulations, the U.S. Department of Health and Human Services recognized that the new regulations would come at significant cost to the healthcare industry, but pointed out, "it is important not to lose sight of the inherent meaning of privacy: it speaks to our individual and collective freedom." While this may seem like lofty language, they cite the same basis many privacy organizations and advocates do--the Fourth Amendment guarantee that "the right of the people to be secure in their persons, houses, papers and effects, against unreasonable searches and seizures, shall not be violated."

To this end, HIPAA and regulations seek to control how patient information is collected, safeguarded, and used over time. The overarching requirement is in some ways obvious; only clinicians with a need to know-and to whom you have granted access-should have access to your medical information. But the actual implementation is complex, as more information is digitized, as more systems are interconnected, and as increasingly powerful tools for querying become available.

But the real tension between privacy and usefulness stems from the basic requirement for automating patient information in the first place--to give clinicians ready access to the information they need to make on-the-spot, critical decisions. "It's a balance between confidentiality and ease of use," notes Dr. John Halamka, who as both a practicing physician and CIO for a Boston-area hospital group has a comprehensive view of the problem. In describing the tools they have developed at CareGroup Health System, Halamka talked about "including knowledge in the workflow" for an application such as order entry. Halamka offered the example of a doctor who is prescribing a hypertension drug for a diabetic, where the doctor would ideally have the patient's latest lab results as well as recent and relevant research about the medication "in the context of taking the action."

Again, while the requirement is in some ways obvious, the implementation is likely complex. To begin with, doctors operate in an information-saturated world. Primary medical research alone is a deluge of information. Halamka points out that if doctors took time out "to read eight research articles a night, they would be 800 years behind after one year." To solve that problem, Halamka's technology team at CareGroup gives clinicians access to databases such as Uptodate.com, where experts in the field read, abstract, and summarize the world's literature.

Moreover, even an individual patient's record may be lengthy and complex, and, depending on the action being taken at a given time, the clinician likely needs selected information rather than every detail about that patient. Halamka notes that the same doctor prescribing a hypertension drug would indeed want recent lab results, but would likely not need to read a summary from a recent psychological visit.

The key, then, is to provide authorized clinicians with precisely the information they need, when they need it--but only the precise information they need, so that privacy is not compromised. In an environment such as CareGroup, which deals with 40 terabytes of patient information, such careful handling requires a team of 16 data analysts who provide the necessary views, reports, and query tools for the clinicians to use. Depending on the nature of the query, some reports would need to be stripped of identifying patient information, for example, and others might need to generalize the results so no specific patient information could be inferred. In addition, Halamka emphasized the tools "need to recognize roles and rights based on clinical needs." A query that is appropriate for one clinician to perform may not be appropriate for another. Halamka noted an emergency room doctor might need ready access to a broad set of patient information. The tools, Halamka continued, "should allow you to do your job while a lso protecting the patient."


IS TODAY'S TECHNOLOGY UP TO THE TASK?

Given the complexity of maintaining patient privacy in an increasingly digital world, it's reasonable to ask if the technology can support the requirement for privacy while also giving clinicians access to the information they need. Practitioners like Halamka would answer in the affirmative--"We do our very best with the tools we have"--but HIPAA compliance comes at a cost. (The Department of Health and Human Services estimates it will cost the industry $17 billion over ten years to implement the HIPAA privacy regulations.)

Some of the cost of HIPAA compliance is the human cost for the data curation work done at places like CareGroup. Other costs come as organizations integrate privacy software with patient record systems. At least one interested party, though, thinks the eventual solution to the patient privacy issue may involve a new approach to database technology itself. Researchers at IBM's Almaden Research Center in San Jose have been developing the technology behind Hippocratic Databases--databases that, according to IBM Fellow Dr. Rakesh Agrawal, support the primary mission of patient care while taking "responsibility for the data that they manage to prevent disclosure of private information."

Agrawal is widely recognized as a leading thinker in the field of datamining--the discovery of useful knowledge previously hidden in massive amounts of raw data--and has been writing about privacy issues for several years. Agrawal's idea of Hippocratic Databases presumes a system where "contracts" are created between databases and users to ensure the privacy and integrity of data. "This contract system is based on 10 principles," notes Agrawal, "including stipulations that the information will be kept accurate and up-to-date, the data is used solely for what it was specifically collected for, and the data is only retained for as long as it is needed."

FIRST, DO NO HARM

"Whatever, in connection with my professional practice or not, in connection with it, I see or hear, in the life of men, which ought not to be spoken of abroad, I will not divulge, as reckoning that all such should be kept secret."

--Hippocratic Oath

Agrawal's interest in privacy and databases stems from his long and serious work in datamining. At various times, datamining has been viewed as problematic because of potential privacy concerns, and the topic has been frequently discussed at conferences where Agrawal was a speaker. Attending a conference in 1995, Agrawal was struck by a question from the audience, "Can technologists change the attitude that we are not responsible for the consequences of technology?" Agrawal admits, "the question stuck with me," and it motivated him to keep thinking about this issue of privacy. In Spring 2002, Agrawal and several colleagues from IBM presented a paper, "Hippocratic Databases," at the 28th Annual VLDB Conference in Hong Kong.

"We saw it as a call to the industry," said Agrawal, and the paper's introduction said, "We suggest that the database community has an opportunity to play a central role in this crucial debate involving the most cherished of human freedoms by re-architecting our database systems to include responsibility for the privacy of data as a fundamental tenet." And while patient record information is the most obvious and important problem, Agrawal is well aware that privacy extends to many other areas--finance immediately comes to mind. "Five years from now," according to Agrawal, "information about animate things in databases will completely dwarf information about inanimate things." Moreover, Agrawal suggests the logic of managing this animate information is very different, and privacy is just one issue that presents technical challenges to today's databases.

The challenges begin with how privacy clashes with some of the fundamental benefits of a traditional database, such as concurrency and recall. Databases are very good at capturing and committing records, and then immediately making these records available in views, query results, and reports. But, as Agrawal suggests, Hippocratic databases likely require more emphasis on "consented sharing" than on concurrency.

There are database technologies in use today that support privacy, but Agrawal would argue that they either don't go far enough or they don't support the kind of use cases that Hippocratic databases require. Medical researchers, for example, rely on statistical databases to provide meaningful answers to statistical questions (average, maximum, minimum, etc.) without compromising sensitive information about individuals. Statistical databases use techniques such as restricting types of queries and "data perturbation"--where noise is added or selected values are swapped. While Hippocratic databases would benefit from some of these statistical techniques, Agrawal and his colleagues point out that Hippocratic databases will need to support a much broader set of queries and usage.

Security and encryption technologies are also increasingly in use with databases. Agrawal notes that databases can apply multiple levels of security to database items--e.g., top secret, secret, confidential, and so forth. To date, though, these techniques have been implemented in ways that can make query results uneven or inaccurate--a "top secret" query could leave "confidential" records unreported, for example. "Many of our architectural ideas about Hippocratic databases have been inspired by this [security] work," wrote Agrawal and his colleagues.

THE HIPPOCRATIC DATABASE

IBM's model for privacy-savvy databases may well have been inspired by the Hippocratic oath, but the principles of how to handle private information are broadly understood and articulated. Regulations in the United States and elsewhere in the world are largely based on the idea of "Fair Information Practices" These practices stem from the set of principles established in 1980 by the Organization for Economic Co-operation and Development (OECD). While the OECD delineated eight principles (which many countries have used to develop legal guidelines for the collection and use of personal information), IBM's researchers cite ten, which cover how the data shall be used, disclosed, retained, and safeguarded.

Along with these principles, Agrawal and his colleagues offer a strawman design and a set of use cases for how Hippocratic databases could be tested. The response has been enthusiastic according to Agrawal, and has bolstered his conviction that, "We can build the datamining models while still preserving the privacy of individuals." For Agrawal, it's a case of "the promise of the technology versus the risk, and the technical community can help reduce the risk."

Posted by Bill Trippe at 7:57 PM

January 4, 2004

Does Context Rule?

In the fall of 2000, there was a spate of e-book conferences--two in New York, one week apart, for example--and the same sorts of arguments about the advantages of digital content for publishers were once again trotted out. There's the lower publication costs point, together with the felling of fewer trees angle. There's the faster time to market due to virtual distribution across the Web, and because the aforementioned trees don't have to get chopped, chewed, and rolled out for printing presses. There's the "digital document is better " docket, where the tried and true search and retrieval achievements are pointed to, along with other usability improvements such as the ability to cut and paste, annotate, and customize dynamic documents. Updating information, integrating information, navigating information, and disseminating information are all part of the "digital is better" formation.

E-books are still with us, of course, but they never lived up to their hype. I remember sitting in one of these e-book conferences and trying to decide which was the better metaphor--e-book as 8-track tape, or e-book as videotext.

Just because these arguments can be mapped across a few decades--back to the online information services of the 1970s, through the first blush of CD-ROM in the 1980s, and right up to the Internet and Web and enterprise portals of today--doesn't take away from the force of these convictions. On the other hand, after so long a time this argument has been made--and as variously applied as it has been--there's a certain impulse to say, Been there, done that.

In fact, despite the presence of new digital content delivery platforms in the form of e-book readers, there is little new coming out of such conferences about e-books that goes much further than offering--ironically--an electronic analog of the print book. Never mind some of the new wrinkles being brought to bear in the digital publishing scene, of which digital rights management (DRM) has been thrust to the fore, right along with (and in the case of XrML, in combination with) XML-based content tagging and management systems.

As important as DRM and standards-based content management are to rational, efficient, and cost-effective document and information serving, and yes, even if that document is a book, there remains one challenge that often still comes up short: getting users of digital document systems the exactly right content these users need at the exactly right time these users need it. While it is a great idea to get any content seeker the content he or she seeks, most of the real action of managing digital information is taking place within companies that have a real ROI interest to motivate good content handling, and among these businesses' partners and value chain participants.

Giving Content in Context

For enterprises wishing to benefit from the creation and management of content portals, the challenge is clear. Systems that manage content without managing the context fail.

Searching for content is a frustratingly difficult and easily overwhelming exercise. This is true even as search engines are increasingly bolstered by technologies and processes to help make them more effective--spiders, meta-data, standardized taxonomies, and human editorial intervention. The problem of course is an ever-greater avalanche of data. The projections for simply and effectively finding content are dire, and hardly a case of Chicken Little; for example, where there were less than 200,000 web sites in 1995, there are, a half-decade later, 22 million, and these numbers don't include most intranet sites that are closed off to Web indexing efforts by firewalls.

The Web's promise (among others) is to improve communication both within and outside the enterprise. To succeed, however, customizing the content delivered to employees, partners, and customers becomes important.

Getting Personal about Content

Personalization requires enterprises to have the means to capture information--the term "profiles" is typically used--about the information users. These profiles need to be useful in directing specific content to those profiled, which means that an enterprise also needs to know about its own content and, if used, third party content.

There are many elements that can be used to deliver content in context. These include:

� Registering content meta-data for Web and enterprise-wide search engines
� Implementing effective search engines (e.g., relevancy)
� Collecting and managing profiles of site users (e.g., personalization engines)
� Creating and maintaining taxonomies of content (e.g., subject classifications)
� Identifying communities of interest (e.g., portals)
� Enabling pass-along content delivery (e.g., superdistribution using DRM)
� Sending email content offers/links to profiled users

Some companies rely on powerful search engines that possess tools such as relevancy ranking, natural language query, built-in thesauruses, contextual hit results, and other improvements to the electronic searching.. Other companies simply rely of self-selection of its content users, where the assumption--as in the case of many enterprise and vertical portals--that the focus of the site carries enough implied context. The more effective solutions, of course, are those that use as many contextual content delivery strategies as possible.

The more robust, detailed, and accurate the meta-data, the easier it is to find content in huge content bases and return find hits and serve the content itself. If such search effectiveness is tied to personalization profiles that track a content user's interests and requirements and content delivery mechanisms, content delivered in context becomes powerful indeed.

But for enterprises today, perhaps the biggest benefit is gained by mastering how enterprise content can be served into specific contexts within the business process and partner chains, to deliver more on the promise of automation. Look for such management of content (which could be called "syndication") to play a growing role in tying the information of the enterprise to the many different parts of the enterprise's business actions.

(My thanks to David Guenette, who collaborated with me on an earlier version of this article.)

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 10:21 AM

November 25, 2003

Build vs. Buy in Content Management Systems

The listserv cms-list has an interesting discussion lately on build vs. buy when it comes to content management systems. As one poster correctly noted today, build vs. buy is a simplistic way of stating the question. Most CMSs require extensive customization, and the work to make a CMS work for you needs to be thought of—at least— as a substantial extension to a given CMS technology.

I wrote a white paper about the build vs. buy question a couple of years ago, but I think it still holds up fairly well. It was written for a particular vendor (Enigma) to distribute, but most of it is neutral.

Posted by Bill Trippe at 8:57 PM | Comments (2)

November 18, 2003

Important Emerging Trends in XML and Content Management?

For an upcoming Gilbane Report article, I am going to be writing on important trends in XML and what impact they will have on content management technology. Part of this has been spurred by the blizzard of announcements coming out of the W3C this month, including updates and last calls related to XQuery, XSLT, and XPath.

It is also driven by all the product announcements, and recent improvements and changes to various core open source tools.

I would love your thoughts on this. What is important among all of the recent announcements and changes? What will have an impact on content management, and what will not?

Posted by Bill Trippe at 3:42 PM | Comments (3)

October 27, 2003

Does Personalization Do Anything Useful?

I interviewed user interface expert Jared Spool a couple of years ago for a now defunct Web site. I really like Jared's ideas on all things technical, so it was a pleasure to discuss personalization with him. The full text of the original article follows.

For Jared Spool, Purpose, not Personalization, is All

Poorly conceived efforts to personalize can result in clumsy, off-putting sites

Jared Spool is a long-time--and highly sought after--advocate of taking time and care in designing human-computer interfaces. His Haverhill, MA-based company, User Interface Engineering, has been advising and training development teams since 1988 and is now 20 people strong. Spool and his team provide primary research, publications, and training for software developers involved in interface design. With the massive build-out of the Web, User Interface Engineering has turned its substantial resources to understanding how users interact with Web sites.

Spool himself conducts some of the training classes, and has a vast and ready arsenal of examples, ideas, and anecdotes of things done well and ill. We talked recently about the many efforts sites have made to personalize the visitor's experience.

One Fact Does Not Tell Much

"The thought of personalizing based on one or two known facts..." Spool begins, and then leaps to an example. "A person is walking down the mall and slows down to look in the window of one store. Will they slow down at they next similar store? If they happen to go into one store that sells dishes, will they go into every store that sells dishes? Will they ever buy dishes?" In other words, Spool is advising, one fact does not tell us much, if anything, about that person's future behavior.

UIE has researched shopping behavior both in malls and online. They've learned that even what people expressly declare they are interested in does not have a determinative effect on what they will do. "We observed one woman go to Crate and Barrel specifically to buy napkins. She never bought napkins. Instead, she happened upon a pitcher she liked, and bought it and the set of matching glasses. Had Crate and Barrel shown her only napkins, they may have ended up with no sale."

These two examples go directly to the two predominant types of information that Web sites use to personalize:

1. Information that can be inferred about a user based on, for example, click stream behavior, and

2. Explicit information that the user might agree to share with the web site.

We don't, Spool would contend, end up with enough information to fully understand the user's needs. And even if we did, we probably wouldn't know what to do with it. Web developers are coming face to face with a fundamentally complex problem--trying to determine what specific information a user might want at a specific point in time.

"Take Amazon.com, for example," says Spool. "A friend was pregnant two years ago, and purchased a book there related to pregnancy. Two years later, every time she goes to Amazon.com, they suggest baby naming books to her." Spool then lets the thought sink in for a minute. Of course, the baby is now a toddler and indeed already has a name. It's a single example, but a telling one for Spool, one that illustrates how the inference made from one piece of data has clearly led to an off-putting result.

Watch out for the Daily Special

Amazon.com is not alone in clumsily offering recommendations and specials. Spool's examples include the gardening site that won't sell you certain plants based on your zip code, apparently out of fear the climate may kill them. Another favorite example is the pharmacy site that will recommend specials on a certain medication, despite the fact that you've completed a lengthy on-line questionnaire and included information that you're allergic to that very medication. "Apparently," Spool deadpans, "It's ok to kill yourself but not your plants."

Spool's examples are catchy and often very funny, but they also make the point about personalization. "I'm not sure we know enough how to do it," says Spool. "And when we do it wrong, it's at best an annoyance."

The problem is what Spool calls the "indiscriminate attention" a site may end up paying to certain information. "It's nice at times for a third party to pay attention to certain interests and make recommendations. For example, if you go to a restaurant regularly, and a waiter knows you like a certain salmon dish that is sometimes available as a special, it's nice for the waiter to point that out. But you probably don't want that same waiter to start commenting on your recent choice in friends."

Other sites have adopted the approach of always fronting certain information, presenting the user with daily specials and offers, regardless of whether the user is there for such information. In fact, UIE's research shows that users almost always bypass that kind of information in favor of the things that interest them. "It's like going running into a restaurant needing the restroom," says Spool. "And having the waiter insist on reading you the daily specials."

The lesson? Specific information about a user is could yield some useful functionality. For example, it might be helpful for a floral web site to track the types of flowers you have sent, and then warn you not to send the same ones twice (or to suggest one that you've indicated was well received). But one or two facts, Spool suggests, don't merit redesigning the whole Web site. This is probably most true in complex applications involving knowledge workers. "You would have to predict what they need at any given time," says Spool. "And the odds of you being psychic are slim."

Ask the Right Question

For Spool, the way to tackle personalization isn't to start with the question, "What can we personalize?" The right questions to ask are, "What does the user need to see right now? What information does the user need?"

Spool offers another example. "Take a user coming to eBay. If that user has some bids on some items pending, it would be handy for the first screen they see to be a summary of their current bids. Which ones our being out bid? How long do the auctions have? If there are certain items that they've indicated they are always on the lookout for (personally, I'm a big fan of the 'Elvis Pezley PEZ dispenser'), eBay could display new items and have an easy way for them to place opening bids on those items."

Spool's final advice is that the ultimate solution may not even have to be expensive or particularly complex. "Watching users come to your site frequently would give you some ideas on the patterns that show up in their regular visits," says Spool. "Sometimes it will require personalization technology to optimize those visits, but often it can just be done with cleverly changing the content, without any real sophisticated tools."

Posted by Bill Trippe at 3:08 PM

October 14, 2003

EMC and Documentum? Bolt from the Blue?

I honestly can not say I would have predicted that EMC would acquire Documentum, but I wasn't entirely floored either. There are several ways of looking at this as sensible from EMC's perspective at least:


Perhaps most significantly, the combined EMC-Documentum will make sense to a significant number of prospective customers who see the enterprise content management problem as a storage problem first. This is especially true of those organizations and individuals who come to enterprise content management with an orientation toward records management and archive management. Such prospective customers will understand the combined offering more quickly than will some others.

Posted by Bill Trippe at 12:34 PM

October 13, 2003

Applications of Internet Publishing

At the request of Mark Cummings, VP and Publisher at Scholastic Library Publishing, I was a guest lecturer at a class he is teaching at NYU, Principles and Applications of Publishing on the Internet. The class has been delving into some real nuts and bolts--how a reference publisher, for example, goes about digitizing and structuring their content for effective publishing on the Web.

It was interesting to speak to a group of graduate students, some of whom are already working in the field and some of whom hope to. As I said to them, I spend so much time speaking with other technical people in the field, I am guilty of speaking too much in the jargon of the industry.

They are using The Columbia Guide to Digital Publishing as a text. Mark Walter and I co-wrote the chapter on content management.

If you would like to see the slide presentation from the NYU talk, you can download it here. My thanks to reader Brian Casey for taking the PowerPoint and creating the PDF.

Posted by Bill Trippe at 1:45 PM

October 6, 2003

Random Entries I Could Have Written

I have been very busy with some client work recently, so the blog has lagged behind some thinking I have been doing. Among the topics that I have been considering lately:


Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 4:55 PM

September 18, 2003

278 E-mails, and Nothing's On

Is e-mail dead? I tend to not think too much of e-mail in the context of content management, but I do have clients who are interested in at least managing e-mail archives. But, in truth, is it really worth saving?

As of 2:00 today, I had received 130 e-mails in my primary (business) e-mail account. 74 of them were successfully identified as spam and filtered into a separate Microsoft Outlook folder. Another eight were infected with a virus, and were trapped by my anti-virus software and deleted. Another dozen were bounces from AOL e-mail accounts that had been spammed by someone else using my email address. This leaves 36 legitimate business e-mails. I have a second business account that has excellent server-side spam filtering software. Every three days it sends me an e-mail showing me the (typically) scores of spam it has trapped on and quarantined. In the same period of time, I may have received 20 legitimate e-mails.

I'm tempted to ask, "What the hell is wrong with this picture?" but a more accurate question might be, "Does this work at all?"

I manage, through significant effort and expense on my part, to make email work for me and a small number of e-mail addresses that I manage for others. But the outcome of all of this effort is, at best, a handicapped medium. Before e-mail can be a useful part of enterprise content, it must first be, by and large, useful content. Right now it isn't.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 3:59 PM | Comments (1)

September 16, 2003

How Much Content is Structured?

The conventional wisdom has recently been that 20% of content is structured, and the rest is unstructured. This factoid may be loosely borrowed from an older number, which is that 20% of an organization's data assets are in some kind of structured form (typically relational database tables). I wonder if either number really holds up.

I think it is the rare organization that has a great deal of "document" content in a structure such as XML (or SGML for older materials). In my experience, most large collections of XML-tagged content represent one silo of an organization's data (the technical manuals in a manufacturing company, or the catalog data in, say, an electrical supply company). Some legacy content may never end up in a structured form, but the toolsets are almost there to allow us to ask the question, "Should all new content, born digitally, be structured?"

I'm not sure I am ready to answer the question. Are you?

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 11:01 PM | Comments (2)

September 15, 2003

More XML-Based Publishing?

Seybold was a very busy week for me, so I didn't really get a chance to step back and really think about what trends seemed to be represented there. However, it does seem like there is more XML-based publishing going on. And this includes publishing to print, through desktop engines such as Quark Express and Adobe InDesign.

The conversations I had on the show floor seemed to indicate this expanded emphasis on XML comes largely from the requirement for simultaneous output to print, the Web, and other electronic formats. Nothing new there, of course, but the reality seems to be setting in that multiple output publishing is here to stay. As I have said elsewhere, conventional wisdom says everyone's "second business is publishing"; now everyone's second business is multiple output publishing. So, if that is the case, and XML does the job, it follows to use XML, doesn't it? Not always, of course, but apparently in more and more cases.

I hope that this new emphasis doesn't lead people down a path of complex, nearly impossible implementations of XML. Most documents can be supported by very simple XML Document Type Definitions (DTDs) or schemas, some of which are already in the public domain. (Although some of the public domain ones are also over-engineered and difficult to implement, too, so be careful there as well.) Keep the initial implementation very simple, starting with a pilot and going from there.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 10:59 PM

September 11, 2003

CMS, DAM, and Hosted Applications

I have spent some time at the show with Crownpeak, Atomz, and eMotion. The first two are Applications Service Providers (ASPs) for content management, whereas eMotion is an ASP for digital asset management. When I start to think about the ASP option, I sometimes find myself thinking that everyone should simply do it—the argument seems so compelling.

Of course, it doesn't make sense for everyone to use a hosted application for content management. At certain ends of the market, it might not make financial sense (if you are either very small or very large, or if your content, workflow, and/or business requirements are inordinately (and necessarily) complex). But the track record for CMS implementations is still relatively spotty, so why not pay someone else to perform a set of tasks and functions for you at an agreed price? Sometimes it makes too much sense.

Posted by Bill Trippe at 7:06 PM | Comments (2)

August 19, 2003

Content Management and the Enterprise

The growth of content management technology has far outpaced the technologies that were forebears to content management, such as document management and knowledge management.

Perhaps more significantly, content management has a broader and more important role in organizations than these other technologies. In the case of document management, the technology was often relegated to departmental roles; in the case of knowledge management, it often failed to move beyond pilot installations. Content management is taking on a central role within the enterprise.

Indeed, the very term “content management” has been edged out by “enterprise content management” as analysts, journalists, and IT professionals draw more and more direct connections between the content of the organization and the many, varied applications and interfaces that enterprises are deploying around an Internet-based infrastructure.

The bottom line is content management solutions must deal with a broad spectrum of challenges—beginning with the effective management of many types of content, and ending with the integration of this content into a wide and growing number of applications.

Content is becoming part of the complex synthesis of once-independent business processes, even while these business processes shift outside the enterprise’s four walls, and move towards integration with those of its customers, partners, and suppliers. Furthermore, enterprises are making not just content but also operational data available to consumers, customers, and business partners via the Web, in what many call the “extended enterprise.”

The Internet has made the Web into a huge business community, and with the development of Java programming language and the standardization with XML for data interchange, the movement of content and data from back office applications on to the Web and from the Web into back office applications has increased tremendously. When it comes to the integration of content management in the new extended enterprise, two fundamental questions arise:

· How can the enterprise leverage its existing infrastructure, applications, and data?
· How can the enterprise pursue content management integration with least effort and most success?

While much is being claimed for the “extended enterprise,” the fact is that for very many businesses today, content management needs remain simple. These companies’ needs can be met but literally hundreds of products available today, ranging from application service providers that, in effect, create, manage, and serve a company’s content for them, to basic HTML tools such as Microsoft FrontPage and Macromedia Dreamweaver that helps Web designers and Webmasters create Web pages for serving through the company’s own Web server, to standalone Web content management platforms that provide much higher functionality, a wider range of contributor interfaces, and workflow and versioning administration.

But because of the growing role of the Internet in commerce, enterprises can not stay with simple content management without peril. It behooves the strategically-minded enterprise to look at contant management tools that not only meet relatively simple content environments today, but which can also anticipate and address the more complex challenges of fusing content and other business processes.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 5:50 PM

August 15, 2003

XML and Content Categorization

I have begun researching an upcoming article for EContent Magazine on the role that XML is playing in content categorization tools and approaches. This seems to be a quickly widening and changing world. I am sure I will touch on certain core approaches such as RDF; I am not so sure yet if I will be talking about Topic Maps.

The larger vendors in this space seem to include Autonomy, NStein, and Verity on the tools side, and content management vendors such as Documentum and Stellent. It will be interesting to see what kinds of customer case studies they will be able to provide for the research and validation. It will be also interesting to see if XML is core to their categorization approach, is one of several approaches, or is an output or byproduct of their approach.

Some of my early research, and a reference from colleague Bob Boeri, points me to a Medical Subject Headings system (MeSH), and what looks to be a pretty comprehensive effort at the National Library of Medicine to create XML-encoded public databases that use MeSH encoding. Such databases are beginning to answer the "chicken and egg" problem XML initiatives have encountered. Many initiatives represent great ideas and well thought-out approaches to implement the great ideas, only to wither on the vine for lack of real data. Such critical masses of data seem to be emerging in areas such as scientific research and financial services.

Some other resources I will be exploring include the following:

(These last two were both written by Eric van der Vlist, cited on the www.xml.com Web site as an ODP editor and publisher and editor of XMLfr.)

Posted by Bill Trippe at 6:00 PM

August 14, 2003

Welcome

Web sites suffer from a number of maladies, but the most common one, by far, is atrophy. I have battled this in my own practice. I am so busy advising others on how to do things with their content, that I never get around to maintaining my primary Web site nearly as much as I should. Articles that I wrote last month won't appear as a link until months from now. Meanwhile, two year old articles—some of them hopelessly outdated—continue to be prominently featured.

So I have decided to try using a blogging tool as a means of keeping my primary site up to date. In addition, I will be trying some new features, many of which are only in the idea phase.

As a consultant and writer, I am fascinated with how much content the average person creates and consumes in the normal course of doing business. For example, I purchased a new notebook computer and began using it four months ago. As of today, the "My Documents" folder contains 2439 files totalling 349 MB of data. My quick analysis tells me that about 150 MB of that content was transferred over in bulk from another computer. The remaining 200 MB has been created or accumulated by me in the course of my work.

That's a lot of content.

One project is pretty typical, I think, of the constellation of content that one creates, consumes, or otherwise accumulates in the course of doing work. For a report that I am writing, I have accumulated 88 files, spread over three folders, amounting to 14 MB of information. My written output—including correspondence, outlines, summaries of interviews and research, and various drafts of the document and its sections—totals 2 MB. The remaining content is source material that I am