September 8, 2008

Amazon Video on Demand

People are often so focused on Google's plan for world domination that they fail to notice how much content distribution capability Amazon is developing. Today they announced a video on demand service.

As my friend and Gilbane colleague David Guenette has noted, wouldn't it make sense for Kindle to be the device for all this content?

Posted by Bill Trippe at 8:24 AM

September 5, 2008

Second Life Scripting

I see from LinkedIn that a friend of mine, Michael Thome, has co-written Scripting Your World: The Official Guide to Second Life Scripting. Congratulations, Mike! And all you Second Life users, click here and buy one now.

Posted by Bill Trippe at 4:42 PM

August 30, 2008

Acrobat and Flex Books, Software, and Resources

I finally got around to updating my Acrobat and Flex aStore to reflect the latest releases of Acrobat and Flex. One sign of a robust software business is an active program of independent writing and publishing around the products, and Adobe has plenty of that.

Posted by Bill Trippe at 1:34 PM

January 10, 2008

Contract Developer Needs: Moodle, Arbortext Editor

Two colleagues of mine are looking for some medium- to long-term help on projects.

One needs a Moodle developer. They would like to find someone to first do some quick consultation on whether their project is feasible to build with Moodle, and if so, they will then need help with creating the necessary extensions. The company is in greater Boston, and they would like the developer to be available to visit the office from time to time as the project progresses.

The other needs a developer who has experience customizing Arbortext Editor--developing DTDs or schemas, developing stylesheets, and supporting the overall implementation.

If you are experienced in either of these areas, please email me.


Posted by Bill Trippe at 6:15 PM

January 3, 2008

The Kindle Digital Text Platform

I was rooting around on Amazon the other day, seeing what other kind of (non-book) content was available for the Kindle when I discovered the Digital Text Platform Amazon has made available for publishing content in Kindle format. "DTP" is listed as Beta, but I found it functional and easy to use. Basically you create all the metadata for the title, including pricing information, and then upload the content for conversion to the Kindle format. To test it, I created an eBook out of a series of articles I have written on content management and XML. They seem to want HTML ("The preferred format for uploading content is as a single HTML file"), but I got impatient when I then read you needed to assemble linked images in a zip file using special instructions. So I went with a single Word .doc file ("standard .doc files will often convert without a hitch"). For the most part, it did convert without a hitch, though it did a woefully bad job with a small number of very simple tables. To work around that, I simplified a couple of the tables and deleted the others. In fairness to Amazon, I worked quickly, and could have experimented with HTML tables.

If you're a Kindle owner and happen to buy the title, I would love to hear from you about the experience. Since I don't own a Kindle yet, I had to rely on the preview capability in DTP, which basically gives you an HTML view of the content.

From the introduction to the eBook:

The following articles, white papers, and blog entries were written between 2000 and 2006. They appeared in one of several publications: The Gilbane Report, eContent Magazine, E-DOC Magazine, or Transform Magazine. Some appeared in my blog, www.billtrippe.com, or its predecessor blog, Ideas in Technology and Publishing. I undertook this compilation as an experiment in working with the beta version of Amazon.com's Digital Text Platform for creating content for the Kindle eBook reader.

I only edited the material lightly, so the articles are showing their age in places. Some links are likely out of date, some product references may be to versions of products that have since been superseded, and at least one product, XMetaL, has changed corporate ownership at least once since first written about in one or more of these articles. However, I chose these articles from many, many others I could have chosen because the material is evergreen and still useful, I think. I stand by what has been written here, especially for the price!

Posted by Bill Trippe at 8:56 AM

December 19, 2007

Meanwhile, Over at Facebook

I mentioned before that I have been trying the Facebook thing, but kind of don't get it. I am willing to chalk it up to a generational thing. But then Dave Kellogg, CEO of Mark Logic, posted this video to my wall, and it all makes sense now.

Posted by Bill Trippe at 9:54 AM

December 18, 2007

Call for Papers: Gilbane San Francisco 2008

They are now accepting proposals for panel participation and presentations for Gilbane San Francisco 2008, to be held at the Westin Market Hotel, San Francisco, June 17 - 19, 2008.

Join the content and information technology's leading analysts, IT strategists, and technologists at the industry's most popular and important conference this coming Spring. Share your expertise and experience, and network with the forward-thinking implementers and thought leaders.

How to be a speaker

Choose a topic area from the list below and see how to submit a proposal. The deadline is January 15, 2008. Topics to be covered in-depth include:

If you've never been to one of the Gilbane events and want see what we have been covering in our conferences, check-out the programs from the recent hugely successful Gilbane Boston 2007 and Gilbane San Francisco 2007.

Posted by Bill Trippe at 10:14 AM | Comments (1)

December 4, 2007

Meanwhile, Over at Gilbane...

Tomorrow, I will be part of a webinar, What Every Publisher Needs to Know About Content Management. It's being put on by Book Business Magazine and sponsored by Follett Digital Resources. Matt Steinmetz, Special Projects Editor for Book Business will be moderating, and I will be joined on the virtual dais by Jabin White, Vice President for Product Management at Silverchair.

I'm going to be presenting a market overview, offer some definitions, and discuss some recent and emerging trends. I'm going to leave most of the heavy lifting to Jabin, though. He is truly one of the smart guys in the business and an excellent presenter, and I am looking forward to hearing what he has to say.

You can go right to the registration page here.

Posted by Bill Trippe at 8:40 PM

November 30, 2007

Kindle Still "Sold Out"

I keep seeing references to Kindle being sold out, but I have yet to find a number of how many sold. The main Kindle page at Amazon now says you won't get one by Christmas. This seems like a problem to me--missing Christmas sales and also not even promising a specific ship date.

Posted by Bill Trippe at 4:04 PM

November 28, 2007

Social Networks

I am at the opening keynote at Gilbane. The speakers:

There is quite a bit of discussion on social networks.

I just passed 500 connections on LinkedIn. I mention this because I have found LinkedIn to be a valuable resource. It's a great way to keep in touch of colleagues, especially if they are also active users. I have found long-lost colleagues and friends, made useful connections, helped other people make useful connections, and even found projects and prospects there. I compare this with Facebook, which I joined more recently. Facebook is a powerhouse, no doubt, and there seems like an endless number of applications and activities there. But I guess I am an old fart. I don't get half of the apps, and I don't like the default behavior where every new app and even every action on every app is to ask your entire network to do the same thing with that app--take the same movie quiz, answer the same question, and so forth. It strikes me as the equivalent of forwarding the same email to every person in your contact list. Of course, you don't have to ask every contact to do something--you can select some or one or none. You can even do nothing with any of the applications, which is what I tend to do.

I don't know what the effect of Google's OpenSocial initiative will be. Conventional wisdom seems to be that it won't make a dent in Facebook, and, aside from LinkedIn, the founding members seem to be a who's who of failed social networks, including Google's own orkut. And, generally, I am deeply skeptical of anything Google does outside of consumer search and pay-per-click advertising. But assuming not everyone in the world will join precisely one social network, doesn't it make perfect sense for these networks to have a common API?

Posted by Bill Trippe at 8:46 AM | Comments (1)

November 19, 2007

Amazon Kindle

Amazon debuted Kindle, its eBook reader, today. I haven't seen it yet, of course, but I'm impressed by the number of titles they have available at launch. And the pricepoints--NYT's bestsellers at a standard price of $9.99.

Lots of interesting details about the feature set as well as the complementary content, like Wikipedia, newspapers, blogs. Another detail, reported by CNET, caught my eye:

Kindle, which was manufactured by an undisclosed Chinese original equipment manufacturer, connects to its specialized Amazon store via an EV-DO (Evolution Data Optimized) cellular network through "Amazon Whispernet," built atop Sprint's EV-DO network. No data plan or monthly bill is required. "We pay for all of that behind the scenes so that you can just read," Bezos said, adding that he estimated that it would take "less than a minute" to download a book.

If it is really that easy to use and keep up to date, they are on to something.

WSJ.com has a blog roundup (subscription), and proving that Kindle seems to be real news, it even made All Things Considered. And, last but not least, PW weighs in.

Posted by Bill Trippe at 12:52 PM

November 15, 2007

Microsoft SharePoint and ECM: Ready for Primetime?

If you are interested in SharePoint for ECM applications, the webinar I recently did for Gilbane is now recorded and available on the website of the sponsoring company, KnowledgeLake.

Posted by Bill Trippe at 8:27 PM

November 13, 2007

Adobe Management Changes

Adobe's CEO Bruce Chizen steps down, and the market is reacting. But Adobe also said Monday that "fourth-quarter sales would be near the top end of its guidance of $860 million to $890 million."

Oh, for the record, I don't own any stock or have any other financial interest in Adobe. As a rule, I avoid investing in companies that I cover or might do business with--partly to avoid a conflict of interest but also because I am terrible at picking tech stocks. They either go bust, or I sell them at a small loss or profit the second before they take off like a rocket.

Posted by Bill Trippe at 12:37 PM

November 7, 2007

What kind of computer do you have?

A white one.

You wonder if these are too good to be true, but they are still funny.

Posted by Bill Trippe at 10:44 AM | Comments (1)

November 6, 2007

Microsoft SharePoint and the Enterprise Content Management Market

On Thursday, I will be doing a webinar on SharePoint and ECM. I wrote a bit about the topic over at Gilbane (and we have a white paper on the topic coming out shortly). If you are interested in attending the Webinar, you can register over at KnowledgeLake, the company sponsoring the webinar.

Posted by Bill Trippe at 8:44 PM

November 1, 2007

XForms and Rich Text Editing

Over at Developerworks, Steve Speicher and Andy Smith show some approaches for adding rich text editing controls to an XForms application.

By following some of the integration rules defined by XForms, XBL, and a rich text editor, the end result is a simple and powerful addition to the XForms set of controls. This can further enable the application of XForms in a variety of applications, such as blogs, e-mails, social networking sites, and more. These can then leverage the built-in capabilities of XForms for validation, XML submission, declarative programming, and more.
This kind of thinking reminds me of some of my early thinking about XForms in particular and XML-based forms in general. When does a form end and a text editor begin?

 

Posted by Bill Trippe at 3:16 PM

October 24, 2007

Docmetrics Trial: Free $250 Credit

I've mentioned protectedpdf from Vitrium Systems in the past. I saw a demo and was impressed. They now have a companion technology, docmetrics, which allows you to measure reader behavior. They have a free docmetrics trial if you are interested.

Posted by Bill Trippe at 8:12 PM

October 19, 2007

Does XForms Technology Have Momentum?

I have a few Google news and blog alerts that help me keep track of some technologies of interest. One is for XForms, which I receive as a daily digest, and I always get something every day, usually four or five items, almost all from blogs. Almost every item is technical and fairly in-depth, usually about something the blogger is prototyping or developing. I compare this to my alert for InfoPath, which doesn't come every day, and the items that do trickle in are rarely technical. Usually they are PR about a product, where InfoPath is mentioned in a list of technologies that the product works with. In fairness to Microsoft, I just played with a search for "forms services" in blogs, and got more hits from that, so I will set up an alert. Interestingly though, in Google blog search, I get a total of 2,016 hits for "forms services" and 35,585 for XForms.

In today's XForms alert, John Boyer of IBM offers some ideas for talking to C-Level types about XForms. For John, the business value of XForms comes down to this:

Read the whole thing.

Posted by Bill Trippe at 9:28 AM

October 13, 2007

Here and There

Posted by Bill Trippe at 10:08 PM

October 4, 2007

Meanwhile, Over at Gilbane...

The sessions that I have been organizing on enterprise publishing technology have been coming together. For the session on DITA and related standards like S1000D, we have Bob Doyle of the Boston DITA Group and Don Bridges of Data Conversion Labs. We have another speaker from industry who will be talking about S1000D, but he is still awaiting the go-ahead from his corporate communications folks.

For the session on multi-channel publishing, John Parsons, Editorial Director of The Seybold Report will be moderating, and two speakers are on board, again with a third likely to be joining soon. Rich Pasewark, a former colleague of mine from XyEnterprise and more recently with Quark, is working independently now on some very interesting projects. The second speaker is Mark Laroche, who is Director of Production for Digital Media at Random House. He is going to be talking about some very forward-thinking work they have been doing withe the Fodor's travel guides.

Finally, for the metadata session we have two speakers, with a third to be announced shortly. We were very happy to talk our client Richard Ferrie from Pearson into speaking. Rick is Senior Vice President, Publishing Operations and Content Management for all of Pearson, and has some top-level lessons learned on what works and what doesn't in bringing metadata into publishing workflows and systems. Gilbane analyst Bill Rosenblatt will also be speaking, bringing his perspective on metadata efforts at some of the largest publishers and media companies out there.

Keep an eye on the conference session descriptions page and the Gilbane events blog as we add new speakers and elements to the conference.

Posted by Bill Trippe at 9:05 AM

October 3, 2007

Back, I Think

I had some problems with my Movable Type installation which led me to upgrade to MT 4, but only after I had migrated to a new server at my hosting company.

Fun, fun, fun!

Posted by Bill Trippe at 2:53 PM

August 31, 2007

DCL's DITA Test Drive

Over at The Content Wrangler, Scott Abel shares his enthusiasm for the "DITA Test Drive" offering from Data Conversion Labs.

Sometimes the sheer volume of information on the internet is overwhelming. Even with the help of Google Alerts and RSS feeds, it’s easy to miss interesting news. That’s likely the reason we failed to notice this especially interesting offer from the folks at Data Conversion Laboratory (DCL). It’s called the DITA Test Drive Challenge, a program that allows content-heavy organizations a shortcut to DITA. For $3000 (okay, $2995, technically), DCL will convert 500 pages of legacy content to DITA and perform a Content Reuse Analysis on 2500 pages of legacy content. Wow! That’s quite an offer. Why would you want to take advantage of this offer? Because there’s a dirty little secret in XML authoring land. It’s next to impossible to evaluate an XML authoring tool without actually using some of your own content in it. Testing an XML editor with your own content will help you avoid selecting the wrong authoring tool for your organization. Those who skip this step generally purchase software based on the opinions of others and sometimes after having downloaded a free trial version of the software (which is pretty useless without your own DTD and some real content).

Posted by Bill Trippe at 8:39 PM

August 20, 2007

Unicode and Microsoft Internet Explorer

A scientific publishing client writes:

"We are making great progress converting all our documents to HTML (from SGML). One challenge we are facing is how to convert Unicode character entities into characters displayable in Internet Explorer. It appears that Netscape and Firefox work much better than IE in displaying Unicode. One option is to create glyphs for all of the non displayable characters; but, there are hundreds of them and that is not realistic for us.

Do you know how other publishers are handling the display of these special characters? If the characters appear in display equations, we are creating gifs. Our challenge is for those characters that appear in text, which are now displaying as boxes in IE. For example, the entity bsime is used for similar or equal to. Unicode represents this as ⋍ and it should display as ⋍ (Editor's note: you are seeing this if you viewing this in Firefox or Netscape!)

Are there plug ins or sites that have all the glyphs or does Microsoft have special setups, etc? We have the same question out to a few of our vendors to see if they can help as well. This has become the critical path for us."

Thoughts?

UPDATE:

I forgot to post this awhile ago. My colleague Marc Dashevsky worked with the client and they came up with the following:

In short, the problem is solely with Internet Explorer V6.x. The Mozilla-based browsers and Internet Explorer V7.x all display the same subset of Unicode. Following is a description of the testing.

He set up his system as follows:

* The font Arial Unicode MS was already installed on his system.
* He explicitly set, in Internet Explorer, Arial Unicode MS to be the Web Page Font (Tools->Internet Options->Fonts).
* He ensured that Internet Explorer, Firefox and Netscape were all using UTF-8 encoding.


He then visited a web page that lists many characters in ISO 8859-1.

Just as the client had experienced with with uncommon characters displayed in its HTML pages, on this page the Mozilla-based browsers and Internet Explorer V7 displayed many characters not displayed by Internet Explorer V6. All browsers successfully displayed all characters listed in the Latin Extended-A block. However when he got to characters in the Minimum European Subset (a.k.a. the Multilingual European Subset No. 2), Internet Explorer V6 displayed open rectangles while Mozilla browsers displayed appropriate glyphs. (An open rectangle means that Internet Explorer knows what character it has encountered, but it cannot find a glyph to display it.)

There clearly is some problem with Internet Explorer V6, and it is not likely that there is a work-around for it. Microsoft fixed the problem in V7 and he is certain they have no interest in retrofitting it to V6.

Marc's solution is to have everyone switch to Firefox.

Makes sense to me.

Posted by Bill Trippe at 5:32 PM

August 11, 2007

Comcast Hates You: A Tragicomedy in Three Acts

Well, they do if you use their high-speed Internet service and want to send email.

I have Comcast at home for cable, Internet, and digital phone. I haven't been an unhappy customer by any means, though I have always found them expensive and Luddite. I got a kick out of it a few weeks ago when they called out of nowhere to announce a price drop, but this was on or about the day my city government approved Verizon to provide FIOS service here. Competition is a wonderful thing.

So anyway, yesterday I was working at home and I found my email was not sending successfully. I didn't fuss it with it much for a bit, as I was busy with some work. When I started to debug it, though, I was able to figure out it wasn't local to my machine. We have three machines on a secure wireless network hung off the cable modem, and none of the three was successfully sending email. So I got online to chat with Comcast.

This was the start of my fun. Here's the Reader's Digest version of a much longer and more frustrating conversation I had:

Clueless Comcast Support Person #1: Do you need help configuring Outlook Express (the only email client they officially support, as they distribute it)?

Still Agreeable Me: No, the problem is not with the client. I have several machines with different clients, and they are all having the same problem.

Clueless Comcast Support Person #1: Since you don't need help configuring Outlook Express, bye and have a nice day!

Still Agreeable Me: Wait! None of the email clients work and they have all worked fine for years. What is the difference?

Clueless Comcast Support Person #1: (Long delay, mumble, mumble.) Oh, you were sending too much email (more than 100!), so we blocked your access to port 25.

Still Sort-of-Agreeable Me: Oh? I suddenly became a spammer after several years of never having been one? I run up-to-date security software on all my machines. Which one caused the problem?

Clueless Comcast Support Person #1: Comcast values your security [Ed note: Clearly a cut and paste!] and we cannot tell you that. However, if you follow this 12-foot-long instruction and send an email from your Comcast email, it will direct you to a URL that will explain how to unblock port 25.

I don't use Comcast email, but I had set up a Comcast login, so, good little computer user that I am, I tried what he said.

It didn't work of course.

So I called this time.

Distressed Me: I was online, trying to get my email to work, port 25 is blocked, I tried his suggested fix, and it didn't work.

Clueless Comcast Support Person #2: Do you need help configuring Outlook Express?

Aggravated Me: No, it has nothing to do with my client. You blocked my access to port 25, and I can't send...

Clueless Comcast Support Person #2: Since you don't need help configuring Outlook Express, bye and have a nice day!

Infuriated Me: If you say the words "Outlook Express" one more time, I am going to kill you. You are blocking my outgoing port for alleged security problems. Your fix didn't work. What can be done?

Clueless Comcast Support Person #2: (Long delay, mumble, mumble.) Our security department is going to look into it and it will be fixed in 24 hours.

This morning, I got online again, new guy.

Tired-but-Somehow-Hopeful-Me: I was checking to see if the problem with my email has been resolved? And please don't mention Outlook Expr...

Clueless Comcast Support Person #3: Do you need help configuring Outlook Express?

Ready-to-be-Livid-Me: Please look up my account details for the history on this problem.

Clueless Comcast Support Person #3: Do you need help configuring Outlook Express?

Livid-Me: Are you going to unblock port 25 or not?

Suddenly Clueful Comcast Support Person: We do not lift blocks on port 25.

Cool-as-a-Cucumber-Me: Do you have the number for Verizon?

POSTSCRIPT: I ended up talking with someone in Comcast security. Despite what the first two support people told me, they do not selectively lift blocks on port 25. He did not have information about whether my connection was used to spam (I am virtually certain it has not been), but implied instead that they are doing this across the board.

The fix is challenging. Comcast's online help--and the tech support people--are only prepared to help you reconfigure a comcast.net email to use an alternate port, port 587. They do not tell you how to configure other email addresses. What I ended up doing was configuring my other emails to use smtp.comcast.net for the outgoing email server (port 587). This works from here, and I am hoping it will also work when I am using this laptop elsewhere.

I find a few things about this episode absolutely amazing.

-- If Comcast is doing this to more than a few people, they are astonishingly arrogant to roll something like this out without informing people. I found a number of other blogs discussing this.
-- Comcast hates their customers, but they also hate their tech support staff. Imagine having calls coming in about something you don't have a clue how to answer?
-- Is it a blanket change in using this port, as the security guy said, or was something happening on my connection? Who knows, but Comcast should have their story straight.

If Comcast is doing this, as the other blogs suggest, to combat spam, well, good for them. I hate spam. But if they are taking my money, they should spend some of it to roll out such changes in a thoughtful, well supported way. Their tech support folks should be better informed, and their online doc and Help should address the thousands of users like me who use non-comcast.net emails.

UPDATE: Another blogger says Comcast's port change will be ineffectual against spammers.

ANOTHER UPDATE: My fix works at home, but not at my office, where I had to revert to the old port and the old SMTP server. So either I need to find a more general fix or toggle between the two sets of settings (I have four emails...). Fun, fun, fun!

Posted by Bill Trippe at 8:21 PM | Comments (4)

July 18, 2007

Mac OS X Leopard

If you have been waiting for Leopard, you can preorder it now at Amazon.com.

According to the article at Wikipedia, "Leopard contains over 300 changes and enhancements, according to Apple.[3] Some notable features include support for writing 64-bit graphical user interface applications, an automated backup utility called Time Machine, support for Spotlight searches across multiple machines, and large revisions to most core operating system components."

Posted by Bill Trippe at 9:52 AM

May 30, 2007

Excel and XML

Since so much metadata, and even editorial content, is often produced in Microsoft Excel, shouldn't publishers consider using SpreadsheetML for long-term uses of Microsoft Excel? A tutorial over at Brian Jones' blog got me thinking about it. If you are interested in a more in-depth look at SpreadsheetML, start here.

Posted by Bill Trippe at 10:25 AM

May 26, 2007

DITA for Small Groups

Are you a sole proprietor, sole documentation person, or part of a small doc group? Check out Lone-DITA.

Speaking of DITA, if you haven't already, you should check out DITA Storm, a browser-based DITA editor.

Posted by Bill Trippe at 1:59 PM

May 21, 2007

Thinking about DITA vs S1000D?

Over at TheContentWrangler.com, Joel Amoussou has some thoughts:

The subject of interoperability between S1000D and the Darwin Information Typing Architecture (DITA) has received significant attention within the technical documentation community recently. This article discusses the following issues:

--Shall we create DITA specializations for S1000D data modules?
--How can we facilitate interoperability between DITA and S1000D, to enable round-tripping transforms for example?
--Is the DITA specialization mechanism the best way to make S1000D extensible?
--How can users leverage the strengths of both DITA and S1000D without introducing complexity?

As they say in the blogosphere, read the whole thing.

Posted by Bill Trippe at 1:31 PM

May 8, 2007

QuarkXPress Server 7 and new QPS Users

I would like to speak to people who are using the new versions of QuarkXPress Server and also the new QPS for some research and writing that I am doing. Please email me and I will follow up.

Posted by Bill Trippe at 2:49 PM | Comments (2) | TrackBack

XML and Globalization

SDL Warns Businesses to Think Global When Migrating to XML

SDL, one of the big players in globalization solutions, announced today the findings of a research project into the use of XML in delivering global content across multiple channels. This is something I have written about for Gilbane (here and here), and I am very interested in best practices that will emerge as more and more companies use XML in producing content for global audiences.

SDL appropriately notes that the global implications of moving to XML must be considered up-front, and is providing seven "golden rules" at www.sdlglobalxml.com to ensure successful implementation of XML projects for communicating with global audiences:

  1. XML alone does not solve the issue of global content
  2. Think global from the start of your XML strategy
  3. Automate the process of managing higher volumes of smaller chunks, being sent more frequently for translation
  4. Ensure translators can visualize the context of XML chunks
  5. Optimize the structure of your XML for localization
  6. Protect your XML code during localization
  7. Ensure terminology and style are consistent across dispersed chunks

Posted by Bill Trippe at 2:44 PM | Comments (2)

April 28, 2007

Apollo Widget and Vista

That finetune.com Apollo widget I mentioned recently? Works like a charm on Windows XP, but not on Windows Vista. I downloaded it, installed it, and launched it. Weirdly, the process runs on Vista, but the GUI simply never appears. On XP, it is a nice little app from what I can see.

Posted by Bill Trippe at 9:03 PM

MathML 3.0 Working Draft Published

Mathematical Markup Language (MathML) Version 3.0: Working Draft

2007-04-27: The Math Working Group has published the First Public Working Draft of Mathematical Markup Language (MathML) Version 3.0. MathML is an XML application for describing mathematical notation and capturing both its structure and content. The goal of MathML is to enable mathematics to be served, received, and processed on the World Wide Web, just as HTML has enabled this functionality for text.

In related news, the W3C has also published a MathML for CSS profile.

Posted by Bill Trippe at 5:33 PM

April 11, 2007

Can Blogs Persist in the Way Scholarly Information Does?

Jon Udell interviews Geoffrey Builder, Director of Strategic Initiatives at CrossRef, and a veteran in the scholarly technology world. They discuss CrossRef's critical role in the scholarly information world, how Digital Object Identifiers (DOIs) work, and what this kind of technology means for blogs and other content.

Posted by Bill Trippe at 12:16 PM

April 6, 2007

eBooks No, But ePaper Yes?

"Electronic paper" edging toward reality

"Electronic paper" has long been hyped as the future of newspapers and books, but products like e-books have been slow to take off. That may soon change, say executives involved in the pioneering technology. While Internet companies are scanning libraries of books and making them available online, E Ink Corp., which emerged out of the Massachusetts Institute of Technology a decade ago, is seeing a surge in orders for its portable, foldable displays that mimic conventional paper to carry such books. Nine different companies launched products last year based on the technology," said Russell Wilcox, E Ink president. "In the last nine months we've gone from manufacturing tens of thousands of parts to millions of parts."

Posted by Bill Trippe at 7:03 PM | TrackBack

April 5, 2007

DRM-Free EMI: Microsoft Joins Apple

Over at DRMWatch, Bill Rosenblatt weighs in on DRM-Free music, EMI, Microsoft, and Apple.

As far as EMI is concerned, the deal was shortsighted, risky, and possibly irresponsible to the company's shareholders. EMI is the smallest of the four majors, enjoys no synergies with corporate siblings, and is undergoing financial hard times. This move with Apple was a lunge for near-term revenue, at the quite possible expense of longer term revenue for EMI and the rest of the industry. EMI gets a cash advance of US $5 Million from Apple. It should enjoy a short-term revenue spurt as some consumers respond to the hype and purchase DRM-free tracks for $1.29 (in the US market).

Posted by Bill Trippe at 8:23 PM

April 2, 2007

Currently Reading

One of the takeaways, er, giveaways from the Adobe Analyst Meetings last week was a nice little book, Apollo for Adobe Flex Developers Pocket Guide. I read most of it on the train ride home (the narrative parts anyway, a chunk of it is reference), and it made sense, though I am not a Flex developer. Note that it is indeed specifically for Flex developers, and it is indeed a pocket guide.

Posted by Bill Trippe at 12:06 AM

March 30, 2007

Adobe Analyst Meetings

Back from a couple of days in New York, where I attended the Adobe Analyst meetings. Impressive stuff, as I relate over at the Gilbane Group blog.

Posted by Bill Trippe at 4:11 PM

March 21, 2007

And a Busy Year it Was

The 2006 Year in Review for DITA

by Don Day, Chair, OASIS DITA Technical Committee
IBM Lead DITA Architect

The OASIS DITA standard:
The current standard is at DITA 1.0.  During 2006, committee work was focused on developing the proposed DITA 1.1 features (see "Roadmap for DITA development.). Just last month, the committee released a Public Review draft for DITA 1.1, which is expected to be approved later this year.

Posted by Bill Trippe at 2:35 PM

March 19, 2007

Grazr

An article in Mass High Tech about RSS startup Grazr caught my eye, so I went to the Web site and played around with their widget.

It seems pretty cool. You can build your own here if you have an OPML files to start with.

Posted by Bill Trippe at 3:01 PM

February 16, 2007

Brilliant

Hat tip to my Gilbane colleague Leonor Ciarlone.

Posted by Bill Trippe at 10:51 AM | TrackBack

February 14, 2007

WS-AreYouEvenStillReadingThis

Writing for IBM developerWorks, Elliotte Rusty Harold offers Ten Predictions for XML in 2007. I've always liked Elliotte's work. When SGML was giving way to XML, Elliotte wrote the first good book about XML, and he has gone on to write several more. His XML in a Nutshell is the book I always recommend to people looking for a solid overview and authoritative first reference, so his predictions mean something. He weighs in on a number of topics you would expect to hear about (XQuery, XForms, open document formats), and some that are less well known (the Atom Publishing Protocol (APP)). But the thing that really caught my eye was his skepticism about Web Services. The money quote: "Enterprises have absorbed as much Web services machinery as they're able to stomach. Web Services Description Language (WSDL) and SOAP 1.2 are the end of the line. Many enterprises won't even get that far. WS-Choreography, WS-Transport, WS-Reliability, WS-Security, WS-Resource, WS-ServiceGroup, WS-BaseFaults, WS-Messaging, WS-KitchenSink, and WS-AreYouEvenStillReadingThis won't leave the station."

Posted by Bill Trippe at 6:47 AM

February 12, 2007

EMC Retrospect Extends Support for Microsoft Windows Vista

But maybe a week too late for me...

I was using the Retrospect software to back up my Windows XP notebook, which started to die an unceremonious death a week or so ago. So I ran out and bought a new machine, and found myself stuck with Microsoft Vista because 10 of the 11 notebooks at Best Buy were already running Vista. Then I discovered that my backups, faithfully created with the Retrospect software, had no way of getting to my new Vista machine, since Retrospect wasn't working on Vista. How delightful!

So how does a major hardware and software vendor like EMC not have software updated at the same time a new version of the dominant OS comes out? I have no idea. Will Retrospect restore the backups from my Windows XP machine onto my Windows Vista machine? I will let you know.

Posted by Bill Trippe at 11:26 PM

February 4, 2007

Vista, Schmista

Too often, it seems, I find myself building a new system for myself, my small office, or for family use. Building one for the family is actually pretty easy. Windows, Office, and away we go. My office machine is a little trickier, as I have to account for things like Quickbooks, and that is difficult because somewhere along the line I put myself on this treadmill of having bought one full version followed by upgrades. So I end up installing the original software, then a couple of upgrades, and then I have to go to the Intuit Web site for a patch--blah, blah, blah, blah, blah. Really, it should all be easier.

The toughest job is building a new system for myself. Windows, Office, Acrobat, my HTML editor, some XML tools, Firefox, my backup system, and then a bunch of small things that I have grown to use and like--incuding Google Desktop and the Onfolio tool (which unfortunately is now only part of the Windows Live toolbar--oy). Then there are all the settings--network accounts, email accounts, ftp accounts, RSS settings. The details drive me crazy, and I don't want to count the hours I have spent tinkering with the new machine I bought Thursday evening that is still not 100% "mine."

The new machine has Windows Vista, by the way. And while I have not done much exploring, Vista is, well, to be polite, underwhelming. I am sure someone with some knowledge could spell out some of the improvements, but it fails the "doh!" test. In other words, it still does poorly what it has always done poorly. It still takes forever for the system to boot and to shut down, and the performance seems, incredibly, worse than my two-year old Model T of a machine, despite the fact that the new machine has four times as much memory and a much, much faster chip. How is this possible?

I am sure that I can improve on the performance. (Well, I assume I can, if I spend some time looking at my power settings, and at what is launched during startup, and how big the paging file is, blah, blah, blah, blah.) But this is exactly my point. It shouldn't be so hard. We are 20-something years into the personal computer era; why do we still have to baby and tweak and cajole and troubleshoot these systems like they are a whole new invention?

Posted by Bill Trippe at 6:17 PM | TrackBack

February 3, 2007

XForms Tutorial

New Orbeon Forms Tutorial

Hierarchical Menu

After lots of promises over the past few months, the new Orbeon Forms tutorial is finally available! You can read it online or get it with any recent builds of Orbeon Forms.

The tutorial specifically targets the upcoming Orbeon Forms 3.5, of which you can find nightly builds here. The tutorial covers:

Installing and configuring Orbeon Forms.


Understanding the simple XForms Hello application.


Building from scratch the Bookcast application, which allows you to keep track of the books you have read...

Posted by Bill Trippe at 7:28 PM

January 18, 2007

eBooks in the K-12 Classroom?

TeleRead offers some thoughts on a WiFied eInk machine and perhaps a K-12 push for the Sony eReader.

Spurred by the threat of the rumored Kindle E Ink machine from Amazon, Sony is considering a WiFi-enhanced successor to the Sony Reader, as well as a push to get E Ink machines into the classroom.

Posted by Bill Trippe at 1:02 PM

January 7, 2007

Adobe Acrobat 8

I have my copy of Acrobat 8, but have been too busy to install it. But I was spending some time updating by eForms Resources page, and started looking at the list of new books about Acrobat 8. Not surprisingly, you could start a small library with them. So I decided to put together an Amazon aStore with Acrobat-8-related products. Shop early and often!

Posted by Bill Trippe at 3:14 PM | TrackBack

January 6, 2007

Someone is Bullish about eReaders

E-Paper Display Company Plastic Logic Receives $100 Million Funding

In one of the biggest venture capital rounds ever in Europe, UK electronic paper display technology company Plastic Logic has received $100 million in venture funding. The new round was led by Oak Investment Partners and Tudor Investment Corporation.

Posted by Bill Trippe at 10:06 AM

January 3, 2007

XForms for UBL

Micah Dubinko highlights a new XForms for UBL project at SourceForge.

The Universal Business Language (UBL) provides standard XML formats for business documents. This project is to provide XForms which allow creation, processing and editing of UBL documents and XLST stylesheets to generate such forms.

Posted by Bill Trippe at 3:02 PM

December 28, 2006

Speaking of My Resource Pages...

... I also spent some time updating my eForms resources page.

Posted by Bill Trippe at 3:36 PM

December 27, 2006

Under the Christmas Tree

My teenage sons were very happy with their high-tech Christmas. Presents included an iPod Nano and Family Guy, Vol. 3 (Season 4, Part 1) for my younger son, while my older son has spent every waking moment with his Motorola Q Phone, taking time out to watch Pulp Fiction (Collector's Edition).

Meanwhile, I had a decidely low-tech Christmas, highlighted by the Get Fuzzy 2007 Page a Day Daily Boxed Calendar. Satchel Pooch is my man.

Posted by Bill Trippe at 10:13 AM

December 24, 2006

Relational database integration with RDF/OWL

Bob DuCharme, one of the smartest guys in the business, reports that his XML 2006 paper is done and available. You can download the paper here and the PowerPoint slides here.

Posted by Bill Trippe at 11:43 AM | Comments (1)

December 19, 2006

SVG is Dead; Long Live SVG?

Every time I decide SVG has lost all of its traction, I read something like this that makes me at least consider that SVG still has legs.

From the article:

The NeuART II software works from these CDs, checking copyright and also making sure that the user has a valid copy of Adobe illustrator, and then using a JavaScript program converting the files into a standard vector graphic (SVG) format. The SVG images are stored on the users system and organized using the software, which works on Linux, Windows and Macintosh operating system computers.

Posted by Bill Trippe at 3:58 PM | Comments (3)

Stumbling Upon

I mentioned having some fun with StumbleUpon. Then today I found a delightful site called WordPerhect. Check it out. The startup tips are a hoot.

Posted by Bill Trippe at 12:18 AM

December 14, 2006

Drupal with Ad Serving and Web Analytics

A client is interested in adopting Drupal, but is simultaneously looking at Web analytics and ad serving software, mainly commercial packages. I would like to put together a snapshot of some Drupal sites that use different analytics and ad serving packages, including Drupal modules. Would anyone be willing to share what they are currently using? You can post here or email me. Thanks.

Posted by Bill Trippe at 11:11 AM | Comments (1) | TrackBack

IBM, Yahoo Partner on Free Enterprise Search for SMBs

This strikes me as an interesting challenge to Google appliance, and a nice way for Yahoo to penetrate the enterprise with a well-regarded partner.

The free search package allows small and midsize businesses to search corporate file servers and databases as well as the public Web.

Posted by Bill Trippe at 10:35 AM | TrackBack

December 2, 2006

A Chapter for the Ladies

The joys of Project Gutenberg: baseball, as viewed in 1888.

On account of the associations by which a professional game of base-ball was supposed to be surrounded, it was for a long time thought not a proper sport for the patronage of ladies. Gradually, however, this illusion has been dispelled, until now at every principal contest they are found present in large numbers. One game is generally enough to interest the novice; she had expected to find it so difficult to understand and she soon discovers that she knows all about it; she is able to criticize plays and even find fault with the umpire; she is surprised and flattered by the wonderful grasp of her own understanding, and she begins to like the game. As with everything else that she likes at all, she likes it with all her might, and it is only a question of a few more games till she becomes an enthusiast. It is a fact that the sport has no more ardent admirers than are to be found among its lady attendants throughout the country.

Posted by Bill Trippe at 2:10 PM | TrackBack

November 22, 2006

Mixing MathML and SGML

Do you have any experience, or know of any instances, of mixing MathML within an SGML document instance? I have a client who is beginning the process of converting an extensive collection of SGML documents, and would like to go ahead and convert the equations first, into MathML, and then insert the equations back into the SGML document instances. One of their services providers is concerned about this. They are citing the SGML character entities in the current document instances versus the need--as they see it--to use Unicode in the MathML. However, as I read the MathML specification, you can still use SGML character entity references as long as you are using the MathML DTD and not the MathML XML Schema (see this section of the MathML recommendation).

Am I reading this correctly? Any experience with this?

I realize there are likely some other issues too, but this one came up in the first discussion...

Posted by Bill Trippe at 3:41 PM | TrackBack

November 21, 2006

Web Analytics Packages

A client is looking at web analytics software for a fairly complicated operation. They would like to track behavior from a number of domains (~40), and produce custom reports for both internal audiences and for advertisers. To complicate things, of course, the systems are heterogeneous (mainly Windows and Linux, but some Macintosh sprinkled in). They will be migrating to a common platform (TBD) in the foreseeable future, but may want to put the new analytics package in place before the migration.

They have started with the following list of packages to examine:

ClickTracks
CoreMetrics
WebSideStory
Sage Analytics
WebTrends
Omniture

Does this look the right list? Any others they should be looking at?

Any and all comments welcome, and feel free to contact me off the list as well (though no sales calls please).

Posted by Bill Trippe at 11:04 PM | Comments (3) | TrackBack

Mashups

Every now and then I get buzzword fatigue, and I got it almost immediately upon hearing the word "mashup." (And it doesn't help that the Wikipedia definition has the word "seamlessly" in the first sentence.) Still, I am sure the word is useful for some people, and I am sure there are some good mashups out there. Then today I found one that combines maps and flickr photos really nicely, and I decided that maybe I like the word after all.

Posted by Bill Trippe at 11:45 AM

October 24, 2006

Adobe Digital Editions

Adobe announced Digital Editions today (press release here). Digital Editions is billed as a rich internet application for digital publishing, enabling users to acquire, read, and manage a variety of digital content. There is an obvious match here for eBooks, but the platform also has significance for digital editions of magazines, for example, and other content that would benefit from digital rights management (DRM) support.

Ryan Stewart already has a close look at ZDNet, and considers it "extremely compelling for both content providers and users on a number of fronts." Alan Safford has some more thoughts at PC World. David Utter of Webpronews.com discusses some of the hosting and distribution issues, and highlights that Digital Editions is the first Adobe product based on Flex 2 (a point Adobe CTO Kevin Lynch also mentioned this morning).

UPDATE: Publishers Weekly has more, focusing on the reader interface.

I saw it today, and it looked good. It is a Beta, but the interface is attractive and the performance is terrific. I didn't dig in too much, but what I saw was a set of books with an attractive point-and-click navigation and very quick retrieval and display of the titles in Acrobat and in XHTML. You can download it here. I did, and it installs very quickly and easily.

FURTHER UPDATE: Don Fluckinger has a great overview at PDFZone.com.

AND YET ONE MORE: Bill Rosenblatt has some thoughts on the DRM implications of the new offering.

Posted by Bill Trippe at 1:40 PM | Comments (1)

Blogging Has Been Light

I have been heads down with some project work and writing, so blogging has been light. I am at Adobe Max for a couple of days, and just saw a very cool demo of more integrated Web publishing beginning in Photoshop and extending through Fireworks and Dreamweaver. It was a "future," but I will find out more in a press briefing later today with Adobe CTO Kevin Lynch.

UPDATE: There is a beta program for Fireworks 9 if you are interested in applying.

Posted by Bill Trippe at 12:40 PM

October 6, 2006

RFP: Document Management

Jim Rapoza of eWeek Labs offers a sample RFP for Document Management.

Posted by Bill Trippe at 6:05 PM

October 3, 2006

CrossRef Indicators

I remember when I first heard about Digital Object Identifiers DOIs and thinking, "great idea... needs critical mass." Well, according to the latest CrossRef Indicators, they have long since passed critical mass.

CROSSREF INDICATORS (September 29, 2006)

Total no. participating publishers & societies 1,683
% of non-profit publishers 64%
Total no. participating libraries 1,107
No. journals covered 15,215
No. DOIs registered to date 22,584,497
No. DOIs deposited in previous month 294,257
No. DOIs retrieved (matched references) in previous month 4,503,094
DOI resolutions (end-user clicks) in previous month 11,007,980

The 11 million plus DOI resolutions is staggering really. That is 11 million clicks on specialized, authoritative content in one month.

Posted by Bill Trippe at 12:57 PM

October 2, 2006

There's Fraud in Online Advertising?

Over at MarketingProfs, John Jantsch wonders if the click fraud problem is overhyped. Especially since the current issue of Business "Week screams across its cover" Click Fraud - The Dark Side of Online Advertising....

I enjoy this kind of perspective--Jantsch discusses the "analog" analogue to pay-per-click and wisely suggests people not overreact. At the same time, I think the democratization of PPC advertising puts more people at risk than, say, the phantom billboard example Jantsch suggests. Hence the need for the key parties to be vigilant, and also provide open, accountable, and measurable ways for buyers to know that their investment is being well spent.

Posted by Bill Trippe at 12:05 PM

September 29, 2006

Speaking of DITA

I wrote an article about DITA for the magazine, Multilingual Computing. Unfortunately, the article is available by subscription only. (Also, unfortunately, I am having trouble reaching their site right now...) But I have four certificates entitling readers to a one-year subscription to the magazine. It's an excellent magazine. Email me with your contact information, and I will mail you one of the certificates. First come, first served.

UPDATE: Corrected "one-ear subscription" to "one-year subscription." No one wonder they have been going slowly! I still have a couple left, so e-mail me if you are interested.

Posted by Bill Trippe at 10:12 AM

DITA Open Toolkit Release 1.3

Release 1.3 of the DITA Open Toolkit is now available. I have written about the rapid adoption of DITA (for example, here and here). One of the big reasons for the rapid adoption is the toolkit, which provides users with, among other things, a ready means of publishing DITA-encoded content in common formats such as PDF and Help.

Posted by Bill Trippe at 9:51 AM

September 27, 2006

Sony Reader Roundup

TeleRead has a good roundup of reviews on the Sony eReader.

Posted by Bill Trippe at 1:24 PM

September 26, 2006

Sony eReader Available

The Sony Portable Reader System PRS-500 is now available. TeleRead has a very thoughtful article about some of the challenges Sony faces. Meanwhile, I keep offering to review the thing, but no word from Sony.

More here from paidContent.org.

Posted by Bill Trippe at 7:54 PM

September 19, 2006

Meanwhile, Over at Gilbane

The Gilbane Group announced they have launched a blog for Chief Technology Officers (CTOs) who are involved in enterprise content applications, whether vendor, integrator, or enterprise implementer. The content technology CTO Blog is hosted by the Gilbane Group as a service to the content and information technology community. The purpose of the blog is to facilitate ongoing discussion and debate on technologies, approaches and architectures relevant to enterprise content applications. CTOs have a wealth of critical information about technologies that is not always accessible to enterprise customers. CTOs also have demanding jobs, and have limited time available to meet with each other with customers, or with other industry influencers. This blog is intended to encourage communication both between vendor CTOs and between enterprise customer CTOs and vendor CTOs. All CTOs are invited to participate as an author, and to comment. Two CTO Blog charter authors have already contributed posts during the pre-launch testing. John Newton, a Documentum founder and now founder and CTO of Alfresco, provides a provocative take on "content management 2.0". Vern Imrich, CTO of Percussion Software, shares insights into the apparent contradiction of content management technology moving up and down the technology infrastructure stack at the same time. Additional charter authors of the Content Technology CTO Blog include: Bill Cava, Ektron; James Gonthier, Refresh; Jason Hunter, Mark Logic; Vern Imrich, Percussion; John Newton, Alfresco; Bjrn Olstad, FAST; Eric Severson, Flatirons Solutions; and Carl Sutter, CrownPeak.

Posted by Bill Trippe at 10:15 AM

September 15, 2006

Functional Web Analytics

Writing at iMedia Connection, SEMphonics CEO Gary Angel asks some refreshingly direct questions about what companies actually do with web analytics:

Can you answer yes to all of these questions?

I especially like the last one. This is not to say I think people are lazy, but that if a report isn't relevant or actionable, people will simply discared or ignore them.

Posted by Bill Trippe at 9:31 AM | TrackBack

August 31, 2006

Ultimate Developer and Power Users Tool List for Windows

Ron Gustavson writes with a great link, Scott Hanselman's 2006 Ultimate Developer and Power Users Tool List for Windows. It includes all manner of tools, targeted at developers and super users, and has a very good section on XML tools.

Posted by Bill Trippe at 4:07 PM

August 29, 2006

XML Schema Book

Someone asked me to recommend a book about XML Schema, and I didn't hesitate to point to Priscilla Walmsley's fine book, Definitive XML Schema.

Posted by Bill Trippe at 3:44 PM

August 24, 2006

DRM Vendor Market Consolidates, but Deployments Seem to Be on Rise

Writing for DRM Watch, Brett Sheppard has a brief roundup of recent Enteprise DRM deployments.

Posted by Bill Trippe at 5:45 PM

August 23, 2006

Eliot Kimber Meets MarkLogic

Eliot Kimber gets his first look at MarkLogic and likes what he sees.

UPDATE: Mark Logic CEO Dave Kellogg was relieved to find out that Eliot liked the software despite Eliot's blog subtitle, "All tooks suck. Some suck less than others."

NOTE: Yes, if you are reading closely, "Mark Logic" is the company but the product is called "MarkLogic Server." I have no idea why there is a space in the company name but no space in the product name.

WHICH REMINDS ME: If you are interested in exploring XQuery, Mark Logic's Stephen Buxton has co-authored an excellent book, Querying XML: XQuery, XPath, and SQL/XML in Context.

Posted by Bill Trippe at 9:42 AM | TrackBack

Eliot Kimber Meets MarkLogic

Eliot Kimber gets his first look at MarkLogic and likes what he sees.

UPDATE: Mark Logic CEO Dave Kellogg was relieved to find out that Eliot liked the software despite Eliot's blog subtitle, "All tooks suck. Some suck less than others."

NOTE: Yes, if you are reading closely, "Mark Logic" is the company but the product is called "MarkLogic Server." I have no idea why there is a space in the company name but no space in the product name.

WHICH REMINDS ME: If you are interested in exploring XQuery, Mark Logic's Stephen Buxton has co-authored an excellent book, Querying XML: XQuery, XPath, and SQL/XML in Context.

Posted by Bill Trippe at 9:42 AM | TrackBack

August 22, 2006

SEO and Content Management

Writing for CMS Watch, Randy Woods and Julie Batten offer some excellent, detailed advice about SEO and content technology.

Posted by Bill Trippe at 1:54 AM

August 16, 2006

Exploding Laptops and Other Nuisances

According to The New York Times, a Dell notebook computer in Thomas Forqueran’s pickup truck caught fire in July, "igniting ammunition in the glove box and then the gas tanks." (Emphasis added.)

I can see keeping ammunition in your glove box, but in the gas tanks too?

Of course, the real tragedy is that he may never get to read Instapundit again.

UPDATE: Click on the picture for a closer view. Do you suppose he was smoking a cigarette when he put the ammunition in the gas tanks?

Posted by Bill Trippe at 7:58 AM | Comments (1)

Cracking PDF

Over at PDFZone, Don Fluckinger has a great piece about the re-emergence of ElcomSoft as one of the good guys--or not.

Posted by Bill Trippe at 7:50 AM

August 13, 2006

Dear Sony: Please listen to Jane...

If you have a keen interest in eBook markets and technology, you really should follow the TeleRead blog. This weekend it has a number of fine entries, including Dear Sony: Please listen to Jane about your eBabel problem—if you want to woo romance readers. The advice applies to all kinds of readers, including romance readers.

Posted by Bill Trippe at 6:38 PM

August 12, 2006

Publishing Technology Survey

IDEAlliance is conducting a survey of publishing technology, and will be sharing the results. According to the Web site:

This IDEAlliance Publishing Technologies Survey is being conducted to assess the state of publishing technologies and standards in the industry today. First we ask for general information about your organization and your role. You do not have to reveal your name, company or position. However note that we provide survey results to any one who is interested. Next we focus on digital media assets both for archive and for product delivery. We hope to assess current media formats and identify trends for the next two years. We then move our focus to systems. Here we hope to assess the current systems that are installed and in use as well as the wish-list for the next 2 years. Other areas of inquiry include technology standards, both awareness and adoption.

Posted by Bill Trippe at 8:24 PM

August 11, 2006

eMail RIP, Redux

Everyone knows email is hopelessly broken. For example, note this excellent article, written three years ago, and the situation has only worsened. Indeed, if you google "email broken" the first several hits are from 2003. It is as if everyone has simply accepted it.

But should we? Every now and then I look at my own spam problem. In the last 10 hours, for example, I received 195 emails, and 134 of them were spam. 113 of the spam were successfully trapped in my Outlook spam folder, leaving me to clean up 21 of them from my Inbox. This is in addition to a spam filter that one of my ISPs provides; that filter traps about 200 spam a day. At one point, I was diligent about reviewing the spam to see if any real email was incorrectly trapped, but now I rarely do. Last night I cleaned about 7400 spam from my Outlook spam folder after a perfunctory search for a few keywords ("XML," "content," "Melrose" (my hometown)) and saving a half-dozen or so noncritical emails.

Shouldn't there be more of a solution to this problem?

Posted by Bill Trippe at 11:28 AM

August 10, 2006

Wise advice to Amazon

Adobe's Bill McCoy and TeleRead's Michael Banks weigh in on Amazon's new push to have publishers use the bookseller's Mobipocket format.

Posted by Bill Trippe at 10:15 PM

protectedpdf

On behalf of a client, I sat through a demo yesterday of a DRM technology, protectedpdf, from Vitrium Systems. I was impressed. It embeds the client right in the PDF file, eliminating the requirement for a separate plug-in or client download. It also showed an impressive flexibility about the types of business and use models you could implement. For example, one use showed a marketing white paper where you could view the first few pages of the PDF, but then had to enter personal information (name, address, email, etc) in order to view the rest of the white paper. I didn't dig in too much, but I liked what I saw.

Posted by Bill Trippe at 9:18 PM

August 6, 2006

Improving eBook Reading

Jon Udell has a practical suggestion for improving the reading experience with eBooks.

Posted by Bill Trippe at 12:00 PM

July 27, 2006

What is RDF?

Over at XML.com, Joshua Tauberer has updated a very useful article, "What is RDF."

Posted by Bill Trippe at 8:27 PM

July 25, 2006

Sony Breaks Its Silence

Sony has been very quiet about their new eBook reader since an initial spate of publicity. But just today I received an email with a few details (not many really). I am reproducing the email here.

I continue to be underwhelmed by their marketing efforts. I contacted their PR folks after the initial announcements last December, and again a month or two ago. Still no word from them.

=================================================================

PICK A NICE SPOT FOR YOUR LIBRARY.

=================================================================

Thank you for your patience and for requesting updates on the
Sony(R) Reader, coming this fall, in time for the holidays.
It holds about 80 electronic books, is as easy to carry as a slim paperback and thanks to electronic paper, just as easy to read. Just load it up with tons of great electronic books from CONNECT(TM) eBooks, and you'll never read the same way again.

Explore the portable reader here.

-----------------------------------------------------------------

EASY READING

Breakthrough technology provides clarity that's almost paper- like. View from nearly any angle and adjust text size to your
preference.

-----------------------------------------------------------------

PERFECTLY PORTABLE

It's lightweight, thin, and holds about 80 books. More with optional memory cards. So take your own mini-library wherever
you go.

-----------------------------------------------------------------

LONG BATTERY LIFE

The rechargeable battery allows you to turn up to 7,500
continuous pages on a single charge (when not providing audio).

-----------------------------------------------------------------

CONNECT EBOOKS

Designed with variety in mind, CONNECT eBooks will have over 10,000 titles online. You'll find many of the latest bestsellers and a deep catalog including more than 15 categories and over 100 subcategories. From mystery to history, sci-fi to self-help
and more, you're sure to find something to fit your taste.

- Sample titles that will be available at launch.

At Risk
by Patricia Cornwell from Putnam.
Number 9 on the New York Times Hardcover Fiction best seller list.*

Freakonomics
by Steven D. Levitt and Stephen J. Dubner from PerfectBound and Harper Collins.
Number 2 on the New York Times Hardcover Nonfiction best seller list.

The Da Vinci Code
by Dan Brown from Anchor and Random House.
Number 7 on the New York Times Paperback Fiction best seller list.

Digging to America
by Anne Tyler from Knopf Publishing.
Number 23 on the New York Times Hardcover Fiction best seller list.


- Categories that will be available.

Biography
Business
Entertainment
Fiction and Literature
Games
Graphics
Health, Mind and Body
Mystery and Thrillers
Nature
Politics and Government
Resources and Reference
Self Help and Improvement
Science Fiction
Technology
Thrillers

Posted by Bill Trippe at 3:22 PM

June 24, 2006

If You Had 20,000 Image Files...

One of my clients is interested in converting 20,000 or so images that are in perpetual use. They get published in very long-living documents that are under continuous review and get republished every few years on average. Currently, the documents are distributed in print and PDF only, so the client has been content to maintain the images as bitmaps--high-resolution TIFFs. This works fine for print, though it is cumbersome for ongoing review and changes, as most of the images are line art.

So now they are thinking about distributing the documents in other formats besides print and PDF. Candidate formats include HTML, various wireless formats, XML, and so on. This has led some of us to think about converting the line drawings images to SVG. But here is where I pause, despite my interest in SVG. SVG makes a lot of sense--it is standards-based, rich enough for their drawings, convertible to other necessary formats, and displayable directly on many devices. Still, I fret about the lack of overall adoption and momentum. These drawings will be used for years--decades in many cases. Does SVG have those kinds of legs?

Posted by Bill Trippe at 11:49 AM | Comments (1) | TrackBack

More Progress on Digital Publishing Standards

I've discovered a blog, written by Bill McCoy, who is General Manager, ePublishing Business, at Adobe, and therefore keenly interested in the eBook business. He weighs in on some recent announcements from the IDPF.

Posted by Bill Trippe at 11:48 AM

June 9, 2006

'Viper' Bites at Last

Writing for eWeek, Lisa Vaas has an early look at IBM's Viper upgrade to DB2 and its "breakthrough XML handling."

Posted by Bill Trippe at 11:56 PM

June 7, 2006

Publishing to iTunes

Via PaidContent.org, news of PDF Magazine Downloads in iTunes. I have been hearing rumblings about publishing books and magazines to iTunes and, by extension, iPods. Obviously the screen size is an issue right now, but perhaps this suggests some future directions for iPods and other, similar, devices.

Posted by Bill Trippe at 5:32 PM

Does Implementing a CMS Help Search Engine Optimization?

Randy Woods of Toronto-based non-linear creations emails with a very solid white paper about this question. I think a new CMS implementation, done well, naturally lends itself to search engine optimization strategies. The mere fact that you are templating the pages encourages you to normalize markup, and that alone can go a long way toward helping the search engines. The white paper has lots of good detail about markup, navigation, site structure, and other issues, and concludes with an interesting case study.

You can download the white paper here (simple registration required).

Posted by Bill Trippe at 12:18 PM

May 25, 2006

Reading the MadCap Tea Leaves

Over at Palimpsest, Sarah O'Keefe has some interesting speculation about the authoring tool MadCap software is developing. I like her idea for a new MadCap slogan, "Annoying Adobe since 2005."

Sarah's thoughts are a nice counterpoint to what I am saying over at Gilbane about Quark. MadCap is moving in on FrameMaker, an established and successful product that has languished under uninterested management at Adobe. Meanwhile, Adobe moved in on and overtook QuarkXPress, an established and successful product that languished under arrogant management at Quark. Obviously, there is no telling what MadCap's tool will be like--it is only an announcement--but the useful lesson from Quark's loss of market share is that no product is immune from competition.

(Well, maybe Microsoft Word is, but, then again, maybe not.)

Posted by Bill Trippe at 12:43 PM | Comments (1) | TrackBack

May 24, 2006

Quark 7.0 is Out, But Does Anyone Care?

Over at the Gilbane blog, I ask and answer the question, Quark 7.0 is Out, But Does Anyone Care?

UPDATE: Thad McIlroy thinks you should.

Posted by Bill Trippe at 5:52 PM

May 18, 2006

That Google Appliance Again

Count Tony Byrne among the people who are not wowed by the Google appliance.

UPDATE: Neither is Mark Logic CEO Dave Kellogg. He wrote an in-depth entry about it, and wondered if it were worth missing a Dunkin' Donut for. That's a question I ask myself all the time.

Posted by Bill Trippe at 11:51 AM

May 17, 2006

XSLT 2.0 vs. XQuery

Over at IBM's developerWorks, Benoit Marchal has an article, Comparing XSLT 2.0 and XQuery. Quoting briefly from the intro:

Since it was introduced in November 1999, I have found that XSLT, the XSL Transformations language, is one of the most useful (if not the most useful) tools you can use to manipulate XML documents. Many available APIs and tools work with XML documents from Java or other languages, and I have used many of them in different projects, but cannot recall an XML project that did not use at least some XSLT.

It should come as no surprise, then, that I have followed the development of XSLT 2.0 with great interest. XSLT is a powerful language, sophisticated enough to handle even the most complex manipulation, but it is also very verbose and that makes it more difficult to debug and maintain large stylesheets. The W3C hopes to address this, and other problems, when it releases two languages: XSLT 2.0 and XQuery 1.0. This article compares the two upcoming languages and provides some pointers on how best to use them.

One of the great things about the Web, of course, is the abundance of technical information available in the clear and free of charge. I have always liked IBM's sites, in particular, though, because they seem to have the most vendor-neutral and useful content on important, emerging technologies.

Posted by Bill Trippe at 3:08 PM

InfoPath Client Not Needed Going Forward?

One of the email lists I read is the InfoPath group at Yahoo. A question came up about using SharePoint Forms as an alternative to InfoPath, since the current version of InfoPath requires the Windows client be present on each user's desktop. In response, Gray Knowlton, who indentified himself as a Senior Product Manager for InfoPath 2007, said the next version of SharePoint will "include InfoPath Forms Services, which will render InfoPath forms to browsers and html-enabled mobile devices, and this will not require InfoPath on the form fillers' desktop, nor will it require any advance download on the part of the person completing the form."

This sounds like good news to me, and significant.

UPDATE: XForms guru Micah Dubinko agrees that it is significant, but also asks a pertinent question.

FURTHER THOUGHT: I wonder what this evolution in InfoPath means for companies like SharePoint Forms, which "provide out-of-the-box web forms for SharePoint... and [allow] organizations to deploy powerful yet simple electronic forms solutions with SharePoint without the need to deploy InfoPath on every desktop." What does their value proposition become?

I opened comments and trackback on this entry in case anyone wants to weigh in.

Posted by Bill Trippe at 10:48 AM | Comments (2) | TrackBack

May 16, 2006

XTech 2006 Week

Allesandro Vernet is reporting from XTech 2006 Week in Amsterdam. He kindly alerted folks to an excellent presentation on XHTML2 and XForms given by Steven Pemberton. Check out the CSS Zen Garden examples.

Posted by Bill Trippe at 8:21 PM

News from AIIM

Doug Henschen from Intelligent Enterprise is reporting from AIIM, where he offers an early look at SharePoint Server 2007 as well as a look at new releases from Ektron and Stellent.

Posted by Bill Trippe at 8:12 PM

April 29, 2006

Microsoft Gets Into eNews Business

Via Dave Winer, I learn that Microsoft is getting into the newspaper facsimile business. This is a space now occupied by folks like Zinio and Newsstand. Zinio and Newsstand have had modest success. Will Microsoft enjoy more simply because they are Microsoft? The reader being built right into Vista helps of course, but the functionality will have to be attractive and useful. I wonder if a demo is available out there...

UPDATE: Jeff Jarvis is unimpressed. I left a comment over there.

UPDATE: Jarvis's post has a number of excellent comments, and led me to a great new (for me) publishing blog, Hammorati. Make that two new blogs, as I missed the personal blog of Rex Hammock.

Posted by Bill Trippe at 12:12 PM

April 13, 2006

Atom

Looking at my logs, I noticed I was getting a lot of 404s on index.atom. I guess the default file naming from MovableType is atom.xml, so I have redirected hits on index.atom to go to atom.xml. I hope this solves the problem. Let me know if you still have trouble with this.

Posted by Bill Trippe at 6:31 PM

Simple Fix for Plugging Firefox Memory Leaks?

SteetTech points to a potentially simple fix for plugging Firefox memory leaks Cybernet Technology News offers a quick fix that can help with Firefox's annoying memory leakage. "This fix will bump memory usage down to under 10MB every time you minimize Firefox (Windows OS, only). When minimized, it writes Firefox to the hard drive."

I have noticed this memory problem, and find myself killing and restarting Firefox a few times a day.

Posted by Bill Trippe at 3:32 PM

April 3, 2006

OPML File of My Feeds

I have been playing around with OPML a bit. So I spent some time organizing my personal feeds tonights and created an OPML file of them

Posted by Bill Trippe at 9:57 PM

March 29, 2006

Madcap Flare

I love a company that is not afraid to come up with bold names. Other than that, I don't know the first thing about this product, but it looks pretty cool. Here's an interview with Mike Hamilton, their VP of Product Development. Apparently, it is a re-formation of many of the people who were behind RoboHelp originally.

Posted by Bill Trippe at 5:34 PM

Infopath hit with first virus - Computerworld

I somehow missed this Computerworld article a few weeks ago: Infopath hit with first virus. Apparently, a Trojan horse virus has been detected that targets Microsoft InfoPath.

Posted by Bill Trippe at 12:53 AM

March 28, 2006

Search Terms

Maybe my favorite part of looking at my usage logs is seeing what search terms brought people to my site. Some of them are really obvious, some of them funny, and some of them, well, odd. Here is a recent sampling, with the search term followed by the number of visits in the last couple of weeks:

dita+tutorial 25
ebook+device 16
sony+ebook 12
bill+trippe 10
coco+crisp+nationality 9
breece+pancake 6
publishing+white+papers 6
revenue+per+click 6
billy+packer+idiot 5
t-mobile+early+termination 4
coco+crisp+real+name 3
coco+crisp+pictures 3
swiss+spaghetti+harvest 3
cussing 3
%22block+that+metaphor%22 2
everett+hoagland 2
baseball+food 2
andre+dubus 2
hub+fans+bid+kid+adieu 2
chris+herren+iran 2

Coco is so popular, but I love how quickly the Billy Packer is an idiot theme made it out there.

Posted by Bill Trippe at 5:08 PM

March 25, 2006

Adobe InDesign for Single Source Publishing?

I posted the following to the TECHWR-L list today, and thought I would also post here in case any of you have thoughts about this. Please comment or email me if you have ideas.

I have a client that will be using Adobe InDesign for their documentation. I say "will be" because the decision has already been made for a variety of reasons, so the mission now is to figure out how to support them in this effort. A few points:

--It's a small group. One writer and one editor, supported by a graphic designer who will create the templates.
--Their total volume of documentation is probably a few thousand pagea a year, much of it updates and custom versions of one key deliverable (user documentation for software that runs as both client-server (Windows and Mac) and Web-based.
--They will be using a workflow solution, WoodWing, that will allow the authoring to be done in InCopy, with the templates being maintained by the designer using InDesign.

The print output is straightfoward, but my concern is output to Help without too many manual steps by the small team. From reading the archives, it appears there is no tool that directly supports InDesign-to-Help output, but there are two indirect options:

--Use the XML capabilities in InDesign and InCopy to produce XML output that could than be transformed into Help.
--Produce a PDF file or files that could then be ingested by one of the Help tools (preferences or recommendations here? They use RoboHelp with MS Word now, but are not committed to it) I am assuming this approach means the PDF file needs to have enough hooks, links, etc., in it to make the transformation to Help automatic.

Do I seem to have captured the state of the art? Does anyone have any experience, recommendations, etc? Also, if you are a consultant who does this kind of work, I would be interested in hearing from you offlist.

Many thanks in advance.

Posted by Bill Trippe at 1:16 PM

March 23, 2006

Schematron

If you do a lot with XML-encoded content, you should know about Schematron. Betty Harvey also thinks so, and has lauched an email list, the Schematron-love-in.

Posted by Bill Trippe at 8:53 AM

March 22, 2006

EclipseZone - Open source Eclipse/SWT XForms engine released

Writing for EclipseZone, Eric Borraco reports that Nuxeo has released the source code for an open source XForms engine for SWT and Eclipse. The technology will be used in the Apogee project to build rich client applications for collaboration and ECM.

Posted by Bill Trippe at 7:03 PM

Hosted Content Management: DocZone.com

Hosted content management is not new. There are mature and growing vendors like CrownPeak and Clickability, and Atomz, of course, is now part of Web analytics vendor WebSideStory. But those vendors are focused on Web content management. What about a hosted application for a content management application like XML-based multichannel publishing?

I think the conventional wisdom even a couple of years ago was that XML content management, especially for applications like technical documentation, was simply too complex and too variable for a hosted solution. How could one provider efficiently meet the needs of different customers with different DTDs--and perhaps more significantly, different trasformations to print and Help and HTML?

But then DITA came along, and with it the promise of a single, extensible DTD (or XML schema if you prefer) that many people could use across various industries. So some clever people have come up a hosted service for XML content management called DocZone.com. I got a briefing a few weeks ago from Dan Dube, Managing Director for DocZone's US Operations, and came away very impressed. They have chosen a great suite of technology, and their focus on DITA is a very smart bet. And they had a significant announcement this week, landing Dutch automotive company Spyker Cars as a customer.

Posted by Bill Trippe at 1:00 PM

Riya Photo Search

This is either cool or scary.

UPDATE: They've gone Beta, and you can sign up now.

Posted by Bill Trippe at 11:21 AM

March 21, 2006

Good at Hacking Word Files?

My son created this great menu for a school project, and then the Word file imploded. I have tried pretty much every tool I could find out there, and none of them uncracked it. Interested in giving it a try? Click here for a zip file, with the Word file included. I have tested it for viruses and it is clean. If you can restore the file to its original state (four photos on the front, about 6 pages long), I will send you a copy of my SVG book or the DRM book that I helped write.

Posted by Bill Trippe at 12:47 PM | Comments (2)

March 13, 2006

XMP

I have been taking (thus far a cursory) look at XMP on behalf of a client, but I note a number of resources seem to be emerging. IDEAlliance is holding an XMP Open Day later this month in New York City. Meanwhile, Adobe has announced a Public Beta of the XMP Toolkit, Version 4.0, Prerelease 1. (It's interesting that the XMP Toolkit URL is at Adobe , which is billed as "Formerly Macromedia.") Meanwhile, a couple of people whose opinion I trust have weighed in on XMP, notably Bob DuCharme, as well as Ed Stevenson and Lisa Bos from Really Strategies.

Posted by Bill Trippe at 8:24 PM

March 11, 2006

More XForms

Preview Release Version 0.4 of Mozilla XForms is out.

Posted by Bill Trippe at 9:18 PM

I Kid You Not

XForms: The Movie!

Posted by Bill Trippe at 9:13 PM

February 27, 2006

Browser Toolbars

I am not a browser wonk, and since switching from Internet Explorer to Firefox in November 2004 because of the IE spyware curse, I have done nothing but keep my Firefox up to date, make my Onfolio tool work with Firefox, and added a single search engine (IBM Developer Works) to Firefox's convenient search box. But one of my clients, ThomasNet.com, has come up with a toolbar to highlight its content, and it looks quite useful. It also strikes me as a good example of how to blend a user's day-to-day habits with a publisher's content. If you are an industrial buyer or otherwise work in manufacturing, you should check it out. If you are a publisher, I would also recommend you take a look and consider how you could merge your content with your user's day-to-day tasks.

Posted by Bill Trippe at 12:04 PM

February 26, 2006

DITA or DocBook?

Since I have been spending a fair bit of my time talking about DITA lately, I often get asked about adopting DITA or DocBook. I should write a few thoughts down (and have in one of the Gilbane Report white papers), but in the meantime you can see what Eliot Kimber has to say.

Posted by Bill Trippe at 10:06 PM

February 21, 2006

More Thoughts on eBook Market

Burt Helm, who does a solid job of covering electronic publishing at Business Week, has a new article on the potential for the eBook market. He discusses the Sony device again, and also speculates on what Apple might be up to. (Which reminds me that I havent heard from the Sony PR person yet.)

As I have said in a couple of places (here and here), having a good device is one thing, but you also need excellent foolproof sites for marketing the content and supporting the customers. I had a bear of a time with my older son's Napster installation this past weekend. A hiccup in his membership led to several hours of troubleshooting, and eventually led me to reinstall the firmware on his MP3 player. I have to say the Napster tech support was mediocre at best--and this after 20 minutes on hold. The Creative Labs folks (maker of his Zen Micro MP3 player) were excellent. They knew eactly what steps to walk me through, and were very systematic about it. So good device, and good technical support on the device, but the site definitely let him down.

Posted by Bill Trippe at 11:22 PM | Comments (2)

February 16, 2006

Traffic and Trackback Spam

I shouldn't have talked about my site traffic the other day in nearly the same breath I mentioned the problem I was having with trackback spam. It turns out the two things are related, as I am now being bombarded with attempts to post trackback spam. Props again to Brad Choate and his spamlookup tool; it has blocked 826 trackback spams today alone.

Posted by Bill Trippe at 8:45 PM

Vital Source

I am downloading and will be looking at the vitalsource bookshelf, an eBook reader and manager that one of my clients is interested in. So far, I like what I see. The installation went smoothly, and I went through their bookstore and selected a few free titles and a demo title (a seven-day license to the first four chapters of an introductory calculus text).

The downloading has a nice feature where you can immediately open a book as it finishes downloading, even when you have a number of other books in queue.

I have begun looking at some of the eBooks. The reader has a very simple interface (a good thing in my "book"), but I haven't quite grasped the basic navigation ideas yet. The basic rendering looks excellent, though the calculus book is a workbook, so it is hard to judge if the math is dumbed down or if the original typsetting of the book was as simple as the eBook seems to be.

I will be digging a little deeper.

Posted by Bill Trippe at 1:13 PM

Sony eBook Device Again

Over at DRM Watch, Bill Rosenblatt has some thoughts about whether the new Sony eBook Device will have an impact on the moribund eBook market. Bill focuses, naturally, on some of the DRM aspects of the device, and sees their execution on this product as a good test of their new DRM strategy. I agree with Bill, especially since Sony stumbled so badly recently with their music DRM. But I also think the success of the eBook device also depends on Sony Connect, which is, well, er, ummm, lame.

Posted by Bill Trippe at 12:37 PM

February 14, 2006

Trackback Spam

Back in the day (oh, six months ago) I was plagued with comment spam. Then I installed a more recent version of MovableType, and comment spam was reduced to a dull roar. Then, last Friday, I started to get bombed with trackback spam. Beginning Friday and continuing through this morning, I deleted hundreds of trackback spam that originated from 72 different IP adresses. I went through the painstaking process of deleting the trackbacks and then adding each IP address to my list of banned IP addresses.

This was really dreary, and I know the problem goes aways with a further upgrade to MovableType, but I am swamped. So I did a little poking around and found a tool, spamlookup, written by Brad Choate. Brad was an independent developer, but is now part of the engineering team at MovableType developer Six Apart. I had a little trouble with the installation files, but was bailed out by Brad and my man Paul Crook. Since installing it a few minutes ago, it has already deflected three trackback spams.

My life is now 1% better.

Posted by Bill Trippe at 9:00 AM

February 4, 2006

Revenue-per-Click

I am spending a few days at Disney World at a client meeting that is focusing on issues like Web marketing, analytics, pay-per-click, and organic search. This morning one of the client's customers offered a case study of how he analyzes the relative success of approaches such as Google pay-per-click, other search engines such as Yahoo, and vertical search engines that focus on his business. Not surprisingly, the most general pay-per-clicks do not necessarily yield the best results. The best results often come from the vertical search engines--the more focused the search, the more specific idea the user has about what he or she is searching for, and so forth. Just by using the vertical search engine, the user has already qualified himself to a certain degree.

For this speaker, the right metric is not cost-per-click but rather revenue-per-click. How much real business follows from a given click through to your site? To accurately track this, he has his salespeople always enter a source for a lead into their sales tracking system. Did it come from Google? Another search engine? A referral from an existing customer? This field is mandatory (in fact, they have to enter this field first before they can create or enter the rest of the customer record). This encourages the sales person to get a very specific idea of the source of the lead, which they also find to be an important element in qualifying the customer.

After a couple of years of analyzing this, the speaker has a lot of proof that the highest revenue-per-click comes from the vertical search engines. Moreover, the general search engines like Google tend to produce too many unqualified leads--and these unqualified leads take additional time from the sales people working with more qualified leads. So this speaker is spending less money on Google pay-per-click going forward and will spend more money on the vertical search engines.

It seems this speaker is ahead of the game, and has arrived at a metric that not enough people are thinking about yet. A quick search of Google (!) gives me 43.4 million hits for "pay-per-click," 7.15 million hits for "cost-per-click," but only 12,100 for revenue-per-click."

It sounds like this in area ripe for more exploration.

Posted by Bill Trippe at 10:29 AM

February 2, 2006

Seek and Ye Shall Find?

Courtesy of an excellent talk on vertical search engines by Mike Sack of Inceptor comes this nugget from a recent Nielsen/Net Ratings report: Internet users conducted 5.1 billion searches in October 2005. Mike said this works out to about 40 searches per user per month. The 5.1 billion represents a 15% increase over five months prior.

Mike highlighted the continued impressive growth in search engine advertising spending, and said continued strong growth will come from areas such as vertical search, local search, and spending by smaller- and medium-sized companies. Just as broadcast and print advertising spans the biggest markets and audiences (ads for the Super Bowl), it also spans the smallest markets (local stations and even local cable). Mike predicts a similar stratification for search engine advertising, with vertical search engines filling many of the needs.

Posted by Bill Trippe at 11:55 AM

January 31, 2006

The Expanding World of XML Authoring

I have a new article in Intelligent Enterprise, XML Content Authoring For the Rest of Us. To quote from the intro:

XML authoring has long been viewed as difficult and arcane, and best left to specialists using complex thick-client software. Indeed, in some markets and applications, such as developing technical documentation for aircraft or automobiles, today's preferred XML tools look and act much like the SGML authoring tools of 1992. The same products, including Adobe FrameMaker, Arbortext Epic and Blast Radius XMetaL, still dominate.

But the world is changing, with browser-based apps, XML-enabled eForms, and the XML capabilities in Microsoft Word. The article provides a brief survery of some of these changes.

Posted by Bill Trippe at 12:27 PM | Comments (1)

January 23, 2006

Ping Recommendations

I am having a problem with pinging other people's entries when I want to. I get the following error (by example):

Ping 'http://gilbane.com/blog/mt-tb.cgi/173' failed: HTTP error: 403 Throttled

So I poked around a little, and I discovered there are some undocumented settings in mt.cfg, OneHourMaxPings and OneDayMaxPings, but I can't find specific syntax for editing the mt.cfg file and I am loathe to make a change without the specific settings. Any help out there?

Also, I wonder if I am ping happy. For each post, I ping the following:

blo.gs
weblogs.com
technorati.com
http://api.my.yahoo.com/RPC2
http://rpc.pingomatic.com/
http://ping.weblogalot.com/rpc.php

Maybe I can cut back on these? Any suggestions would be welcome. Feel free to email me if you don't want to post a comment.

Thanks,

Bill

Posted by Bill Trippe at 8:35 AM

January 18, 2006

Web Marketing for Manufacturers

I will be attending this event tomorrow as part of my ongoing research into the topic of how much manufacturers rely on their Web sites--and really the content on their Web sites--to drive sales and marketing. I have written about this in the past for the Gilbane Report.

UPDATE: I am live blogging the event over at the Gilbane Blog.

Posted by Bill Trippe at 8:19 PM

January 16, 2006

That Sony eBook Device

I mentioned some speculation about a new Sony eBook reader that was announced at the Consumer Electronics Show earlier this month. There are some details out on the Sony Web site, but apparently it will not be available until Spring. I sent a note to the Sony PR person to inquire about a review copy.

Posted by Bill Trippe at 10:07 PM

A British Take on eBooks

You can find some very bullish views about eBooks in this article.

Posted by Bill Trippe at 9:56 PM

January 7, 2006

The World Wide Web

Do you remember the first time you saw the World Wide Web through a browser? I do. I am a little hazy on the date, but it was thanks to my friend Bill Stewart, who was then an Associate Dean at Boston University's College of Engineering. He had helped one of the professors set up a lab, and invited me by at lunch one day. He sat me in front of a browser and showed me a few sites. It seems to me the University of Hawaii was one of the sites, along with a couple of other academic and scientific ones.

I was tickled, and knew I was looking at a Great New Thing. The Internet was not new to me. I had worked at Mitre in the early 1980s, and we had Usenet and email access as early as I can remember. But the Web, of course, was graphical. Here were photographs and variable fonts, colors and backgrounds. I was looking at recent pictures from geological experiments thousands of miles away and reading things that had posted in the last few days, hours, and minutes. Attractive, low-cost publishing that reaches users around the world, instantaneously.

Sometimes I still marvel at this basic truth about the World Wide Web--this instantaneous reach around the globe. Reading my site activity logs for the last week, I see that I have had visitors from almost 50 different countries. The geographical span starts out as you might expect--the United States far ahead of any other country, and then a few hundred visitors from the UK. There are a few dozen each from Germany, Candada, the Netherlands, Australia, Italy, and the Czech Republic. And then there is a long tail, and that is the piece that fascinates me--Iceland, Russia, Latvia, Morocco, Samoa, Singapore, Pakistan, and Iran. A warm hello to all of you out there.

Posted by Bill Trippe at 2:03 PM

December 30, 2005

The iPod of eBook Readers?

Burt Helm, who covers digital publishing for Business Week, has a new article speculating on a new eBook device from Sony. Sony hasn't said much about it yet, but details will be announced at the Consumer Electronics Show on January 4. Helm is reporting that Random House, Harper Collins, and Simon & Schuster will be offering content on the new device.

And guess what the portal will be for the new content? Sony Connect, which I am concluding is, well, not very good.

Posted by Bill Trippe at 4:27 PM

December 26, 2005

Medical Publishing and XML

Medical publishing was an early adopter of SGML and has naturally progressed to using XML for content development and repurposing. MarkLogic, with its Content Server XML repository, has had a laser-like focus on the medical publishing business, winning business from industry giants Elsevier and Wolters Kluwer, among others. They have now added The New England Journal of Medicine to their list of customers.

Posted by Bill Trippe at 10:27 AM

December 25, 2005

A Brief History of Podcasting

Courtesy of Dave Winer, Christopher Lydon provides a great story of how he got started on podcasting (and, in turn, how podcasting got started). Lydon also mentions one of my favorite people in the business, Bob Doyle.

Posted by Bill Trippe at 7:35 PM

December 19, 2005

XSLT Pays

I happened to learn about this from a posting on the DC-XMLUsers mailing list, but if you live in the DC area and are a highly experienced XSLT programmer, the government has a good job for you.

Posted by Bill Trippe at 7:50 PM

December 14, 2005

Phone Content

One area where SVG certainly has legs is in phone technology. Check out this article, which is on a Web site, phonecontent.com. Now there is a URL you might not have envisioned a few years ago.

Posted by Bill Trippe at 10:29 PM

December 12, 2005

Content Audit Templates

I am recommending a client undertake a condent audit, and I wonder if there are some standard templates and tools out there. Please post in the comments here or email me if you have some ideas.

Posted by Bill Trippe at 11:59 AM | Comments (2)

December 5, 2005

Cool Name

And perhaps even a cooler idea. I heard about this last week, and haven't tried it yet, but might come up with something.

Posted by Bill Trippe at 4:46 PM

November 26, 2005

The Amazon.com Concordance Feature

I really like Amazon.com's concordance feature, which is better explained here. Here is the concordance for the DRM book that I helped write. It reads like every conference call I have had for the past 10 years. Better yet, here is the concordance for Joyce's Ulysses.

Posted by Bill Trippe at 9:37 PM | Comments (1)

November 24, 2005

Another XForms Book

Updating my eForms Resources page, I noted there is a second book out on XForms. I haven't read it yet, so I can't compare it to Micah Dubinko's book, which is excellent.

Posted by Bill Trippe at 9:33 AM

November 22, 2005

XML 2005

I didn't attend XML 2005, but Lisa Bos of Really Strategies did, and she has a roundup of some interesting announcements and some thoughts about Documents 2.0.

Posted by Bill Trippe at 8:19 AM

November 1, 2005

Another take on XML eForms

Writing for PDFzone, Don Fluckinger voices some skepticism on all the buzz about XML-based eForms.

Posted by Bill Trippe at 10:21 PM

October 6, 2005

Ajax

I spend a good chunk of any business day talking to people about technology. Every couple of years something new comes up and I suddenly find the topic being mentioned once in every conversation. Ajax is reaching that point. I have a brief blog entry on it over at Gilbane.com, and Jordan Frank has a very good article about it on XML.com.

Posted by Bill Trippe at 9:18 PM

September 20, 2005

The Complete New Yorker

This looks like a lot of fun--every page of The New Yorker ever printed, in a set of eight DVDs. It's called The Complete New Yorker, and Amazon.com has it for $63.00, well below the SRP of $100. I have to agree with the marketing tag line: A cultural monument, a journalistic gold mine, an essential research tool, an amazing time machine.

This is also a technology story of course. The back issues of the great American magazines are a treasure trove of material, and the question of how to get high fidelity versions of the back issues out to a wide reading audience has continued to present practical problems. PDF is a great format of course, but it can be bulky. The New Yorker project uses a page rendering technology from LizardTech. Document Express with DjVu provides readers with a high-fidelity page in a much more compressed format.

I emailed LizardTech some questions in response to this press release. I will let you know what I learn.

There is a nice flash demo here.

Justyna Bednarski of LizardTech wrote back with the following stats.

How many pages in total? The New Yorker had 500,000 pages scanned into TIFF-formatted files totaling 15 terabytes of document images, then used LizardTech's Document Express with DjVu Enterprise Edition to convert the TIFF files into the open source DjVu electronic document format, where they measured a mere 300th of the original scanned size.

Total stored data in GB? 15 terabytes prior to compressing into DjVu.

I also asked how big the same collection would have been in PDF, but Bednarski didn't want to speculate on that. Bednarski added, "We cannot assert the size in PDF, as nobody wanted to try it because they didn't think it would be practical. The New Yorker didn't think it was practical as they say in the case study."

The case study can be found in PDF format here. You can also view it in DjVu format here (requires installation of the DjVu plugin, which can be found here).

Browsing The New Yorker web site, I also learned that The Complete New Yorker is a live project. According to the site, "Every year, The New Yorker will offer an updated Disk 1, which will include an additional year of issues, an updated index, and other enhancements."

Posted by Bill Trippe at 8:57 AM | Comments (4)

September 14, 2005

DITA and FrameMaker 7.2

I have a pretty lengthy entry over at the Gilbane Report blog on the new release of FrameMaker and its support for DITA.

Posted by Bill Trippe at 11:55 AM

September 12, 2005

eForms Resources

Well, at some point this past Spring, between my laptop dying and a hard disk frying, I lost a few versions of my eForms Resources. I republished it today, with some links updated and cleaned up, but I definitely lost some data along the way. If you want the RSS feed, you can find it here.

Posted by Bill Trippe at 9:49 PM

September 9, 2005

Roundup of DRM Technology

Writing for Information Today, Robert Smallwood has a nice roundup of DRM technology and its role in ECM and Enterprise Rights Management (ERM).

Posted by Bill Trippe at 4:03 PM

September 7, 2005

Newsfeed Reader Recommendations?

I am outfitting a couple of computers and wanted to put a newsfeed reader on them. Any recommendations? I run Windows XP.

Posted by Bill Trippe at 3:41 PM

December 30, 2004

2004 Year in Review: DRM Technologies

Bill Rosenblatt has a nice roundup of DRM technology news from this year at DRM Watch. Bill makes some very good points for enterprise DRM going forward:

Two things need to occur before we would consider Enterprise DRM to be ready to cross the chasm: large system integrators must build significant practices in information usage compliance that include DRM, and DRM needs to be integrated into larger enterprise solutions such as content management. We see evidence that the former is starting to happen at certain large consultancies. As for the latter, many industry watchers are waiting for the major enterprise document management vendors -- IBM, EMC (owners of Documentum), OpenText, and FileNet -- to make Enterprise DRM acquisitions and integrate them with their content management solutions.

Posted by Bill Trippe at 12:11 PM

December 29, 2004

ftp, secure shell

Can folks suggest a tool or tools that would help me (1) do ftp back and forth to a couple of web sites and (2) run a secure shell to my primary web site (which runs linux). I would want to run this on a couple of PCs, one of which is currently running Windows 2000, and another which is running XP.

I have been playing with CuteFTP for a couple of days, and like it. On the shell side of things, I have used something called AbsoluteTelnet. Both of these are still on their trial period, and seem ok, but I am open to suggestions.

And, silly question perhaps, but why isn't this sort of thing simply built into Windows? I was trying to run FTP on my XP machine through a DOS box, and I kept getting bad connections, etc.

Thanks,

Bill

Posted by Bill Trippe at 3:34 PM | Comments (4)

December 22, 2004

Internet Explorer, RIP?

If I am reading my web logs correctly, readers visiting my site use Internet Explorer less than 1% of the time. Mozilla is used about 50% of the time, bots seem to take another 12-15%, and a long list of other browsers and newsreaders account for the rest.

Is this possible?

Posted by Bill Trippe at 10:18 AM | Comments (5)

December 13, 2004

Is QuarkXPress Giving Way to InDesign?

I have an article in the new Seybold Report that asks and attempts to answer this question. At this writing, the article is available to subscribers only, but if they put it on the free portion of the Web site, I will let you know. The following is from the introduction to the article.

A tip of the hat to friend and colleague Kate Binder of Prospect Hill Publishing Services who had some great ideas for the article and offered the best quotes.

Since the early 1990s, QuarkXpress has been the leading desktop publishing tool. Many products have tried but failed to knock QuarkXPress from its perch over the years. Some of us are even old enough to remember one-time products such as Manhattan Graphics' ReadySetGo!, and many industry followers rooted in vain for challengers such as Aldus PageMaker (the product eventually acquired by Adobe).

Indeed, despite the overwhelming leadership of its flagship product, Quark Inc. as a company seemed determined to breathe life into its competitors by infuriating its customer base with half-hearted customer support and onerous licensing terms. Year after year, however, QuarkXPress maintained its dominant market position.

In this desktop publishing war, all eyes have been on Adobe since it introduced InDesign in 1999. Publishers and creative professionals have watched the development of Adobe InDesign closely, and many of them evaluated the earliest releases. While a critical mass of new users was not ready to switch to InDesign 1.0 and 2.0 releases, users were clearly tuned into the emerging product and returned to evaluate it with each new release.

Posted by Bill Trippe at 10:32 PM | Comments (1)

December 8, 2004

Another Shoe Drops

So Oracle has made their formal entree into the content management space with Tsunami. I hope to get an in-depth briefing next week, after Oracle Open World, which has them all busy this week. I will let you know what I think.

Posted by Bill Trippe at 4:56 PM

December 7, 2004

Datawatch

My friend and colleague Phil Storey, who has been a frequent speaker at Gilbane Report events, has landed at software vendor Datawatch. Datawatch has a number of business intelligence, report managament, and data transformation tools, and has been moving steadily into the XML and content management arena. I like a number of their offerings, including VorteXML, which converts structured and unstructured content and data into XML. The Datawatch products are especially good at working with unstructured and semi-structured input such as what you get from print streams, legacy applications, and the like. If you have these kinds of requirements, they are worth checking out.

Posted by Bill Trippe at 4:20 PM

December 6, 2004

XMPie

Anyone have experience with XMPie or competitive products? Please feel free to post here or email me.

Posted by Bill Trippe at 8:38 PM | Comments (1)

December 4, 2004

Catalog Data Online

Any metrics out there on how much it costs companies to put their catalog data online as a step toward developing e-commerce? I am starting to work with a company that seems to have a surprisingly low-cost way of doing this.

Please feel free to post here or email me.

Thanks,

Bill

Posted by Bill Trippe at 1:51 PM

December 2, 2004

Interesting New Approach

Data Conversion Labs (DCL) has announced a new service, Harmonizer, which analyzes existing content sets for redundancy. As they describe the offering in their web site:

[The] Harmonizer&tm; content reuse service from DCL measures and eliminates redundant data from document sets. This unique system checks for duplicate content (and "near duplicates") and weeds out what you don't need; it also harmonizes text and grammar variations to fit your standard. In short, Harmonizer™ clears the clutter in order to reduce costs, improve accuracy, and speed up turn-around.

In announcing the service, DCL revealed some research they had done into some content sets, with a 83.1% level of redundancy in one aerospace company's maintenance manuals and 68.3% in a pharmaceutical firm's product data. DCL President Mark Gross said, ""They are recreating text that has already been written - and are paying for the privilege!"

Well said! And real food for thought.

Posted by Bill Trippe at 9:30 PM

November 24, 2004

The More Things Change

According to Google Zeitgeist, "perl programming" was the second-most popular technology search in October 2004.

Posted by Bill Trippe at 10:47 AM

November 18, 2004

I Like a Company ...

... That's not afraid to choose a cute name. I always thought Stratify, while a fine company, was much better as Purple Yogi. But nobody asked me.

UPDATE: And then a Webinar invitation from this company lands in my inbox today.

Posted by Bill Trippe at 7:34 PM

November 15, 2004

Acrobat 7.0 Announced

Adobe announced Acrobat 7.0 today. One of the interesting things about Acrobat 7.0 is that the forms designer product, Adobe Designer, is now part of Acrobat Professional. I will be getting a copy of Acrobat 7.0 Professional at the end of December and will let you know what I think.

UPDATE: One of my favorite speakers, Chuck Myers, Technology Strategist in the ePaper Solutions Group at Adobe, will be demonstrating some of the new Acrobat 7.0 eForms features as part of my eForms panel at the Gilbane Conference.

Posted by Bill Trippe at 9:24 PM

November 12, 2004

Firefox

I have whined here in the past about my battles with spyware. This led me to switch to a Netscape browser on my home machine, but I have continued to run IE on my (Windows XP) notebook. Yesterday I was having sporadic problems reaching Google, and after a little analysis, decided that my browser had been hijacked again. So I took a colleague's advice and installed Firefox.

Lo and behold, I still had trouble getting to Google, but, damn, I like Firefox. (Not sure what the network problem was; it affected the three machines on my home network, and was resolved overnight.) I have not done any blow-by-blow analysis yet of Firefox vs. IE, but Firefox seems much more responsive. The application starts much more quickly, and I can add new windows (Ctrl+N) immediately. (This is something I do all the time when I am researching and writing. If I find something of interest, I leave the window open and start a new one. Another nice feature of Firefox is that the new window reverts to my start page and not the page currently displayed, which is also very useful.) The built-in popup blocking is quite good, and I really like the download manager.

So I am going to stick with it. I have noticed a couple of small things already. I can't run my Onfolio tool from withink Firefox (maybe I need to reinstall it? I will check), and it looks like some favorite pages work a little differently. But I may have myself a new browser.

UPDATE: I found the answer to my Onfolio question. They have a new beta coming out that will support Foxfire.

Posted by Bill Trippe at 11:32 PM | Comments (4)

November 4, 2004

Upgrade XMLSpy?

I am not a power user of XMLSpy, by any means, but I do several projects a year with it and have had very good success. I am still running XMLSpy Enterprise Edition, version 2004, release 3, but they have come up with a new release. Are people going with the upgrade? If so, why? If not, why not?

Thanks,

Bill

Posted by Bill Trippe at 11:33 AM

October 26, 2004

If You Were Building Your Own Worflow...

And you were mainly a Microsoft shop, would you somehow leverage Microsoft Exchange Server and Microsoft Project to do it?

I was involved in a project like this several years ago. This describes what my client did at a high level.

This is how it is described for Outlook and Project 2003.

Any thoughts out there? This would be a group of 30 users with a fairly traditional publishing workflow, editorial through production.

And, if you wouldn't build it, would you buy something?

Bill

Posted by Bill Trippe at 12:58 PM

October 12, 2004

Any thoughts on...

...DITA versus DocBook? I am just starting to look into this on behalf of a client.

Any DITA advocates out there who can make the case? Any DocBook users who are switiching to DITA?

Posted by Bill Trippe at 8:29 AM

September 5, 2004

Privilege Management & Rights Management for Corporate Portals

Frank Gilbane has quietly made back issues of The Gilbane Report available to the public. If you go to the back issues page, you can see which issues are available in full text, and which only allow you to see the PDF introduction to the article.

I have a number of past issues I really like, and will highlight a few here over the next couple of months.

I really like an article I helped contributor Larry Gussin and former associate editor David Guenette write about security, digital rights management, and portals. I think it remains one of the few articles that addresses how DRM fits into the larger enterprise secuirty picture.

An excerpt of the article follows, and the full text can be found here.

From, Privilege Management & Rights Management for Corporate Portals

With the quickly growing demand for intranet-based enterprise information systems, as well as for extranet extensions, the enterprise information portal (EIP) is becoming the primary emerging solution to the problem of intelligent user access.

Enterprise information portals extend Web content management (CM) solutions by delivering both enterprise and commercial content and core enterprise and industry information through a single, unified, and usually browser-based interface. An EIP may present Web sites, documents, databases, email, and other information types from multiple servers, and allow users to access this information through its portal server. The key EIP goal is to provide more efficient access to business-critical information for employees, customers, suppliers, and business partners.

With content management and portal technologies emerging as a new, robust framework for enterprise and extranet information, the traditional enterprise security solutions, which are predicated on online network sessions and on providing document level access, may no longer be adequate or efficiently manageable. IT managers should wonder, for example, how these firewall-based solutions will be able support the potentially huge emerging requirements for extranet, offline, and more granular access to information.

Equally important is the question of how information access security can be managed. If the rise of EIPs reflects the need to address the growing number of information resources found within enterprises, these information resources still require security decisions from their business line managers. With the numbers and types of users of these information resources also growing in number, as well as being potentially tied to multiple locations and access relationships, the information access management challenges become even more daunting.

With all this complexity, enterprises must address important infrastructure requirements before they can enjoy the benefits of extending enterprise information internally among their business units and departments, and externally among their business participants. Two of these requirements address questions of how enterprise managers can ensure that:

--Users effectively access the information they need.
--Business rules govern how and by whom information is used.

Two distinct solution categories exist that can address some part of the extended enterprise's need for information and content security control: privilege management and digital rights management. The solutions available today are still caught up in their cultures or origin, but the real-world needs of enterprises may be answered by the right combination of these solutions. Such a combination of approaches would effectively manage both online and offline access to content, and provide a persistent protection and control of information throughout its lifecycle.

Posted by Bill Trippe at 3:16 PM

September 3, 2004

Is InDesign Gaining Traction?

For an upcoming Seybold Report article, I am looking at InDesign and where it seems to be gaining traction against QuarkXpress. This was definitely a theme at the Seybold conference, where I spoke to several large book and magazine publishers who are in the middle of making the switch.

I will be interviewing some folks from Adobe next week, and have put together the following list of questions so far. Any others you would like to see asked?

GENERAL

First, what is it about the CS release of InDesign that has convinced companies that this is the version to trust for production? Is this a matter of CS having the right feature list? Stability? Performance? Platform support? Integration with other tools (InCopy, Illustrator, PhotoShop)?

FEATURES

Second, how does InDesign compare with QuarkXpress in terms of core composition and pagination features? Is it fair to say that InDesign CS is competitive with QuarkXpress on a feature-by-feature basis? If I were to create a matrix of composition and pagination features (or examine ones from Adove and Quark), how would the two products stack up? Where does Quark still lead the way? Where does InDesign lead the way?

XML

Coming at InDesign more from the editorial side, two things seem to be attractive about it: support for XML and ability to integrate InDesign in a workflow where text needs to be "roundtripped" through a lot of editorial iterations. Can you comment on these things? Specifically:

-- How does InDesign support XML? Does it maintain XML throughout the process? If so, does it handle any XML schema? Only a single one? Same questions for InCopy.

-- What are some of the workflows involving XML? Are customers using XML in the editorial process and then publishing through InDesign where InDesign is kind of a black box? Are they using XML in the editorial process and then publishing through InDesign in a more iterative process where there is a lot of export in and out of InDesign back to XML?

EDITORIAL WORKFLOW--INDESIGN AND INCOPY

-- What about InDesign and InCopy? Precisely how do the products support iterative design and editorial work where both tools are used? What underlying data structure is maintained for the text and other elements while all of this editorial work is going on?

-- What about the combination of InDesign and InCopy with third-party content management platforms, such as those from Managing Editor? Do some of these questions of workflow and XML maintenance and support get answered by the third-party tools?

INDESIGN AND MULTICHANNEL PUBLISHING

-- The above questions go to the point of InDesign/InCopy as a "hub" for multichannel publishing. Publishers who have iterative and design-centric workflows have been "locked in" to tools such as QuarkXpress, where the "master" version of the content is locked into a complex, design-heavy, and proprietary format. In such a workflow, only the print can be most efficiently done, and the other formats--HTML, wireless, syndication format--lag in the process. For some types of publishers, these other formats have proven to be expensive and cumbersome to produce, even as they become increasingly important to the business (or, worse, not! where they are "must have" additional formats that do not neceassrily bring additional revenue).

(long windup to the question...)

-- So does InDesign solve this problem? Can it be a better "hub" for multichannel publishing? Why?

PLATFORM SUPPORT, INTEGRATION

-- What about platform support? Mac vs Windows? What impact is this having?

-- What about integration with the rest of the Adobe creative suite? Does this differ materially from what people can do with Quark?

-- What about the programmability of InDesign? I hear from developers that InDesign has better support for programmers who want to automate steps in the workflow? I even heard at one point that InDesign's APIs are designed in a modular fashion, allowing developers to address individual elements of the InDesign functionality? Is this true? In general, how does the programmability of InDesign compare with QuarkXpress?

PERFORMANCE, PRODUCTIVITY

-- What about performance, support for humongous files, creating PostScript/PDF, other areas that heavy production users would worry about?

OTHERS?

Posted by Bill Trippe at 8:37 PM | Comments (1)

August 25, 2004

Web Services Technology Companies

I have been giving some thought to what companies sell technology to support Web Services, or, more broadly, Service Oriented Architectures. I have put together kind of a strawman taxonomy of the vendors, and have begun discussing it with some of the vendors. This is still prelimary, and does not reflect any of the feedback I have received thus far.

One consistent bit of feedback I have received thus far is that I should differentiate more between "platforms" and "tools." I will give this more thought.

Any thoughts out there?

"Major Platform Players"

> IBM
> Microsoft
> Sun
> Oracle
> BEA
> Hewlett Packard
> Computer Associates

"Traditional EAI (Enterprise Application Integration) Vendors" who are now Web Services focused

> SeeBeyond
> Tibco
> Vitria
> webMethods

"Web Services Pure Play Vendors"

> Intalio
> Lombardo
> Ultimus
> Savvion

"Web Services to Go" or "Web Services Networks"

> StrikeIron
> GrandCentral
> Infravio
> Actional
> AmberPoint

Posted by Bill Trippe at 6:04 PM

August 18, 2004

Traction for MathML?

Seybold does not have a concentration of XML vendors, but one area where there is quite a bit of activity is in Math typesetting and conversion. MathML seems to be gaining ground, and one company, Design Sciences, really seems to be taking over. As some of you know, they provide the math editor in Microsoft Word, MathType, and are also the math editor in ArborText Epic and XMetal. They have announced, but are not yet shipping, MathType editors for Quark and InDesign.

One cool thing they provide is a free MathML "player" for Internet Explorer.

It displays MathML in the browser, and it also has an accessibility feature that "speaks" the equation. The text to speech will not blow you away, but it is a first version which was partially funded by the National Science Foundation, and they are still working on it. The text-to-speech work in the electronic version of the American Heritage Dictionary, as one example, is much better, but it's interesting to see it done with mathematics.

Design Sciences also has a new product offering, MathFlow, which manages the workflow and conversion of math from Word to desktop publishing formats.

Posted by Bill Trippe at 11:22 AM

August 10, 2004

Is Word Ready for XML Primetime?

I have a recent article in Transform magazine that looks at Word 2003 and how many organizations are using its XML capabilities. To quote briefly from the article:

With the release of Office 2003 last year, much was made of the XML capabilities built into Word, the ubiquitous word processing tool from Microsoft. Word 2003 supports XML in two ways: it lets users to save documents to a Microsoft-specific XML format called WordProcessingML, and also enables users to structure documents according to valid XML schema.

Ten months after its release, Word 2003 is not yet commonly used for XML authoring on its own, but it has given rise to a cottage industry of XML editing add-ons.

Posted by Bill Trippe at 4:18 PM

August 8, 2004

DAM Consolidation Continues

Artesia has been acquired by OpenText, continuing the trend over the past two years for DAM vendors to be acquired by vendors of broader ECM platforms. This leaves NorthPlains as perhaps the last of the standalone "enterprise DAM" vendors.

Posted by Bill Trippe at 11:10 AM

July 8, 2004

More on Authoring XML in Word

My colleague Lisa Bos weighed in on the use of Microsoft Word in structured editing applications. Lisa is Vice President and Chief Architect at content management consulting firm Really Strategies and has a wealth of experience in XML implementations. Taking a step back from the specific question of Microsoft Word, Lisa offered the following useful overview of the kinds of XML editing tools one can consider. She puts a special emphasis on the integration of editorial tools with a content management system.

There are several broad categories of choices for XML editing:

1 - Commericial XML editors intended for use by editorial and production users. The major players in this category are XMetaL and Epic. These are also the only editors that have an integration with Documentum that allows XML documents to be appropriately processed (chunked) on check in and check out. (While Documentum allows the use of Word as an editor, special customizations would be needed to integrate Word as an XML editor.)

2 - Commercial or shareware/freeware XML editors mostly intended for use by software developers but also an option for editorial teams who can't afford more or who have very simple content. Products in this space range from XML Spy (expensive developer's tool) to a wide variety of free/inexpensive tools that work like glorified text editors (I don't mean to disparage these - their XML capabilities can be really great - but the view to the user is "geeky," not the word processing view available in XMetaL or Epic).

3 - Non-XML editors in combination with custom scripting/programs to convert content to and from XML. Even though it does have some new XML capabilities, MS Word still falls into this category. Whether you use its XML capabilities or not, you must still write scripts to transform your content to the needed DTD after you close a document in Word, and to transform the content into a form that Word can use when you open a document. This can work extremely well for content like news that does not have a high level of nesting in its DTD, but is expensive to maintain for complex content (it's difficult to script for all the myriad of things people can do to content in Word). For complex content, it ultimately requires more custom programming and maintenance to make Word into an XML editor than it does to add some nice word processing features to a native XML editor. Using Word or another word processor also has other negative side effects. In particular, it results in your editing and production staff not really understanding XML or your DTD.

It is true that there is a learning curve and cultural change required for staff to become familiar with XML and with using an XML editor versus a word processor. This change always involves some discomfort but also comes along with lessons that are highly valuable to a business. The two primary lessons are (1) a clearer understanding of the purpose of each person's role (e.g., my job is to add value to content, not to worry about how it *looks* in various outputs) and their ability to focus more clearly on their individual goals and (2) a broader understanding of the big picture - how what they do in content affects all outputs (all re-uses and all media). Yes, the transition to XML requires cultural change, but it is beneficial change. Yes, an XML editor constrains users, but it constrains them with a purpose. In the end the benefits of an XML editor outweigh the negatives significantly.

All that said, it is still possible for the *way* in which an XML editor is implemented to cause problems or be unnecessarily complicated for users, but that is a topic for another time.

Posted by Bill Trippe at 9:29 PM

July 7, 2004

Internet Explorer Security: Microsoft's Achilles Heel?

I recently gave up on trying to make my home computer safe from browser hijackers taking over Internet Explorer. Instead, I disabled IE and loaded Netscape. Problem solved, for now.

(And, yes, I disabled IE rather then removed it. This particular home computer still runs Windows 98. Now I know why that judge in the federal Microsoft antitrust case ordered Microsoft to make it easier to uninstall IE.)

This was after I spent substantial time trying many Adware killers and finally buying one. All to no avail. I simply could not eliminate spurious hijackings.

I am a big advocate for a better user experience through the browser and other Internet-aware clients. Thus my focus on tools like SVG, XForms, and other eForms technologies. Will Internet Explorer fade from use even faster because of these security problems? Will InfoPath suffer a similar fate?

Posted by Bill Trippe at 6:05 PM

July 6, 2004

Little Things Mean Alot

I was parsing a DTD yesterday using a commercial editing tool and ended up with a syntax error. Pretty typical problem, but I was really surprised by the difference in error messages I received between the original editor I was using and another one I tried after a frustrating hour. The second product did a much better job of isolating the problem (it proved to be two problems). It isolated the first error to the line number and the type of problem, and then isolated the second problem to the line number. The first one essentially gave up, saying little more than "this structure doesn't belong here," and even pointing to the wrong place in the file.

I haven't looked at this in detail yet, but I have long held that XML parsing tools can be incredibly unhelpful in error reporting. As more and more people use XML, are they getting accustomed to this kind of painful troubleshooting, or are the products getting better?

Posted by Bill Trippe at 12:51 PM

June 29, 2004

Ease of Use in XML Editing

What makes for ease of use in setting up editors and authors to begin using XML for content development? How much of it is making the DTD or schema useful and intuitive? How much of it is customizing the editorial tool to make it productive?

Or am I even asking the right questions?

Bear in mind that I am referring to the average content creator, and not necessarily someone who is comfortable with XML or even something like HTML markup. Imagine that this person, like most computer users, has created most text using a commercial word processor, an email client, and web-based forms.

Your thoughts?

Posted by Bill Trippe at 6:30 PM | Comments (1)

June 19, 2004

Structured Authoring in MS Word

Anyone doing structured authoring in Word such that they can produce
reasonable XML at various points in the workflow? I would love to hear
about user experiences. This would include folks who are using add-on tools such as those from i4i and Hypervision.

Feel free to post here or contact me off list.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 9:04 PM | Comments (2)

June 9, 2004

Tools for Creating Graphical Views of DTDs or XML Schemas?

Do any of you use tools to create graphical or tree views of XML? See, for an example, the Elm Tree Structure in this example from the W3C:

http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm#AEN664

I have used DTD2html in the past, which creates a nice HTML tree view, but something like the Elm Tree would be better.

If you have any ideas, post here or email me at btrippe@nmpub.com.

Thanks!

Bill

Posted by Bill Trippe at 4:49 PM | Comments (3)

June 7, 2004

Correct Feature List for XML Repositories?

I have been looking in detail at XML repository tools on the market, as well as the major relational database vendors for their support of XML. One of the things I would like to include in the CMSWatch report is a feature matrix, showing how the various tools compare in key features. The following is the current feature list I am reviewing with the vendors.

APIs

DOM
Persistent DOM view
SAX
Java
JDOM
COM
SOAP
.NET
Conventional database APIs
WebDAV
XML:DB

STORAGE

Native Storage of XML
Validation on input/store
Validation on output
Accepts well-formed XML
Accepts non-XML data
Use DTD for database definition
Use W3C schema for database definition
Use RelaxNG schema for database definition
XML view of RDBMS data
Easy to update storage model if DTD/schema changes
Index at Database Creation
Ad hoc/multiple indexes after database creation
Incremental indexing
Type-aware queries

XML STANDARDS SUPPORT

Xlink
Xinclude
XQuery
XPath
XSLT
XQuery Update
XQuery Module

DATABASE MANAGEMENT FEATURES

Data replication and synchronization
Transaction support
Rollback
Versioning
Multi-level user security
Security based on XML tree/element
Online backup
Offline backup
Tuning and optimization
Span physical disks
Hardware optimization
XML triggers

OS/Platform SUPPORT

Windows 2000 Pro
Windows 2000 Server
Windows 2000 Adv. Server
Windows XP Pro
Sun & FSC Solaris 8 (32 bit) - UltraSPARC
Sun & FSC Solaris 8 (64 bit) - UltraSPARC
Sun & FSC Solaris 9 (64 bit) - UltraSPARC
AIX 5.2 (32bit)
AIX 5.1 & 5.2 (64bit)
HP-UX 11.0 (32bit)
HP-UX (11i (64bit)
Red Hat Linux Adv Server 2.1 (IA 32)
SuSE Linux Enterprise Server 8 (IA 32)
SuSE Linux Enterprise Server 8 for S/390 zSeries

OTHER

Data binding support
Extended SQL queries
Scalable: can run multiple instances
Scalable: single instance can manage multiple data models

ADVANCED TEXT SEARCH

Full text search
Wildcards
Boolean operators
Proximity searching
Structural search (utilizing the XML structure)
Optimize search based on structure
Stemming
Thesaurus Support
Fuzzy Searches

PERFORMANCE

Data shredded from XML to other formats?
Querying
Mass Load Performance
Indexing Performance

Posted by Bill Trippe at 9:11 PM

June 5, 2004

What Fresh Hell is This?

OK, I am guilty of always wanting to use that great line from Dorothy Parker, but I think it is apt for having to deal with Spam Commenting after weeks of dealing with spam, viruses, adware, dead and dying hardware, and my domain name being used as the return address in, apparently, three million spam emails that went out last week.

This blog is a modest enterprise. I have been happy to post 12 or so entries a month, and to enjoy some dialogue with readers. In truth, I seem to get more direct emails than I do formal comments, but that is fine. In fact, I prize the comments people have posted; they are the real value of the blog, if you ask me. Sometime in the last week though I began to get 10-20 spam comments a day. MovableType does allow you to block by IP address, but whatever or whoever is doing most of the spam commenting comes in on a fresh set of IP addresses every day (but uses the same maddeningly inane (and fake) email address (hrie@yahoo.com)). Weirder still, they are posting all manner of URLs (most recently prescription drugs and debt consolidation).

So, indeed, what fresh hell is this?

A tip of the cap to Jay Allen, who has made available a tool, MT-Blacklist, that can be added to a MovableType installation to block spam comments (and clean up ones that are already there). My thanks to Jay, and curses to you, hrie@yahoo.com.

Posted by Bill Trippe at 11:27 AM

May 19, 2004

Acrobat and XML

Here's a sentence I wouldn't have found myself writing a couple of years ago.

I am finding the latest version of Acrobat to be helpful in some XML work I am doing.

It's a simple thing really, but I have been taking some consistently styled Microsoft Word files, publishing them as PDF, and then using the "save as XML" feature in Acrobat 6.0. (I know that I could use the "save as XML" feature in the latest version of Word, but the machine I happen to be using has Word 2000.)

The Acrobat-produced XML is OK. Not great, but more than workable, which is what I need for this particular project. It's nice to have this kind of option, especially since there is no shortage of PDF documents out there.

Posted by Bill Trippe at 9:09 PM | Comments (1)

May 18, 2004

Reasonable Request, Don't You Think?

A poster to TECHWR-L wrote today:

My company is considering moving our documentation to an xml-based content management system. However, we are having a hard time finding tools that are easy to learn and reasonably priced (it must cost less than $12,000 to get 10 users up and running, but is expandable to 100 users). Our requirements are:

* wysiwyg xml authoring tool that was created for writers (not developers)
* includes xsl stylesheets for creating HTML help, Webhelp, oracle help
* includes workflow, version control, check in/check out
* can be used with sql and/or oracle databases

While a few of the details are a little out of place to me (e.g., why specifically mention SQL and Oracle?), I like the writer's focus on a price point for a fixed number of users. I always like thinking of technology rollouts in terms like this (getting n users to get up and running with x and y functionality).

I have to wonder how many solutions there are for the author of this post. ArborText Epic is too expensive, and a lower-cost, general-purpose, XML-aware CMS solution like that from Ektron would likely require some work to support output of chapter- and book-length material to PDF. Also, another poster has already correctly pointed out that the necessary XSL style sheets would require a fair bit of work.

Still, isn't it reasonable to assume you could launch basic single-source publishing for $1200/user?

Posted by Bill Trippe at 10:55 PM | Comments (1)

May 10, 2004

Sometimes I Wonder...

...if computers really save time. I do not want to log the time I have spent in the last week dealing with:


Though I have to say I love wireless networking. I now have 3 machines running on a small wireless network at home, and the setup has been very easy. Of course, once one of the existing machines was wired, it promptly got a virus. Then I was a good boy, and began downloading all of the Windows upgrades in preparation for adding a firewall. Halfway into the upgrade, the hard drive failed.

As I write this, I look over my shoulder at another (older) machine I am trying to reclaim only to see the following error message:

Operating System Not Found.

I don't think that is a good thing.

Posted by Bill Trippe at 8:28 PM

May 6, 2004

PureEdge Announces New Platform

XML-based eForms vendor PureEdge Solutions announced a new version of their software earlier this week.

They continue to innovate with an XML-centric focus to their products. One of the interesting aspects of the new release is their support of the open-source eclipse rich client environment. PureEdge will be supporting eclipse as both an IDE and a rich client, with the PureEdge Viewer embedded within the eclipse client.

Posted by Bill Trippe at 1:43 PM

April 28, 2004

New Article on eForms in the Seybold Report

I have a new article on eForms in the Seybold Report, "Suddenly, E-Forms Matter." To quote briefly from the introduction:

"Electronic forms have long resided in a sleepy corner of the content-management landscape, perched somewhere between scan-and-capture applications and records management. Indeed, only recently have content-management vendors and analysts broadened the definition of Enterprise Content Management (ECM) to include the comparatively fixed content assets in applications such as records management and forms processing. As a result, the feature set and functionality of electronic forms have never been high on end users' lists for content-management solutions.

This lack of interest has not been from a lack of work, however. Electronic forms are ubiquitous on the Internet-think e-commerce applications, site registration and search forms. Indeed, HTML forms are the de facto standard for interfacing people and processes on the Web, and Internet and intranet applications are replete with HTML forms for both user and administrative interfaces.

Moreover, content-management applications especially have relied on HTML forms interfaces. Many Web applications, for example, depend heavily on storing content in relational database tables. HTML forms are a direct and ready means of developing an interface for such tables, so developers have relied on them.

But just as people eventually realized that HTML was not a robust markup language that could glue applications together-leading to XML-so too have people realized that HTML forms are not the best method for collecting and validating data and content."

Posted by Bill Trippe at 8:55 PM | Comments (1)

April 10, 2004

XML Forms and XML Editors

Quick thought.

Where do XML-based eForms end and XML-based editors begin?

There is clearly some overlap. Certain kinds of content entry are well served by electronic forms—we wouldn't have built all of these industrial-strength Web sites if this weren't true. Indeed, HTML forms for content entry are the most common interface for many content management systems. As the world trends toward XML, will XML editors become the predominant GUI for content entry and editing? What about XML-based eForms? Will they dominate instead?

There are some things that XML editors clearly do that eForms were simply not designed to do—entering and updating lengthy documents comes to mind. But with InfoPath supporting a rich text interface and must-have editorial features such as spell checking, will full-blown editors only be required when length is a factor?

I would love to hear your thoughts.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 12:33 PM

April 1, 2004

Some Technology Highlights at the Gilbane Conference

I was busy at last week's Gilbane conference event, so I can't say I looked systematically at every vendor who was exhibiting. Overall, it was an excellent concentration of content management and XML vendors. A few quick thoughts:

There was more, of course, but these were the items that caught my eye. In the interest of equal time, I would be happy to hear more from any of the vendors who attended. I could then highlight some of them in future entries here.

Posted by Bill Trippe at 9:30 PM

March 26, 2004

Adobe Designer Beta Program

Chuck Myers from Adobe did a great job at today's Gilbane conference event on eForms. One of the things he discussed is the general availability of the Beta version of the Adobe Designer 6.0 forms creation software.

Please go here to get more information.

To quote briefly from the Adobe Web site:

Adobe Designer 6.0 software from Adobe enables users to extend the power of XML to create Portable Document Format (PDF) and HTML forms that effectively capture data and easily integrate it into enterprise systems. This capability enables a high return on investment because it uses embedded intelligence to validate data at the source of capture, eliminating the time and effort -- and costly mistakes -- associated with rekeying. And because PDF files can be filled in and submitted electronically, they speed document transactions and give users instant access to forms from anywhere on nearly any device.

I will be posting copies of the various presentations here shortly.

Posted by Bill Trippe at 2:32 AM

March 20, 2004

Onfolio

Allaire founders J. J. Allaire and Adam Berrey emerged this week with a new company and a new product, Onfolio. The premise of Onfolio is directly on-target—one of the primary uses of the Web is research—and the new product is designed to help users collect, catalog, organize, and share the information they find on the Web. My initial take on the product is that it adds a lot of value to this important process.

More than a shortcuts organizer, Onfolio allows you to collect and organize Web pages, snippets of Web pages, documents—in short, anything you can cut and paste or save in whole, such as PDFs and office documents. What is potentially much more valuable here is that you can usefully share and publish the results. In addition to the obvious features such as "share this link" or "share this collection of links," Onfolio allows you to publish your results as email, Web pages, and even RSS feeds.

The initial release of the product is closely tied to Microsoft. The installation works only on Windows XP, 2000, and 2003, and works best with Internet Explorer. However, it looks as if there is enough publishing flexibility to allow users to create generically useful HTML and RSS.

I will be trying it over the next few weeks and will report on it. I am going to start by collecting and organizing some current research I am doing on eForms.

Posted by Bill Trippe at 2:58 PM

March 17, 2004

What Works

Reviewing some recent postings, I realized I have been complaining a lot about email, so I want to balance the negative by spending some time thinking about and writing about what works well in the Internet. I am a suspicious sort (being a lifelong Red Sox fan will do that to you), so I fear that I may be jinxing myself. But I will try anyway.

So what does work? My first thought is that many of the large e-commerce sites seem to not only work well but seem to be improving over time. Now, I am sure people run into problems, but I wonder how the problem rate in major e-commerce sites compares with, for example, catalog ordering by phone. Has this been tabulated?

I mention this because of some recent experiences with eBay. I am an infrequent seller and buyer on eBay. As a seller, though, I am attuned to ease of use. Over the past year, eBay has added a number of features, most notably better integration with PayPal. You still have to log in separately to PayPal for many functions (a drag, but perhaps a necessary evil for security reasons), but the eBay selling interface provides a much more unified view of status than it formerly did. It feels more like an application, instead of a bunch of loosely aggregated links.

Now I can check payment status and shipping status on items much more easily. One newer feature caught my eye. I can now print postage and shipping labels directly from my paypal account. The fee is charged automatically, the labels printed, and the recipient notified via email that the package is being shipped. The email notification can include the tracking information for the package. This newer feature eliminates many manual steps for me as a small seller.

The one drawback is fees. The profit from small sales, especially, can be eaten up by the combined fees of eBay and PayPal. The goal should be an e-commerce infrastructure that makes profitable even the smallest sales (think a total sale of 10 cents for an archived news article, with the total fees being less than a penny).

I can see small publishers taking advantage of this kind of infrastructure over time. Such an infrastructure should widen the kinds of products publishers can
efficiently produce and profitably distribute.

Posted by Bill Trippe at 2:27 PM

March 4, 2004

Adobe FrameMaker 7.1

I had an excellent briefing from the project management folks at Adobe about the 7.1 release of FrameMaker. I won't try to discuss the entire new set of features, which is summarized here and in a more comprehensive PDF download here.

The features that caught my eye, included the following:


Karl Matthews, the group product manager for FrameMaker made an excellent point about SVG, and its applicability for localization. Too many images reside in what Matthews called "opaque binary file formats," where it is very difficult or impossible to get at and manipulate elements of an image. SVG, because it is hierarchical and an XML language, allows programmers to easily parse, navigate, and manipulate the image "tree" and all its elements. Thus, an SVG-encoded image could straightforwardly be manipulated to allow certain elements (captions, callouts, text) to be localized for different venues. So, along with the more obvious reasons to consider SVG for technical publishing (it's a vector format, it's scalable, it's cross-platform), organizations should now add localization to this list.

Posted by Bill Trippe at 8:34 AM

February 27, 2004

Updated Feature List for XML Editors

I have been working on the feature list for XML editors, which will be included in the upcoming report I am writing for CMSWatch. After some feedback from some initial respondents, I have revised it. This is by no means final. In fact, I am asking the respondents to add features as they see fit. My thanks to several reader who chimed in with ideas.

If you would like to see this in spreadsheet form, click here. I invited product folks from XML vendors to download and respond to the list. I can not guarantee that your results will get in the first version of the report, but they will get into future versions of the report.

For those of you reviewing the list in spreadsheet form, note that the yellow rows are those rows added since the first version.

Thanks,

Bill

Updated Feature List for XML Editors

OS Windows Mac Linux Solaris VALIDATION XML Validation SGML Validation DTD Support W3C Schema Support Relaxing Schema Support Namespace Support Xinclude XML Catalog Interactive Validation Batch Validation Edit/manage Entities Edit/manage CDATA Edit/manage Attributes Tag/Attribute help and support in context Support special characters and character entities Schematron Allow invalid but well formed Allow invalid and not well formed TABLES and MATH WYSIWYG CALS Tables WYSIWYG HTML Tables WYSIWYG MathML Editing EDITING INTERFACE WYSIWYG Editing Source View Editing Grid View editing* Tree view editing* Pretty printing of XML Syntax coloring Multiple Editing Windows Allow changing markup (elements too) * These both assume hierarchical editing support and one is typically offered instead of the other. EDITORIAL FEATURES Spell checking Grammar checking Collaboration features Versioning Document Compare Document Merge Search and Replace Search and Replace with Regular Expression Search and Replace with XML Context Multiple-level Undo/Redo PLUG-INS and SUPPORTING APPLICATIONS XSLT Engine XSL-FO Engine Raster Image Editing Raster Image Display Vector Image Display Vector Image Editing CUSTOMIZATION FEATURES Menus Macros Tool Bars Keyboard Shortcuts Forms/Interface Designer Customize based on schema/DTD in use Apply customizations to individual user Apply customizations to entire workgroup Apply customizations to entire organization DOM Interface included Web Services via SOAP, etc. BROWSER AND INTEGRATION SUPPORT ActiveX Java JavaScript HTTP PUT HTTP POST LOCALIZATION (Please list # of languages)

Localized Menus
Localized Documentation
Localized Error Messages
Dictionary/Spellchecking
Integration with Translation Systems

Language Support
Tag Aliases
Localized User Interface
Thesaurus, multiple languages
Online Help, multiple languages

Posted by Bill Trippe at 6:56 PM

February 2, 2004

Here Lies Email, RIP

The signal-to-noise ratio for email reached an all-time low for me this weekend. Mydoom certainly did its part in this, especially since someone on a social mailing list to which I subscribe was infected, leading to scores of bogus messages per hour. I have basically four email addresses, two of which are "public," and then I administer four more emails for my primary domain. I use tools on the server side and the client side, and I still had over 700 bogus emails reach my inbox between Friday noon and Sunday evening. At one point, after not being online for a few hours, I logged on and downloaded 179 emails, not one of which had any value whatsoever.

How do larger organizations deal with this? And is anyone measuring the real economic impact of this? The costs begin with the server and storage costs, and the technical resources for managing the problematic sides of email. But then there is the loss of productivity for each user who must deal with problematic email that reaches them. Someone commented recently that each new filter and tool employed to deal with problem email raises expectations among end users, only to eventually disappoint and leave them even more frustrated then before.

Is the solution out there somewhere—and sometime soon?

Posted by Bill Trippe at 7:58 PM | Comments (2)

February 1, 2004

Would like to hear from experienced users of XML technology

I have been posting the following message to some XML-related newsgroups lately.

For some research and writing I am doing, I would like to hear from experienced users and technical implementors of the following commercial XML products (listing by vendor and not always by roduct):

ArborText Epic Editor
Corel XMLetal
Altova XMLSpy and Authentic
Ektron's eWebEditPro+XML
Ixiasoft TextML
Software AG Tamino
Ipedo's XML repository
Microsoft SQLServer (when used for storing XML)
Oracle (when used for storing XML)
Hypervision Worx
Hypervision Studio
CambridgeDocs XDoc Converter

Please contact me offline, and thanks very much.

Bill

-------------------------
Bill Trippe
New Millennium Publishing
763 Massachusetts Avenue
Cambridge, MA 02139
781 526 2564
btrippe@nmpub.com

Posted by Bill Trippe at 8:26 PM

January 23, 2004

Correct Feature List for XML Editors?

I have been looking in detail at the commercial XML editing tools on the market. One of the things I would like to include in the CMSWatch report is a feature matrix, showing how the various tools compare in key features. The following is a first cut at the feature list.

XML Editor Features

OS Windows Mac Linux Solaris VALIDATION XML Validation SGML Validation DTD Support W3C Schema Support Relaxing Schema Support Namespace Support Xinclude XML Catalog Interactive Validation Batch Validation Edit/manage Entities Edit/manage CDATA Edit/manage Attributes Tag/Attribute help and support in context Support special characters and character entities TABLES and MATH WYSIWYG CALS Tables Editing WYSIWYG HTML Tables Editing WYSIWYG MathML Editing EDITING INTERFACE WYSIWYG Editing Source View Editing Grid View editing Tree view editing Pretty printing of XML Syntax coloring Multiple Editing Windows EDITORIAL FEATURES Spell checking Grammar checking Collaboration features Versioning Document Compare Document Merge Search and Replace Search and Replace with Regular Expression Search and Replace with XML Context Multiple-level Undo/Redo PLUG-INS and SUPPORTING APPLICATIONS XSLT Engine XSL-FO Engine Raster Image Editing Raster Image Display Vector Image Display Vector Image Editing CUSTOMIZATION FEATURES Menus Macros Tool Bars Keyboard Shortcuts Forms/Interface Designer BROWSER SUPPORT

ActiveX
Java
JavaScript

LOCALIZATION

Menus
Documentation
Dictionary/Spellchecking

Posted by Bill Trippe at 7:45 PM | Comments (3)

January 16, 2004

Major Relational Database Vendors and How They Support XML

I wrote the following article for EContent magazine late in 2002, and I am re-examining its conclusions now that I am taking a fresh look at RDMBS engines from Microsoft, Oracle, and elsewhere. Do the major RDBMS vendors do enough to support XML, or is there still a case for dedicated XML repository technology?

Now that XML has moved beyond being the latest cool thing, and is in fact being widely adopted and deployed, some practical questions are being asked about it. But these questions are only starting to be answered. Perhaps the biggest question about XML is, "Now that I've got it, where am I supposed to keep it?" Some of the big database players think they've got the answer. Organizations are replete with storage technologies: relational databases, file servers, and document management systems, to name a few. And, perhaps to no one's surprise, XML data is found in all of these places and more. There are also newer, specialized technologies specifically designed for native XML storage.

Yet, for most organizations, relational databases are the dominant mechanism for storing and managing data. Moreover, there is great concentration in the relational database market (with technology from Oracle, IBM, and Microsoft dominating). Given this concentration of technology and vendors, it's worth looking at what these vendors plan to do about XML. Specifically, it's worth looking at each vendor's flagship database products: Oracle's 9i Database, IBM's DB2, and Microsoft SQL Server.

It's clear why these key players are taking XML seriously: The market for XML storage is a big one. According to the analyst firm ZapThink, the market for XML storage will grow from $75 million in 2000 to over $4.1 billion in 2005. And while the relational database vendors currently consume only 15% of the XML storage market, that percentage will grow to 65% by 2005. That leaves plenty of money for the specialized XML vendors to make, but it also means that the relational databases will be storing plenty of XML for years to come.

XML Versus Relational Data
The distinctions between XML and relational data are by now widely discussed and, for the most part, well-understood. But regardless of what a salesperson may be telling you this week, the differences are fundamental. Relational data is all about tidy rows and columns of well-understood, previously defined chunks of information--like names, addresses, prices, and product codes. People have come to use the word "structured" to refer to relational data, and the term makes sense.

XML data can also be somewhat structured. A set of names and addresses can be represented, perhaps equally well, as both relational data and XML data. But XML has two fundamental differences: 1) XML can embed hierarchies of parent-child relationships in ways that relational data cannot; and 2) XML doesn't care a lick how long or complex a given "field" or "record" is, while relational data is all about how long and complex the fields and records are.

Take the extreme (but not all that unusual) case of a lengthy technical document coded in XML. The entire "record" or XML document can be megabytes in length. It can consist of many parent and child nodes. Thus, an XML document is not likely going to fit neatly into columns and rows. As a result, XML data can be an odd fit in a relational database. So, Oracle, Microsoft, and IBM have been working hard to extend their products to better ingest, store, manage, and manipulate XML data.

To begin with, all of the major vendors have improved on an already available method of storing large chunks of data as a means of better supporting XML. The so-called BLOB (Binary Large Object) space in a relational database can be used to store large XML documents, and the vendors have refined these to differentiate BLOBs from CLOBs (Character Large Objects). Using BLOBs or CLOBs, whole XML documents can be securely moved in and out of a database, and secondary tools, such as an XML parser, can then be used to manipulate the XML as it is moved in and out of the BLOB.

For Robert Shimp, vice president for Oracle 9i Database marketing, the emergence of XML is part of the broader problem enterprises face as the growth and importance of "unstructured data" begins to rival the growth and importance of structured data. "Organizations are looking for a unified view of their data, both structured and unstructured," said Shimp. Moreover, according to Shimp, organizations suffer from a proliferation of too many data sources, many of which are too loosely managed. And this loose proliferation of assets is not good for companies, as it makes it difficult for them to efficiently manage and act on their intellectual capital. "It would be analogous to the CFO of a company handing out $100 of the company's money for each employee to manage," noted Shimp, "with no controls on how each employee would do it."

Solving the "Single Source" Publishing Problem
For organizations that have significant amounts of content, managing XML data becomes even more important. Publishers and others with large content stores are looking to solve the "single source" publishing problem, where they increasingly rely on both XML and structured data to be rendered into HTML, WAP, and other formats--often on-the-fly. Already, such automation could involve tying together many repositories, where a rendered HTML page could be derived from both structured and unstructured sources. In a manufacturing application, this could be a parts catalog where price and inventory data comes from a relational database and the product descriptions come from a document management system. In a magazine publishing application, this could be where the article content originates in a content management system while a related directory listing comes from a relational database.

Creating such a unified view of both unstructured and structured data is precisely where the major vendors see their offerings headed. Oracle, for instance, talks about "unifying...business data...and XML content," and IBM talks about "combining XML...and the power of data integration." And all of them, Microsoft included, are embracing the broader notions of Web services, where content and data are integrated over the Internet, using loosely-coupled components and XML as the all- purpose glue.

Besides the single-source publishing problem, other factors are driving the need for XML storage. ZapThink's research points to the growth in Web services, the increased use of XML for messaging, and the need for improved searching and querying of the XML. Taken together, these drivers suggest a growing need for storage technologies that provide more sophisticated management of XML data.

Looking Under the Hood
Database vendors are working to make their products function better with XML. While the products and approaches differ-- and the big three all have both new commercial offerings and significant R&D under wraps--the approaches have some things in common.

To varying extents, they all rely on technologies for mapping the XML data to the relational fields, and back again. For example, IBM is developing "Extenders" for its DB2 database that will allow developers to map XML data to DB2 tables, and back again, and Microsoft has a programming facility called SQLXML for mapping and querying between, as the name suggests, SQL and XML. Oracle would argue--and industry analysts would tend to agree--that their mapping technologies are more deeply embedded in the product, especially with Oracle 9i, Revision 2, which is now generally available.

The major vendors all fully support the more stable XML standards, such as the core XML syntax, though they vary in their support of emerging standards. So the different products can parse the XML, at least against a document type definition (DTD), and in some cases against an XML schema. The products can also use XPath to traverse the hierarchical structure of the XML, but many have stopped short of supporting newer, developing standards, such as Xquery and other emerging standards for querying. Ron Schmelzer, senior analyst at ZapThink, follows XML data storage closely, and sees such standard support as critical to differentiating the various product offerings. Whereas relational database systems use SQL for querying, Schmelzer points out that SQL simply doesn't work "as well with the hierarchical nature of XML documents." As a result, said Schmelzer, "a number of initiatives exist to deal with XML-centric data query, insertion, and update operations."

And all of the vendors support emerging programming languages and application programming interfaces (APIs). The big three support Java for database access and connectivity, and emerging APIs for processing XML, such as the Document Object Model (DOM) and the Streaming API for XML (SAX). The emphasis, correctly, seems to be on giving software developers a ready toolkit for accessing, manipulating, retrieving, and updating XML data, and quickly transforming it to other forms--HTML, relational, other forms of XML, and so on.

Integration as a Crutch?
This last point--integration--is a focus for several of the vendors, notably IBM and Microsoft, both of which are heavily invested in marketing software development tools and methodologies. IBM as well has a huge professional services business, a large chunk of which is dedicated to database and XML integration. The rollout of the Web has meant an explosion of database integration and access, and the continued growth of XML will only accelerate this trend.

Oracle's Shimp, among others, would caution that application integration is only part of the problem, that there is underlying and more fundamental data analysis and modeling that needs to be done. In a situation where the data stores have multiplied (often for reasons of expediency), Shimp reasons that simply integrating the various databases may be a "crutch" to avoid the harder work that is being left undone. Indeed, the increased mix of data types--relational, nonrelational; structured, unstructured; and XML especially--have brought a new challenge to organizations. This challenge is to truly analyze all the data and develop a more unified and comprehensive data model. XML isn't so much a new problem, as a new and complex dimension on an existing problem.

The Data Model is the Key
Yet while all organizations would be wise to invest in this kind of comprehensive data modeling, the organization that has a lot of XML data indeed has some unique problems on its hands, and perhaps extra motivation to take a step back and analyze things. By its very nature, XML data is going to be different, and is going to require some different integration and handling. If you have various business databases, and a large store of XML, you likely are going to require at least separate instances of a relational database. For example, you could have all of your business or transactional data in one database, tuned to maximize the performance of that data. Your XML data could then reside in a second relational database that supports XML, such as the products from Oracle, IBM, and Microsoft.

These companies would likely argue that a single-vendor solution is preferable, and that, of course, their solution would be best. However, the reality is that you likely have many data sources already, from different vendors, and will likely live with some of these for some time to come. So while a comprehensive data model and more monolithic solution may be in your future, you will likely still have to knit some things together to create a comprehensive solution, at least for now.

Posted by Bill Trippe at 9:31 PM

January 15, 2004

What to Consider in Evaluating Databases for XML Storage

As I dig into the XML and relational repositories for the CMSWatch work I am doing, I find myself asking the vendors to provide detailed demonstrations of the products. In order to get an apples-to-apples comparison, I am asking for a demo that touches on the certain consistent points and capabilities. The following is how I described it in correspondence with one of the vendors.

I need to understand how Product X can be used to store XML content.

This is best done by a demonstration of the product that shows a typical project:
--How the database is designed
--How the content is loaded and administered
--How the content is then accessed and updated
--How programmers typically will interact with, query, and manipulate the content
--Specifically how the XML will be stored in the database

Since this is focused on XML, I would also like to learn:
--How Product X supports the various XML technologies (especially XSLT, XQUERY, XPATH)
--How Product X's functionality compares with both relational and XML-specific repositories.

I have also considered asking them to work with the same data, but that may not be completely necessary. Any thoughts?

Posted by Bill Trippe at 9:08 PM

December 29, 2003

Some Important Considerations for Web Delivery

Nowadays every organization has a presence on the Web, so Web delivery is not a new challenge. But as organizations make more business functions available over the Web, they need to bring more content--and more types of content--to the Web in support of these business functions. This need is common to all types of firms, from the company seeking to enhance product support for retail customers, to the government agency that needs to distribute forms and information to constituents.

Regardless of the pressures that are moving more of your content to the Web, you are likely facing a greater need to automate Web delivery. Given the complexities of content management, this may feel a little bit like opening Pandora's Box. But fear not. There are some practical considerations that--when kept in mind--will help ensure that your Web delivery initiative is successfully implemented and easily maintained. The first two considerations--determining project scope and deciding whether to build or buy--are paramount, and deserve the most thought. With these issues resolved, you can consider some other important practical tips.

This is an article Jenn Accettola and I wrote.

Determine Scope, and Don't Lose Sight of It

Like the story of the blind men trying to determine the dimensions of an elephant, many web delivery projects fail because the people involved do not completely agree on the size and shape of the animal in the room with them. The first issue to be resolved should be scope. Are you creating a web delivery system for selected, specific information helpful to your end users? Or, are you going to be integrating applications and content from across functions and divisions of your entire company?

Web Content Management (WCM) tends to focus on very specific and similar parts of a much larger animal (say, the legs). This is not a bad choice if you are certain that you will not be seeking to expand your project in the near future.

Enterprise Content Management (ECM), however, is a much more complex proposition. Such an undertaking requires clearly defining the objectives of your project to assess the fulfillment of both internal and external business needs. ECM is a much more inclusive system that allows you to leverage as much corporate information as you need for website production, knowledge management intranets, enterprise portals, and more. ECM projects include functions often provided by separate applications such as content management, messaging & collaboration, workflow & file management, and archiving (records management).

In choosing between the narrower (WCM) and broader (ECM) approaches, don't let vendors frame your problems. Make sure you have a clear idea of what you want, and hammer out information models and processes ahead of time.

Decide Whether to Build or Buy--or Borrow

BUYING A SYSTEM should be based on clearly defined business objectives and processes. You should carefully develop requirements and match requirements to vendors. The more groundwork you do in the requirements building phase, the higher your chances for success with the project. And do keep an open mind as you match your requirements with potential vendors. Don't eliminate vendors based on vague perceptions, anecdotal evidence or incomplete information. The lesser known, specialized vendor may end up being the best fit for you; so too may the larger, more general-purpose vendor. It depends on your requirements.

BUILDING A SYSTEM is a real alternative and easier than it's ever been. In recent years, the spurt of "do-it-yourself" content management activity has resulted in a wide range of mid-to-small-scale content management systems (CMS) as well as many industry-specific systems. Consider your resources and motives for building your own system carefully. Companies often turn to in-house development as a reaction to (or, fear of) weak deployments, failed initiatives and expensive commercial CMS.

In-house development may be your best option if your web delivery initiative applies to limited properties, you don't plan to manage content across an organization, you have a small number of contributors, and you have a well-defined, simple workflow process. Otherwise, you may soon find it difficult to keep up with constant feature requests generated by external market forces or new technology.

Finally, YOU CAN BORROW a web delivery system by leasing it from an Application Service Provider (ASP), a vendor that deploys, hosts and manages access to a packaged application to multiple customers on a subscription basis. This model has become very popular with small-to-medium size businesses, offering lower entry-cost, shorter deployment schedules, and access to more robust networks with specialized technical support.

With the ASP model, companies can focus on their primary business instead of channeling resources into expanding IT budgets, hiring developers, and developing a product that is outside of the core business model. Using an ASP for pilot programs gives companies a better opportunity to test out software packages before making long-term financial commitments.

Selecting an ASP requires the additional steps of performing due diligence on the hosting company, establishing or reviewing service-level agreements, and managing the contract and relationship. The security of your data and the stability of the ASP are also important considerations. Content Management ASPs such as Atomz and CrownPeak Technology are gaining increasing popularity.


Some Additional Considerations

CONSIDER THE ACRONYMS - they're not just clever sets of letters but indications of protocols and technologies that may make your projects more or less successful. Many de facto and official standards (notably XML and related technologies such as XSLT) are quickly becoming "tools of the trade" for developers. In the right hands, XML-based systems can help you realize significant return on investment.

SECURITY AND DIGITAL RIGHTS MANAGEMENT may be of critical importance, but will add extra layers of complexity to your web delivery model and require extra consideration. In your requirements development phase, be sure to lay out what kinds of security your content will require. Will something like single sign-on be sufficient, or is your content valuable enough to require the persistent protection that comes with Digital Rights Management (DRM)?

LEVERAGING CONTENT is often an important part of maximizing ROI. Some organizations now talk about "return on content" when discussing, for example, how an organization benefits from making complete use of internal and external content. Think of the R&D organization that needs ready access to up-to-date research and analysis. On a practical level, this comes down to how your content delivery system deals with syndication--both of your own content, and content that you are getting from third parties.

KISS--WHEN YOU CAN: Keeping it simple is one of the trends of the day, in healthy reaction to the often overly complicated flops that doomed some projects in the early years of CMS development. So, for example, if you have PDF files ready for distribution, and page fidelity is a requirement for your content, don't avoid the obvious solution of simply distributing the PDF.

IF YOUR CONTENT IS TIED TO REVENUE, REVIEW REVENUE MODELS FOR YOUR BUSINESS. Your revenue models may dictate the types of technology, content, services, and nature of your Web delivery processes. This is perhaps obvious for the business that sells content, but is an important consideration for all businesses--especially if you sell over the Web, develop prospects over the Web, educate your customers over the Web, or support your customers over the Web. (Chances are--whomever you, are this means you.)

Conclusions

Web delivery is no longer a brave new world, but it remains a growing one--in both importance and complexity. Yet despite the technical challenges of Web delivery, organizations that keep certain basic considerations in mind will enjoy greater initial success--and greater success in the long-term.

Posted by Bill Trippe at 1:17 PM

November 7, 2003

Adobe, InfoPath, Xforms, and eForms

The lastest version of The Gilbane Report contains my initial analysis of the recent changes in the eForms space, including, notably, the release of InfoPath and the approval of Xforms as a W3C recommendation. I have never taken a keen interest in the eForms world, but the vendor announcements and the standards efforts are important.

To quote briefly from the article:

Electronic forms (eForms) have always represented a significant piece of the Enterprise Content Management puzzle. On one end of the marketplace, eForms have been implemented to replace traditional paper processes, such as in government and paper-intensive industries such as financial services. In a number of other applications, such as Web Content Management, eForms are the de facto user interface for such tasks as content entry, editing and system administration.

As eForms have proliferated in both of these types of applications and others, the functional and architectural requirements for eForms have grown. Where early eForms were successful merely for capturing and perhaps storing data, it didn't take long for developers to want to manipulate and work with the captured data. On the end of the market where dedicated eForms tools were being used to automate paper processes, such development typically involved working with the proprietary data structures and programming interfaces of the eForms vendor. In applications such as Web content management, the functionality and architecture of eForms were bounded mainly by HTML and related technologies such as JavaScript. As organizations have moved toward application server-centric architectures such as J2EE and .NET, both the proprietary approaches and HTML-based forms have failed to keep up.

Posted by Bill Trippe at 8:11 AM

November 3, 2003

XML and the Technologies for Taxonomy Development and Support

I have a new article in EContent Magazine that asks and answers the question, "Can XML Drive Taxonomies and Categorization?" The answer is yes, of course. As I suggest in the lead to the article:

If you google "XML," you do get a stunning 20.5 million hits, which is about four times as many as "Britney," but--sensibly--half as many as "God." So I guess XML falls short of omniscience. Still, the prevalence of XML has led to its being a too-ready answer to seemingly every question about information technology in general and content management in particular. The assumption seems to be that, no matter the requirement or problem, XML is the answer.

As always, the answer is in the details. Please see the article for more.

Posted by Bill Trippe at 8:06 PM

October 26, 2003

More SVG Support, and a Thought

Software company Beatware announced their latest version of e-Picture Pro with additional support for SVG, including both SVG Tiny and SVG Basic. What's interesting about this announcement is that Beatware is emphasizing the need to put more control over SVG in the hands of the graphic arts professional. For example, e-Picture Pro has built-in constraints that enable you to create illustrations while honoring the limitations of Basic and Tiny. This keeps the graphic artist in the driver's seat, and eliminates the need for hand-coding. This is the kind of product feature that will help expand the use of SVG.

My thought is that there should be a source of concentrated news on SVG. It is very hard to tease SVG news out of some product announcements (e.g., Adobe's recent release of its new creative suite). In other cases, smaller companies who have dedicated themselves to SVG have trouble getting the word out. Could the market use a newsletter or news source specifically to cover SVG? I have been thinking about if for a while. Please get in touch if you have some ideas on this.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 10:04 AM

October 21, 2003

Enter InfoPath

InfoPath launched today, to quite a bit of fanfare from Microsoft and its many partners. Despite Microsoft's size and success, they don't often create the best buzz at a product launch. To some, they seem stingy about the details, and with a product line as ubiquitous as Microsoft Office, it is all about the details. But with InfoPath, they do seem to have done a good job of getting the word out. My inbox has been flooded with InfoPath-related press releases, especially from the partner companies. Moreover, technical and marketing folks have been very available to discuss the launch.

I am tempted to say something tongue-in-cheek about InfoPath (e.g., InfoPath is Latin for "Microsoft Office everywhere, damnit!"). But, in fairness to Microsoft, I have not really looked that closely at it yet. I have been using the beta version of Office 11 for several months, but I never installed the InfoPath componentry. I have spent some time working with Microsoft Word output to XML, and have a pretty good idea about that.

There are at least two interesting things about InfoPath. First, on the one hand, it seems to be very true to XML, but it still looks like a complex and heavy client installation. Second, it is pretty clear that InfoPath is positioned as Microsoft's entree to the electronic forms market; however, Microsoft has been "doing" forms for a long time through products such as Visual Basic, Access, Excel, and even Word. Is InfoPath more than the sum of those parts? Less? And what about XForms? InfoPath specifically does not support XForms. Does this set up InfoPath to be its own, unique vocabulary for forms development, despite the fact that it is based on XML?

With the public announcement of InfoPath today, Microsoft published a great deal more material about the product on the Microsoft Web site today. See, for example, the FAQ and some customer case studies.

I would love to hear from people in the field who have started working with InfoPath, especially in content applications, about their experiences thus far.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 4:15 PM

October 17, 2003

WorX Studio Reviewed

EContent Magazine has published a review I wrote about WorX Studio from HyperVision, Ltd. As noted in the capsule summary of the review, "WorX for Word and its companion tool, WorX Studio, provide a novel way to bring Microsoft Word into an XML-based editorial workflow. WorX for Word acts as a plug-in to Word to provide structured authoring of XML. WorX Studio gives users a means to interactively convert unstructured documents into structured XML, and can be used in concert with WorX for Word. The product suite can be very useful for a group of writers who work on a small number of structured document types."

Posted by Bill Trippe at 10:49 PM | Comments (2)

September 23, 2003

Component Management of XML

Back in the day, serious content management meant component-level management of XML elements. This was a carryover from early SGML content management approaches, before the term "content management" had legs. Of course, early "content management" applications were really Web delivery engines (early versions of Vignette come to mind), and some systems now thought of as "content management" were actually document management systems (the aptly named Documentum).

At the time, there were a few technologies that handled component management of SGML data. These included Xyvision's PDM (once an acronym for Parlance Document Manager, and now called Content@), Interleaf's RDM, Chrystal Software's Astoria, and a product from a company called Texcel. (I forget the entire lineage, but somewhere along the way, Interleaf acquired the Texcel technology, and either incorporated it in RDM or began using it as part of a later offering called BladeRunner. Interleaf was then acquired by Broadvision, and the BladeRunner technology—I believe—is now part of what Broadvision calls "One-to-One Publishing.")

I mention this because the need for component-level management of (now) XML is still there, and these technologies still do the job. They are not the only technologies that do the job, anymore. However, it would be interesting to come up with a matrix of detailed requirements for single-source XML-based publishing and see how these technologies fare against other, newer ones. Do they still maintain a technical advantage?

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 12:33 PM

September 9, 2003

Is DRM Emerging Again?

Seybold has more about Digital Rights Management than I thought it might, given the relative softness in the market. Bill Rosenblatt did an all-day intensive yesterday, and has an excellent keynote session scheduled for tomorrow, The Great Digital Copyright Debate.

Bill has lined up excellent speakers for this:

--Joe Kraus, Co-founder, Digitalconsumer.org

--Tim O'Reilly, Founder & President, O'Reilly & Associates

--Dean Marks, Senior Vice President, Intellectual Property, Corporate Business Development and Strategy, Warner Bros. Entertainment Inc.

Bill himself is a great moderator and speaker, and this is a stellar lineup. It should be a great session.

It's interesting, though most likely a coincidence, that Microsoft last week announced availability of their Windows Rights Management client. A number of people see Microsoft's Rights Management Services as a springboard for a lot of development. Perhaps Microsoft's client application will get some traction, and fuel more development of DRM solutions.

Posted by Bill Trippe at 12:43 PM

September 3, 2003

Storing your XML

Many organizations are now working with XML data in one or more applications. As the use of XML grows, an important question arises&--where should XML data be stored?

I originally wrote about this in February of last year for Transform magazine. It's interesting that the primary argument still holds up more than a year and a half later; some of the vendors have changed of course.

If you take even a cursory glance at the XML storage market, you will see many vendors vying for your attention and your dollars. These include major database companies like Oracle and Microsoft, and a long list of companies with specialized XML repositories. The various products reflect completely different approaches to data storage and management, and understanding them in detail will require your technical staff to dig into some subtle and complex technical questions.

In fact, XML has reopened some fundamental questions of data storage that some people, certain vendors especially, had felt were already answered. For many years, object-oriented database vendors argued that their systems were superior for storing document data, but they never gained much marketshare or mindshare against the giant relational database vendors like Oracle. Customers seemed to accept the argument that relational databases would do the job well enough, and IT organizations wanted to shorten, not lengthen, the list of technologies in use and under maintenance.

Have things changed now where you should consider a specialized database for XML storage? The answer, of course, is it depends. It depends on three things mainly--how much persistent XML data you have, whether you need high-performance, real-time access to it, and what kinds of querying you think you might want to do with the data. Let's take a quick look at each of these.

Persistent XML. Applications use XML in one of two ways--as a source format, or as something that is created and used for exchanging data between two data sources, usually temporarily. But you may well have source XML data that needs updating and maintenance. The more XML data you have, and the more user interaction and updating it requires, the more you may need a specialized tool for storing it.

Real-time access. If the data is stored in a format besides XML, and you need instantaneous access to it in XML format, then access could be slowed by requiring the application to go through transformation processes. A database that is optimized for XML storage will provide better performance, so if performance is key, you should consider an XML repository.

Querying. Relational databases support SQL for querying relational data, of course, but XML data cannot be queried with SQL. SQL is designed for the row-and-column orientation of relational data, and both end users and developers are comfortable with its approach. XML data, with its mix of elements, attributes, and textual data, requires a different approach. Instead, XML data is queried with tools based on XPath. While relational database vendors such as Oracle have basic XPath support, you may need a specialized repository if you have extensive and complex requirements for XPath-based querying.

If you do have one or more of these requirements, then it makes a lot of sense for you to look at specialized tools for storing your XML. And while there are many players in the space, there are two clear market leaders, Software AG with its Tamino XML Server (http://www.softwareag.com/tamino/), and Ixiasoft Corporation (http://www.ixiasoft.com/) with its TEXTML Server (XIS). Both companies bring a lot to the table. In Software AG's case, they are a long-established database company, having introduced the Adabas product some 30 years ago. They have made an aggressive and very strong entry to the market with Tamino. In Ixiasoft's case, they are a newer vendor with a sole focus on this technology.

This is not to say you would be saying goodbye to Oracle or Microsoft anytime soon. For one thing, you will still have many uses for relational data and the tools that work best with it. But as your use of XML increases, your need for specialized tools will also likely increase. This will be especially true if you have large volumes of XML data, real-time processing needs, or complex queries to run. Those are the same questions you ask of your database today, and will ask of your XML database tomorrow.

UPDATE (04/09/06): This was written a few years ago, but the general ideas still apply. I would say, since I originally wrote this, MS SQL Server has come on as more of a presence in storing XML, and Mark Logic's XML content server has carved out an impressive chunk of this market. And as Dave Kellogg, the CEO of Mark Logic noted recently, the question of XML storage is still a very live one for IBM and Oracle.

Posted by Bill Trippe at 5:19 PM

August 20, 2003

XML and Print Publishing

One of the traditional arguments for document and content management is that, "everyone's 'second business' is publishing." That is, regardless of the nature of your business or organization, you are in the information creation and distribution business, so you would be wise to automate it.

If that traditional argument still holds true, then everyone's second business is still publishing--but now to both in print and to the Web. Why? To paraphrase Mark Twain, the death of print has been greatly exaggerated. Indeed, print is not going away, even as the Web becomes an essential channel for organizational communications of all types--marketing, sales, and customer support to name a few.

The result is a new, compound requirement for organizations--to efficiently manage the flow of content into printed form, while at the same time getting this same content out to the Web. The task is made more challenging as organizations try to do this multichannel publishing economically--and with a mix of platforms for content creation, print production, and Web distribution.

If this problem sounds vaguely familiar, it is. Multichannel publishing is a more common problem because of the Web, but it is not an entirely new problem. Since the 1980s, organizations have been looking to distribute their information in ways besides print. CD-ROM was a popular format at one point, but it was overlapped by the Web--with its ubiquity and low cost of entry for basic communication.

Not only is the problem an old one, but the solution happens to be as well. In the 1980s and 1990s, organizations looked to an ISO standard called SGML--the Standard Generalized Markup Language. The promise of SGML was that you could capture content in a way that was format neutral--and then publish it to as many formats as you needed. "In the early days of SGML it was considered a breakthrough to mark up a document in a way that let it be published on more than one imaging device," noted Jon Parsons, Director of Product Marketing at XyEnterprise, a longtime vendor of content management and electronic publishing technology. "Then came the idea that that same generically marked content could also be published in a browsable version on CD."

Indeed, some organizations implemented SGML-based publishing systems, and a few were able to realize significant productivity gains from this approach. In the end, though, SGML proved to be too expensive and too complex for the average organization. Just as CD-ROM gave way to the Web, SGML would also give way to a generalized markup language that was more suited to the Web.

Enter XML

The eXtensible Markup Language (XML) was conceived by the World Wide Web Consortium as a "lighter weight" SGML that was more suitable for the wide distribution and HTML-oriented browsers of the Web. The thinking--correct then and correct now--was that HTML was too irregular and too format-oriented, and SGML was too complex. XML, then, emerged as a relatively simpler way to encode content in a format-neutral manner that would allow multichannel publishing from a single source.

As XyEnterprise's Parsons observed, "What's consistent is the idea that adding intelligence with granular mark-up and lots of metadata creates flexibility, increases efficiency through content reuse, and meets the goal of 'write once, use many.'" In fact, it is this ability to reuse content that is so powerful, and where organizations see the most dramatic return on investment (ROI). According to Parsons, "We've seen astounding ROI from single-source implementations in aerospace and automotive technical documentation, legal publishing, defense-related maintenance information, e-learning companies, and other markets."

Deja vu All Over Again?

If you have been in this business for a while, this is now sounding all too familiar. Is XML simply the latest all-purpose technology to fix the same problem that SGML never really solved? Well, in a word, no. XML may be heavily based on SGML, but it is succeeding where SGML didn't for many important reasons.

--Most significantly, XML is a key piece of all major software development platforms and components. This begins at the database, where major vendors such as Oracle, Microsoft, and IBM have made XML features a key part of their product roadmap. XML then permeates all the key applications and platforms--portal software, enterprise application integration, application servers, and, yes, content management. This is a significant change from SGML, which was only supported by a much smaller number of specialized products.
--As a result, XML is widely understood by programmers, who use XML in their daily work. This is becoming truer as organizations use XML-based approaches such as Web Services to tie existing and new applications together. Doug Tidwell, XML Evangelist at IBM has pointed out that XML is seen as "the universal data access language data access language" for the Web. Again, this is an enormous change from SGML, which was understood by a small cadre of specialists, and never became a part of the programmer's toolkit the way XML has.

Why is XML so much more useful and widespread than SGML? While there are some advantages to the XML language itself over SGML (mainly, it is lighter weight and easier for programmers to parse and process), the more important factor is that XML is supported by many important related standards and technologies. This begins with the transformation language XSLT (Extensible Stylesheet Language Transformations), which allows programmer to easily map XML to other formats (including HTML, other XML vocabularies, and document formats such as PostScript). But it also includes the XML Path Language (Xpath), which is used to access specific objects within an XML document, and the Document Object Model (usually referred to as the DOM), which is an industry standard programming interface for XML documents. The result is a ready toolkit for programmers to create, access, update, and transform XML data from one form to another.

Indeed, to this date, XML has become much more of a general-purpose data representation tool for programming than a markup language for document encoding. But it is still ideally suited for encoding content for single-source publishing, and industry experts say the time is right to begin leveraging XML in the enterprise. "Information technologists have understood the value of managing a single source of information that can be used in multiple ways for some time." said Frank Gilbane, editor of the Gilbane Report (www.gilbane.com), "The problem has been that the benefits were not apparent to business managers, and it was simply too difficult and expensive to accomplish. Today's need to deliver synchronized information to multiple channels (print, web, wireless, etc.) is something all business managers understand. This business need has also driven technology development and adoption to a point where single-source strategies, especially XML-based, should always be considered."

XML for Everything?

Gilbane's careful emphasis--that XML-based single sourcing should be considered--is precisely the right advice. In other words, don't drop everything and convert all your content to XML. As Parsons from XyEnterprise observed, "Successful single source solutions require careful analysis of the content, a clear focus on defined and measurable business objectives, and solid software support at each step in the workflow." So a reasonable first step would be to understand the business objectives tied to single sourcing--what do you hope to gain from single sourcing, and how will you know if you have achieved the objective?

For one engineering firm that I work with, the business objective was to make all their key documents available in print and on the Web--and as soon after updates occurred as possible. They employ a group of 16 technical writers and editors who are responsible for incorporating all updates into a document database of over 70,000 pages. When the documents only had to be available in print, this was a manageable but somewhat slow process. Updates could take several months to appear in a reprinted report. When they began to also produce HTML versions of the documents for distribution over the Web, the delays--and costs for contract help--only increased. They implemented an XML-based system for print and Web publishing with the goal of reducing the time for an update to be distributed--while maintaining current staffing levels. Two years into the project, they have dramatically shortened turnaround time and are producing print and Web versions of their documents with the same staff.

I advise clients to look first at a key business objective for their content, and then to undertake a pilot single-sourcing project that could support that business objective. For example, the business objective could be to make customers more self sufficient in the customer support process. The content tie-in could be to make key service bulletins, heretofore only available in print, also available for download in a searchable HTML database.

Consider a Pilot Project

The pilot project could be as simple as encoding a small sampling of content in XML, and then designing processes for print and Web rendering. You would begin by analyzing the content for its suitability for single sourcing. In XML parlance, this involves creating a Document Type Definition (DTD) or XML Schema that defines the content elements--how they are used, what content or subordinate elements they consist of, and what attributes they share. For example, a technical document may include a number of sequenced tasks, where a parts catalog may include part numbers and descriptions. Writing a DTD or schema is the formal expression of these elements. It's a marriage of the often well-understood but perhaps not formally codified rules of your content and the formal structure of XML encoding. It's important in a pilot project to keep this analysis relatively simple and high-level; remember this is a proof of concept.

To see what XML encoding is like for a business user, you could have an experienced user test an XML editing tool such as Corel's XMetal. This could give you a sense of the learning curve some users may face, and could also give you some metrics for future reference. (Keep in mind, though, that a full system may use a variety of tools and processes for the XML encoding, such as forms interfaces, so the actual tagging processes will likely differ.)

Once you have the XML-encoded content, you would need some means to render print and HTML versions of the content for distribution. Assuming you have kept the DTD or schema relatively simple, a programmer can quickly create an XSLT stylesheet for the HTML output. XSLT, or its companion language XSL-FO (XSL Formatting Objects), can be used to create the print output.

You would then have sample content, sample print and HTML output, and some metrics--the time it took to create the content, the informal DTD, and the associated stylesheets. Armed with this, you would be well positioned to plan a larger implementation--either with available in-house resources or by working with a vendor or system integrator.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at 9:08 PM

August 18, 2003

XML and the Writing Process

I have always liked a quote attributed to Tim Berners-Lee, early in the life of the Web. Commenting on HTML, he supposedly said, "Who would want to type this stuff?" Without knowing the context of his remarks, I have to guess he said it amidst a discussion of tools—and the need for tools to make the author's life easier in order for the Web to flourish as a medium.

Well here we are many years later; have things really changed all that much?

There are all kinds of HTML editors, of course. You can save HTML out of your Word documents and such. Moreover, XML authoring of content is increasing, and with it has come an increase in the number of XML editing tools.

Despite this growth in tools, authoring for the Web remains, by and large, a fairly difficult proposition. The tool I am currently using (MovableType) allows me to do rudimentary authoring (paragraphs and some formatting essentially) without resorting to hand coding. But the easiest way for me to accomplish slightly more complex formatting (lists, for example) is to dive back into HTML coding. Moreover, the editing window I use within the MovableType application does not have a couple of basic editorial tools I use pretty heavily--spell checking, a dictionary, and a thesaurus to start with. So I find myself writing in a word processor, then copying text over to the Movable Type window, reformatting it as necessary... really, there has to be an easier way than this, and I am not even talking about writing more lengthy documents or more complex text (with tables, math, and figures).

There are some emerging tools that are beginning to cover this gap. Ektron, for example, has its EWebEditPro and EWebEditPro+XML. These both provide an Active X control that provides a WYSIWIG editing interface where otherwise the user would face a plain text interface to XML and/or HTML. And Office 11 is adding additional support for XML.

The ultimate tool would combine a familiar word processing interface, including tools such as spell-checking and a dictionary, with an ability to automatically embed the appropriate HTML (and ideally XML) markup. Along with the content creation itself, users would be able to easily add, review, and update metadata related to the content. This would be a rudimentary set of functions for content creation, and should be the starting point for a solid tool.

Posted by Bill Trippe at 3:41 PM | Comments (2)

support this blog