How Much Content is Structured?

September 16, 2003

The conventional wisdom has recently been that 20% of content is structured, and the rest is unstructured. This factoid may be loosely borrowed from an older number, which is that 20% of an organization’s data assets are in some kind of structured form (typically relational database tables). I wonder if either number really holds up.

I think it is the rare organization that has a great deal of “document” content in a structure such as XML (or SGML for older materials). In my experience, most large collections of XML-tagged content represent one silo of an organization’s data (the technical manuals in a manufacturing company, or the catalog data in, say, an electrical supply company). Some legacy content may never end up in a structured form, but the toolsets are almost there to allow us to ask the question, “Should all new content, born digitally, be structured?”

I’m not sure I am ready to answer the question. Are you?

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at September 16, 2003 11:01 PM

Comments

STRUCTURED CONTENT IS IN THE EYE OF THE CREATOR.
Hey Bill,
Just found this page off the CMSwatch site. It's been a helpful source as I strive for CM knowledge and understanding.

I believe structure, in the accessibility and management sense, can only truly start to exist when the creator has the tools (templates, work area platforms, principles and graphical formatting, architects) to make information usable, thus becoming content.

You mentioned the toolsets to let us ask the question. But where is the human element in digital content?
Your thoughts, please?

Enjoy your ideas.

ChrisMac

Posted by ChrisMac at September 19, 2003 4:54 PM

Hello Chris,

And thanks for your excellent post.

You are absolutely right. The best tools are merely supporting human judgment and decisions. In my experience, people bring at least two kinds of expertise to the content creation process—knowledge of the content itself (i.e., domain expertise) and understanding of how to best convey the information.

For example, I've done projects in certain vertical industries (automotive and aerospace come to mind) where the content creators are subject matter experts. Technical manuals are often formulaic in presentation—it is the technical detail the author provides that is paramount. In other areas (for instance e-learning and reference publishing), the content creators often bring a great deal of information design to the process.

Content authoring and content maintenance tools should support both of these kinds of content creators. To date, I think the support is still rudimentary, though the first kind of content creator is typically better supported than the latter.

Thanks again for your post.

Bill Trippe
btrippe@nmpub.com

Posted by Bill Trippe at September 19, 2003 10:01 PM

Post a comment

Comments for this entry have been closed.

support this blog