Article
0 comment

Importing NewsML and XML

A recent gig building a customised content management module to work with NewsML has made me look at how content management systems out there can work with NewsML and other XML-based structures.

NewsML is an XML standard specifically designed by IPTC (International Press Telecommunications Council) for structuring news articles and news article metadata. Since it’s XML-based, it’s independent of media, so it can be manipulated for a variety of uses (web, hand-held devices and so on), from communication in between agency systems, to RSS and other syndications, and efficient archiving. Oh, and of course it has its own website: www.newsml.org.

Loads of news services use it, from Reuters to Agence France Presse, from Business Wire to The Irish Times.

Workforce Guardian needed to publish employment relations news articles on their website and their secure service platform, sourced from a subscribed-to news agency feed using NewsML. To complicate matters, they source news from a variety of services, so one solution had to fit several XML schemas, as well as allow for traditional copy-from-Word content. They researched CM systems that would do this, and – guess what – they found none. Try searching CMS Watch for ‘import XML’: nada.

This turned out to be a neat custom module that I was able to build for Workforce Guardian using PHP, but it begs the question: how should existing CM systems out there work with current and emerging XML-based standards?

Can your CMS import XML?

All CM systems I know of have a WYSIWYG HTML authoring space that includes pasting content from MS Word documents. And that’s fair enough; Word is pretty much the proxy standard text content format of the business world. But as Web 2.0 really sinks in, and we get used to consuming more portable online content, I think the systems that businesses have bought should also be able to consume – and re-publish – content formats other than Word.

Can your CMS export XML?

It stands to reason that if a CMS can import XML content, it’d be good if it could export as XML too. Plone is the only CMS I know of that can do this, and any CMS that is RSS-enabled is effectively exporting content as XML.

RSS: just the beginning

The business world (apart from news agencies obviously) is really only just starting to grasp the potential of RSS. While RSS on intranets is becoming more common now, it’s only a matter of time before businesses will need to publish feeds of content more diverse than just news stories. And that’s a great opportunity for any CMS out there.

(By the way: if anyone knows of any CMS that can do these things, please feel free to let me know…)