December 18, 2013

EMC Archiving Strategy: From Documentum to InfoArchive (formerly EAS)

Written by: Tim Nelms and Jeff Hoopes – CrawfordTech Archiving Division

Where did we come from and where are we going? What should drive your archiving strategy with EMC IIG?

Documentum is the without a doubt one of the heavy weights of the content management and archiving market. Having been around since the early 1990’s it has earned a reputation as a sophisticated Content Management platform for large enterprises. From a very early stage in its development Documentum adopted features supporting complex content management functions such as life cycles, check-in/check-out, virtual document management, workflows, which were closely integrated with a sophisticated security model. As a result Documentum’s sweet-spot became associated the complex works flows of that document authors needed industries such as the creative marketing, life sciences companies and engineering. What these had in common was a need to closely manage and audit changes to (in general) human authored content.
When did Documentum really take hold and what are the benefits?
Whilst complex document authoring was (and is) a core market for Documentum it wasn't until 2003 that Documentum began to seriously look at leveraging the platform for archiving. Since the 1980s, IBM had developed a market for high-volume content archiving with the Content Manager On Demand (CMOD) platform. EMC naturally saw this as market as an opportunity, so from 2003-2008 EMC Documentum developed a comprehensive strategy around high-volume archiving that included print streams, scanned content, enterprise applications and email. Variously these offerings became known as Archive Services for Reports, Archive Services for Images, Archive Services for SAP and Archiver Services for Email.
EMC Documentum’s archiving applications became very popular, but highlighted the challenges of adapting what was essentially a strong content authoring platform into an archiving platform. In particular Documentum had a rich meta-model - so rich that it overburdened archiving solutions with meta-data that was more appropriate to document authoring. Hence in about 2007 Documentum introduced light-weight system objects that made Documentum far more scalable as an archiving platform.
Despite these developments, by 2008 the pace of innovation had slowed and it seemed as if the opportunity to build a single enterprise archive platform was slipping away from EMC. On reflection the slowdown in innovation allowed the product lines to mature at their own pace and tackle the distinct needs of their user bases. But at the same time a new approach to archiving was being conceived the EMC team in Europe.
The Enterprise Archive Solution (recently re-launched as InfoArchive) was born out of initiative at a large, French bank to build a high volume archive for both transaction documents such as invoices and statements and structured data from core banking applications, but also later the solution needed to be adapted to support archiving for archiving high volume transactional data such as Single European Payment Authority (SEPA) transactions. Whilst transactional documents represented up to a billion objects the latter need for structured data archiving required tens of billions of objects to be archived. EMC needed a new approach to this scale of problem.
During its regular series of acquisitions during the 00’s EMC IIG had acquired X-Hive, an author of market leading XML database, xDB. It was this that was to provide the key enabling technology for the envisaged enterprise archive solution.
EMC’s InfoArchive solution (as of Jan. 21st) cleverly combined the strengths of Documentum for protected information at rest with the granularity and scalability of xDB to deliver a next generation platform for archiving based on open standards.
To compare Documentum and InfoArchive is as to compare apples and oranges. But here are our thoughts:
  Documentum  InfoArchive (formerly EAS)
Workflow & Case Management Documentum archives leverage products like xCP and Captiva which make it easier to inte-grate with business process and workflow Whilst InfoArchive has plenty of integration points with workflow and enterprise applications it does not have a tightly integrated case management platform (yet).
Volumes We have seen Documentum successfully deployed with ar-chives of up to 500,000 content objects. And whilst this is by no means the limit and good enough for many document archives it is not as scalable as InfoArchive. InfoArchive has been proven to be capable of archiving 10’s of billions of objects and. Through its two stage search mechanism users have incredible control over search and retrieval of large archived data sets.
Structured Data It would be fair to say that out-side of SAP this has never been Documentum’s sweet spot. InfoArchive was built pretty much from day one to support structured data archiving and has the most elegant and effective model for structured data archiving we have seen.
Future Proof Version 7.0 is the latest release of the Documentum platform and it is certainly in no danger of dying-out. We expect to continue to see improvements over the coming years. However as EMC has started to acknowledge, a ‘next-gen' platform is in development and will have an architecture designed to support the cloud. InfoArchive was built leveraging xDB - a fundamental building block of IIG’s ‘next-gen’ architecture. It seems reasonable to assume that EAS will be more strategic to EMC’s future archive platform. We are certainly investing our resources into InfoArchive at the moment.
Print-Stream Archiving (Report Archiving) From 2004 onward Documentum was successful adopted by customers for print stream archiving. Whilst these customers remain well supported we recommend that they evaluate InfoArchive as a platform for their archiving needs. InfoArchive has been very successfully deployed for print stream archiving and if you need both structured data archiving and print stream archiving it is a must.
So our analysis is that EMC IIG’s archiving strategy (some years in the making) looks set to return with renewed strength. Our own investments are nowadays associated with the InfoArchive solution, and I would heartily recommend this to customers with structured data archiving and print-stream archiving.
CrawfordTech is a long term EMC IIG partner for print-stream archive and over 30 customers around the world use our joint solutions. Our marketing leading print-stream archiving offerings for the Documentum and InfoArchive platforms are designed to support the high volume archiving needs of the banking, insurance and utility industries.
For more information about CrawfordTech’s print-stream archiving products for EMC visit our Document Archive Solutions and be sure to see our solutions by industry.