December 15, 2016

Understanding Your Archive: Storage Requirements, Pt. 2

Archive Storage

Our previous article in this Technical Series on Advanced Function Presentation (AFP), Understanding Your Archive: Storage Requirements, Pt. 1, looked at the data types within a Customer Communications Archive. This time we consider storage and retrieval options available for long term archiving.

Documents in a Customer Communications Archive are static and thus are no longer updated as a standard part of any business process. Some may be accessed through web portals by internal and external customers, and some retrieved for audit or compliance purposes but the vast majority of Customer Communications will never be looked at again.

Depending on the composition tool or line of business system that generated the documents they can either be stored individually, in their own file (burst) or as part of a larger file (non burst).

Burst Mode

With this type of storage technique, each communication is archived as its own self-contained document. PDF or PDF/A is the most common format used, with each document stored in burst mode managed as a separate entity. There are greater storage overheads with this technique because font and resource data is replicated in each document and the amount required will vary on a case by case basis.

Fortunately, if the documents are not encrypted then techniques exist to reduce storage requirements by 80% or more.

Non Burst Mode

In non burst mode the file is stored with a single set of resources but containing multiple documents. Each is indexed by page number for identification within the file. A typical Advanced Function Presentation (AFP), file will contain one set of resources but thousands of documents. Creating a file with this structure is not as difficult as it might seem, particularly as many document composition systems already implement this method by default for creating print files etc.

Additional Tools

For individual files which already contain resources, tools such as PRO Concatenator can identify and remove duplicate resources as the files are combined and loaded into the archive. These tools can be used in addition to other solutions that check for commonality across an entire archive, ensuring that only a single copy of the resource is held.

Retrieval software then re-combines the resources and individual document pages for the user before presentation. These documents can include their own copy of the resources (PDF/A) and can be digitally signed, for example, without compromising archive integrity.

Opportunities

By understanding the storage, compression and retrieval techniques that are available when files are created or loaded in to an archive, important business decisions can be made. These may include moving documents from legacy platforms, eliminating costly and difficult to maintain systems from the IT infrastructure, and initiating storage savings during migration. Other options include adding transpromo messaging on transactional documents or creating accessible documents while managing and maintaining your storage footprint.

Next time we explore Dynamic Document Retrieval and how this uses your Customer Communications Archive.