Skip to content

good practice and innovation
about us infoKits Tools & Techniques Publications Events
You are here: Home » infoKits » Electronic Documents and Records Management » Stage 1: Positioning EDRM » Input

Input

In order to manage your content you need to provide facilities to capture electronic content and analogue (paper, film etc) content and associated data for indexing or for loading into other databases etc.

Electronic capture

When we talk about electronic capture we usually mean the system is integrated with the software applications we use to create content so we can create content and save it into the system where it will be managed for us. Examples include interfaces to office suites so we can save word processing documents, spreadsheets and Powerpoints into the system plus interfaces with other applications so we can save CAD files and graphics files and e-mails and attachments into the system. In short every electronic content object. In content management applications this will include content entry via browser-based templates to impose structure and Web publishing styles.

Increasingly electronic capture also includes the ability to capture electronic content and data created on other business administration systems or other third party content management systems. This includes the ability to capture content created on finance or HR applications or student administration systems or from third party web sites and other content management systems. Increasingly the interchange format that powers this interchange is XML. It also includes the capture of data via electronic forms and, where required for legal purposes the capture of an image of each electronic form.

Analogue/paper capture

In most cases then in addition to capturing electronic content you also need facilities to capture content held on paper or film or in other analogue formats by digitising it. You will need a document and data capture subsystem to scan and digitise analogue content and to provide you with facilities for capturing data from the documents for indexing (metadata) or for loading into other databases if you are capturing forms etc. The data capture can be done automatically or semi automatically or manually.

There are a number of options when it comes to capturing paper documents and data. The following represent five options that you should be aware of.

  • Document image capture

  • Document image and unstructured full text capture

  • Document image capture and structured forms processing

  • Document image and semi-structured data capture

Document image capture

The first option is the traditional document image processing requirement. The systems allow you to scan paper documents or microfilm images or slides etc and capture digital bitonal, greyscale or colour images. They provide facilities for keying in metadata or for capturing metadata from a barcode on the document and facilities for checking the quality of the captured image and metadata, rejecting and rescanning or rekeying. They provide facilities for loading the image and metadata onto your management system.

Most users will need to scan office documents up to A3 size but the subsystems also support large format scanners that scan documents up to A0 size or larger. You may also need to scan microfilm formats or 35mm slides etc. In this core application the only recognition software that tends to be supported is bar code reading. This is still the biggest market for document and data capture. Increasingly, however, users are now looking to move on to more sophisticated subsystems that can capture the text of a document or the data held on a form. Manually keying data from an image is very labour intensive so any system that can reduce the effort involved in data capture should be looked at carefully.


Bookmark and Share
If you can read this text, it means you are not experiencing the Plone design at its best. Plone makes heavy use of CSS, which means it is accessible to any internet browser, but the design needs a standards-compliant browser to look like we intended it. Just so you know ;)