Skip to content

good practice and innovation
about us infoKits Tools & Techniques Publications Events
You are here: Home » infoKits » Electronic Documents and Records Management » Stage 6: EDRM - defining the statement of requirements » Analogue Capture

Analogue capture

The National Archives A.2 requirements also refer to the fact that the system must be capable of importing images from a scanning or capture subsystem. However the TNA requirements do not specify the detailed requirements for a document capture subsystem. Hence, for the requirements listed in the checklist under (1.2) above, you need to decide how many of them you need at each phase of implementation and specify your requirements in more detail.

For scanning and digitising you need to indicate the size and type of documents you need to scan. We would always recommend specifying A3 scanners rather than A4 to cover any larger formats. If you have large maps and plans to scan then you need to consider large format scanners or decide whether you will send those documents off site to a bureau for scanning.

You need to consider whether you can set up one centralised scanning facility adjacent to the post room or whether you will need to support distributed scanning. If you move to a centralized scanning solution you need new procedures. Instead of staff from departments coming down to collect their post they need to come down and supervise post opening and scanning. You need procedures for handling payments, official documentation that needs to be returned to sender; junk mail; mail containing printed publications which may also be available in digital format etc.

If you opt for distributed scanning you may incur high software costs for each scan station and you will need to train a larger number of staff and supervise staff in several locations. One option is to provide distributed scanning facilities but all scanned images are routed to a single input queue where they are checked and first level indexed by central staff and routed back to the relevant departments.

You need to decide whether you want colour, greyscale or black and white scanners; whether you scan all documents double sided or selectively single or double sided. In most cases you will need a combination of flatbed scanners and rotary scanners. Flatbed scanners require some manual feeding of documents but can scan a wide range of document types. Rotary scanners can feed standard sets of documents automatically from a stack.

You then need to decide whether you want additional facilities apart from just image capture. The first is text recognition. The advantage of this is that if you scan the image of a document only then you have to index it manually which is labour intensive. You cannot search the full text of that document. If you go on and use recognition software to identify text characters in an image and code them then you can search on the full text of the document as if it was a word-processed document. The problem with this is that text recognition is not one hundred percent accurate. Most users will need a text recognition facility but they will also need the ability to switch it off for certain categories or batches of documents. If you are scanning valuable reference documents then you will use text recognition and clean them up. If you are scanning in invoices or handwritten correspondence you will switch it off.

The second option is paper forms processing. If you design forms, send them out to customers to enter data and send back to you and you then need to capture data from the forms and load it into a database and also keep a copy of the forms as evidence that the customer did order a specific service or answer yes to a specific survey question etc then forms processing can be a valuable facility. You need high volumes of forms - at least 500 per day - to make it worthwhile and you need to be capturing a significant volume of data from the forms. You also need to be in control of the forms so you can design them and print them in a format optimised for scanning. Institutions sending out high volumes of questionnaires may consider this option. Increasingly, where practical institutions are looking instead at electronic forms for customers to complete via the Web but in many cases you need to provide both options. Increasingly student administration is using electronic forms but if you still need to support paper forms in volume then consider forms processing.

The third option covers the automatic capture of data from uncontrolled forms. These are typically forms sent in by third parties such as invoices, direct debit mandates etc and you have no or little control over the layout or design of the form. This makes automatic data capture more difficult but there are packages that can do some intelligent capture.

The final option is the specialized case where you are scanning maps or plans or design drawings and once you have captured a digital image you want software to recognize graphic symbols such as arcs and curves and polygons etc and carry our raster to vector conversion so the symbols can then be processed in a graphics package such as a CAD package. Normally this would not be done routinely as part of a post scanning operation. This would be conducted as a post processing exercise in areas such as Estates or in departments that use GIS packages.


Bookmark and Share
If you can read this text, it means you are not experiencing the Plone design at its best. Plone makes heavy use of CSS, which means it is accessible to any internet browser, but the design needs a standards-compliant browser to look like we intended it. Just so you know ;)