The Administration Module
Kofax Capture Administrator's Guide
Similar Sample Pages
If all your sample pages are very similar, turning off page-level form identification could
result in a slight performance improvement. For example, you might be processing different
revisions of the same form. Each revision contains a marker (such as a form revision number
or a box) clearly printed in a known location on every revision.
If page-level form identification is selected, it will probably “pass” all revisions of the form
and rely on zone-level identification to distinguish between the samples. Since revision
markers are often printed in letters suitable for OCR processing or encoded in bar codes, the
results of a carefully drawn form identification zone are very reliable. In this case, bypassing
page-level form identification removes the overhead of the page-level processing without
diminishing successful results.
You may also be able to identify your forms based on certain geometric shapes found on the
form. In such cases, you can use a recognition profile for your form identification zone.
Images with Varying DPIs
If you are processing images with varying dpis or images with a different dpi than the
associated sample pages, page-level form identification fails. In this case, you should turn off
page-level identification and rely on a form identification zone to identify your forms.
One solution is to create multiple form types with identical settings, but use sample pages
scanned at all expected dpis. For example, consider this scenario: You know that some of
your forms are scanned at 200 dpi and others are scanned at 300 dpi. To account for the
differences in dpi, you can create one form type with sample pages scanned at 200 dpi and
another form type with identical settings and sample pages scanned at 300 dpi. Page-level
form identification is able to match the images against the sample pages with the same dpi.
If the preceding solution does not work (for example, you have no way of knowing the dpis
of the images you’ll be processing), you can use a form identification zone instead.
Scanning forms printed on very thin paper (such as onion skin or pressure-sensitive
multipage form paper) may produce “stretched” or otherwise distorted images. Distorted
images are typically the result of how a scanner’s feeding mechanism pulls the pages through
the scanner. Since the scanner won’t distort all images in exactly the same way, the results of
page-level identification are very low.
Even with the distortion, it may be possible to identify an area on the image that can be used
as a reliable form identification zone. If so, you should disable page-level identification and
rely on a form identification zone to identify your forms.
Image Variations on Sample Pages
Some forms contain sections that are standard, and other sections that can vary from form to
form. For example, the top half of a form might have a standard header format that is always
filled in the same way. The bottom half of the form might be non-standard and contain free-
▪ If you are processing this type of form, page-level identification fails. One solution is to
scan a sample page with the free-form section “blanked out.” (For example, you can tape a
white sheet of paper over the free-form area.) This may allow the pages you are scanning to