The Library of Congress
Technical Standards for Digital Conversion
Of Text and Graphic Materials
"For the general public, the Congress has endorsed the creation of a National Digital Library
through a private-public partnership that will create high-quality content in electronic form and
thereby provide remote access to the most interesting and educationally valuable core of the
Library's Americana collections. Schools, libraries, and homes will have access to new and
important material in their own localities along with the same freedom readers have always had
within public reading rooms to interpret, rearrange, and use the material for their own individual
James Billington, Librarian of Congress, Fall 1995
By the time Dr. Billington announced the new National Digital Library, there already existed a
significant history of digitization at the Library of Congress (hereafter referred to as the Library).
The conversion of materials from the collections of the Library of Congress has roots in the pilot
projects and programs of the 1980’s. With the advent of the National Digital Library Program,
the Library staff began to develop a series of standards and best practices that have guided the
Library’s digital conversion programs. These standards have been modified over time; this
document presents the most current digitization standards available to-date at the Library, and also
features the historical documents on which the standards are based.
Over time, a digitization process has emerged at the Library that follows a pattern of planning,
content production, web assembly and site maintenance. This document focuses on the first two
stages of this process. Topics within the scope of this document include planning, digital image
capture (including device characterization, document handling, image quality standards, and
imaging workflow), digital file management (including file formats, naming and storage),
technical metadata (included technical, structural, preservation and descriptive metadata
contained in the TIF header tags), and quality assurance.
1.1 Project Planning
The Library sets out a standard procedure for planning the digital conversion of materials from
its collections. After specific material has been selected and the project goals defined, the
process focuses on a formal requirements analysis that documents each element of the
digitization process. In addition to general project descriptive information, the requirements
analysis focuses on the materials proposed, the general digitization specifications, copyright,
conservation, access, storage, and “digital object behavior” requirements.
The Library of Congress is careful to respect copyright and individual privacy rights. Copyright
research and privacy clearances are done outside the actual imaging process. A preliminary
assessment of copyright and privacy issues is part of the planning process and materials are not
scanned without an understanding that the project outcomes will be within the legal restrictions
of copyright and privacy laws.
This presentation focuses on imaging of text and graphic materials – it does not include audio
and video conversion standards. The following activities are also beyond the scope of this web
site, including those that take place prior to imaging (materials selection, preservation and
conservation), details of digital preservation, the development of descriptive metadata beyond
the TIFF header data, the creation of derivative image files and the web design process.
2 Current Technical Standards
These technical standards are intended to summarize current standards and best practices used at
the Library. These standards provide guidance for the production of the “master” image to be
retained by the Library in its “warehouse” storage area. A variety of derivative files may be
prepared from these master images for display on the Library’s web site or for distribution.
In the Library’s best practices, TIFF master files are produced to different standards depending
on the intended usage of the files -
Grayscale TIFF files are produced where color content does not exist or is not deemed
significant. Books, manuscripts and sheet music or books fall into this category.
Fine quality grayscale or color TIFF image files are produced where color information
exists in the content, or where the artifactual value is extremely important. Rare books,
maps, and photographs are within this category.
Rarely, bitonal TIFF files may be accepted when representation of the document content
is the sole requirement. This consideration is generally the result of external
2.1 Document Management and Handling
Library Divisions manage all collection materials and handling requirements set by the Division
must be adhered to. Additionally, the Library's Conservation Office must be involved from the
beginning of all imaging projects.
All materials must have a conservation assessment prior to scanning. Based on the assessment,
materials may need conservation treatment or re-housing before they are taken to the scanning
workstation. In all cases a complete document collation should be prepared before scanning.
The Division curator and Library's Conservation Office staff must approve all equipment used in
the scanning process. No equipment used for image capture shall damage original materials nor
shall the manner of its use cause damage. This includes, but is not limited to –
Book cradles and other supports for bound materials.
Weights and special supports for materials.
The contractor may use other physical supports if approved by the Library such as
flexible, wedge-like supports combined with materials to support the book spine
as the weight of the text block shifts during scanning.
The contractor shall not use any materials that may result in tearing or chipping of
pages, damage to the spine or to the text block, or damage to the area where the
text block is attached to the cover of the book.
The contractor may use a sheet of glass applied gently by the operator to the
single page that is being scanned. Any glass that spans a book gutter must have
Special supports for unbound materials.
Certain unbound materials, such as folded sheets of music, may require other types of
support. For example, fragile sheet music that has been folded for long periods of
time has a tendency to tear at the fold.
These types of folded sheets shall not be scanned with the crease pressed flat against
the scanning bed. While these sheets can normally be inverted and scanned page-by-
page on a book scanner and sometimes on a typical flatbed scanner, the area or page
that is not being scanned must be supported to prevent damage or undue stress to the
crease or to the pages themselves.
The contractor shall provide a support mechanism that will accommodate these
requirements. This support structure need not be elaborate, but must be functionally
adequate to meet the requirements.
Lighting equipment of all kinds
The Division curator and Library's Conservation Office staff must approve both
general environmental illumination and scanner specific lighting.
2.1.3 Handling Pictorial Materials
The capture device(s) and production workflow to be utilized shall not cause harm to the
materials being scanned. Harm may be caused by such factors as excessive handling, inversion
of fragile items, flattening, surface abrasion, excessive illumination, and excessive heat.
Most of the black-and-white photographic negatives to be scanned are medium-format (4x5 and
5x7 inches) safety film. Other negatives range in size from 35mm to 11x14 inches. Any nitrate-
based negative materials to be scanned will be identified in each project. Work with nitrate-
based film shall be completed in accordance with the special handling rules and requirements as
specified by the Library.
Color transparencies and color negatives range in size from mounted 2 x 2-inch slides to 8 x 10-
inch sheet films. Color film materials are typically housed in Mylar jackets or sleeves within an
additional paper sleeve. All film-based materials, such as black-and-white photonegatives and
color transparencies not in Mylar sleeves shall be handled with clean cotton gloves and resleeved
into their original housings. When rehousing, the emulsion side of film items shall face the non-
sealed side (the side without an adhesive seam) of the sleeve or jacket. It may be required that
glass negatives be scanned emulsion side up to prevent surface abrasion or image loss, and
laterally reversed during image post processing.
Items identified as either fragile or being curved, cupped, or warped shall not (1) be flattened
against or under glass or (2) turned face down for capture.
2.1.4 Other Materials
Specific handling instructions will be specified for each project. In consideration of the safety
of the collections, the Library may alter handling rules, workstation handling requirements, or
withdraw materials from scanning. In some situations, the Library may require that a Library
employee accompany and/or handle the material at the scanning station.
2.2 Scanner Characterization
The evaluation of technical image quality and adherence to standards generally is approached in
two stages. First, scanners are “characterized” through the use of sample images and targets
designed to measure tonal reproduction, dynamic range, resolution, noise, color accuracy and
additional characteristics that are determined to be of particular importance for a given
application. This data helps in the selection of an appropriate scanner for the project and helps
operators install, configure, and set the equipment controls properly. Second, a target may be
included with every image or on specific occasions such as the first and last image of a
document. This data helps insure that the ongoing imaging work maintains the original quality
specifications. Historically, the Library has used the USAF 1993 Visual Resolution Test Target
(and similar derivatives) to establish that visual resolution is in compliance with requirements.
Tonal representation has been evaluated through visual inspection of sample and actual images
selected from scanner output.
Now ISO standards for the measurement of the appropriate image quality elements have been
approved and the Library has begun to test and characterize scanners through the use of Standard
ISO targets as well as with sample images. (Note that a more detailed discussion of image
evaluation is provided in the section of Quality Assurance later in this document.) The Library is
currently implementing test procedures following the standards listed below:
ISO No. Date
Function (tonal reproduction)
Electronic still-picture cameras -- Methods for
measuring optoelectronic conversion functions (OECFs)
Resolution -- Spatial
Electronic scanners for photographic images -- Part 1:
Scanners for reflective media
Resolution -- Spatial
Electronic scanners for photographic images -- Part 2:
Electronic still picture imaging
Electronic scanners for photographic images
ICC Specification Revision
ICC.1:1998File format for color profiles, version 4.1
2.2.1 Standard Targets and Tests
2.2.1 - A
In order to verify the calibration of the scanning equipment and to ensure the best possible
images, the Library requires that certain standard targets be scanned and that procedures for use
of the specified targets be followed. The Library may select targets that are appropriate to the
project and these are to be scanned and submitted prior to work commencing. Additionally, the
Library will require delivery of specified scanned technical targets during the installation and
configuration of scanning equipment, and may require them during the ongoing production of
images. Targets to be scanned and delivered will be specified by the Library at the time a project
The Library will provide standard targets to be scanned by the service provider. Corporate
owned targets may be used upon approval by the Library. The targets shall be delivered as
image files for subsequent analysis by the Library. The contractor may include the company
analysis and interpretation.
Target images may be required at the following times:
Prior to the initiation of a project (before a scanner is installed at the Library);
Upon initial installation of equipment at the Library;
When new equipment is installed;
Whenever a new operator is trained to operate the scanning or post-processing
Whenever the Library’s quality review indicates a significant increase in quality
problems. The Library will notify the contractor of this requirement;
On a regular basis for projects when many batches are delivered over a period of
performance greater than one month.
In general the Library expects imaging equipment set to yield images at 1:1 using optical
resolution without resampling. Thus a scanner with 6000 pixels in the long dimension (as
reported in TIF 256 tag, ImageWidth) might be set at a working height to yield 300 pixels per
inch (reported in the 282 tag, XResolution). A document as large as 20” on it’s long dimension
could be scanned. A straight vertical line 1/100 of an inch wide on a document would then show
on the image as a straight vertical line exactly 3 pixels wide. The line tonality on the image
would be uniform and similar to the tonality on the document, the line edges would be straight
and precise, and no stray pixels on either side on the line would be darkened. Unfortunately,
many scanners that are presented as “300ppi” in manufacturer’s literature cannot image the line
precisely. Until recently, the standard test of scanner resolution has been a target prepared with
pairs of lines of various widths presented for visual inspection. This test is not very rigorous –
reviewers can distinguish fine lines that are not uniform in tonality and there may be many stray
pixels inaccurately darkened; images from such equipment may look fuzzy and lack fine detail.
ISO standard 16067 was designed to overcome this problem. Frequently the results of the ISO
standard measurement are considerably lower than reported under the older visual inspection
method. Currently the Library uses both visual and ISO 16067 targets to provide a more
complete indication of scanner resolution.
2.2.1.A.1 Resolution Targets
The standard measure of resolution is based on Modulation Transfer Function (MTF) per
ISO-16067-1 (for reflective materials) or ISO-16067-2 (for transmission materials) using slant-
edge targets such as the QA-61 or QA-62 Targets. Visual measurements of resolution based on
ISO 12223 Standard targets, on the USAF 1951 test target, or on the RIT Alphanumeric target
may be used to supplement the MTF measurements.
Software produced MTF/SFR curves of vertical and horizontal resolution in both
center and at least one corner.
Visual inspection of the center and at least one corner for both vertical and horizontal
resolution may supplement the MTF analysis.
For content presentations that will not involve OCR, the current standard is a
minimum visual resolution of 300 ppi in the image center (both measurements) with
minimal loss of quality in the corners, and a minimum MTF10 resolution of 300 ppi
in the image center (both vertical and horizontal measurements) with minimum loss
of quality in the corners. A minimum resolution of 400 ppi is considered standard
practice with 300 ppi generally only used for large format materials where lower
resolution is mandated by device limitations and stitching is not practical or desirable.
For images that contain text that will be OCR’ed, the standard will be 400 ppi for
both visual and MTF measurements.
For Rare Book and other special materials the standard is a minimum of 400 ppi and
may be higher as planned project outcomes require.
2.2.1 - B
Eight-bit grayscale is the minimum bit depth required for any digital conversion work at the
Library. Previous and current standards require a visual comparison between the original and the
image. The Library is beginning to formally characterize scanners using the ISO 14524 OECF
targets listed below.
Targets for tonal analysis
Sample scans of selected materials.
ISO-16067-1 based scanner targets such as the Kodak Q-13 target or the QA-61 and
QA-62 targets. (These targets commonly present a 20 step gray scale although
specific targets may vary. When different targets are used, the software necessary to
analyze the target may be requested by the Library)
Software analysis of a 20 patch grayscale to determine the number of discernable
steps, the relationship between steps, and the gamma of the tone curve.
Visual inspection of the sample images to confirm that the tonal match is very close
and that details in the dark and near white areas of the original have not been lost in
Visual inspection of the target to determine distinction of patches, particularly in the
dark grays near black and light grays near white.
No loss of detail in dark and light gray areas of the original.
A minimum of 18 steps should be visible on a 20 step scale.
Software analysis should show appropriate density steps and gamma.
2.2.1 - C
The dynamic range of an image is the ratio of the darkest area to the lightest area of the image.
The range of reflective materials is limited – a scanner should be able to reproduce a similar
2.2.1.C.1 Targets for dynamic range analysis
The grayscale targets used to analyze tonality will also be used to analyze dynamic
The density difference between the darkest and lightest discernable patches shall be
determined by visual inspection.
Software will provide similar measurements
For reflective 5.5 f-stops or greater (6 f-stops or 1.9 db is preferred),
For transmission 8 f-stops, (10 f-stops or 3.0db is preferred),
2.2.1 - D
Noise introduced by the scanner must be limited and well controlled. Noise is most visible in
broad areas of tonality such as the sky in a photograph or the page background of a manuscript
2.2.1.D.1 Targets for noise analysis
The grayscale targets used to analyze tonality will also be used to analyze noise.
The dark patches will be inspected for visible noise.
Software curves and analytics will be generated.
Software analysis should show well-controlled noise with minimal RGB and
luminance channel variations.
An average luminance channel noise of approximately Y <=5% is expected.
2.2.1 - E
The Library is creating color images for many projects. Two problems are apparent: capturing
accurate color at the scanner and providing information to users that informs them how to
display and prints images with reasonably accurate color. The first problem is being analyzed
using the targets, tests, and standards listed below. The Library is now examining how color
profiles are created, checked, and placed within a TIFF image tagged field so that the user is
provided with the necessary data for accurate color information.
2.2.1.E.1 Targets for color accuracy analysis
Gretag Macbeth ColorChecker - the large patch 8.5” x 11.5” target.
The Gretag Macbeth Digital Color Checker when appropriate analytic software is
Visual inspection under ISO standard viewing conditions.
Software generated comparative analysis and delta-E.
A delta-E of less than 8 is expected.
2.2.1 - F
The Library will supply representative sample page images of text for any project that includes
The Library will OCR and evaluate the supplied sample page images
OCR test results on library materials vary greatly.
The Library has found that 400 ppi images produce improved OCR.
The Library’s general benchmark is 90% word accuracy on the text sample provided, but
different standards may be specified for special materials, such as pre-1820 newspapers .
2.2.1 - G
The Library is currently experimenting with controlled viewing conditions and monitor
calibration to establish standard environment(s) for reviewing and analyzing target and sample
images. New standards may be published on these topics soon.
2.3 Image Acquisition
2.3.1 Imaging procedures
Many procedures followed in the Library’s digitization process are specific to the resources and
policies of the Library. However certain standards are central to all imaging projects, including
those performed by contractors either onsite or off.
2.3.1 - A
Target and test scans
Every imaging project requires certain target and test scans. Prior to beginning document
scanning, equipment operators should image the set of targets necessary to characterize the
scanner as described previously. This target set should be repeated as needed to insure that all
scanning throughout the project meets the standards set at project startup.
If a significant number of images within a batch fail to meet the project specifications,
the Library may require the entire batch be rescanned.
At project startup and on occasion throughout the project, sample scans of typical documents
will be requested and evaluated as described below to insure that quality standards are met
throughout the project.
Documents you may be interested
Documents you may be interested