66
The National Archives File Format Conversion Version: 1.2
Page 15 of 29
the conversion happening dynamically on
request. You may, however, store a
converted file to speed up any future
requests.
•
You do not need to convert a large number
of files in one go, which may be time
consuming.
•
Adding new files to the system is easy, as
you have no need to provide them in all
required formats upfront.
•
The system can be updated to provide
different formats as the need arises, again
without having to process all your existing
files up front.
adopt this strategy, you must assure
re
yourself that the conversion process is
sufficiently reliable for your requirements.
•
The systems you are using may not allow
you to issue dynamic requests for files in
different formats. For example, if your files
are accessed via a network file share,
there is no way to intervene an on-demand
conversion server.
•
The system will require updating as
different source formats are introduced.
•
On-demand conversion may be slow, or
place too great a load on your systems
depending on the size, complexity and
number of conversions.
•
This strategy generally only makes sense
for static information. If editing of the data
by users is required, then an on-demand
format conversion strategy may not work,
unless there is a clear master version, and
only that version can be changed.
3.2 Early and regular conversion
Early conversion means that you have decided to convert files to different formats as soon as
you can (but not on-demand). Early conversion is a batch-processing strategy, involving
converting a body of files in a common format into another which better fits your business
requirements and is generally a replacement process. For example, if you have decided to use
a newer format provided by some upgraded software, you may convert all your previous files
into the new format.
Benefits
Downsides
•
The number of different formats you need
to support is greatly reduced, converging
your files on to a standardised set of
formats. This can mean:
o information is always encoded in a
•
Each file has more frequent conversion
and each conversion has an associated
cost and risk of information loss.
•
If your original or new formats are fairly
60
The National Archives File Format Conversion Version: 1.2
Page 16 of 29
currently supported format
at
o reduced support, maintenance and
software licensing costs
o increased flexibility in choosing
alternate software to use
o the risk of file format obsolescence
becomes negligible.
•
You have the opportunity to review
information and allow for quality assurance
of the files. With frequent conversion,
these processes will be streamlined and
each conversion will benefit from previous
experience.
new
o conversion tools may not be as
readily available, may have bugs or
fail to deal with complex or unusual
files well. This can also impact both
the cost and quality of your
conversion process.
o the new format may not be as
widely supported, so you may also
have to create additional formats if
you need to share the information
with users who have not yet
upgraded.
•
If you need the same information to be
accessible in multiple formats, storing all
the converted files will take more space
than using on-demand conversion.
3.3 Late conversion
Late conversion means you have decided to defer conversion to the last sensible moment.
Obviously, the definition of “last sensible moment” will vary based on your own assessment of
the risks and benefits involved in your own environment.
For example, following a risk assessment of the file formats in use in your organisation, you may
find that you have a large amount of legacy information recorded in ten different file formats,
some of which are not accessible any more using current software. Some of this information
may not be needed for active business use; hence a preservation strategy is employed.
However, some of the information is still occasionally required, so a different format is selected
for this.
Benefits
Downsides
•
Each file has less frequent conversion,
therefore there is a lower risk of
information loss and lower overall costs
•
If your target format is well established
o there will probably be far more
•
You will have a greater variety of formats
in use in your organisation at any one time.
This can:
o increase support, maintenance and
software licensing costs
34
The National Archives File Format Conversion Version: 1.2
Page 17 of 29
conversion tools available to use
o existing conversion tools will
probably deal with unusual or
complex files better, as there has
been time for bugs and edge-cases
to be worked out.
•
You may be able to discard older
information no longer deemed useful to the
business, avoiding the need to convert at
all.
o reduce your flexibility to choose
se
different software.
o prevent older information from
being usable in newer contexts.
•
You will probably have to convert a greater
number of files and a greater variety of
formats in one go, making this a larger
project to manage and more complicated
to quality assess.
•
You may misjudge the “last sensible
moment” and find that converting some
information is now economically or
technically unfeasible.
•
If you need the same information to be
accessible in multiple formats, storing all
the converted files will take more space
than using on-demand conversion.
45
The National Archives File Format Conversion Version: 1.2
Page 18 of 29
4. How to convert formats
Any major format conversion project should be managed using your organisation’s change
management processes
9
including making appropriate impact assessments, risk analysis,
quality assurance and communications. You will need to work alongside a number of different
people in your organisation, including the relevant Information Asset Owners (IAOs) and primary
users of the information so that you understand their requirements, and that they understand
the changes.
This section presents a simple methodology for converting files from one format to another. It
will give you the steps you should go through in performing a file format conversion process,
and flag up areas of potential risk that you should consider.
Assuming you have already understood your drivers for conversion, and chosen when you need
to convert your files, you should follow the following four steps to convert your files:
•
assess your information (see section 4.1
)
•
assess your environment (section 4.2
)
•
select your migration tools (section 4.3
)
•
migrate your files (section 4.4
).
4.1 Assess your information
When assessing your information, you need to consider your business requirements – that is
how you need to be able to find, open, work with, understand and trust your information.
10
These requirements may not be immediately obvious and you should liaise with the owner and
principle users of the information to ensure all their requirements are met. This will help inform
whether the information contained in the formats you are migrating from have particular
characteristics that you want to ensure remain unchanged. Some conversion processes only
change the format of the underlying information, but many conversion processes will alter some
aspect of the information as well. In general, very simple types of information can survive a
conversion process without change, but complex information will be altered in some way.
9
Digital Continuity for Change Managers nationalarchives.gov.uk/documents/information-
management/digital-continuity-for-change-managers.pdf
10
See Identifying Information Assets and Business Requirements for more information
nationalarchives.gov.uk/documents/information-management/identify-information-assets.pdf
61
The National Archives File Format Conversion Version: 1.2
Page 19 of 29
For example, you may be converting from one document format to another. It is possible that
while the text of the document remains unchanged, the pagination, colours, styles and fonts
used within it will be altered in conversion.
Before conversion, identify the key characteristics of your information that must survive
conversion without (or with little) change. You should be aware that features which you do not
regard as essential may in fact be essential because of the way in which they have been used.
For example, while you may not regard the colours in a document to be important, users may
have annotated minutes using green to indicate things that are complete, and red for things that
are unfinished. Or the pagination of a document may change, breaking page references
embedded in the document, rendering a contract unusable. It is important to review your
information to determine whether changes to an aspect of your information can subtly affect the
meaning of it.
There are often some less obvious characteristics you also should consider, typically related to
complex or hidden functionality in the format. Below is a non-exhaustive list of a few of them:
Characteristic
Example
Factors to consider
Embedded metadata
Many formats allow various
pieces of descriptive metadata to
be embedded in them. For
example, documents recording
the author of the document, and
photographs recording the
geographic location at which it
was taken and the camera
settings used.
•
Whether any embedded
metadata is required in the
converted files.
•
Whether your conversion
tools will move this
information across.
Embedded objects
Many complex formats allow
other files or formats to be
embedded within them. For
example, documents may
contain embedded images or
spreadsheets, or presentations
may contain videos.
•
Not all conversion tools
will be able to deal with all
kinds of embedded
objects.
•
You must test files with
embedded objects to
quality assure that the
conversion process will
work for them.
Scripts and macros
Some formats can contain mini-
-
•
If you need the support of
49
The National Archives File Format Conversion Version: 1.2
Page 20 of 29
programming languages. For
r
example, documents often have
a macro feature which allows
common tasks to be automated.
In general, scripts and macros
do not survive conversion
processes, unless the
conversion is from one version of
a format to another of the same
format. Occasionally, another
format will provide the same
support for the same embedded
scripts or macros, or provide
equivalent ones, but this is rare.
scripts and macros in your
files, you may need to
rewrite these manually for
the newer format.
Digital signatures
Some files allow digital
signatures to be embedded
within them (or you may have
digital signatures in external
systems relating to those files).
Digital signatures validate that a
file was signed by an authorised
user, using strong cryptography
over all of the information in the
file to prove the assertion.
•
Any converted file, being
different to the original, will
lose this digital signature
(or the signature will no
longer be valid), and you
will need to produce a new
digital signature for it.
You must make sure that your new format supports the required capabilities and that the
conversion process will maintain the characteristics through the transfer. If your new format
does not support the capabilities you may need to re-evaluate your choice of format, or whether
to migrate at all. The process for assessing file formats is described in another document:
Evaluating your File Formats.
11
11
See Evaluating Your File Formats nationalarchives.gov.uk/documents/information-
management/evaluating-file-formats.pdf
Documents you may be interested
Documents you may be interested