23
Computer-Assisted Translation Tools
The Translator’s Tool Box - © International Writers’ Group, LLC 303
As mentioned above, Xbench (see www.xbench.net) distinguishes itself by the
large range of bilingual file formats it supports and has become the preferred
tool for many in the field of QA tools.
Figure 202: Supported file formats in Xbench
Aside from these desktop-based tools, there are also cloud-based quality
assurance tools for which you upload all or part of your content to an online
location.
This includes the Russian tool multiQA (see multiqa.com). Though multiQA
performs the typical variety of QA checks (albeit in a more limited fashion and
for a more limited number of formats), it really distinguishes itself with
terminology checks using specifically developed morphological engines (for
English, Russian, German, Ukrainian, Kazakh, Chinese, Spanish, Polish,
Slovak, Czech and Norwegian). This means that it allows you to check your
translation against a terminology database or glossary and won’t give you a
"false positive" when you have the singular nominative form in the termbase
and a plural form in the translation.
A tool that was developed specifically for the application of the MQM quality
evaluation framework is the open-source tool translate5 (see
www.translate5.net). It presently only supports the Trados Studio-specific
format SDLXLIFF as well as generic CSV files, but since it’s open-source it can
be developed into supporting a large range of bilingual file formats.
22
Computer-Assisted Translation Tools
304
The Translator’s Tool Box - © International Writers’ Group, LLC
Here is what the interface of translate5 looks like:
Figure 203: translate5’s interface with an example of a critical terminology error
Since the interface is highly customizable, it might look very different when
you use it. The point is that the reviewer just needs to highlight any part of
the segment in question and select the necessary quality category from a
hierarchy-based picklist. Once that is done, you’ll immediately be able to see
tags added to the source or target text around the problematic sub-segment.
And when you’re done with your review, you can filter the text according to
the quality metrics and export or view reports and final assessments on them.
This approach is really helpful because it’s a completely browser-based, and
therefore highly accessible, process (even for the reviewer on the client’s side
who does not have access to a translation environment tool). And yet, any
changes that you or any proofreader or editor makes in translate5’s interface
can be reimported into Trados Studio as tracked changes, and comments that
were entered in translate5 can be transferred to Trados Studio as well.
Source Document Quality Assurance
Quality assurance tools that don’t rely on translation memory technology
include Acrocheck (see www.acrolinx.com) and FormatCheckers for Word and
FrameMaker from Star (see star-group.net/en/products/formatchecker.html).
15
Computer-Assisted Translation Tools
The Translator’s Tool Box - © International Writers’ Group, LLC 305
Acrocheck is a corporate tool to check terminology, style and grammar of
documents in German, English and French. That doesn’t sound too fancy, but
I was really impressed at the program’s level of "intelligence." For instance,
you can load customized style sheets or modify the pre-loaded ones to make
the program steer authors to stay within the desired parameters. One of the
goals of this is to optimize your return from translation memory, because the
more unified the source text, the better by far the return from TM.
Unfortunately, there is no desktop version available. However, the makers of
Acrocheck are looking at marketing a desktop version in the mid-term future.
A similar tool, though not quite as comprehensive (but available in a desktop
version), is Star’s FormatCheckers for Word and FrameMaker. This tool checks
about 50 different potential errors in Word or FrameMaker documents,
ranging from typographical errors to duplicated spaces, paragraph marks,
15
Computer-Assisted Translation Tools
306
The Translator’s Tool Box - © International Writers’ Group, LLC
manual references, and many others. Much like Acrocheck, the intention is to
create well-formed documents before the translation even starts, thus aiming
at a better return on translation memory matches and/or better entry of data
into the translation memory.
Figure 204: Error checking in Star’s FormatChecker
A slightly different and yet powerful way of also maximizing translation
memory content is to author—i.e., write—the source document on the basis of
the translation memory. There are several tools that offer this feature,
including Congree, based on a partnership between the TEnT vendor Across
and the Society for the Promotion of Applied Information Sciences at the
Saarland University (see www.congree.com) and Star MindReader (see star-
group.net/en/products/mindreader.html).
11
Computer-Assisted Translation Tools
The Translator’s Tool Box - © International Writers’ Group, LLC 307
Though these do not strictly fall into the category of quality assurance, this is
a particularly exciting family of tools. Tools that allow authoring on the basis
of a translation memory not only extend the use of the translation memory—it
is obvious that you will have a huge number of matches in the translation
portion of a project if you adjust your writing to the source part of the
translation memory in the first place—but it also offers a whole new world of
opportunities to language providers! All of a sudden, authoring may become a
much easier new service portfolio item for individuals or companies who have
so far specialized in translation only.
17
Computer-Assisted Translation Tools
308
The Translator’s Tool Box - © International Writers’ Group, LLC
Translation Memory Quality Maintenance
Some of the above-mentioned tools provide quality assurance features for
existing translation memories, but none of them allows you to maintain
translation memories on a large scale. While some TEnTs offer decent features
to do that, many do not. Olifant (see okapi.sourceforge.net/Release/Olifant/
Help/) allows you to open (and save) TMX, Trados 2007 and Wordfast
translation memories and filter and manipulate them to your heart’s content,
all in a clean, multi-column interface.
Figure 205: Open TMX file in Olifant with the available commands in the View menu
The TEnT developer Heartsome also developed the TMX Editor, an editor for
the translation memory exchange format. It’s Java-based so it runs on all
platforms, and it’s surprisingly fast. Just like the translation environment tool
(see page 232), it has also been released for free with an open-source license.
You can download it at github.com/heartsome/tmxeditor8.
13
Computer-Assisted Translation Tools
The Translator’s Tool Box - © International Writers’ Group, LLC 309
As you can see in the following screenshot, it also offers a quality assurance
filter that you apply to your TMX file. Once problematic translation units are
found, you can either batch change or delete them or process them
individually. Other features include the modification and/or adding of
metadata (data about the translation unit), changing of the code page,
merging or splitting TMX files, or exporting TMX files into a great number of
other formats, including a number of text formats and Word or Excel formats.
Figure 206: Heartsome’s TMX Editor
Apsic Xbench (see page 317) also allows for the conversion of translation
memory formats or other database exchange formats, and so do some of the
tools that are provided along with Swordfish (see page 232).
21
Computer-Assisted Translation Tools
310
The Translator’s Tool Box - © International Writers’ Group, LLC
Terminology Mining
Terminology mining programs offer the possibility of extracting terminology
and building up terminology databases or glossaries by taking existing pairs of
source and target documents or bilingual translation memories, analyzing
them, and presenting you with a proposed translated terminology list. Once
this list is generated, it can be used as either a primary glossary for a project
(or to send to the client), or as a common glossary that can be shared among
multiple translators working on this project.
A number of TEnTs, including Across, CafeTran, Déjà Vu, Fluency, memoQ,
Star Transit, Text United and XTM, offer monolingual term extraction as a
regular part of the ongoing translation process or as a separate feature to
build up glossaries or terminology databases.
There are standalone tools for this process as well. SDL’s MultiTerm Extract
(see www.translationzone.com/products/sdl-multiterm/extract) works on a
purely mathematical level ("if word A always appears in sentences for which
word B always appears in the translated sentence, then these words must
form a word pair"). This means it supports all Windows-based languages.
Figure 207: Trados’ MultiTerm Extract (formerly ExtraTerm)
18
Computer-Assisted Translation Tools
The Translator’s Tool Box - © International Writers’ Group, LLC 311
Other tools use a combination of mathematics and linguistic data.
The most powerful application in the field of term extraction used to be the
Xerox Terminology Suite (XTS), which was designed for the deep pockets of
corporate users and was very powerful because it was based on preconfigured
linguistic data in various languages. Today the suite is owned by TEMIS, which
later was acquired by Expert System, effectively halting any development for
translation-related purposes.
However, the translation environment tool Similis (see page 212) has
integrated the XTS engine and therefore comes with a very high-level
linguistic "knowledge" in seven EU languages (English, Dutch, German,
Spanish, Italian, Portuguese and French). Similis is able to apply a
combination of linguistic and statistical rules to a number of processes,
including automatic extraction of terms and phrases from translation memory
content with extremely high accuracy—but unfortunately only in a handful of
languages.
Figure 208: Similis with automatically extracted terminology
20
Computer-Assisted Translation Tools
312
The Translator’s Tool Box - © International Writers’ Group, LLC
Another terminology extraction tool is SynchroTerm (see
www.terminotix.com). SynchroTerm aligns texts in their various formats to
bitexts, extracts terminology, and presents long lists of terminology with
reference information that can then be verified, annotated, and exported to a
number of formats, including Logiterm , the two SDL MultiTerm formats, Excel
and the machine translation tool PROMT.
Aside from extracting terminology from aligned documents, it is also possible
to import TMX translation memories and have terminology extracted. There
are a good number of settings that allow you to govern the extraction,
including many fields you can automatically add to each term pair (so that
your TEnT will be able to take that into consideration when processing the
data and make suggestions to you).
Theoretically, all languages are supported with the tool; however, practically
speaking there are different tiers of language support. In general,
SynchroTerm relies on mathematical calculations to extract terminology pairs.
For a great number of Western languages it also uses long lists of stop words
to filter those out automatically, and for English and French it also makes use
Documents you may be interested
Documents you may be interested