38
3
What Word documents does it support?
Docx4j can read/write docx documents created by or for Word 2007 or later, plus earlier versions which
have the compatibility pack installed.
The relevant parts of docx4j are generated from the ECMA schemas, with the addition of the key
Microsoft proprietary extensions. For unsupported extensions, docx4j gracefully degrades to the
specified 2007 substitutes.
It is not really intended read/write Word 2003 XML documents, although
package
org.docx4j.convert.in.word2003xml
is a proof of concept of importing such documents.
For more information, please see Specification versions below.
Handling legacy binary .doc files
Apache POI's HWPF can read .doc files, and docx4j͛s
org.docx4j.convert.in.Doc
does use this for
basic conversion of .doc to .docx. The problem with this approach is that POI's HWPF code fails on
many .doc files.
An effective approach is to use LibreOffice or OpenOffice (via jodconverter) to convert the doc to docx,
which docx4j can then process. If you need to return a binary .doc, LibreOffice or
OpenOffice/jodconverter can convert the docx back to .doc.
Getting Help: the docx4j forum
Free community support is available in the docx4j forum, at http://www.docx4java.org/forums/ and on
Stack Overflow.
Before posting, please:
check this document doesn͛t answer your question
try to help yourself: people are unlikely to help you if it looks like you are asking someone else
to do lots of work you presumably are being paid to do!
ensure your post says which version of docx4j you are using, and contains your Java code
(between [java] .. and .. [/java]) and XML (between [xml] .. and .. [/xml]), and if appropriate a
docx/pptx/xlsx attachment
consider browsing relevant docx4j source code
This discussion is generally in English. If you would like to moderate a forum in another language (for
example, French, hinese, Spanish…), please let us know.