Quick Word Recovery
-
Version
0.6.0.1
New format MS Word docx files are in reality zipped
collections of mostly xml files with the text entirely
stored in the document.xml file. If there is one XML
error in this file Word will error out and not display
your document. This program works by truncating the
document.xml file 50 characters ahead of the first XML
error and then adding the ending tags automatically
with xmllint.
The amount the files are truncated is adjustable. The
truncation is done in the first place as the XML
validator will often only report the first XML error
several characters after the actual XML corruption
begins. If the 50 character default end the file within
a complex tag, then xmllint may not be able to
correctly refinish the end tags and thus it is useful
in those cases to truncate less or more characters to
try to get into a text/data region instead of a complex
tag, or at least in a tag where xmllint knows how to
truncate and end the file by itself (xmllint apparently
does some truncating).
|